Why Sophia Search

Orthodox search should make trust visible.

Sophia Search is being built around a simple rule: modern retrieval can help us find things, but named sources must carry the authority.

Core Claims

What the product is for

  1. Sophia Search gives Orthodox Christians cross-references they can trust.
  2. Every cross-reference should show why it can be trusted.
  3. Trust comes from received sources, not generated associations.
  4. Sophia points readers back to the sources.
  5. Search may assist discovery, but it must not speak in place of the sources.

Feature Alignment

How each major feature fulfills the goals

Feature

Church Fathers to Scripture Cross-References

Goals

  • A patristic cross-reference is trusted only when it rests on explicit source evidence.
  • Sophia must distinguish the Father's own citation from an editor's note or source index.
  • Semantic similarity can suggest places to read, but it cannot authenticate a cross-reference.

Canonical Sources

  • The primary source is the patristic work itself when Scripture is explicitly quoted or cited.
  • Edition layers can include PG/PL, Sources Chretiennes, GCS, CSEL/CCSL, or public-domain ANF/NPNF through CCEL.
  • Current CCEL ThML <scripRef> markup should be treated as an edition reference layer until authorial status is verified.

Technical Implementation

  • Ingest only explicit source-marked scripture references into the trusted graph.
  • Store source edition, source tag, passage label, OSIS reference, work, section, paragraph, and authorial status.
  • Show the provenance in the interface before presenting a link as trusted.

Feature

Greek to English Alignment Cross-References

Goals

  • Alignment serves the Church's received text traditions as a study aid, not as a new authority.
  • Each link should name the Greek source, English translation, method, and confidence level.
  • Uncertain links should be hidden or clearly marked uncertain.

Canonical Sources

  • Old Testament target: Septuagint.Bible, the GOARCH and Hellenic Bible Society project for the living Septuagint text.
  • New Testament target: the 1904 Patriarchal Text approved by the Great Church of Christ; accessible through Hellenic Bible Society BYZ04 and eBible GRCBYZ.
  • Current Sophia data is not yet that target: it uses public-domain Brenton LXX, Textus Receptus, KJV/ASV, and public-domain interlinear data.

Technical Implementation

  • Store Greek token IDs, English token IDs, source editions, morphology or lexicon source, method, and confidence.
  • Support one-to-many and many-to-one alignment instead of forcing word-for-word equivalence.
  • When traditions differ, show the difference rather than harmonizing it away.

Feature

Semantic Search for Bible and Church Fathers

Goals

  • Semantic search is for discovery, not authentication.
  • A semantic match should always send the reader back to a named text and edition.
  • Semantic results may be useful, but they are not cross-references until source evidence proves them.

Canonical Sources

  • The source is the Bible or Fathers corpus being searched, not the model, embedding, or index.
  • Each corpus should name its edition, version, and licensing status.
  • Review can promote a semantic discovery only when it is tied to explicit source evidence.

Technical Implementation

  • Label semantic results as semantic matches.
  • Keep semantic matches separate from authenticated cross-references.
  • Record search method, index version, corpus version, and any later human or source-backed review.

Source Ledger

What data is actually available now

Old Testament Greek

Orthodox target
Septuagint.Bible, a joint project of the Greek Orthodox Archdiocese of America and the Hellenic Bible Society.
Ecclesial status
The project says it aims to serve as the official clearinghouse for the living Septuagint text used in Orthodox worship and devotion.
Current Sophia data
eBible Brenton LXX via data/grcbrent_vpl.txt. Public domain, accessible, not claimed as Orthodox-blessed.

New Testament Greek

Orthodox target
The 1904 Antoniades Patriarchal Text, corrected in 1912.
Ecclesial status
The 1904 edition was published as the New Testament approved by the Great Church of Christ, the Ecumenical Patriarchate.
Online access
Hellenic Bible Society BYZ04 for reading, and eBible GRCBYZ for public-domain developer formats.
Current Sophia data
Textus Receptus from data/tr.json plus public-domain interlinear data. Useful for study, not the Patriarchal Text.

English and Alignment Data

Current English packs
KJV and ASV 1901. Public-domain study texts; no Orthodox ecclesial approval is claimed for them.
Current alignment source
tahmmee/interlinear_bibledata, which itself cites Strong's data and BibleForgeDB.
Rule
These alignments can power hover/click study aids, but they must not be labeled Orthodox-authenticated until rebuilt against the target Greek source and a licensed or explicitly named English source.

Source Policy

What Sophia should say plainly

No source is "Orthodox-authenticated" unless Sophia can name the edition, link to the data or publication, and identify the ecclesial approval or source authority being claimed.

If the source is only public domain, academic, commercial, or inferred by alignment, Sophia should say that plainly and keep it outside the trusted cross-reference graph.