Biblical Citations in the Talmud: A New Digital Index and Concordance
I developed a comprehensive digital resource that systematically presents the thousands of biblical citations embedded throughout the Babylonian Talmud.1
This project creates structured concordance tables that map biblical verses to their specific locations in the Babylonian Talmud, providing readers and researchers with unprecedented access to the intricate relationship between these foundational Jewish texts.
The full context is shown, with hyperlinks to the relevant section in the Talmud.
Outline
Intro
The Challenge: Biblical Citations in Talmudic Literature
The Solution: Automated Extraction and Organization
Output: Organized Concordance Tables
Research Applications
Conclusion
Appendix 1 - Per-book Concordance Index
Torah (Five Books of Moses)
Nevi'im (Prophets)
Ketuvim (Writings)
Appendix 2 - Technical
Getting the full Steinsaltz Talmud, with sections
Processing Architecture
Data Schema
Performance Characteristics
The Challenge: Biblical Citations in Talmudic Literature
The Babylonian Talmud contains extensive biblical material—direct quotations, paraphrases, and allusions that form the scriptural foundation for rabbinic legal and theological discourse.
Traditional study methods require scholars to manually cross-reference biblical passages with their Talmudic contexts, a time-intensive process that often limits the scope of comparative analysis.
The basic print indexes are:2
היימן, 'התורה הכתובה והמסורה', חלק א, ב, ג
Heiman’s indexes are freely available at HebrewBooks.
Screenshot, for example, from Part 3, p. 1:
A number of other works provide a traditional index of verses that appear in the Talmud, such as Soncino, Artscroll, and Hebrew Wikisource; see the discussion here, at judaism.stackexchange.com - Mi Yodeya: “Is there a resource that shows all uses of a Tanach verse in the Talmud?” (~2014)
As well as these traditional Hebrew reference works:
תורה אור
תורה תמימה
תורה שלמה
Now, there’s also Caleb Friedeman (Editor), A Scripture Index to Rabbinic Literature (2021).
You can see the Table of Contents and some sample pages here.
The description at the Amazon page of that work:
A Scripture Index to Rabbinic Literature is a comprehensive Scripture index that catalogs approximately 90,000 references to the Bible found in classical rabbinic literature. This literature comprises two categories: (1) Talmudic literature (i.e., the Mishnah and related works) and (2) midrashic literature (i.e., biblical commentary).
Each rabbinic reference includes a hard citation following SBL Handbook of Style, the page number where the reference can be found in a standard English edition, and an indication of whether the biblical reference is a direct citation, allusion, or editorial reference. This incredibly handy reference work is the first of its kind and is a welcome addition to Hendrickson’s well-crafted line of reference books.
Key points and features:
A comprehensive Scripture index to classical rabbinic literature in English
Includes references to the Mishnah, the Tosefta, the Jerusalem Talmud, and the Babylonian Talmud, as well as the Mekilta, Midrash Rabbah, Pirqe Rabbi Eliezer, and many more
Approximately 90,000 references include a hard citation, a page number in a standard English edition, and an indication of whether the biblical reference is a direct citation, allusion, or editorial reference
Saves researchers large amounts of time and energy by bringing together a vast amount of data that was previously located across many disparate resources.
Being print-only is highly limiting (as I’ve discussed a number of times). The constraints of print almost always require that abbreviations are used, and the full relevant sources can’t be cited. Of course, hyperlinking is only possible in a digital text. In addition, the 2021 index is copyrighted, not open-access.
Sefaria does have a related index, but it’s not especially usable:
https://www.sefaria.org/explore:
For example, Book of Joshua:
https://www.sefaria.org/explore/Joshua:
Close-up, of one connection:
Value propositions of my new index:3
Open access
Digital
No abbreviations
Hyperlinked
Links to specific sections
Browseable, simple navigation, user friendly, easy to compare different discussion of single verses
The Solution: Automated Extraction and Organization
My solution processes the complete English translation of the Babylonian Talmud from the Steinsaltz digital edition.4
The system identifies biblical quotations through HTML markup analysis—the Steinsaltz edition formats biblical quotations in bold text followed by parenthetical citations like "(Genesis 1:5)".5
The system processes approximately 80,000 source entries (=Steinsaltz sections) from the complete Talmud, ultimately generating over 17,000 unique biblical citation mappings. Each entry captures not only the biblical verse and its location in the Talmud, but also the complete contextual passage and the specific subsection within each Talmudic page.
Output: Organized Concordance Tables
The final product consists of 38 separate markdown files, hosted at the Github repo, one for each biblical book, containing organized concordance tables.
Outline:
https://github.com/EzraBrand/bible-rabbinic-index/blob/main/docs/BOOKS_INDEX.md
Screenshot:
These tables group citations by biblical chapter and provide four key pieces of information for each entry:
The specific biblical verse and its quoted text6
The Talmudic location (tractate, page, and subsection)
A direct hyperlink to the passage on ChavrutAI.com
The complete Talmudic context surrounding the citation
For example, the Genesis concordance reveals that the creation narrative in Genesis 1 appears throughout multiple Talmudic tractates, with concentrated discussions in Rosh Hashanah 11a regarding the timing of creation, and scattered references across Berakhot, Chagigah, and other tractates exploring theological implications of the creation account.
https://github.com/EzraBrand/bible-rabbinic-index/blob/main/docs/books/Genesis.md
Screenshot:
Each page is divided into sections, by chapter. The sections/chapters can be accessed via the hamburger icon at the top right.
For example, Genesis 48:
https://github.com/EzraBrand/bible-rabbinic-index/blob/main/docs/books/Genesis.md#chapter-48
Screenshot, with relevant components highlighted with red arrows:
Research Applications
This concordance serves multiple applications that were previously difficult to pursue systematically.
Readers and scholars can now trace how specific biblical verses function across different Talmudic contexts, examining whether particular passages consistently serve similar argumentative purposes or whether their interpretive applications vary by tractate or topic.
The hyperlinked structure connects researchers directly to primary sources on ChavrutAI, enabling immediate verification and deeper study of specific passages.
Conclusion
The Bible Rabbinic Index demonstrates how digital humanities methods can support traditional textual scholarship. By automating the labor-intensive process of citation mapping, the tool enables researchers to pursue questions about biblical-Talmudic relationships at scale.
The complete dataset, processing scripts, and organized concordance tables are freely available for academic use.
The intersection between biblical and rabbinic literature represents one of the most significant textual relationships in Jewish intellectual history. This tool provides a new lens through which to examine that relationship systematically, offering researchers both comprehensive coverage and direct access to the primary sources.
Appendix 1 - Per-book Concordance Index
From here:
https://github.com/EzraBrand/bible-rabbinic-index/blob/main/docs/BOOKS_INDEX.md
Torah (Five Books of Moses)
Nevi'im (Prophets)
Ketuvim (Writings)
Appendix 2 - Technical
Getting the full Steinsaltz Talmud, with sections
Full script, google colab notebook:
Complete ed. Steinsaltz Talmud Bavli - in sections - combined CSV - 80k rows - 19-Sep-25
Key code snippet:
# Construct the download URL
base_url = "https://www.sefaria.org"
csv_filename = f"{tractate_name} - en - William Davidson Edition - English.csv"
download_path = f"/download/version/{quote(csv_filename)}"
full_url = base_url + download_path
Processing Architecture
The system employs a three-stage processing pipeline designed for accuracy and maintainability:
Stage 1: Extraction (extract_concordance_from_csv.py
)
BeautifulSoup
HTML parser identifies<b>
and<strong>
elementsRegex-based citation detection with parenthetical reference patterns
Multi-word tractate name parsing using page-pattern identification
Book name canonicalization (Roman numerals, variant spellings)
Exclusion filters for cross-references and meta-commentary
Stage 2: Export (export_concordance_csv.py
)
Deduplication by (book, chapter, verse, tractate, page, section) tuples
Sorting by traditional Jewish biblical book order
Data validation and integrity checks
CSV serialization with proper escaping
Stage 3: Generation (generate_md_per_book.py
)
Chapter-based organization with markdown headers
URL encoding for ChavrutAI hyperlinks with section anchors
HTML sanitization for table formatting
Template-based output generation
Data Schema
JSON intermediate format:
{
"book": "Genesis",
"chapter": "1",
"verse": "1",
"tractate": "Chagigah",
"page": "12a",
"section": "16",
"verse_text": "In the beginning God created",
"verse_html": "<b>In the beginning God created</b>",
"full_text": "Complete Talmudic context passage..."
}
Performance Characteristics
Input processing: 80,793 source entries (81k-row CSV)
Output generation: 17,138 unique citations across 38 biblical books
Processing time: ~2-3 minutes for complete pipeline
Memory usage: 19MB intermediate JSON files
Thanks to a discussion in the ‘Digital Humanities IL’ group for inspiring this project.
Unrelated, I just updated my “Words of Wisdom: Word Counts of Classical Jewish Works”.
The major change in v5: I updated the numbers in the Talmud Yerushalmi section, based on my recent independent word count script at my Github here.
Full output here, word counts of every Yerushalmi tractate, chapter, and halacha: yerushalmi_word_counts.csv.
Compare also my recent discussion on Talmud indexes, here: “Appendix 3 - Talmudic Indexes: Existing Talmudic Indexes, and Index vs. Outline, and Towards an Outline of Select Chapters and Sugyot”.
Note: My index only covers the Babylonian Talmud; thus, it doesn’t replace the existing indexes mentioned above, which cover all of classical early rabbinic literature.
From the Sefaria website; by downloading and combining CSVs of tractates, that retain sections.
Cross-references - citations beginning with "see," "cf.," or "compare" - are filtered out to maintain focus on direct quotations.
Note: The text in column ‘Bible Verse Text’ is often not fully accurate; the script to extract that text needs to be optimized, it’s a fairly complex text processing challenge.