How Often Are the rabbis ‘Rabbi’, ‘Rav’, and Shmuel Mentioned in the Talmud?
How Many Times Are the Major Mononymic (Single-Name) Rabbinic Figures “Rabbi” (Yehuda HaNasi), “Rav” (Abba Arikha), and “Shmuel” (Shmuel bar Abba) Mentioned in the Talmud?
For this project, I wanted to answer a simple question:1 how often does the Talmud mention the three major mononymic rabbinic figures “Rabbi” (Yehuda HaNasi), “Rav” (Abba Arikha), and “Shmuel” (Shmuel bar Abba)?2
The question is simple, but the text is not. “Rabbi” and “Rav” are the standard ordinary rabbinic honorifics.3 Throughout the Talmud those words introduce other named sages, so in a raw search, the vast majority of counts are false positives. “Shmuel” is easier, but even there I wanted a method that would separate the mononym from longer names first.
I used two text files from my talmud-nlp-indexer project: the full English Talmud corpus and the gazetteer of Talmudic names. I tokenized both files into word-level units and then matched named figures from the gazetteer against the corpus. To avoid double counting, I sorted names from longest to shortest and removed those spans first. That way, a standard name such as “Rabbi Elazar ben Perata” gets marked before shorter pieces inside it can be counted separately.
After that removal pass, I counted the remaining exact one-word tokens Rabbi, Rav, and Shmuel. This gives an algorithmic estimate based on a clear rule: remove all other known rabbinic names first, then count the residual mononyms.
The resulting counts were:
Rabbi: 1,733
Rav: 3,460
Shmuel: 2,655
I also generated a concordance for manual review. For each of the three names, I saved up to ten examples with ten words before the hit and twenty words after it. That review step is especially important for Rabbi and Rav. Some residual matches still sit in formulaic reporting clauses such as “Rav says” or “Rabbi says,” and some of those probably do refer to the mononymic sage while others need closer checking in context.
Even with that limitation, the method is useful. It gives a reproducible first-pass count, and it handles the main structural problem in this kind of search: the Talmud reuses the same short words inside many longer names. By removing longer names first, I can get much closer to the cases where the text is actually using a single-name figure.
The result is a transparent workflow: tokenization, longest-first name removal, residual counting, and concordance review. I can refine that workflow later with tract-level checks, better normalization, or a more detailed name list.4
Appendix - Technical
I ran the project on the English Talmud text file and the name gazetteer from the same repository. The corpus contained 2,587,408 alphabetic word tokens. The gazetteer contributed 3,680 tokenized name entries. I excluded the exact one-word targets Rabbi, Rav, and Shmuel from the removal list and kept every other gazetteer entry.
The matching pass was greedy and local. At each token position in the corpus, the script checked candidate gazetteer entries that began with that token and tried lengths from longest to shortest. When it found a full match, it marked that whole token span as occupied and skipped over those tokens for later counting. This longest-first rule prevented shorter names from being counted inside longer names.
That pass removed 158,999 corpus tokens as parts of non-target rabbinic name spans. I then counted every remaining unoccupied token equal to rabbi, rav, or shmuel.
Part of this larger project, note that I completely revised the new page: https://chavrutai.com/term-index. (See my previous writeup on how an initial version of this glossary was built: “Introducing a New Talmudic Glossary: Mapping Talmudic Names and Technical Terms” [Feb 22, 2026].)
This continues to be a work-in-progress. The current description there is as follows:
Note: This page is a work in progress. Data may be incomplete or contain errors.
This is a structured glossary of 4,904 terms appearing in the Babylonian Talmud — personal names, place names, Biblical figures, nations, and key concepts. Each entry includes variant spellings, Hebrew/Aramaic text, occurrence counts in the Steinsaltz English Talmud corpus, and links to Wikipedia where available.
Biographical data (teachers, students, father, dates, affiliation) is sourced from Wikidata and is available for entries that have a corresponding Wikidata item (Q-ID). This data is community-maintained and may not be complete or fully accurate for all figures.
Corpus occurrence counts reflect how often each term appears in the full Steinsaltz English translation as indexed in the ChavrutAI search corpus.
Screenshots, in tablet view:
Initial view:
After clicking on specific entry:
On mononyms (=single-word names) in the Talmud, see also my general discussion and lists in my piece “Abba”. And compare also my note on this topic in a piece a while back: “Identifying the Most Quoted Sages in the Talmud’s Aggada: A Programmatic and Quantitative Study“ (Jan 17, 2024).
Used in Eretz Yisrael and Babylonia, respectively.
Worth noting, in this context: Counting in English translation is especially helpful in this context, because it can avoid instances where “rabbi” is used in a general sense as “master/teacher”, and not as a fixed honorific (especially in the context of a dialogue, as a form of address). An example of the former is in a sentence in the pattern of the following: “Rabbi! But didn't you say that …”.
I accordingly updated the table (mentioned earlier):



