Word Counts of All Chapters in Talmud Bavli
Cross-posted at my Academia.edu page (registration required): “Bavli By the Numbers: Word Counts of All Chapters in Talmud Bavli”. I also attached a PDF below. For word counts of entire tractates, see my previous piece: “Words of Wisdom: Word Counts of Classical Jewish Works”.
I successfully analyzed a total of 304 chapters, out of 308 chapters.[1] I also calculated the number of folios per chapter.
Out of the 304 chapters successfuly analyzed, here are the statistical word count results:
Median Word Count per Chapter: 4,575 words.
Average Word Count per Chapter: 5,503 words.
Cumulative Word Count for all 304 Chapters: 1,672,827 words.
This cumulative figure is close to the previously determined total of 1.86 million, as mentioned in my "Words of Wisdom". Accounting for the four chapters with missing data, the numbers align quite closely.
Median number of folios (=dapim) per chapter: 7.5
Average number of folios per chapter: 9
Technical
The word counts in this piece are based on the transcriptions in Hebrew Wikisource:
קטגוריה:פרק בתלמוד הבבלי – ויקיטקסט
Using a Python script, in Google Colab. See my “Bavli By the Numbers“ for additional technical info.
Table key - technical
The source of traditional chapter names in fields ‘chapter name’ and ‘folio chapter begins on’ is the following website:[2]
אוצר הספרים היהודי השיתופי, "רשימת פרקי הש"ס"
‘Folios’ are the traditional numbering of the Talmud that has been used for the last 500 years.[3]
The values in ‘Number folios in chapter’ were calculated programmatically. If the number is missing, it means that it’s the final chapter, so I couldn't easily calculate the number.
Table
The table is sorted from highest word count to lowest.
The full table can be found in the attached PDF, or at my Academia.edu page (linked above).
Screenshot of the beginning of the table, below. Showing the longest perakim. These unsurprisingly include many of the most popular perakim studied in yeshivot:
האשה נקנית
חזקת הבתים
שנים אוחזין
אלו מציאות
ארבע אבות
[1] For four chapters, our script reported a 'word count' of zero, due to non-standard formatting in that Wikisource page (The usage of the full word ‘גמרא’ instead of ‘ ‘גמ'’ caused the script to not work correctly). To estimate their word counts, one can transfer the text from Wikisource to Microsoft Word and utilize its word count feature. It's advisable to then reduce this count by approximately 10% for accuracy. (I've manually validated a subset of the successfully processed chapters in this manner as a quality assurance measure.) The chapters that output a 'word count' of zero are the following:
ביצה פרק א; ברכות פרק ח; ברכות פרק ט; נדרים פרק ב.
There are four chapters in Tractate Tamid that do not have any Talmud Bavli on them, and are only Mishnah, and therefore have been excluded (despite being in that Wikisource category):
תמיד פרק ג; תמיד פרק ה; תמיד פרק ו; תמיד פרק ז.
Cf.
"מסכת זו היא הקצרה ביותר מהמסכתות שיש עליהן תלמוד בבלי (דפי תלמוד ומשנה יחד עולים לסך 8 דפים), ומסיבה זו צורפה בדפוסים למסכת מעילה. בשל כך המסכת אינה פותחת בדף ב' אלא בדף כ"ה (עמוד ב') ומסתיימת בדף ל"ב, מדף ל"ב עד דף ל"ג צורפו שלושת הפרקים האחרונים של המסכת הכוללים רק משנה."
I’d like to take this opportunity to say thank you to YD for sparking the inspiration behind this project.
[2] As an aside, traditional Talmudic chapter names are an example of ‘incipits’. Incipit - Wikipedia:
“The incipit of a text is the first few words of the text, employed as an identifying label [...] The word incipit comes from Latin and means "it begins" [...] Before the development of titles, texts were often referred to by their incipits [...] Though the word incipit is Latin, the practice of the incipit predates classical antiquity by several millennia and can be found in various parts of the world [...] Many books in the Hebrew Bible are named in Hebrew using incipits. For instance, the first book (Genesis) is called Bereshit ("In the beginning ...") and Lamentations, which begins "How lonely sits the city...", is called Eykha ("How") [...] All the names of parashot are incipits, the title coming from a word, occasionally two words, in its first two verses. The first in each book is, of course, called by the same name as the book as a whole [...] In the Talmud, the chapters of the Gemara are titled in print and known by their first words, e.g. the first chapter of Mesekhet Berachot is called Me-ematai ("From when"). This word is printed at the head of every subsequent page within that chapter of the tractate.”
[3] For a discussion and some bibliography on this, see my article in the Seforim Blog, “From Print to Pixel: Digital Editions of the Talmud Bavli” (June 5, 2023).


