Transcript of my presentation: “A Selective Overview of Digital Resources for the Scholarly Study of Rabbinic Texts”
An edited transcription of the presentation I gave at the workshop "Editions of Classical Jewish Literature in the Digital Era"
Video recording of lecture at YouTube here.
Presentation slides at my Academia.edu page, here (requires registration). Also attached here:
And without further ado, here’s the edited transcript, with headers from the slides added.
Intro
First, I'd like to give an overview of previous guides that exist and what I tried accomplishing with this one. This guide can now be found on my Academia profile. It originally appeared at the Seforim blog in three parts last year. So, I'll start with a short history. I started working on this guide at the beginning of last year, the beginning of 2022. I was working on a comprehensive overview of literary forgeries of Jewish literature, and I came to realize that a full overview and guide of digital resources available for my research and general Jewish studies was lacking. So, I started compiling this guide. As I said, it was published last year at the Seforim blog in three parts and, thanks to Academia, is available on my profile. Anyone could take a look, download it, share it, and I think it could be very useful.
And since then, since last year, I've updated it and revised it considerably, and I believe it's fully up to date. There are no broken links or very few broken links. Some universities have already linked to it.
Scope of this guide and shortcomings of other guides
I'll start off by discussing the scope of my guide, the shortcomings of guides that already exist. This guide is focused on Jewish studies, as opposed to general tools for scholarship. Guides to digital resources do exist, but they're often somewhat limited, out of date, and are simple lists without, and especially, a helpful logical outline. In this slide I have an image of a typical university webpage of an Israeli academic Talmud Department listing resources with links. I won't mention the specific university, you could guess, but as you can see, it's simply in alphabetical order. The top image, and it isn't really in any kind of specifically logical order other than alphabetical. And also, there's a limited description, but again, it's really just like one or two sentences, and that's about it. Some university departments do it better, but they're still somewhat lacking, and it's often very much splintered between, especially in Israeli universities, different departments. Then in American universities, it's also spread over thousands of years of Jewish history, so there's definitely room for improvement, and that's where I come in.
My focus also, due to the fact that I'm not in an institution and I'm independent, I wanted to focus more on open-access - something that's high quality, user-friendly. There's a lot, again, the internet is very fast-moving and there's a lot of stuff out there that's broken links and not necessarily the best stuff that's out there. Sure, everyone here is mostly aware of the best stuff to use, but a lot of it isn't so well known. So this is trying to make it explicit, kind of popularized, let everybody know what exists, what's out there.
In terms of the actual scope to make this guide more manageable, given the background interests and to help people who would be interested, this is primarily concerning resources from post-destruction of the Second Temple, so around the year 100 CE or so, until the late modern period, so around 1850. You could call this period by different names, like Middle, Rabbinic, Diaspora.
Of course, scholars might want to argue on the epoch, but that's the period I focused on. It's not going to discuss early Christianity on one side, and on the opposite end, it doesn't cover all the late modern period or modern Hebrew literature. Again, it's very much focusing on what you could call the rabbinic period.
Outline and Intro to Current Ecosystem
Here's an outline of the current ecosystem. Again, in my full guide, it's about 40 pages, and I try to break it down in a full, user-friendly way. So you can check it out there. Here today, I'm only going to cover a few of them to just give a taste and help people here who, obviously, this is a higher-level workshop, so just focus on more interesting stuff. But there, I cover the full ecosystem.
Today, I'm going to mostly focus on primary texts. But just very quickly, in this slide, I'm going to give the full outline. For sources, we have everybody's favorite, Sefaria. Our next session is going to be on Hiburim, a source I'll discuss a bit more later. That's where a lot of my open-source transcription happens. Sefaria pulls a lot from there.
For scans, we have HebrewBooks, and then the National Library of Israel - Merchav. I'll go a little bit more into that as well, on what the best way is to browse. The National Library of Israel has a lot of scans, not so accessible, but we'll go through that.
Manuscripts...
For indexes, I'll go a little bit into that as well. For secondary literature, which I won't be discussing today but I have in my guide, Kotar has a lot of high-quality academic material. A lot of academic publishers are there. For Kindle, there's a lot of stuff. I personally prefer to do a lot of reading on Kindle. It's very easily accessible, and there's really good academic stuff out there. JSTOR, Academia, and then we have bibliographic info. Again, the National Library, which is also through encyclopedias, is helpful. Europeana has a lot of it accessible. Hebrew Wikipedia, of course, you need to be careful with, but they're really good for researching rabbis and communities, and they have a lot of good bibliography.
I won't be discussing this today, but there's a lot of really good popular media out there of varying quality. Academic blogs, often read by academics, have a lot of really good stuff there, especially for contemporary reviews, as well as Hebrew language blogs. Then there's a huge ecosystem of YouTube videos, podcasts, Twitter, Facebook, and we're not going to go into that at all today. But especially now, there's more and more of some really good high-quality stuff out there. Forums also, very often coming from a more traditional Haredi point of view, often have high-quality discussions.
Popular Editions of Primary texts
So now to get into specific resources, let's start with popular editions.
When I say 'popular editions', I mean as opposed to the more scholarly, scientific editions, which everybody here is more likely to be interested in. But there is, of course, a place for the more popular editions. That's what existed mostly for hundreds of years, and often that's the only edition that's available, especially if it's not necessarily something that has a manuscript available. That's just the basic edition.
Sefaria, which I said often hosts public domain texts, and it's again popular, not necessarily the best, but it's very helpful. So are Wikisource, and as I mentioned, HebrewBooks, which of course, I'm sure everyone's aware, has OCR (Optical Character Recognition), so it is searchable. They also have something that a lot of people might not be aware of; they have a beta search. It's not found on their homepage, but it's closer to the Otzar Hachochma experience where it's a more powerful search, shows all the titles, and is a lot more user-friendly than the default search. And then, of course, the National Library of Israel. In the last few years, they really upgraded, as I'm sure most people know, to a really powerful search with filters and all that, for hundreds of thousands of books and manuscripts.
Here, I have a screenshot of one of the tens of thousands of scanned PDFs, this one's from Kedushat Levi. They also have their bibliography, which we'll discuss a bit more later. Now, in terms of an index, I want to discuss a bit. That's the screenshot of it at the bottom left. Something that people might not be so aware of is this kind of acts as an index both for Hebrew Books as well as for the National Library and some other websites. But mostly, it lists everything and you can browse it. Their own index is not great. They kind of list this in a lot of places. There are a lot of links. I created my own meta-index list, also available on my Academia page. They've indexed tens of thousands of works in a user-friendly way, so it's definitely worth checking out. Last year, they had 36 web pages of indexes and, even just from a taxonomy perspective, I think it's very interesting.
Again, the National Library unfortunately doesn't have such a great index. They've changed their technology over the years, and a lot of links are broken. The best way to get to it, and I have a screenshot on the bottom right, is through their 'Available Online’ filer You request a book, and they'll say yes, and you can find it. They have tens of thousands of scans available. Some can be downloaded, and they often also link to Hebrew Books, to Otzar, and to other external websites. So it's also a great way of discovering other sites.
Not open access, and probably a subscription, which I'm sure everyone is aware of, is Otzar Hachochma. They also upload a lot of really high-quality recent publications, including by contemporary Orthodox publishers. It's also OCR, so it's basically searchable. OCR is not 100%, but it's very good. The search is good. And we'll move on to digital critical editions.
Primary texts - Edited Digital scientific editions
So we had to talk yesterday from some of the Historical Dictionary Project. They have a lot of stuff, not necessarily the most user-friendly. I think it kind of came up yesterday. Of course, that's not really what it's for. It's not meant for browsing, it's meant for very high-quality, accurate addition. But, it is possible to kind of get the text from there as well.
Then Al-Hatorah, which in the next session from, hello, they're really good for medieval commentaries directly from manuscript, and these are kind of original scientific editions that respond, and they have dozens of really good editions. Plus, they also have a lot of other stuff, which isn't necessarily better or worse than Sefaria. My understanding is that's for the same source, correct? Then, Talmud Shelbey Katz, of course, from an African past. Then we had the Avodah Zarah project from the Schechter Institute that many, especially the smaller midrashim, as well as the Devarim Rabbah from Kehal Mech.
On the top right, I have this synoptic edition of, I think that's the Devarim Rabbah, a synoptic edition from Millikovski, hosted by Bar-Ilan University. That's a separate website. Matt’s edition of the Zohar, which he prepared as part of his monumental translation, that's available online as well at the Stanford University Press website. The translation, original translation, it's really incredible. They have their paper dictionary.
So that's a screenshot of Matt's Zohar. It's also just a PDF you download, but I would say it's pretty clear. That's the best edition that's out there.
Primary texts - Presentation of Manuscripts #1
Let's discuss some of the manuscripts available online, whether they're images, maybe scans, or transcriptions.
Ktiv, of course, which was discussed a bunch of times, it's a recent project in the last few years, but they already have tens of thousands transcribed. And also, of course, talked about that, and then Hachi Garsinan, Digital Mishnah Project. I had a screenshot of that as well. So it's, there's all sorts of different ways that you could present a manuscript, and a transcription, and so Garcinan, an incredible project, a screenshot as well. And as discussed, it's a great example of TEI, and just really a good example for how these projects should be done.
Ilanot, a project which I don't think was mentioned yet, but it's kind of a new project that recently worked in the last few years. It's in a book, I think there's actually an exhibit now at the National Library of Israel. I'm not sure if it's connected directly or highest, but it's also, it's a really cool example of how it could be done. The quality resolution is incredible and there's all sorts of different ways of analyzing it. So that's the bottom left, that's from the Illinois project. Really incredible work.
Primary texts - Presentation of Manuscripts #2
Okay, so now, just the newer projects, both at Princeton. So we had the Genizah project, which has been around for decades. They’re high-quality, they have transcriptions, a lot of them directly from Goitein, as well as links to his index cards and published transcriptions, not unpublished. And then we also have Schaefer. Of course, Schaefer is famous for his synoptic edition. And then, we have a project for Sefer Hasidim which is kind of similar, looking at all the different versions, and a lot of them are very different. So you have the ability to present the layout of the text. It's all transcribed, I believe it's 14 manuscripts.
Tertiary tools for Primary texts - Search and other Tools
As well as some of the newer tools, Sefaria, over there, actually their search is also really good, and it's powered by Dicta. So they have their equipment done. The citation finder at this point is really only for Tanakh and Talmud, maybe a lot more in the future. Opening up of acronyms which is also really good, and you could have a screenshot like that there, and you can kind of play around with it with people's customers later. Synopsis Builder, which I think Gila mentioned as well. There's a tool called to remove nikud which is not so accessible on their website, but I found it through a search and I personally find it helpful, just for simply removing equipment from Sefaria.
Digital humanities tools for Rabbinic Hebrew - current and future
So here we have an article that I discovered pretty recently, which I thought was a really good overview, and I also want to kind of give it more exposure, at Lehrhaus. Where he discusses, he gives an overview of everything. So Dicta’s Library, which has tremendous potential in terms of improving the readability, parallels, quotations, again at this point really only Tanakh and Talmud, hopefully we'll get more in the future. And then Dicta Analytics as well, limitization, speech parts tagging, which is also really important for machine learning, natural language processing, and it's incredible for being able to really do kind of the pre-processing and preparing.
There's just some quick screenshots I have over there, that's the Library and that's some of the lemmatization, and with the tagging.
Generative artificial intelligence - present and future
We all know generative artificial intelligence, and it's something that my blog has been testing in the last few months. Punctuation, this is something that Avi and Moshe have mentioned in the article, not yet available in Dicta, but I think this is giving a, at least a ChatGPT-4, which I pay for. I think everybody should pay for it. So far, in my testing it does incredibly well on punctuation.
We'll see how it all plays out, but I think generative AI and large language models will revolutionize many aspects of scholarship. So you could have automated punctuation which, again, was discussed in the article, but as far as I know, not yet available. Keyphrase extraction, I found it very helpful for being able to pull out sources like certain, the Talmud has a lot of formulas, so being able to pull that out. And I think going forward, I specialized in LLM, which is something that I'm very interested in. We could really revolutionize the next studies and ChatGPT-4 really is not at all, it wasn't trained on rabbinics, the Linux (?). But I think when and if it is, it could be really revolutionary.
So that's about it. Again, the full guide is available on my Academia.edu page. I also blog on all this stuff. If you're interested, please subscribe!