How Do AI Models Translate Rabbinic Hebrew? A Comparison of Google Translate, Dicta, ChatGPT-4, and Claude 3
Translation Test – Comparing Google Translate,Dicta, ChatGPT-4, and Claude 3; ChatGPT-4’s translation of Aruch HaShulchan
Outline
Translation Test – Comparing Dicta, ChatGPT-4, and Claude 3
Original hebrew
Dicta Translate
ChatGPT4
Claude 3 Sonnet (free version)
ChatGPT-4’s translation of Aruch HaShulchan
Translation Test – Comparing Dicta, ChatGPT-4, and Claude 3
Note: This test was conducted five months ago, on April 21, 2024. I haven't checked whether the results have changed since.1
I was given pre-release access to Dicta’s translation tool, which hasn’t been publicly launched yet. I ran a comparison, testing it against both ChatGPT-4 and Claude 3. In my opinion, Dicta’s translation performed significantly better than Google Translate, but it still fell short compared to ChatGPT-4 and Claude 3.2
I believe Claude 3 (free version) is the best overall, though its translations tend to be a bit more flexible/loose compared to ChatGPT-4.
For comparison, I’ve bolded to highlight the differences, meaning, especially good or bad translations:
Original hebrew
מסעות בנימין השלישי
מאת: מנדלי מוכר ספרים
https://benyehuda.org/read/64083
יתברך הבורא וישתבח היוצר, שהוא מנהיג את הגלגלים בעולמות העליונים ואת בריותיו בעולם התחתון ומבין לכל הליכותיהם. אין לך עשב שאין לו מלאך, שמכהו ואומר לו: “גדל!” ואם עשב כך, קל-וחומר בן-אדם, וקל-וחומר בן-בנו של קל-וחומר אדם מישראל. אין הדיוט קופץ בראש, אין שוטה נעשה פלא-יועץ, ואין עם-הארץ – חסיד ובור – משכיל אצלנו, אלא עד שכל אחד ואחד מהם המלאך שלו מכהו וכופאו להיות מה שהוא. אף קבצנינו, ארחי-פרחי שלנו, מלאכי-השרת מכים אותם ואומרים להם: “פרו-ורבו, קבצנים! בית-יעקב, לכו – וחזרו על הפתחים!”…
Google Translate
May the Creator be blessed and the creator be praised, who guides the wheels in the upper worlds and his creatures in the lower world and understands all their ways. You don't have grass that doesn't have an angel, who beats it and tells it: "Grow!" And if he did so, he was a human being, and he was the son of a human being from Israel. No layman jumps in the head, no fool becomes a wonder-counselor, and none of the people of the country - Hasidic and ignorant - is educated with us, except until each and every one of them is struck by his angel and frozen to be what he is. Even our beggars, our archi-farchi, the servant-angels beat them and say to them: "Pro-verbo, beggars! Beit Yaakov, go - and return to the doorways!"...
Dicta translate
Blessed be the Creator and exalted be the Maker, who governs the wheels in the upper worlds and his creatures in the lower world, and understands all their ways. There is no herb that does not have its angel, who strikes it and says to it: “Grow!” And if a herb is so, how much more so a man, and how much more so a son of a man, and how much more so a son of a son of a man, a Jew. A fool does not become a counselor, a simpleton does not become a wonder-worker, and a boor does not become a scholar among us, unless his angel strikes him and forces him to be what he is. Even our beggars, our vagabonds, the angels strike them and say to them: “Procreate, beggars! House of Jacob, go forth – and return to the doorways!”...
ChatGPT-4
"May the Creator be blessed and the Maker be praised, who leads the spheres in the upper worlds and His creatures in the lower world and understands all their ways. There is no grass that does not have an angel who strikes it and says to it: 'Grow!' And if this is so for grass, all the more so for a human, and even more so for a descendant of a human from Israel. A commoner does not jump to the head, a fool does not become a wonder-counselor, and an ignorant person—neither pious nor uneducated—is considered wise among us, except that each and every one of them has an angel who strikes them and forces them to be what they are. Even our collectors, our wanderers, our ministering angels strike them and say to them: 'Be fruitful and multiply, collectors! House of Jacob, go—and return to the thresholds!'"
Claude 3 Sonnet (free version)
Blessed be the Creator and praised be the Maker, who guides the celestial spheres in the upper worlds and His creatures in the lower world, and understands all their ways. There is no blade of grass that does not have an angel, who strikes it and says to it: "Grow!" And if a blade of grass is so, how much more a human being, and how much more a descendant of a human being of Israel. No commoner rises to the top by chance, no fool becomes a wondrous counselor, and no boor—pious yet ignorant—gains wisdom among us, but rather for each and every one of them there is an angel who strikes and compels them to be what they are. Even our beggars, our wandering vagabonds, have ministering angels who strike them and say to them: "Be fruitful and multiply, beggars! House of Jacob, go—and return to the doorsteps!"
ChatGPT-4’s translation of Aruch HaShulchan
Josh Broyde has just released an AI translation of the Aruch HaShulchan (Aruch HaShulchan was published starting from 1884):
Aruch_Hashulchan/Aruch_Hashulchan_AI_Translation_Orach_Chaim.pdf
My thoughts:
This is excellent work. Over the past year, I've also tested the top models for translating a wide range of rabbinic and literary Hebrew texts, and I agree ChatGPT4’s translation is quite good—significantly better and more consistent than Wikisource and Sefaria's community translations.
One minor preference of mine: I personally prefer a line-by-line format for translations, as I find it easier to follow. That's probably just a quirk of mine, though. My ideal would be to see far more customizable formatting options than even what Sefaria currently offers.
For an example of OCR and AI translation from German, with many adjustments, see my piece at my Academia page (requires registration): “Krauss's Rules of Transcription of Greek Loanwords in Classical Rabbinic Literature - Consonants - b; g; c; d; g; z; th (=θ, theta); k; l; m; n; x (=ξ, xi)“.
Avi Shmidman responded to me at the time:
“our primary advantage at this point is quantitative - we can run the translation model for anyone on unlimited amounts of text (e.g., we can run the entire Igros Moshe through it in an hour, at no cost). In contrast, if I understand correctly, running millions of words through GPT4 or Claude would be [expensive]”.
This may be less true now with OpenAI’s "mini” model, which is far cheaper than the flagship model.
Of course, it's remarkable work. Developing LLM’s is quite challenging, even with the significant advancements in LLMs and NLP. Plus, they're offering it as open access.
In choosing the test paragraph, I was inspired by the fact that Avi Shmidman used a similar style paragraph in his BERT paper, also from פרוייקט בן יהודה
I messaged you.