I’m considering putting in a proposal for BibleTech: 2009, but I’m not sure yet. I need to decide soon because the deadline is Monday.
If I make a presentation proposal it would probably focus on the the application of SIL’s Language Explorer to Koine Greek. I would probably focus on morphology and parsing, since that’s what I’ve done the most on, though I might move into the program’s potential for word studies and dictionary creation as well, since I’ve dabbled quite a bit with that (though that’s the dabling that I lost with my computer crashing a few weeks ago). I’m thinking of a proposal something like:
At BibleTech: 2008, SIL’s software development team presented some of the software they are developing for linguists and translators to use on the field. The focus of their presentation was specifically on programs that related to the task of translation. Thus the Data Notebook, Graphite, Translator’s Workplace, and the Translation Editor were the focus of their presentation. Language Explorer, on the other hand, is a program developed more specifically for the linguistic work that accompanies a translation. This includes language analysis, morphology, syntax, discourse analysis, and dictionary making for the many languages that have not been studied, much less received a translation of the Bible.
But exactly because of the program’s FLExibility (FLEx = Field Works Language Explorer) for the description of any language, it is perfect for the analysis of Biblical languages as well. This paper seeks to show the value of FLEx for the study of Biblical languages with an eye toward lexicography and especially the program’s potential for morphological parsing of Koine Greek texts though morpheme-by-morpheme interlinearization and the development of a morpheme lexicon.
Now I’m not sure if this is the best expression of what I’m thinking. Essentially I want to present the work that I’ve done (and hopefully will continue to do between now and March 27th) with an eye toward the possibility of actually being able to automate the process (which FLEx is designed to do). My end hope is that I’ll have a rough morphology of Koine Greek built/written that would make it possible for me to parse various Greek texts that do not have morphological databases readily available – such as the Duke Papyri, the Packard Greek Epigraphy or perhaps even some of the earlier texts from Migne’s Patrologia Graecae. I would love to parse Chrysostom, though that might require some adaption. How much the language had changed in those few hundred years between Paul and Bishop John is beyond my knowledge.
But all of that is a long way off. Right now, I’ve only finished indicative ω verbs (both regular and contract). Goals for March would probably be the completion of verbal inflectional morphology and then also at least one of the noun declensions – though I’ve been in communication with one SIL linguist who believes that he can explain Greek noun variation through phonology rather than declensions. I haven’t seen how just yet, but I’m very curious and we’re continuing to communicate.
So what do you think? Should I make a proposal?
I’d love you to parse Chrysostom too!
I think you should make a proposal. sounds good.
Well don’t hold your breath, I have a long way to go before I get that far.
Sounds a great idea! SIL does amazing linguistic work on target languages, and needs to match it with the source languages. But I guess you will have quite a lot of ambiguities to deal with. And don’t go down the rabbit trail which a previous SIL person went down of trying to turn parsing of Greek into an automatic translation program – it didn’t work!
> make it possible for me to parse various Greek texts that do not have morphological databases readily available – such as the Duke Papyri, the Packard Greek Epigraphy…
I’d suggest that you not begin with papyri and epigraphical materials. I think you’d find too many lacunae and irregular forms to implement of “first draft” parser. The papyri are notorious for containing non-standard forms and just plain lots of spelling mistakes. Auto parsing is always challenging and subject to a LOT of manual revision, but if you want to experiment with that, tackle an edited text first–one that’s closer to a formal or literary genre rather than the scribbles of some of the papyri.
Peter: Are you serious, someone tried to do that? I don’t think you need to worry. I won’t head down that path.
Dr. Decker: Any possibility of looking at the papyri would be an incredibly long way off that, right now, is no more than a distant desire. And now that you’ve reminded me of the issue of spelling, it might be less than that.
Here is part of what I wrote about this project in 2002, for a lesson at SIL ETP and following a discussion of Babelfish:
The BibleTrans project website is no longer available.
So this project was not based on an automatic parse of the Greek, but I think that was its starting point.
But I can think of one target language (apart from modern Greek) for which an automated adaptation of the Greek just might produce a viable first draft translation: Russian! Greek and Russian have so much in common syntactically that an adaptation might at least be comprehensible. But it would need to be done with a highly complex analysis and synthesis tool, working not just at the word level and able to do some reordering.
Peter, there’s a group working on writing parallel grammars of various languages:
http://www2.parc.com/isl/groups/nltt/pargram/
In theory, because LFG separates grammatical and semantic information into an abstract non-syntactic representation, it would be possible transfer meaning more easily from one language to the other.
Well, Mike, I will believe that even the most sophisticated machine translation is useful for Bible translation when I see programs like BabelFish able to translate even simple Bible texts meaningfully from any one language to any other (which is not very closely related). I did a test of this in 2002 and was appalled at how bad the results were even from German to English. Here (repeated today) is its version of Psalm 23, showing a complete failure to parse the German:
Using this program from the following German:
Peter: I agree. That’s terrible. This is the only decent sentences:
“You prepare a table in the face of my enemies before me.”
I looked around and Babel uses something called “Systran,” but I cannot find any description of the theoretical basis behind the program…
I think I’ll be sticking with simply computer aided language analysis.
I tried the same German text in Google Translate, and it does somewhat better, at least putting the “not” in the right place:
Google also offers quite an impressive selection of languages. But it doesn’t do too well with Ephesians, presumably because it is expecting modern Greek. And this is what it did with the original Hebrew of Psalm 23:
!!
I’m surprised by much of Google’s translation from German to English though. Several parts of that are actually reasonable.