Successful Automated Parsing

This weekend, I decided to run the automated parser with what little analysis I have – that is productive noun morphology and about a third of the productive adjective morphology – hopefully I’ll have adjectives done today and I can move on to either to pronouns or the huge task of verbs.

Anyway, since I didn’t have much analysis to start with, I wasn’t if anything would happen when it ran and at the time I didn’t see and change, so I assumed: 1) the parser is designed to only work with the International phonetic alphabet or 2) because I have so little analysis done, there wasn’t anything for it to parse.

Well both were wrong. When I pulled up the program this morning, I was going along analyzing adjectives and I saw this:

Now the blue boxes are simply words/forms that are identical to ones I’ve already parsed – that is, since I’ve done the Accusative Masculine Singular form of δίκαιος, it makes the suggestion that these should be parsed the same – that in of itself is convenient since if I have a word analyzed, it would suggest an analysis for every other occurrence of that word – which basically means that every instances of the article, when I’ve gotten to it, will be taken care of instantly. Anyway, because of the masculine/neuter neutralization, the parser is, of course, wrong here. But that’s not what’s exciting. Well, its still exciting up to this point, I wasn’t sure if I could make it work at all.

What’s exciting is the orangish-pink box (salmon?). That color marks the analysis of automated parser. When I had ran it, I had *δικαί in my lexicon as an alternate form of *δίκαι. And I also had the morpheme –ου “gen.m.sg” in my lexicon. But never had I actually analyzed or parsed the form. The program did itself -granted it did it wrong since it should actually be neuter, not masculine in gender. But even still, it did and that means that I can make it work, though it will take a good amount of effort more before its ready.

In my excitement, I decided to put the parser to the text more directly. I completed entering the analysis for the *δίκαι type of adjective and then added the root *κόκκιν to my lexicon with its alternate stress form (*κοκκίν). The difference between adjectives *δίκαι and *κόκκιν adjectives is in the feminine (hence, below its labeled Productive Ib instead of Productive II). So if the parser works, then I should be able to parse all the masculine and neuter forms. Here’s the result:

It got everything right except for the neutralization between the vocative and nominative cases, which is easy to fix. That’s simply beautiful. I have since updated the lexicon so that now, the nominative is the primary sense for the morpheme –οι.

The final goal, I think (read: I hope), for my presentation at BibleTech:2009 in terms of parsing will be to complete as much morphological analysis as possible and build a small lexicon of the most common words in Ephesians and demonstrate the parser. How much can I get done before the end of March and write the paper? Well, we’ll find out…but its encouraging that it actually works! Especially after the hours upon hours of work I’ve put into this particular database over the past three months and into the program over the past year. I think I’ve restarted from scratch four times out of sheer frustration. I’m finally making headway.

Looking ahead, what worries me the most is dealing with contract verbs…

5Comments

Add yours

1

nathanwells on January 20, 2009 at 10:11 pm

I love the programs SIL has made available – I used (and still use) quite a few while I was working in Cambodia (from what I see, that is what you are using).

I haven’t been following your posts, but why are you parsing Ephesians when it is already parsed by so many? Are you doing research for something specific?

LikeLike
2

Mike Aubrey on January 20, 2009 at 11:40 pm

Its not so much Ephesians specifically that I’m parsing, more Hellenistic Greek in general. There are a couple reasons though:

1) I’m not satisfied with any of the currently available morphology schemes that have been produced – none of them treat questions regarding what words belong to what inflectional classes (declensions) and there is no Greek morphology currently available that satisfactorily deals with Aspect morphology on the verb.

2) Its a great exercise in learning Greek. I know a whole lot more about what adjectives and nouns are in various declensions than I did before.

3) There are Greek texts that are not readily accessible with morphological analysis – not unless you have money and I don’t, so I consider developing my own as a way of solving that problem.

4) Writing a morphology requires learning the grammar, and essentially writing a grammar as well. There’s no better way to learn the grammar in depth than to analyze it yourself on such a scare that shortcuts and cop-outs aren’t possible. And developing a parser doesn’t let you take shortcuts.

So that’s it in a nutshell, though the biggest reason, I suppose is that I wanted to know if I could do it. Thus far, I have. Hopefully that will continue.

LikeLike
3

James Tauber on January 21, 2009 at 9:04 am

As this is related to work I’ve been doing for many years for MorphGNT, I’d love to collaborate.

LikeLike
4

Mike Aubrey on January 21, 2009 at 10:57 am

James, I just e-mailed you.

LikeLike
5

BibleTech:2009 Sneak Peek | Logos Talk on April 11, 2011 at 7:26 am

[…] Mike Aubrey will demonstrate the functionality of SIL’s FLeX language program. Mike will illustrate the power of software for Greek studies and translation work. You can follow his preparations on his blog, ΕΝ ΕΦΕΣΩ. […]

LikeLike