Retractions...

On my last commit I had stated that the MST parser was, in at least some sense, "functional", and I'm afraid I have to make a small retraction to that. The problem is that I'm following Keith Hall's 'k-best Spanning Tree Parsing' paper, and I quickly noticed what Sebastian and I both agree is probably a typo (on the edge scoring moving from S1 to S2 in fig. 2). There was a second discrepancy between my parser's output and the final graph presented in this paper, where one edge score was calculated to be -8 instead of -7 as presented in this paper. I had chalked this up to another typo, and I was a bit presumptuous in doing so - Sebastian pointed out an alternate explanation for this mismatch and I'll have to adjust the algorithm to correct the edge re-scoring procedure.

In better news, I have the training procedure working, and I'm using the MaxentClassifier to handle the initial edge scoring. Thanks to Edward to pointing out how to make use of that class properly. This means that, aside from the previously-mentioned mistake, the only thing that needs to be done is to map that graph output into a true tree structure, and the parser will be finished. Well, that and I have to do a quick tie-in to tag the input sequence before beginning the graph constructions, but I think all the difficult sections are behind me. At least, I hope so, because I'd really like to produce an exceptional MSc thesis!

Also, our plan is to submit more code for public review on the 3rd or so, and hope to have feedback by the 7th or 8th, so that there will be time to make adjustments by the early pencil's down date - Aug 11th.
Cheers!

GSoC 08 - NLTK Worklog

Tuesday, 29 July 2008

Retractions...

No comments:

Resources

Blog Archive

About Me