Statistical parsers aim to automatically discover a set of language-independent relations between elements such as a Subject, a Predicate or an Object, based on their realization patterns in the data of different languages. A Subject in English, for example, is realized in syntax using word order, while in German it is realized in morphology, using word affixes. The cross-linguistic diversity in the realization of grammatical relations has dramatic effects on parsing accuracy — existing statistical parsing models demonstrate excellent performance on English, but when trained on data from other languages they often fail to yield comparable results. A research question thus emerges, namely, what kind of models are suitable for parsing different languages?
In this talk I motivate, develop and demonstrate the application of a Relational-Realizational (RR) parsing model which is designed to cope with cross-linguistic diversity by mapping grammatical relations to morphosyntactic realization in a non-rigid, language-independent, fashion. The model is defined over a formal grammar that inter-relates function, syntax and morphology. The model parameters encode complex interactions, which, for particular languages, are estimated based on corpus statistics. I demonstrated the application of the model to parsing Hebrew and Swedish, showing significant improvement without paying any computational costs.
Reut Tsarfaty is Post-Doctoral Researcher at the Computational Linguistics lab at Uppsala University in Sweden, focusing on technologies and evaluation methods for cross-linguistic and cross-framework statistical parsing. She received her Ph.D. and MSc. from the Institute for Logic, Language and Computation (ILLC) at the University of Amsterdam, and her BSc. from the Computer Science department at the Technion. Reut is an expert in cross-linguistic processing and is interested in particular in modeling rich morphosyntactic and morphosemantic interactions. Reut is a recipient of the Dutch Science Foundation's prestigious MOSAIC award and she is now writing a book on "Parsing Morphologically Rich Languages (PMRL)" to be published by Morgan and Claypool in the summer of 2013.