• Home
  • CLT seminar: Andreas van Cranenburgh – An efficient and linguistically rich statistical parser

CLT seminar: Andreas van Cranenburgh – An efficient and linguistically rich statistical parser

SEMINAR

Statistical parsers are effective but typically limited to producing projective dependencies or constituents. On the other hand, linguistically rich parsers recognize long-distance relations, analyze both form and function phenomena but rely on extensive manual grammar engineering. We combine advantages of the two by building a statistical parser that produces richer analyses.

We investigate new techniques to implement treebank-based parsers that allow for discontinuous constituents. We present two systems. One system is based on a string-rewriting Linear Rewriting System (LCFRS), while using a Probabilistic Discontinuous Tree Substitution Grammar (PDTSG) to improve disambiguation performance. Another system encodes the discontinuities in the labels of phrase-structure trees, allowing for efficient context-free grammar parsing.

The two systems demonstrate that tree fragments as used in tree-substitution grammar improve disambiguation performance while capturing non-local relations on an as-needed basis. Additionally, we present results of models that produce function tags, resulting in a more linguistically adequate model of the data.

Andreas van Cranenburgh
Institute for Logic, Language and Computation
University of Amsterdam
http://andreasvc.github.io/

Date: 2015-04-16 10:30 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

add to Outlook/iCal

To the top

Page updated: 2015-04-07 14:13

Send as email
Print page
Show as pdf

X
Loading