• seminar

seminar

SEMINAR

The use of lexical resources that describe the possible mappings between syntax and semantics of natural language sentences has been in the focus for a range of NLP tasks such as semantic parsing, information extraction, natural language generation and hybrid MT. A widely used resource of this kind is Berkeley FrameNet (BFN), which separates between the language-independent semantic frames and frame elements, and their language-specific realization.

Following the BFN approach, there are framenets developed also for German, Swedish, Spanish, Japanese and other languages. These framenets share, to a large extent, the same interlingua (the set of frames and frame elements), but otherwise they are not unified. Each framenet uses not only its own annotation format but also grammatical types and functions, and identifiers of lexical units.

This presentation shows the first results of an ongoing research that aims at designing and extracting an abstract syntax of a multilingual FrameNet-based grammar, and at generating the corresponding concrete syntaxes.

The grammar is being implemented in Grammatical Framework (GF) – a formalism and a resource grammar library (RGL) for implementing parallel grammars. The GF RGL provides a shared syntactic API that is implemented for nearly 30 languages and that can be used for generalizing the grammatical types and functions of different framenets. The FrameNet-based grammar, in turn, proposes a frame semantic abstraction layer to GF RGL.

Date: 2014-02-13 10:30 - 11:30

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Date: 2014-01-16 10:30 - 11:30

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Taraka Rama - GSLT PhD student in Natural Language Processing at the department of Swedish - will defend his licentiate thesis Vocabulary lists in computational historical linguistics

External reviewer: Roman Yangarber, University of Helsinki

Examiner: Dimitrios Kokkinakis

Date: 2014-01-31 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Jockers will open his lecture with an argument about the applicability of quantitative methods to literary studies. He'll offer his answer to the "so what" question that is frequently asked by humanists who are unaccustomed to thinking about literature as data on the one hand and quantitative evidence on the other.

After sketching the broad outlines of how quantitative data might and should be employed in literary studies, Jockers will move to a "proof of concept" derived from his own recent work charting plot structure in 50,000 narratives. In this section Jockers will discuss how he employed tools and techniques from natural language processing, sentiment analysis, signal processing, and machine learning in order to extract and compare the plot structures of novels in a corpus of texts spanning the two hundred year period from 1800-2011.

He'll explore the six core plot archetypes revealed by the technique and how these shapes change from the 19th to the 20th century. He'll then compare the plot structures of 1,800 contemporary best sellers to the larger corpus in order to suggest that at least one element of market success is related to plot shape.

Matthew L. Jockers is Assistant Professor of English at the University of Nebraska, Faculty Fellow in the Center for Digital Research in the Humanities, and Director of the Nebraska Literary Lab. He oversees UNL’s post baccalaureate Certificate in Digital Humanities, and he serves as the faculty advisor for the minor in Digital Humanities. Prior to Nebraska, Jockers was a Lecturer and Academic Technology Specialist in the Department of English at Stanford where he co-founded the Stanford Literary Lab with Franco Moretti.

Jockers’s research is focused on computational approaches to the study of literature, especially large collections of literature. He has written articles on computational text analysis, authorship attribution, Irish and Irish-American literature, and he has co-authored several successful amicus briefs defending the fair and transformative use of digital text. Jockers’s books include Macroanalysis: Digital Methods and Literary History (UIUC Press 2013) and Text Analysis with R for Students of Literature (forthcoming from Springer in May 2013). Jockers's work has been profiled in the academic and main stream press including features in the New York Times, Nature, the Chronicle of Higher Education, Nautilus, Wired, New Scientist, Smithsonian, NBC News and many others.

Jockers's Gothenburg visit is partly funded by Kungliga Vitterhetsakademien.

Date: 2014-03-27 10:30 - 11:30

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

In the United States, humanists are beginning to engage with large digitized collections of texts in new ways. With techniques guided by ideas such as “distant reading”, “macroanalysis”, and “algorithmic criticism”, literary scholars are appropriating tools long used by corpus linguists and computer scientists, but putting them to use trying to answer different kinds of questions.  This presentation takes a look at some of these techniques, including topic modeling, network analysis, sequence alignment, and geographic entity recognition, with a focus on why and how they are interesting to literary scholars.

Demonstrations projects include publication networks of Modernist poetry, the “ecological imaginary” of 19th-Century Sweden, and the possible influence of Darwinian thought on Danish prose.

Date: 2014-01-23 10:30 - 11:30

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

In my 2004 DPhil thesis I explored how world knowledge can be learnt automatically from text, by employing Inductive Logic Programming, a subfield of machine learning. The knowledge learnt involves cause-effect relations between individuals and companies for the financial domain (e.g. if a person A is being fired by a company B then another person C will be hired, a company D acquires a company E and so on). In this talk I will discuss the relevance of this work in relation to research done since by other colleagues on ontology learning and open information extraction.

Date: 2013-12-06 10:15 - 11:45

Location: T340, Olof Wijksgatan 6

Permalink

SEMINAR

Crowdsourcing provides new ways of cheaply and quickly gathering large amounts of information contributed by volunteers online. This method has revolutionised the collection of linguistic judgements, in computational linguistics and elsewhere. However, to create annotated linguistic resources from crowdsourced data we face the problem of having to combine the judgements of a potentially large group of annotators.

In this talk I will present joint work with Ulle Endriss where we put forward the idea of using principles of social choice theory (which has traditionally dealt with the aggregation of the preferences of individual voters in an election) to design new methods for aggregating linguistic annotations provided by individuals into a single collective annotation.

Raquel Fernandez, ILLC, Amsterdam

Date: 2013-11-26 13:15 - 15:00

Location: T340, Olof Wijksgatan 6

Permalink

SEMINAR

In data-driven parsing with Linear Context-Free Rewriting System (LCFRS), markovized grammars are obtained through the annotation of binarization non-terminals during grammar binarization, as in the corresponding work on PCFG parsing. There is, however, indication that directional parsing with a non-binary LCFRS can be faster than parsing with a binary LCFRS.

Since plain (non-binary) treebank grammars do not perform well, I present a debinarization procedure with which we can obtain a non-binary LCFRS from a previously binarized one. The resulting grammar retains the markovization information. The algorithm has been implemented and successfully applied to the German NeGra treebank.

Wolfgang Maier (Düsseldorf)

Date: 2013-11-19 13:15 - 14:15

Location: EDIT Room 3364, E-building, Chalmers (Johanneberg)

Permalink

SEMINAR

I will start by briefly summarizing the content of my previous talk, but from a slightly different angle. I will also amend some of the statements I made then, as well as fill in some blanks where this is called for.

I will then continue to discuss:

- some ideas for a logical interface to NLP components (such as GF)
- some ideas for implementing different dialogue management strategies
- some ideas for a logic-based CLT Cloud
- some ideas for a CLT flagship application

Most of these ideas are only half-baked, but hey, the slot was empty, and I need feedback!

Date: 2013-12-19 10:30 - 11:30

Location: EDIT Room 3364, E-building, Chalmers (Johanneberg)

Permalink

SEMINAR

Tweets are notoriously hard to analyze, because of spelling variations and the limited context available, but tweets are about real-world events, often described in more details on websites they link to. If the websites are easier to analyze, we may project our analysis of unknown or ambiguous words from the websites to the tweets. We show how such projections can be used to improve POS and NER tagging of tweets, achieving error reductions up to 18.4% on various datasets.

Barbara Plank, Center for Sprogteknologi, Copenhagen

Date: 2013-12-12 10:30 - 11:30

Location: EDIT Room 3364, E-building, Chalmers (Johanneberg)

Permalink

X
Loading