• seminar

seminar

SEMINAR

Semantic Role Labeling (SRL) plays a key role in many text mining (TM) applications. The development of SRL systems for the biomedical domain is frustrated by the lack of large domain specific corpora. In the project we proposed a method for building corpus that are labeled with semantic roles for the domain of biomedicine. The method is based on the theory of frame semantics, and uses domain knowledge provided by ontologies. By using the method, we have built a corpus for transport events strictly following the domain knowledge provided by Gene Ontology. This demonstrates that ontologies, as a formal representation of domain knowledge, can guide us and ease all the tasks in building this kind of corpus. Furthermore, ontological domain knowledge leads to well-defined semantics exposed on the corpus, which will be valuable in TM applications. We have experimented with a word-chunking approach for identifying semantic roles of biomedical predicates describing transport events using the corpus. We trained a first-order conditional random field for chunking applications with the traditional role labeling features and also domain-specific features. The results show that the system performance varies between different roles and the performance was not improved for all roles by introducing domain specific features.

He Tan is a postdoctoral fellow (forskarassistent) at the Division for Databases and Information Techniques (ADIT) and a member of the Laboratory for Intelligent Information Systems, Linköping university.

Date: 2013-02-07 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Planning event for the regular CLT seminar series

Date: 2013-01-24 10:15 - 11:15

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

The World Wide Web is the biggest global social network in human history to date. It is an important empowering instrument for information exchange and knowledge sharing. Yet, two out of three people on this planet still do not have access to the Web. A truly global Web requires mechanisms able to cater for the need for multi-lingual and multi-modal access to the Web (not only text, but also speech,) as well as its coupling to other mass communication media such as mobile phones and radio. In this regard, I will discuss Web Science and Technology initiatives and experience, based on our Web projects in rural Africa that we are currently undertaking such as EU-FP7 VOICES and IPI-Foroba Blon.


Prof. Dr. J.M. (Hans) Akkermans is a full professor of Business Informatics at VU University Amsterdam (VUA). He is the founder, first Director (2007-2011) and now Chair of the various boards of The Network Institute (www.thenetworkinstitute.eu), a new interdisciplinary and multi-faculty research institute established at VU Amsterdam and supported by its Executive Board (CvB) as of initio 2008.

The institute researches the networked world in all its facets (technological, social, economical), and hosts 250 plus researchers from different disciplines, including informatics and computer science, mathematics, economics and business administration (marketing, MIS, knowledge management), linguistics and social sciences (communication science, organization science, cultural studies of management).

He holds cum laude Master and PhD degrees in theoretical physics from the University of Groningen. Recently, as part of the Web Science initiatives at VUA, he initiated and is leading a new international collaboration between VUA, Tim Berners-Lee's World Wide Web Foundation and partners in Africa, called W4RA, the Web alliance for Regreening in Africa (see www.webfoundation.org , www.w4ra.org, w4ra.few.vu.nl). The aim is to expand global access to the Web and create new mechanisms for open information and knowledge sharing.

Date: 2013-01-22 13:15 - 15:00

Location: T340, Olof Wijksgatan 6

Permalink

SEMINAR

High impact events, political changes and new technologies are reflected in our language and lead to constant change of terms, expressions and names. Not knowing about these changes can severely limit our possibilities to find and interpret information from the past.

In this talk we will present work undertaken in the LiWA and ARCOMEM projects on classifying and automatically finding language changes.  Our classification is based on our motivation to find and interpret documents found in long-term archives. We will highlight the characteristics of each class, relate the classes and present application scenarios.

For two classes, namely word sense changes and named entity changes; we go in depth and propose unsupervised algorithms as well as present results. We showcase our algorithms on the New York Times Annotated Corpus (1986-2007) as well as The Times Archive (1785-1985) and show that the proposed algorithms are able to capture named entity changes and move towards automatic detection of word sense changes.

Nina Tahmasebi is at the L3S Research Center, Leibniz Universität Hannover

Date: 2013-01-31 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Marco Kuhlmann (Uppsala University) will speak on "The Divergence of Mildly Context-Sensitive Grammar Formalisms".

Abstract:

One of the key results in the literature on mathematical models of natural language is the weak equivalence between Tree Adjoining Grammar (TAG) and Combinatory Categorial Grammar (CCG) [1]. However, the version of CCG for which this equivalence was established differs significantly from the formalism that is being used today. In particular, old-style CCG allows to restrict and even ban the application of combinatory rules on a per grammar basis, while modern CCG assumes a universal set of rules, isolating all cross-linguistic variation in the lexicon. In this talk I will discuss the relevance of grammar-specific rule restrictions for the expressive power of CCG from a number of different perspectives, focusing on CCG’s ability to give a lexicalized account of word order. This discussion will culminate in a formal result stating that without grammar-specific rule restrictions, the equivalence between TAG and CCG breaks down. This raises important questions about the descriptive adequacy of grammar formalisms.

[1] K. Vijay-Shanker and David J. Weir. The Equivalence of Four Extensions of Context-Free Grammars. Mathematical Systems Theory, 27(6):511–546, 1994.


 

Date: 2013-01-17 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Bonus CLT talk by Alessandro Moschitti: Shallow Semantic Models for Automatically Answering Complex Jeopardy! Questions

Date: 2012-12-07 10:15 - 12:00

Location: L307, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

The talk will present a small cluster of digital applications held together with a concise linguistic architecture based on a typed feature structure architecture (essentially: HPSG). The applications are:

  1. two computational grammars based on the LKB platform, one for Norwegian (called 'Norsource', http://typecraft.org/tc2wiki/Norwegian_HPSG_grammar_NorSource ) and one being in a sense 'universal', called 'TypeGram';
  2. an e-learning tool based on Norsource, viz. a 'grammar sparrer' (http://typecraft.org/tc2wiki/A_Norwegian_Grammar_Sparrer);
  3. a demo of a valence database for Norwegian, derived from Norsource (http://regdili.idi.ntnu.no:8080/vpbwebdemo/parse);
  4. a demo of 'Grammar induction from flat annotation', based on the annotation tool TypeCraft (http://typecraft.org/tc2wiki/Main_Page) and the 'universal' grammar mentioned.

The talk will describe the applications, and the linguistic architecture behind them, in particular its encoding of valence information and situation types. The talk will then present some ideas of extending these applications, for instance in the direction of a cross-linguistic European valence repository.

Date: 2012-12-06 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

1. A best-first anagram hashing filter for approximate string matching with generalized edit distance, by Malin Ahlberg and Gerlof Bouma

2. Same grammar, diverging lexicons: the case of Hindi/Urdu, by K.V. S. Prasad and Shafqat Virk

3. Grammatical Framework: Formalizing the Grammars of the World, by Aarne Ranta (presented by K. V. S. Prasad)

Date: 2012-11-29 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

The INESS project is building an infrastructure for treebanks, a virtual laboratory for treebanking with rich functionality which can be used via an ordinary web browser interface.  The infrastructure is in particular geared towards dynamic parsing, disambiguation and advanced search of LFG treebanks, and an important task for INESS is the construction of a large LFG treebank for Norwegian. INESS will also provide for online search and processing of dependency and phrase structure treebanks created by others. In this talk we will give a basic overview of the project and present some of the newer developments we have been working on recently.

Victoria Rosén, University of Bergen and Uni Research
Paul Meurer, Uni Research

Date: 2012-11-22 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

In this seminar we will discuss the task of tracking lexical change in written Swedish. The focus will be on two related research questions: (1) how do we track the changes of a lexical unit in Swedish texts during a specified time period; and (2) how do we detect interesting lexical changes in Swedish texts, again, during a specified time period.  For both questions we are interested in what may be considered as satisfying proofs for statements about lexical change. 

Question (1) will be related to what is known as Culturomics, and (2) to the Swedish Language Council's work on compiling Swedish new words lists.

 

Date: 2012-11-15 10:15 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

X
Loading