• seminar

seminar

SEMINAR

In this talk, I will present R language (or simply R ), a dynamic, lazy, functional, programming language that was designed in 1993 by Ross Ihaka and Robert Gentleman. R adopts the underlying evaluation model of Scheme with the syntax of S, (a programming language, which was developed by John Chambers at Bell Laboratories). R is an open-source programming language and the flexible statistical analysis toolkit implemented in R , made it the lingua franca for doing statistics. The R package repository (CRAN) features 7861 available packages, which extent the language. Also, there are guides on CRAN that group sets of R packages and functions by type of analysis, fields, or methodologies (e.g. Bayesian Inference, Probability Distributions, Machine Learning, Natural Language Processing). The statistical capabilities of R along with its functional capabilities can transform R into a rich environment for doing Type Theory. Thus, I will conclude this talk by discussing possible extensions of R for A Probabilistic Rich Type Theory for Semantic Interpretation (Cooper, Dobnik, Lappin, and Larsson, 2015).

Date: 2016-02-18 15:00 - 17:00

Location: Seminar Room, Dicksongatan 4

Permalink

SEMINAR

Writer-based and reader-based views of text-meaning are reflected by the respective questions "What is the author trying to tell me?" and "What does this text mean to me personally?" Contemporary computational linguistics, however, generally takes neither view; applications do not attempt to answer either question. Instead, a text is regarded as an object that is independent of, or detached from, its author or provenance, and as an object that has the same meaning for all readers. This is not adequate, however, for the further development of sophisticated NLP applications for intelligence gathering and question answering, let alone interactive dialog systems. I will review the history of text-meaning in computational linguistics, discuss different views of text-meaning from the perspective of the needs of computational text analysis, and then extend the analysis to include discourse as well -- in particular, the collaborative or negotiated construction of meaning and repair of misunderstanding.

Bio:

Graeme Hirst's research interests cover a range of topics in applied computational linguistics and natural language processing, including lexical semantics, the resolution of ambiguity in text, the analysis of authors' styles in literature and other text (including plagiarism detection and the detection of online sexual predators), identifying markers of Alzheimer's disease in language, and the automatic analysis of arguments and discourse (especially in political and parliamentary texts).

Hirst is the editor of the Synthesis series of books on Human Language Technologies, published by Morgan & Claypool. He is the author of two monographs: Anaphora in Natural Language Understanding and Semantic Interpretation and the Resolution of Ambiguity. He is the recipient of two awards for excellence in teaching. He has supervised more than 50 theses and dissertations, four of which have been published as books. He was elected Chair of the North American Chapter of the Association for Computational Linguistics for 2004-05 and Treasurer of the Association for 2008-2017.

Date: 2016-03-16 15:00 - 17:00

Location: T219, Olof Wijksgatan 6

Permalink

SEMINAR

We use propositional dynamic logic and ideas about propositional control from the agency literature to construct a simple model of how legal relations interact with actions that change the world, and with actions that change the legal relations.

This work is relevant for attempts to construct restricted fragments of natural language for legal reasoning that could be used in the creation of (more) formal versions of legal documents suitable for `legal knowledge bases'.

Jan van Eijck, CWI and ILLC, Amsterdam (http://homepages.cwi.nl/~jve/)

(joint work with Fengkui Ju, Beijing Normal University, Beijing, China)

Date: 2016-02-22 15:00 - 17:00

Location: T307, Olof Wijksgatan 6

Permalink

SEMINAR

Last week, 27 Jan 2016, something extraordinary happened. A team of researches from DeepMind, the company that got famous by developing an algorithm that can learn to play Atari games (and being purchased by Google for 400 Million pounds), published a paper in Nature describing a solution to what many consider a milestone towards general artificial intelligence. An algorithm that is able to beat professional players in the board game of Go, a game so complex that any brute force approach is bound to fail miserably. In this talk I will present the results of their paper, how deep neural networks were used together with reinforcement learning and Monte-Carlo tree search to navigate a search space more than one googol (10^100) times larger than chess, and winning 5 out of 5 matches against the european Go champion Fan Hui.

Date: 2016-02-04 10:30 - 12:00

Location: EDIT-room 3364, Chalmers Johanneberg

Permalink

SEMINAR

Date: 2016-01-28 10:30 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

Abstract syntax is a concept in compilers and programming language semantics. It is a tree representation that abstracts away from the order and shape of lexical items. GF, Grammatical Framework, is a grammar formalism that applies abstract syntax to natural languages. Its initial purpose was to build domain- specific translation systems based on semantic interlinguas. GF has later scaled up to wide-coverage grammars, based on the GF Resource Grammar Library, which applies a shared abstract syntax to 30 languages. The Universal Dependencies (UD) initiative is a more recent, but already more widely known, approach using shared concepts: the labels and part of speech tags in dependency trees. UD trees are built by parsers trained from treebanks, whereas GF uses explicit grammar rules. Thus UD uses manual work for annotating treebanks, whereas GF uses manual work for writing grammars.

We will present a conversion from GF abstract syntax trees to UD dependency trees in this talk. The conversion has several potential applications: (1) it makes the GF parser usable as a rule-based dependency parser; (2) it enables bootstrapping UD treebanks from GF treebanks; (3) it defines a formal way to assess the informal annotation schemes of UD; (4) it gives a method to check the consistency of manually annotated UD trees with respect to the annotation schemes; (5) it makes information from UD treebanks available for the construction and ranking of GF trees, which can be expected to improve GF applications such as machine translation. The conversion is tested and evaluated by bootstrapping a small treebank for 32 languages, as well as comparing the GF version of the English Penn treebank with the standard UD version.

Date: 2015-12-03 10:30 - 12:00

Location: L308, Lennart Torstenssonsgatan 8

Permalink

SEMINAR

We combine supervised learning with unsupervised learning in deep neural networks. The proposed model is trained to simultaneously minimize the sum of supervised and unsupervised cost functions by backpropagation, avoiding the need for layer-wise pretraining. The model structure is an autoencoder with skip connections from the encoder to decoder and the learning task is similar to that in denoising autoencoders but applied to every layer. The skip connections relieve the pressure to represent details at the higher layers of the model because, through the skip connections, the decoder can recover any details discarded by the encoder. We show that the resulting model reaches state-of-the-art performance in various tasks: MNIST and CIFAR-10 classification in a semi-supervised setting and permutation invariant MNIST in both semi-supervised and full-labels setting.

Date: 2015-11-20 13:30 - 15:00

Location: room 8103, EDIT Building, Hörsalsvägen 11, Chalmers Johanneberg

Permalink

SEMINAR

Current successful approaches to NLP are for the most part based on supervised learning. In turn, supervised learning critically relies on the availability of annotated data. Such data is usually not plentiful, as it requires time and expertise to develop data. This is the problem of data sparsity. At the same time, available samples usually come from specific domains and languages, e.g., English newswire data, and thus suffer from data bias.

In this talk I will present techniques to overcome data sparsity and bias by proposing to leverage fortuitous data, i.e., data from various sources which is out there, often created as a by-product, but often neglected. I will argue that fortuitous data, combined with weakly supervised learning techniques, helps to improve language technology for task such as POS tagging and dependency parsing. In particular, examples include building more robust taggers for Twitter by exploiting information from hyperlinks, or using doubly-annotated data to improve chunking and parsing. Instead of glossing over such data, it is more fruitful to embrace it during learning. Finally, I will present recent (on-going) work on exploiting cognitive processing data to improve language technology.

Date: 2015-10-23 13:15 - 15:00

Location: K332, Lennart Torstenssonsgatan 6

Permalink

SEMINAR

Computational historical linguistics is a young and new field. Among it’s major challenge is the collection and preparation of suitable data resources. Here we present an approach that takes lexical data taken from a large collection of publicly available wordlists as input and infers automatic assessments regarding the cognacy of words and sounds. We illustrate the workflow and test it by comparing the results obtained from the computation of Maximum Likelihood trees with those provided by experts. The results show that our workflow still lags behind simpler approaches which analyze the data within a distance-based framework. However, since distance-based analyses bear a blackbox character, not allowing for a rigorous check of the individual decisions which lead to a certain classification proposal, we think that our experiments are an important contribution towards the establishment of more transparent methods in quantitative historical linguistics.

Joint work with Johann-Mattis List

Date: 2015-11-13 10:30 - 12:00

Location: K333, Lennart Torstenssonsgatan 6

Permalink

SEMINAR

This thesis identifies a problem with how computer-assisted translation tools are developed. Their final use as an aid to translators is often not fully-considered and left to others to evaluate. A new methodology is proposed which allows a cat tool to be evaluated intrinsically and extrinsically using methods that show the tools ef fect on the whole translation process. Special emphasis is placed on prototyping as a resource-effective way to create tools and gain critical feedback before a full implementation. To evaluate the methodology’s usefulness, StyleCheck is developed and evaluated using it. StyleCheck detects when a style guide rule is applied and gives a hint to the translator when it isn’t. Results show StyleCheck is effective at getting a style guide to be applied, more than translating from scratch or post-editing, although more work on the user interface is required. The methodology is proven to be good at coming up with cat tool improvements, quickly protoyping them and evaluating them.

Opponent: Anne Schumacher
Examiner: Simon Dobnik

Date: 2015-09-28 10:30 - 12:00

Location: Room 112, Dicksonsgatan 4

Permalink

X
Loading