Speakers do not always share identical semantic representations nor identical lexicons. However, regardless of these differences, dialogue participants are able to communicate successfully most of the time.
In this talk I will present ongoing joint work with Bert Baumgaertner (UC Davis) and Matthew Stone (Rutgers) on the development of agents that can implicitly coordinate with their partners in referential tasks, taking as a case study colour terms. I will describe our algorithms for generation and resolution of colour descriptions and report results of experiments on how humans use colour terms for reference in production and comprehension.
Most machine learning methods we use in NLP are methods for learning from unbiased labeled data. However, in NLP we always learn from biased data. When we train a parser on a treebank, for example, and apply it to emails, legal text, or university websites, our training data is biased in terms of genre, style, recency, possibly dialect, etc. In this talk we present learning algorithms for automatically correcting bias - or algorithms for learning under sample bias.
The first part of the talk focuses on large-margin perceptron learning algorithms for learning from weighted data. We discuss sampling vs. weighting and different weight functions. In the second part of the talk we consider the more challenging scenario where the target data cannot be assumed to form a single, coherent distribution, but where instead we need to adapt our model to every new data point on the fly.