Proceedings of the SLTC 2012 workshop on NLP for CALL, Lund, 25th October, 2012
Conference paper - peer reviewed

The core in language teaching and learning is vocabulary, and access to a delimited set of words for basic communication is central for most CALL applications. Vocabulary characteristics also play a fundamental role for matching texts to specific readers. For English, the task of grading texts into different levels of difficulty has long been facilitated by the existence of word lists serving as guides for vocabulary selection. For Swedish, the situation is with a few exceptions less fortunate, in that no base vocabulary organized according to aspects of usage has existed. The Swedish base vocabulary – SweVoc – is an attempt to remediate this. It is a comprehensive resource, aimed at differentiating vocabulary items into categories of usage and frequency. As we are of the opinion that no corpus of written text can do fully justice of general language use, we have utilized materials from a second language as reference for delimiting the category of core words. Another belief is that the task of defining a base vocabulary can not be fully automatic, and that a considerable amount of manual, traditional lexicographic work has to be invested. Hence, the present approach is not an innovative, but a methodological approach to word list generation for a specific purpose, much like LSP.We anticipate SweVoc to be integrated in CALL applications for vocabulary assessment, language teaching and students’ practice.


