• Home
  • Bootstrapping Language Description: The case of Mpiemo (Bantu A, Central African Republic)

Bootstrapping Language Description: The case of Mpiemo (Bantu A, Central African Republic)

Sourcetitle: 
Proceedings of the 6th edition of the Language Resources and Evaluation Conference (LREC 2008), 28-30 may 2008, Marrakech, Morocco,
Year of publication: 
2008
PublicationType: 
Conference paper - peer reviewed

Linguists have long been producing grammatical decriptions of yet undescribed languages. This is a time-consuming process, which has already adapted to improved technology for recording and storage. We present here a novel application of NLP techniques to bootstrap analysis of collected data and speed-up manual selection work. To be more precise, we argue that unsupervised induction of morphology and part-of-speech analysis from raw text data is mature enough to produce useful results. Experiments with Latent Semantic Analysis were less fruitful. We exemplify this on Mpiemo, a so-far essentially undescribed Bantu language of the Central African Republic, for which raw text data was available.

http://www.lrec-conf.org/proceedings/lrec2008/pdf/848_paper.pdf

To the top

Page updated: 2012-01-30 14:01

Send as email
Print page
Show as pdf

X
Loading