A talk on two related subjects:
1. Information Extraction in the Clinical E-Science Framework
The Clinical E-Science Framework (CLEF) project built a system to extract clinically significant information from the textual component of medical records. Conventional clinical Information Extraction (IE) systems often use purpose built software, and many involve some degree of knowledge engineering to encode clinical and linguistic knowledge. The CLEF IE system is built largely from off-the-shelf components, and involves no additional knowledge engineering. Instead, clinical knowledge is provided by human annotated examples, which are used to learn statistical models of the text. This talk will describe the CLEF IE system, and the building of a training data set and gold standard. The talk will give evaluations of system performance for both entity and for relation extraction, with comparisons between training sets of different sizes and types. Finally, the talk will illustrate the quantity of data that can be extracted, by describing application of the system to a corpus of half a million clinical narratives and reports.
2. Pattern grammar based clinical information extraction: an agile process for building practical systems
When developing any IE application, both software and data must be considered. From a software engineering point of view, the last decade has seen the emergence of re-useable NLP frameworks and tool-kits. The task of building an NLP application for processing medical records can thus move from de novo systems development to the adaptation of these tool-kits and frameworks. From a data point of view, it must be considered that virtually all usable NLP techniques require significant volumes of manually prepared examples. One of the main stumbling blocks to developing medical NLP applications is the lack of such example data.
This talk will describe the application of an increasingly popular software engineering technique - an agile methodology - to tackle both IE systems adaptation and annotation of example texts at the same time, in a large hospital setting. Agile methodologies replace the linear requirements-design-implement approach to software engineering with early implementation and the iterative evolution of requirements. The approach taken maximises the involvement of clinician and medical researcher end-users, at low cost to their time. We believe that this has a beneficial effect on requirements gathering, and on final system quality. The talk will be illustrated with quantitative results from a working Proof-of-Concept application, and will discuss ongoing work to develop further applications in the same institutional setting, where we now have a successful and expanding production system.
In contrast to CLEF, the system implemented is based on hand-crafted pattern matching grammars. The talk will discuss this difference, and make comparisons between the two approaches.
Angus Roberts on the web: http://www.dcs.shef.ac.uk/~angus/
Location: L308, Lennart Torstenssonsgatan 8