NLP @ CU Boulder

"The idea of giving computers the ability to process human language is as old as the idea of computers themselves. This vibrant interdisciplinary enterprise has many names corresponding to its many facets, names like speech and language processing, human lanquage technology, natural lanquage processing and computational linquistics. The goal of this exciting field is to provide scientific insights into the nature of human language and to enable human-machine communication and improve human-human communication."


-Professor Jim Martin


Daniel Jurafsky and James H. Martin, Speech and Language Processing: An Introduction to Natural Language Processing, Computational Linguistics and Speech Recognition (2ed.), Prentice Hall 2009

 

The NLP Process

Training computers to accurately deal with languages is a complex process that intricately weaves together linguistic insights and computational models that reference real world contexts. The process can begin with linguistic analysis, computational models, or a combination of the two. After it’s begun, however, it usually cycles in the following manner.

 

An infographic describing the NLP process

 

The NLP Ecosystem

The NLP ecosystem is comprised of linguists, computer scientists, and domain experts, as well as the computational linguists who link these three groups together.

 

 

An infographic about the NLP ecosystem

If this entire process seems interesting to you, why not become a computational linguist?

 

Apply Now

 

Featured Projects

Our faculty are engaged in research projects ranging from language documentation and morphological analysis to semantic analysis and biomedical informatics. We are also currently working on an autonomous conversational agent in a junior high through college classroom setting. Featured below are some of the projects we are most proud of, both past and present. 

 

   Ongoing

Jan 28th

DARPA AIDA Program

Autonomous Interperation of Disparate Alternatives

Project leads

Martha Palmer

Martha Palmer

Susan Brown

Susan Brown

Jim Martin

Jim Martin     

Chris Heckman

Chris Heckman

Our goal is to automatically analyze the content of written documents and extract key pieces of information about the events they describe, including where different news sources contradict each other.

Problem

We can’t possibly keep track of everything that is happening day to day - in the news, in medicine, in financial markets, on social media, etc.

Solution

Natural Language Processing can automatically extract key events, along with who is participating in them and the order in which they happen, to help make our job of keeping on top of things much more tractable.

Techniques Used

  • Deep Learning 
  • Graph Embeddings 
  • Coreference Resolution 
  • Type Matching 
  • Entity & Event Annotation & Recognition  
  • Ontology Construction & Mapping

 

 

    

   Ongoing

Jan 28th

THYME

Temporal History of Your Medical Events    

Project leads

Martha Palmer

Martha Palmer

Jim Martin
     
Jim Martin        

Kristin Wright-Bettener

Kristin Wright-Bettener

Our goal is automatically extracting the timeline of a disease and its treatment from patient records. This benefits individual patients and their doctors by providing quick, accurate summaries of a patient’s history covering several years. Moreover, aggregating together timelines for large numbers of patients can also aid in analyzing the effectiveness of alternative treatments and the development of new treatments, benefitting all patients.

 

Problem

Ever increasing amounts of electronic clinical data and medical subspecialization hinder the ability of doctors and patients to stay on top of all aspects of a patient’s medical history.

Solution

Natural Language Processing can automatically process thousands of patient records in seconds. This allows automatic identification of salient diseases, signs, symptoms, and treatments, while preserving the timeline of the patient’s medical history.

Techniques Used

  • Annotation of Temporal Relations Between Events
  • Annotation and Parsing of Abstract Meaning Representations
  • Coreference Annotation and Resolution 
  • Entity & Event Annotation & Recognition

 

 

    

   Ongoing

Jan 28th

Universal NLP

  

Project leads

Professor Kann

Katharina Kann

Alexis Palmer

Alexis Palmer        

 

 

NLP is making immense contributions to the English and Chinese speaking worlds. Automating teaching to give children access to education and automatic machine translation increasing access to healthcare are just two examples. For the rest of the world to benefit from NLP, it needs to function in their languages too.

 

Problem

The majority of the world's 7000 languages have limited data available for Natural Language Processing.

Solution

When we don’t have enough data to use classical NLP, there are approaches that can make up for this lack.

Techniques Used

- Transfer Learning 
- Pre-training 
- Multi-task Training 
- Meta Learning