Published: Jan. 24, 2017

Marjorie McShane spoke to the CU Linguistics department on February 27, 2017 about the NLP-linguistics handshake. Her abstract and biography are below.

The Strategic Incorporation of Linguistic Analysis into Modern NLP

Most current approaches to natural language processing (NLP) share several defining features:

1. breadth of corpus coverage has priority over both breadth of phenomena covered and depth of analysis

2. near-term results are preferred over longer-term R&D

3. work on individual, “silo” tasks – such as word sense disambiguation or sentiment analysis – is preferred over work toward the comprehensive analysis of text meaning

4. machine learning, usually supervised, is the methodology of choice

5. linguists are expected to contribute by annotating ever bigger corpora for ever more features.

This set of preferences has resulted in the practical gains in applications we all know so well, but the methodology seems already to be reaching a ceiling in the quality of results. More attention, therefore, needs to be paid to holistic methods that will enable language processing systems, over time, to achieve near human-level sophistication.

In this talk I will describe some recent advances in automating the deep analysis of linguistic phenomena – such as polysemy, nominal compounding, reference and ellipsis – within the theory of Ontological Semantics. I will focus on how these approaches can serve both the mainstream NLP community and the cognitive systems community, in different ways. The key to feasibility and utility in both cases is enabling the system to select the subset of instances it knows how to treat, judge its confidence it those analyses, and make those confidence metrics – along with their justifications – available to the computational systems to which they contribute.

Marjorie McShane is an Associate Professor in the Cognitive Science Department at RPI, where she co-directs the Language-Endowed Intelligent Agents Lab. She works on developing and integrating functionalities that are often treated in isolation, such as language understanding, agent reasoning, emotion modeling, and physiological simulation. Among her current foci of research are incremental semantic parsing, difficult referring expressions, and recovery from unexpected language inputs.