Computational Semantics
Our goal is richer and more accurate representations of utterances in English, Chinese, Hindi/Urdu, and Arabic. Our principle approach involves the application of supervised machine learning to data with linguistic annotation. There are several different layers of annotation, and correspondingly several individual NLP components, many of which are trained on a single layer. We begin by describing several different end-to-end systems we are building which incorporate these components, then describe the individual components. Next we describe the lexical resources which inform the linguistic annotation, and then the individual layers of annotation and the different domains and genres they have been applied to. Finally we describe CLEARTK - the CLEAR NLP Toolkit - that is being use by some of the end-to-end systems.