After finishing my PhD, I started my Amazon career doing big data analysis with an IO psych (Industrial/Organizational psychology) team, but quickly realized that I missed working with language, so I transferred to the Alexa Brain organization.  I was the first linguist hired in the organization, and my director vaguely knew he "needed a linguist," but wasn't quite sure why...yet.  Over the two years I spent in the org., I built a team of linguists and data scientists who were working to increase Alexa’s conversational capabilities, including anaphor resolution, context and discourse modeling, and system routing for natural language queries. Our team supported the machine learning efforts of the org and provided linguistics expertise for language modeling and language expansion.  It was while building this team that I started to realize just how valuable a background in linguistics is to any sort of scaled-up machine learning process (especially those with speech and language applications, but it's true for other ML applications as well). While the folks I hired for my team had very different research backgrounds--some did psycholinguistics research, some were phoneticians (like me!), some were syntacticians--the thing they all had in common is the ability to see and make use of the underlying structures and patterns in language...in other words, how to treat language like data!  Throw in a data science skillset (i.e. data processing/cleaning, data analysis and statistics, and some basic programming skills) and presto! I was leading a team of "data linguist" superstars, without whom the work of the org would grind to a halt.  During my time at Amazon, it was becoming increasingly obvious that this role of "data linguist" is integral to the ML process, and I saw other organizations start to build similar teams, both in Alexa and even at other tech companies.  Although I've since left Amazon to raise my daughter full-time, I'm extremely proud of the team I built, and looking ahead to returning to the tech world to do something similar, hopefully this time in the area of speech recognition or speech-to-text, because I'm really missing spectrograms and using IPA transcription!

KR1

 

KR2