Published: Aug. 28, 2021

Over the past year and up to the current quarter, our Strand 1 researchers laid the foundational work needed to tackle their main challenge—developing new advancements in how machines process human language, gestures, and emotions.

Strand 1’s work centers on developing an interactive AI Partner to listen to and analyze student conversations with the aim of facilitating problem solving. Part of this work includes ensuring the AI Partner generates the appropriate Talk Moves (ways teachers can facilitate the progression of classroom discussion) in a classroom. Three Strand 1 researchers, Ananya Ganesh, Martha Palmer, and Katharina Kann, published the results of their talk moves research this summer, which currently informs the strand’s research approaches.

Jake Whitehill and his student Zeqian Li testing their speaker embedding models over Zoom.

Jake Whitehill and his student Zeqian Li testing their speaker embedding models.

Zeqian Li and Jake Whitehill are researching audio-visual diarization ensemble models using a Zoom-based Collaborative Problem Solving dataset. They are also developing a new clustering algorithm that can handle situations where clusters may be compositional; this has potential applications to speaker diarization with simultaneous speech from multiple speakers.

Strand 1 members also lent their expertise to assist Strand 2 in selecting the audio and visual hardware needed to begin the lab testing of our AI-enabled Collaborative Learning Environments (AICL), and their team has been instrumental in guiding and curating the necessary data needed to create the novel algorithms needed to develop our AI Partner.

For example, Strand 1’s co-lead, Ross Beveridge, designed a simple small group task to inspire interesting dialogue and joint problem-solving between students. Named “Fibonacci Weights,” students are provided with color-coded cubes that progress in weight following the fibonacci sequence. The students are not told the weight of the cubes or about the fibonacci sequence, and they must work together as a team to discover and record the weight of each cube.

Photo showing the fibonacci weights setup.

Photo showing the Fibonacci Weights activity.

This task is designed to capture audio and video recordings of small groups of students collaborating to provide researchers with the data needed to test and improve key elements of the AI Partner.

We can also report on progress in Abstract Meaning Representation (AMR) creation and parsing during our fourth quarter. An AMR is a symbolic representation for the meaning of sentences and dialogue that the classroom agent will learn to reason over. Current work by Jeffrey Flanigan, Jon Cai, and Michael Regan is to extend existing AMR corpora with teacher-student and student-student small group exchanges to support automated AMR parsing. Presently, the group has created and begun experiments with ~1,500 new classroom AMRs covering sensor immersion units in a physics classroom and specific teacher TalkMoves (pressing for accuracy).

In addition, the Automatic Speech Recognition (ASR)  processing pipeline design was finalized and a Google Cloud ASR synchronous interface was implemented. This interface was used to process classroom data collected for microphone comparison tests.

Strand 1 researchers Joewie Koh, Shiran Dudy, and Alessandro Roncone have developed a high-level scheme for the agent incorporating modules that promote transparency and facilitate explainable decisions. The team has explored settings where actions are chosen in accordance with multiple objectives and is looking into extending the approach to temporally-based objective optimization.