Teams, rather than individuals, are now the usual generators of scientific knowledge. How to optimize team interactions is a passionately pursued topic across several disciplines. This research hypothesizes that linguistic entrainment, or the convergence of linguistic properties of spoken conversation, may serve as a valid and relatively easy-to-collect measure that is predictive of team success. From the perspective of developing interventions for team innovation, organizations could unobtrusively measure team effectiveness using entrainment, and intervene with training to aid teams with low entrainment. Similar interventions would be useful for conversational agents that monitor and facilitate group interactions. The work could also support the development of browsers or data mining applications for corpora such as team meetings or classroom discussions.

    To date, most studies of entrainment have focused on conversational dyads rather than the multi-party conversations typical of teams. The technical objective of this research is to develop, validate and evaluate new measures of linguistic entrainment tailored to multi-party conversations. In particular, the first research goal is to develop multi-party entrainment measures that are computable using language technologies, and that are both motivated and validated by the literature on teams. The second goal is to demonstrate the utility of these measures in being associated with team processes and predicting team success. The supporting activities include 1) collection of an experimentally-obtained corpus where teams collaborate on a task where they converse, and where a team process intervention manipulates likely entrainment, 2) development of a set of entrainment measures for multi-party dialogue, 3) use of standard psychological teamwork measures for convergent validity and random conversations for divergent validity, 4) exploration of how the team factors of gender composition and participation equality impact group entrainment, and 5) evaluation of the utility of measuring entrainment for predicting team and dialogue success. This NSF grant is in collaboration with Susannah Paletz.

    How do students learn through text-based discussions in English Language Arts (ELA) classrooms? This study seeks to examine the content of student talk during ELA discussions in order to better understand how students develop their understanding of texts and reasoning skills through discussion. Our proposed study uses Natural Language Processing (NLP) to analyze two important features of students’ discussions about texts: specificity and type of evidence. This LRDC internal grant is in collaboration with Amanda Godley.

    Researchers for this project will develop and validate an automated assessment of students' analytic writing skills in response to reading text. During prior work the researchers studied an assessment of students' analytic writing to understand progress toward outcomes in the English Language Arts Common Core State Standards, and to understand effective writing instruction by teachers. The researchers focused on response-to-text assessment because: it is an essential skill for secondary and post-secondary success; current assessments typically examine writing outside of responding to text; and increased attention on analytic writing in schools will result in improved interventions. Recent advances in artificial intelligence offer a potential way forward through automated essay scoring of students' analytic writing at-scale and feedback to improve writing and in the teaching instruction. This IES grant is in collaboration with Rip Correnti and Lindsay Clare Matsumura.

    Writing and revising are essential parts of learning, yet many college students graduate without demonstrating improvement or mastery of academic writing. This project explores the feasibility of improving students' academic writing through a revision environment that integrates natural language processing methods, best practices in data visualization and user interfaces, and current pedagogical theories. The environment will support and encourage students to develop self-regulation skills that are necessary for writing and revising, including goal-setting, selection of writing strategies, and self-monitoring of progress. As a learning technology, the environment can be applied on a large scale, thereby improving the writing of diverse student populations, including English learners.

    Three stages of investigation are planned. First, to analyze data on students' revision behaviors, a series of experiments are conducted to study interactions between students and variations of the revision writing environment. Second, the collected data forms the gold standard for developing an end-to-end system that automatically extracts revisions between student drafts and identifies the goal for each revision. Multiple extraction algorithms are considered, including phrasal alignment based on semantic similarity metrics and deep learning approaches. To identify the goal of a revision, a supervised classifier is trained from the gold standard. A diverse set of features and the representations of the identified goals (e.g., granularity, scope) are explored. In addition to the "extract-then-classify" pipeline, an alternative joint sequence labeling model is also developed. The labeling of sequences is used to recognize revision goals and the sequences are mutated to generate possible corrections of sentence alignments for revision extraction. The writing environment is iteratively refined, augmenting the interface prototyping through frequent user studies. Third, a complete end-to-end system that integrates the most successful component models is deployed in college-level writing classes. Student progress is tracked across multiple assignments. This NSF grant is in collaboration with Amanda Godley and Rebecca Hwa.