Ron Artstein

Publication abstracts

Usman Sohail, Carla Gordon, Ron Artstein, and David Traum. Character initiative in dialogue increases user engagement and rapport. In Semdial 2019 (LondonLogue): Proceedings of the 23rd Workshop on the Semantics and Pragmatics of Dialogue, pages 136–145. London, September 2019.

Abstract: Two dialogue policies to support character initiative were added to the Digital Survivor of Sexual Assault, a conversational agent designed to answer questions about sexual harassment and assault in the U.S. Army: (1) asking questions of the user, and (2) suggesting conversation topics after a period of inactivity. Participants who interacted with a system that had these initiative policies reported that they felt higher engagement and rapport with the character, compared to participants who interacted with the baseline system. There was also a positive correlation between the number of instances of character initiative in a dialogue and the level of engagement and rapport reported by participants.

PDF (397KB)
Back to Ron Artstein’s home

Olga Uryupina, Ron Artstein, Antonella Bristot, Federica Cavicchio, Francesca Delogu, Kepa J. Rodriguez, and Massimo Poesio. Annotating a broad range of anaphoric phenomena, in a variety of genres: The ARRAU corpus. Natural Language Engineering, in press.

Abstract: This paper presents the second release of ARRAU, a multigenre corpus of anaphoric information created over 10 years to provide data for the next generation of coreference/anaphora resolution systems combining different types of linguistic and world knowledge with advanced discourse modeling supporting rich linguistic annotations. The distinguishing features of ARRAU include the following: treating all NPs as markables, including non-referring NPs, and annotating their (non-) referentiality status; distinguishing between several categories of non-referentiality and annotating non-anaphoric mentions; thorough annotation of markable boundaries (minimal/maximal spans, discontinuous markables); annotating a variety of mention attributes, ranging from morphosyntactic parameters to semantic category; annotating the genericity status of mentions; annotating a wide range of anaphoric relations, including bridging relations and discourse deixis; and, finally, annotating anaphoric ambiguity. The current version of the dataset contains 350K tokens and is publicly available from LDC. In this paper, we discuss in detail all the distinguishing features of the corpus, so far only partially presented in a number of conference and workshop papers, and we also discuss the development between the first release of ARRAU in 2008 and this second one.

Back to Ron Artstein’s home

Simon S. Woo, Ron Artstein, Elsi Kaiser, Xiao Le, and Jelena Mirkovic. Using episodic memory for user authentication. ACM Transactions on Privacy and Security 22(2): Article 11, 2019.

Abstract: Passwords are widely used for user authentication, but they are often difficult for a user to recall, easily cracked by automated programs, and heavily reused. Security questions are also used for secondary authentication. They are more memorable than passwords, because the question serves as a hint to the user, but they are very easily guessed. We propose a new authentication mechanism, called “life-experience passwords (LEPs).” Sitting somewhere between passwords and security questions, an LEP consists of several facts about a user-chosen life event – such as a trip, a graduation, a wedding, and so on. At LEP creation, the system extracts these facts from the user’s input and transforms them into questions and answers. At authentication, the system prompts the user with questions and matches the answers with the stored ones. We show that question choice and design make LEPs much more secure than security questions and passwords, while the question-answer format promotes low password reuse and high recall.

Specifically, we find that: (1) LEPs are 109–1014 times stronger than an ideal, randomized, eight-character password; (2) LEPs are up to 3 times more memorable than passwords and on par with security questions; and (3) LEPs are reused half as often as passwords. While both LEPs and security questions use personal experiences for authentication, LEPs use several questions that are closely tailored to each user. This increases LEP security against guessing attacks. In our evaluation, only 0.7% of LEPs were guessed by casual friends, and 9.5% by family members or close friends – roughly half of the security question guessing rate. On the downside, LEPs take around 5 times longer to input than passwords. So, these qualities make LEPs suitable for multi-factor authentication at high-value servers, such as financial or sensitive work servers, where stronger authentication strength is needed.

Back to Ron Artstein’s home

Bethany Lycan and Ron Artstein. Direct and mediated interaction with a Holocaust survivor. In Advanced Social Interaction with Agents: 8th International Workshop on Spoken Dialog Systems, edited by Maxine Eskenazi, Laurence Devillers, and Joseph Mariani, Lecture Notes in Electrical Engineering 510, pages 161–167. Springer, Cham, Switzerland, 2019.

Abstract: The New Dimensions in Testimony dialogue system was placed in two museums under two distinct conditions: docent-led group interaction, and free interaction with visitors. Analysis of the resulting conversations shows that docent-led interactions have a lower vocabulary and a higher proportion of user utterances that directly relate to the system’s subject matter, while free interaction is more personal in nature. Under docent-led interaction the system gives a higher proportion of direct appropriate responses, but overall correct system behavior is about the same in both conditions because the free interaction condition has more instances where the correct system behavior is to avoid a direct response.

Back to Ron Artstein’s home

Claire Bonial, Lucia Donatelli, Stephanie M. Lukin, Stephen Tratz, Ron Artstein, David Traum, and Clare R. Voss. Augmenting abstract meaning representation for human-robot dialogue. Proceedings of the First International Workshop on Designing Meaning Representations (DMR), pages 199–210. Florence, Italy, August 2019.

Abstract: We detail refinements made to Abstract Meaning Representation (AMR) that make the representation more suitable for supporting a situated dialogue system, where a human remotely controls a robot for purposes of search and rescue and reconnaissance. We propose 36 augmented AMRs that capture speech acts, tense and aspect, and spatial information. This linguistic information is vital for representing important distinctions, for example whether the robot has moved, is moving, or will move. We evaluate two existing AMR parsers for their performance on dialogue data. We also outline a model for graph-to-graph conversion, in which output from AMR parsers is converted into our refined AMRs. The design scheme presented here, though task-specific, is extendable for broad coverage of speech acts using AMR in future task-independent work.

PDF paper (960KB)
Back to Ron Artstein’s home

Patricia Chaffey, Ron Artstein, Kallirroi Georgila, Kimberly A. Pollard, Setareh Nasihati Gilani, David M. Krum, David Nelson, Kevin Huynh, Alesia Gainer, Seyed Hossein Alavi, Rhys Yahata, and David Traum. Developing a virtual reality wildfire simulation to analyze human communication and interaction with a robotic swarm during emergencies. LTC ’19: 9th Language and Technology Conference. Poznań, Poland, May 2019.

Abstract: Search and rescue missions involving robots face multiple challenges. The ratio of operators to robots is frequently one to one or higher, operators tasked with robots must contend with cognitive overload for long periods, and the robots themselves may be discomfiting to located survivors. To improve on the current state, we propose a swarm of robots equipped with natural language abilities and guided by a central virtual “spokesperson” able to access “plays”. The spokesperson may assist the operator with tasking the robots in their exploration of a zone, which allows the operator to maintain a safe distance. The use of multiple robots enables rescue personnel to cover a larger swath of ground, and the natural language component allows the robots to communicate with survivors located on site. This capability frees the operator to handle situations requiring personal attention, and overall can accelerate the location and assistance of survivors. In order to develop this system, we are creating a virtual reality simulation, in order to conduct a study and analysis of how humans communicate with these swarms of robots. The data collected from this experiment will inform how to best design emergency response swarm robots that are effectively able to communicate with the humans around them.

PDF paper (491KB)
Back to Ron Artstein’s home

Ron Artstein, Carla Gordon, Usman Sohail, Chirag Merchant, Andrew Jones, Julia Campbell, Matthew Trimmer, Jeffrey Bevington, COL Christopher Engen, and David Traum. Digital survivor of sexual assault. IUI ’19: Proceedings of the 24th International Conference on Intelligent User Interfaces, pages 417–425. Marina del Rey, California, March 2019.

Abstract: The Digital Survivor of Sexual Assault (DS2A) is an interface that allows a user to have a conversational experience with a survivor of sexual assault, using Artificial Intelligence technology and recorded videos. The application uses a statistical classifier to retrieve contextually appropriate pre-recorded video utterances by the survivor, together with dialogue management policies which enable users to conduct simulated conversations with the survivor about the sexual assault, its aftermath, and other pertinent topics. The content in the application has been specifically elicited to support the needs for the training of U.S. Army professionals in the Sexual Harassment/Assault Response and Prevention (SHARP) Program, and the application comes with an instructional support package. The system has been tested with approximately 200 users, and is presently being used in the SHARP Academy’s capstone course.

Video of the talk
Back to Ron Artstein’s home

Gale M. Lucas, Jill Boberg, David Traum, Ron Artstein, Jonathan Gratch, Alesia Gainer, Emmanuel Johnson, Anton Leuski, and Mikio Nakano. Culture, errors, and rapport-building dialogue in social agents. Proceedings of the 18th ACM International Conference on Intelligent Virtual Agents, pages 51–58. Sydney, Australia, November 2018.

Abstract: This work explores whether culture impacts the extent to which social dialogue can mitigate (or exacerbate) the loss of trust caused when agents make conversational errors. Our study uses an agent designed to persuade users to agree with its rankings on two tasks. Participants from the U.S. and Japan completed our study. We perform two manipulations: (1) The presence of conversational errors – the agent exhibited errors in the second task or not; (2) The presence of social dialogue – between the two tasks, users either engaged in a social dialogue with the agent or completed a control task. Replicating previous research, conversational errors reduce the agent’s influence. However, we found that culture matters: there was a marginally significant three-way interaction with culture, presence of social dialogue, and presence of errors. The pattern of results suggests that, for American participants, social dialogue backfired if it is followed by errors, presumably because it extends the period of good performance, creating a stronger contrast effect with the subsequent errors. However, for Japanese participants, social dialogue if anything mitigates the detrimental effect of errors; the negative effect of errors is only seen in the absence of a social dialogue. Agent design should therefore take the culture of the intended users into consideration when considering use of social dialogue to bolster agents against conversational errors.

Back to Ron Artstein’s home

Johnathan Mell, Gale Lucas, Jill Boberg, Ron Artstein, Jonathan Gratch, and Sharon Mozgai. Towards a repeated negotiating agent that treats people individually: Cooperation, social value orientation, & Machiavellianism. Proceedings of the 18th ACM International Conference on Intelligent Virtual Agents, pages 125–132. Sydney, Australia, November 2018.

Abstract: We present the results of a study in which humans negotiate with computerized agents employing varied tactics over a repeated number of economic ultimatum games. We report that certain agents are highly effective against particular classes of humans: several individual difference measures for the human participant are shown to be critical in determining which agents will be successful. Asking for favors works when playing with pro-social people but backfires with more selfish individuals. Further, making poor offers invites punishment from Machiavellian individuals. These factors may be learned once and applied over repeated negotiations, which means user modeling techniques that can detect these differences accurately will be more successful than those that don’t. Our work additionally shows that a significant benefit of cooperation is also present in repeated games—after sufficient interaction. These results have deep significance to agent designers who wish to design agents that are effective in negotiating with a broad swath of real human opponents. Furthermore, it demonstrates the effectiveness of techniques which can reason about negotiation over time.

Back to Ron Artstein’s home

Matthew Marge, Claire Bonial, Stephanie M. Lukin, Cory J. Hayes, Ashley Foots, Ron Artstein, Cassidy Henry, Kimberly A. Pollard, Carla Gordon, Felix Gervits, Anton Leuski, Susan G. Hill, Clare R. Voss, and David Traum. Balancing efficiency and coverage in human-robot dialogue collection. AAAI Fall Symposium on Interactive Learning in Artificial Intelligence for Human-Robot Interaction. Arlington, Virginia, October 2018.

Abstract: We describe a multi-phased Wizard-of-Oz approach to collecting human-robot dialogue in a collaborative search and navigation task. The data is being used to train an initial automated robot dialogue system to support collaborative exploration tasks. In the first phase, a wizard freely typed robot utterances to human participants. For the second phase, this data was used to design a GUI that includes buttons for the most common communications, and templates for communications with varying parameters. Comparison of the data gathered in these phases show that the GUI enabled a faster pace of dialogue while still maintaining high coverage of suitable responses, enabling more efficient targeted data collection, and improvements in natural language understanding using GUI-collected data. As a promising first step towards interactive learning, this work shows that our approach enables the collection of useful training data for navigation-based HRI tasks.

PDF paper (5.5MB)
Back to Ron Artstein’s home

Ramesh Manuvinakurike, Jacqueline Brixey, Trung Bui, Walter Chang, Ron Artstein, and Kallirroi Georgila. DialEdit: Annotations for spoken conversational image editing. Proceedings of the Fourteenth Joint ACL-ISO Workshop on Interoperable Semantic Annotation, pages 1–9. Santa Fe, New Mexico, August 2018.

Abstract: We present a spoken dialogue corpus and annotation scheme for conversational image editing, where people edit an image interactively through spoken language instructions. Our corpus contains spoken conversations between two human participants: users requesting changes to images and experts performing these modifications in real time. Our annotation scheme consists of 26 dialogue act labels covering instructions, requests, and feedback, together with actions and entities for the content of the edit requests. The corpus supports research and development in areas such as incremental intent recognition, visual reference resolution, image-grounded dialogue modeling, dialogue state tracking, and user modeling.

PDF proceedings (9.8MB)
Back to Ron Artstein’s home

Kathryn E. Muessig, Kelly A. Knudtson, Karina Soni, Margo Adams Larsen, David Traum, Willa Dong, Donaldson F. Conserve, Anton Leuski, Ron Artstein, and Lisa B. Hightow-Weidman. “I didn’t tell you sooner because I didn’t know how to handle it myself”: Developing a virtual reality program to support HIV-status disclosure decisions. Digital Culture and Education 10: 22–48, 2018.

Abstract: HIV status disclosure is associated with increased social support and protective behaviors against HIV transmission. Yet disclosure poses significant challenges in the face of persistent societal stigma. Few interventions focus on decision-making, self-efficacy, and communication skills to support disclosing HIV status to an intimate partner. Virtual reality (VR) and artificial intelligence (AI) technologies offer powerful tools to address this gap. Informed by Social Cognitive Theory, we created the Tough Talks VR program for HIV-positive young men who have sex with men (YMSM) to practice status disclosure safely and confidentially. Fifty-eight YMSM (ages 18–30, 88% HIV-positive) contributed 132 disclosure dialogues to develop the prototype through focus groups, usability testing, and a technical pilot. The prototype includes three disclosure scenarios (neutral, sympathetic, and negative response) and a database of 125 virtual character utterances. Participants select a VR scenario and realistic virtual character with whom to practice. In a pilot test of the fully automated neutral response scenario, the AI system responded appropriately to 71% of participant utterances. Most pilot study participants agreed Tough Talks was easy to use (9/11) and that they would like to use the system frequently (9/11). Tough Talks demonstrates that VR can be used to practice HIV status disclosure and lessons learned from program development offer insights for the use of AI systems for other areas of health and education.

PDF (1.4MB)
Back to Ron Artstein’s home

Stephanie M. Lukin, Kimberly A. Pollard, Claire Bonial, Matthew Marge, Cassidy Henry, Ron Artstein, David Traum, and Clare R. Voss. Consequences and factors of stylistic differences in human-robot dialogue. Proceedings of the SIGDIAL 2018 Conference: the 19th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 110–118. Melbourne, Australia, July 2018.

Abstract: This paper identifies stylistic differences in instruction-giving observed in a corpus of human-robot dialogue. Differences in verbosity and structure (i.e., single-intent vs. multi-intent instructions) arose naturally without restrictions or prior guidance on how users should speak with the robot. Different styles were found to produce different rates of miscommunication, and correlations were found between style differences and individual user variation, trust, and interaction experience with the robot. Understanding potential consequences and factors that influence style can inform design of dialogue systems that are robust to natural variation from human users.

PDF paper (187KB)
PDF poster (1.2MB)
Back to Ron Artstein’s home

Ron Artstein, Jill Boberg, Alesia Gainer, Jonathan Gratch, Emmanuel Johnson, Anton Leuski, Gale Lucas and David Traum. The Niki and Julie corpus: Collaborative multimodal dialogues between humans, robots, and virtual agents. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 2928–2932. Miyazaki, Japan, May 2018.

Abstract: The Niki and Julie corpus contains more than 600 dialogues between human participants and a human-controlled robot or virtual agent, engaged in a series of collaborative item-ranking tasks designed to measure influence. Some of the dialogues contain deliberate conversational errors by the robot, designed to simulate the kinds of conversational breakdown that are typical of present-day automated agents. Data collected include audio and video recordings, the results of the ranking tasks, and questionnaire responses; some of the recordings have been transcribed and annotated for verbal and nonverbal feedback. The corpus has been used to study influence and grounding in dialogue. All the dialogues are in American English.

PDF paper (641KB)
PDF poster (1.6MB)
Back to Ron Artstein’s home

Jacqueline Brixey, Eli Pincus and Ron Artstein. Chahta Anumpa: A multimodal corpus of the Choctaw language. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 3371–3376. Miyazaki, Japan, May 2018.

Abstract: This paper presents a general use corpus for the Native American indigenous language Choctaw. The corpus contains audio, video, and text resources, with many texts also translated in English. The Oklahoma Choctaw and the Mississippi Choctaw variants of the language are represented in the corpus. The data set provides documentation support for the threatened language, and allows researchers and language teachers access to a diverse collection of resources.

PDF (166KB)
Back to Ron Artstein’s home

Ramesh Manuvinakurike, Jacqueline Brixey, Trung Bui, Walter Chang, Doo Soon Kim, Ron Artstein and Kallirroi Georgila. Edit me: A corpus and a framework for understanding natural language image editing. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 4322–4326. Miyazaki, Japan, May 2018.

Abstract: This paper introduces the task of interacting with an image editing program through natural language. We present a corpus of image edit requests which were elicited for real world images, and an annotation framework for understanding such natural language instructions and mapping them to actionable computer commands. Finally, we evaluate crowd-sourced annotation as a means of efficiently creating a sizable corpus at a reasonable cost.

PDF paper (1.2MB)
PDF poster (432KB)
Back to Ron Artstein’s home

David Traum, Cassidy Henry, Stephanie Lukin, Ron Artstein, Felix Gervits, Kimberly Pollard, Claire Bonial, Su Lei, Clare R. Voss, Matthew Marge, Cory J. Hayes and Susan G. Hill. Dialogue structure annotation for multi-floor interaction. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), pages 104–111. Miyazaki, Japan, May 2018.

Abstract: We present an annotation scheme for meso-level dialogue structure, specifically designed for multi-floor dialogue. The scheme includes a transaction unit that clusters utterances from multiple participants and floors into units according to realization of an initiator’s intent, and relations between individual utterances within the unit. We apply this scheme to annotate a corpus of multi-floor human-robot interaction dialogues. We examine the patterns of structure observed in these dialogues and present inter-annotator statistics and relative frequencies of types of relations and transaction units. Finally, some example applications of these annotations are introduced.

PDF (562KB)
Back to Ron Artstein’s home

Claire Bonial, Stephanie M. Lukin, Ashley Foots, Cassidy Henry, Matthew Marge, Kimberly A. Pollard, Ron Artstein, David Traum and Clare R. Voss. Human-robot dialogue and collaboration in search and navigation. In Proceedings of the AREA Workshop: Annotation, Recognition, and Evaluation of Actions, Miyazaki, Japan, May 2018.

Abstract: Collaboration with a remotely located robot in tasks such as disaster relief and search and rescue can be facilitated by grounding natural language task instructions into actions executable by the robot in its current physical context. The corpus we describe here provides insight into the translation and interpretation a natural language instruction undergoes starting from verbal human intent, to understanding and processing, and ultimately, to robot execution. We use a ‘Wizard-of-Oz’ methodology to elicit the corpus data in which a participant speaks freely to instruct a robot on what to do and where to move through a remote environment to accomplish collaborative search and navigation tasks. This data offers the potential for exploring and evaluating action models by connecting natural language instructions to execution by a physical robot (controlled by a human ‘wizard’). In this paper, a description of the corpus (soon to be openly available) and examples of actions in the dialogue are provided.

PDF (312KB)
Back to Ron Artstein’s home

Gale M. Lucas, Jill Boberg, David Traum, Ron Artstein, Jonathan Gratch, Alesia Gainer, Emmanuel Johnson, Anton Leuski, and Mikio Nakano. Getting to know each other: The role of social dialogue in recovery from errors in social robots. In HRI’18: Proceedings of the 2018 ACM/IEEE International Conference on Human Robot Interaction, Chicago, Illinois, March 2018.

Abstract: This work explores the extent to which social dialogue can mitigate (or exacerbate) the loss of trust caused when robots make conversational errors. Our study uses a NAO robot programmed to persuade users to agree with its rankings on two tasks. We perform two manipulations: (1) The timing of conversational errors – the robot exhibited errors either in the first task, the second task, or neither; (2) The presence of social dialogue – between the two tasks, users either engaged in a social dialogue with the robot or completed a control task. We found that the timing of the errors matters: replicating previous research, conversational errors reduce the robot’s influence in the second task, but not on the first task. Social dialogue interacts with the timing of errors, acting as an intensifier: social dialogue helps the robot recover from prior errors, and actually boosts subsequent influence; but social dialogue backfires if it is followed by errors, because it extends the period of good performance, creating a stronger contrast effect with the subsequent errors. The design of social robots should therefore be more careful to avoid errors after periods of good performance than early on in a dialogue.

Back to Ron Artstein’s home

Claire Bonial, Matthew Marge, Ron Artstein, Felix Gervits, Cory J. Hayes, Cassidy Henry, Susan G. Hill, Anton Leuski, Pooja Moolchandani, Kimberly A. Pollard, David Traum, and Clare R. Voss. Laying down the yellow brick road: Development of a Wizard-of-Oz interface for collecting human-robot dialogue. In AAAI Fall Symposium on Natural Communication for Human-Robot Collaboration, Arlington, Virginia, November 2017.

Abstract: We describe the adaptation and refinement of a graphical user interface designed to facilitate a Wizard-of-Oz (WoZ) approach to collecting human-robot dialogue data. The data collected will be used to develop a dialogue system for robot navigation. Building on an interface previously used in the development of dialogue systems for virtual agents and video playback, we add templates with open parameters which allow the wizard to quickly produce a wide variety of utterances. Our research demonstrates that this approach to data collection is viable as an intermediate step in developing a dialogue system for physical robots in remote locations from their users – a domain in which the human and robot need to regularly verify and update a shared understanding of the physical environment. We show that our WoZ interface and the fixed set of utterances and templates therein provide for a natural pace of dialogue with good coverage of the navigation domain.

PDF (533KB)
Back to Ron Artstein’s home

Gale M. Lucas, Jill Boberg, David Traum, Ron Artstein, Jon Gratch, Alesia Gainer, Emmanuel Johnson, Anton Leuski, and Mikio Nakano. The role of social dialogue and errors in robots. Proceedings of the 5th International Conference on Human Agent Interaction, pages 431–433. Bielefeld, Germany, October 2017. (Poster)

Abstract: Social robots establish rapport with human users. This work explores the extent to which rapport-building can benefit (or harm) conversations with robots, and under what circumstances this occurs. For example, previous work has shown that agents that make conversational errors are less capable of influencing people than agents that do not make errors [1]. Some work has shown this effect with robots, but prior research has not considered additional factors such as the level of rapport between the person and the robot. We predicted that building rapport through a social dialogue (such as an ice-breaker) could mitigate the detrimental effect of a robot’s errors on influence. Our study used a Nao robot programmed to persuade users to agree with its rankings on two “survival tasks” (e.g., lunar survival task). We manipulated both errors and social dialogue: the robot either exhibited errors in the second survival task or not, and users either engaged in an ice-breaker with the robot between the two survival tasks or completed a control task. Replicating previous research, errors tended to reduce the robot’s influence in the second survival task. Contrary to our prediction, results revealed that the ice-breaker did not mitigate the effect of errors, and if anything, errors were more harmful after the ice-breaker (intended to build rapport) than in the control condition. This backfiring of attempted rapport-building may be due to a contrast effect, suggesting that the design of social robots should avoid introducing dialogues of incongruent quality.

PDF paper (713KB)
PDF poster (4MB)
Back to Ron Artstein’s home

Eugenia Hee, Ron Artstein, Su Lei, Cristian Cepeda, and David Traum. Assessing differences in multimodal grounding with embodied and disembodied agents. 5th European and 8th Nordic Symposium on Multimodal Communication. Bielefeld, Germany, October 2017.

Abstract: Establishing common ground is an essential part of any collaboration process and can be critical in the success of the desired task at hand. With the increased introduction of artificial agents into society, understanding the way that we interact with both embodied and disembodied versions of these agents becomes even more critical. While people are getting more comfortable with using machines for accessing information and providing services, it is less clear to what degree people strive for common ground with these machines and provide feedback related to their reactions to the provided information. We look at the question of how people provide grounding-related feedback when in conversation with a robot and a virtual human in a variety of tasks and modalities. We examine several different types of activities, including first-contact social dialogue, and several item-ranking tasks, in which participants can reveal their own rankings and rationales and potentially influence others. We also examine several kinds of feedback, including positive and negative signals of understanding and agreement. Finally, we examine verbal utterances and non-verbal signals for these functions. We look at whether different tasks or agent types influence the amount and modalities of different kinds of feedback behaviors. We also look at whether feedback patterns are correlated with different amounts of influence that the agents exert on humans.

PDF paper (842KB)
Back to Ron Artstein’s home

Anton Leuski and Ron Artstein. Lessons in dialogue system deployment. Proceedings of the SIGDIAL 2017 Conference: the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 352–355. Saarbrücken, Germany, August 2017. (Demo paper)

Abstract: We analyze deployment of an interactive dialogue system in an environment where deep technical expertise might not be readily available. The initial version was created using a collection of research tools. We summarize a number of challenges with its deployment at two museums and describe a new system that simplifies the installation and user interface; reduces reliance on 3rd-party software; and provides a robust data collection mechanism.

PDF paper (393KB)
PDF poster (5.5MB)
Back to Ron Artstein’s home

Jacqueline Brixey, Rens Hoegen, Wei Lan, Joshua Rusow, Karan Singla, Xusen Yin, Ron Artstein, and Anton Leuski. SHIHbot: A Facebook chatbot for sexual health information on HIV/AIDS. Proceedings of the SIGDIAL 2017 Conference: the 18th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 370–373. Saarbrücken, Germany, August 2017. (Demo paper)

Abstract: We present the implementation of an autonomous chatbot, SHIHbot, deployed on Facebook, which answers a wide variety of sexual health questions on HIV/AIDS. The chatbot’s response database is compiled from professional medical and public health resources in order to provide reliable information to users. The system’s backend is NPCEditor, a response selection platform trained on linked questions and answers; to our knowledge this is the first retrieval-based chatbot deployed on a large public social network.

PDF paper (12MB)
Back to Ron Artstein’s home

Matthew Marge, Claire Bonial, Ashley Foots, Cory Hayes, Cassidy Henry, Kimberly A. Pollard, Ron Artstein, Clare R. Voss, and David Traum. Exploring variation of natural human commands to a robot in a collaborative navigation task. Proceedings of the First Workshop on Language Grounding for Robotics, pages 58–66. Vancouver, British Columbia, Canada, August 2017. (Poster)

Abstract: Robot-directed communication is variable, and may change based on human perception of robot capabilities. To collect training data for a dialogue system and to investigate possible communication changes over time, we developed a Wizard-of-Oz study that (a) simulates a robot’s limited understanding, and (b) collects dialogues where human participants build a progressively better mental model of the robot’s understanding. With ten participants, we collected ten hours of human-robot dialogue. We analyzed the structure of instructions that participants gave to a remote robot before it responded. Our findings show a general initial preference for including metric information (e.g., move forward 3 feet) over landmarks (e.g., move to the desk) in motion commands, but this decreased over time, suggesting changes in perception.

PDF paper (753KB)
PDF poster (684KB)
Back to Ron Artstein’s home

Cassidy Henry, Pooja Moolchandani, Kimberly A. Pollard, Claire Bonial, Ashley Foots, Ron Artstein, Cory Hayes, Clare R. Voss, David Traum, and Matthew Marge. Towards efficient human-robot dialogue collection: Moving Fido into the virtual world. WiNLP workshop. Vancouver, British Columbia, Canada, July 2017. (Poster)

Abstract: Our research aims to develop a natural dialogue interface between robots and humans. We describe two focused efforts to increase data collection efficiency towards this end: creation of an annotated corpus of interaction data, and a robot simulation, allowing greater flexibility in when and where we can run experiments.

PDF paper (1.4MB)
PDF poster (1.6MB)
Back to Ron Artstein’s home

Ron Artstein. Inter-annotator agreement. In Handbook of Linguistic Annotation, edited by Nancy Ide and James Pustejovsky, pages 297–313. Springer, Dordrecht, 2017.

Abstract: This chapter touches upon several issues in the calculation and assessment of inter-annotator agreement. It gives an introduction to the theory behind agreement coefficients and examples of their application to linguistic annotation tasks. Specific examples explore variation in annotator performance due to heterogeneous data, complex labels, item difficulty, and annotator differences, showing how global agreement coefficients may mask these sources of variation, and how detailed agreement studies can give insight into both the annotation process and the nature of the underlying data. The chapter also reviews recent work on using machine learning to exploit the variation among annotators and learn detailed models from which accurate labels can be inferred. I therefore advocate an approach where agreement studies are not used merely as a means to accept or reject a particular annotation scheme, but as a tool for exploring patterns in the data that are being annotated.

PDF preprint (293K)
Back to Ron Artstein’s home

Bethany Lycan and Ron Artstein. Direct and mediated interaction with a Holocaust survivor. In International Workshop on Spoken Dialogue Systems Technology. Farmington, Pennsylvania, June 2017. (Short paper/poster)

Abstract: The New Dimensions in Testimony dialogue system was placed in two museums under two distinct conditions: docent-led group interaction, and free interaction with visitors. Analysis of the resulting conversations shows that docent-led interactions have a lower vocabulary and a higher proportion of user utterances that directly relate to the system’s subject matter, while free interaction is more personal in nature. Under docent-led interaction the system gives a higher proportion of direct appropriate responses, but overall correct system behavior is about the same in both conditions because the free interaction condition has more instances where the correct system behavior is to avoid a direct response.

PDF paper (79KB)
PDF poster (590KB)
Back to Ron Artstein’s home

Ron Artstein, David Traum, Jill Boberg, Alesia Gainer, Jonathan Gratch, Emmanuel Johnson, Anton Leuski, and Mikio Nakano. Listen to my body: Does making friends help influence people? In Proceedings of the Thirtieth International Florida Artificial Intelligence Research Society Conference, pages 430–435. Marco Island, Florida, May 2017.

Abstract: We investigate the effect of relational dialogue on creating rapport and exerting social influence in human-robot conversation, by comparing interactions with and without a relational component, and with different agent types. Human participants interact with two agents – a Nao robot and a virtual human – in four dialogue scenarios: one involving building familiarity, and three involving sharing information and persuasion in item-ranking tasks. Results show that both agents influence human decision-making; people prefer interacting with the robot, feel higher rapport with the robot, and believe the robot has more influence; and that objective influence of the agent on the person is increased by building familiarity, but is not significantly different between the agents.

PDF (1MB)
Back to Ron Artstein’s home

Simon S. Woo, Elsi Kaiser, Ron Artstein, and Jelena Mirkovic. Life-experience passwords (LEPs). In Annual Computer Security Applications Conference (ACSAC), pages 113–126. Los Angeles, December 2016.

Abstract: Passwords are widely used for user authentication, but they are often difficult for a user to recall, easily cracked by automated programs and heavily reused. Security questions are also used for secondary authentication. They are more memorable than passwords, but are very easily guessed. We propose a new authentication mechanism, called “life-experience passwords (LEPs),” which outperforms passwords and security questions, both at recall and at security. Each LEP consists of several facts about a user-chosen past experience, such as a trip, a graduation, a wedding, etc. At LEP creation, the system extracts these facts from the user’s input and transforms them into questions and answers. At authentication, the system prompts the user with questions and matches her answers with those stored by the system.

In this paper we propose two LEP designs, and evaluate them via user studies. We further compare LEPs to passwords, and find that: (1) LEPs are 30–47 bits stronger than an ideal, randomized, 8-character password, (2) LEPs are up to 3× more memorable, and (3) LEPs are reused half as often as passwords. While both LEPs and security questions use personal experiences for authentication, LEPs use several questions, which are closely tailored to each user. This increases LEP security against guessing attacks. In our evaluation, only 0.7% of LEPs were guessed by friends, while prior research found that friends could guess 17–25% of security questions. LEPs also contained a very small amount of sensitive or fake information. All these qualities make LEPs a promising, new authentication approach.

Back to Ron Artstein’s home

Ron Artstein, David Traum, Jill Boberg, Alesia Gainer, Jonathan Gratch, Emmanuel Johnson, Anton Leuski, and Mikio Nakano. Niki and Julie: A robot and virtual human for studying multimodal social interaction. In 18th ACM International Conference on Multimodal Interaction (ICMI), pages 402–403. Tokyo, November 2016. (Demo paper)

Abstract: We demonstrate two agents, a robot and a virtual human, which can be used for studying factors that impact social influence. The agents engage in dialogue scenarios that build familiarity, share information, and attempt to influence a human participant. The scenarios are variants of the classical “survival task,” where members of a team rank the importance of a number of items (e.g., items that might help one survive a crash in the desert). These are ranked individually and then re-ranked following a team discussion, and the difference in ranking provides an objective measure of social influence. Survival tasks have been used in psychology, virtual human research, and human-robot interaction. Our agents are operated in a “Wizard-of-Oz” fashion, where a hidden human operator chooses the agents’ dialogue actions while interacting with an experiment participant.

PDF (2MB)
Back to Ron Artstein’s home

Albert Rizzo, Stefan Scherer, David DeVault, Jonathan Gratch, Ron Artstein, Arno Hartholt, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Giota Stratou, David Traum, Rachel Wood, Jill Boberg, and Louis-Philippe Morency. Detection and computational analysis of psychological signals using a virtual human interviewing agent. Journal of Pain Management 9(3): 311–321, 2016.

Abstract: It has long been recognized that facial expressions, body posture/gestures and vocal parameters play an important role in human communication and the implicit signalling of emotion. Recent advances in low cost computer vision and behavioral sensing technologies can now be applied to the process of making meaningful inferences as to user state when a person interacts with a computational device. Effective use of this additive information could serve to promote human interaction with virtual human (VH) agents that may enhance diagnostic assessment. This paper will focus on our current research in these areas within the DARPA-funded “Detection and Computational Analysis of Psychological Signals” project, with specific attention to the SimSensei application use case. SimSensei is a virtual human interaction platform that is able to sense and interpret real-time audiovisual behavioral signals from users interacting with the system. It is specifically designed for health care support and leverages years of virtual human research and development at USC-ICT. The platform enables an engaging face-to-face interaction where the virtual human automatically reacts to the state and inferred intent of the user through analysis of behavioral signals gleaned from facial expressions, body gestures and vocal parameters. Akin to how non-verbal behavioral signals have an impact on human to human interaction and communication, SimSensei aims to capture and infer from user non-verbal communication to improve engagement between a VH and a user. The system can also quantify and interpret sensed behavioral signals longitudinally that can be used to inform diagnostic assessment within a clinical context.

Back to Ron Artstein’s home

Matthew Marge, Claire Bonial, Kimberly A. Pollard, Ron Artstein, Brendan Byrne, Susan G. Hill, Clare Voss, and David Traum. Assessing agreement in human-robot dialogue strategies: A tale of two wizards. In Intelligent Virtual Agents: 16th International Conference, IVA 2016, Los Angeles, CA, USA, September 20–23, 2016 Proceedings (Lecture Notes in Artificial Intelligence 10011), pages 484–488. Springer, Heidelberg, October 2016. (Poster)

Abstract: The Wizard-of-Oz (WOz) method is a common experimental technique in virtual agent and human-robot dialogue research for eliciting natural communicative behavior from human partners when full autonomy is not yet possible. For the first phase of our research reported here, wizards play the role of dialogue manager, acting as a robot’s dialogue processing. We describe a novel step within WOz methodology that incorporates two wizards and control sessions: the wizards function much like corpus annotators, being asked to make independent judgments on how the robot should respond when receiving the same verbal commands in separate trials. We show that inter-wizard discussion after the control sessions and the resolution with a reconciled protocol for the follow-on pilot sessions successfully impacts wizard behaviors and significantly aligns their strategies. We conclude that, without control sessions, we would have been unlikely to achieve both the natural diversity of expression that comes with multiple wizards and a better protocol for modeling an automated system.

PDF preprint (462KB)
PDF poster (1.6MB)
Back to Ron Artstein’s home

Vasily Konovalov, Oren Melamud, Ron Artstein, and Ido Dagan. Collecting better training data using biased agent policies in negotiation dialogues. In Proceedings of WOCHAT, the Second Workshop on Chatbots and Conversational Agent Technologies. Los Angeles, September 2016.

Abstract: When naturally occurring data is characterized by a highly skewed class distribution, supervised learning often benefits from reducing this skew. Human-agent dialogue data is commonly highly skewed when using standard agent policies. Hence, we suggest that agent policies need to be reconsidered in the context of training data collection. Specifically, in this work we implemented biased agent policies that are optimized for data collection in the negotiation domain. Empirical evaluations show that our method is successful in collecting a reasonably balanced corpus in the highly skewed Job-Candidate domain. Furthermore, using this balanced corpus to train a negotiation intent classifier yields notable performance improvements relative to naturally distributed data.

PDF Paper (603KB)
Back to Ron Artstein’s home

Satheesh Ravi and Ron Artstein. Language portability for dialogue systems: Translating a question-answering system from English into Tamil. Proceedings of the SIGDIAL 2016 Conference: the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 111–116. Los Angeles, September 2016. (Short paper/poster)

Abstract: A training and test set for a dialogue system in the form of linked questions and responses is translated from English into Tamil. Accuracy of identifying an appropriate response in Tamil is 79%, compared to the English accuracy of 89%, suggesting that translation can be useful to start up a dialogue system. Machine translation of Tamil inputs into English also results in 79% accuracy. However, machine translation of the English training data into Tamil results in a drop in accuracy to 54% when tested on manually authored Tamil, indicating that there is still a large gap before machine translated dialogue systems can interact with human users.

PDF Paper (172KB)
PDF poster (221KB)
Back to Ron Artstein’s home

Ron Artstein, Alesia Gainer, Kallirroi Georgila, Anton Leuski, Ari Shapiro, and David Traum. New Dimensions in Testimony demonstration. Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Demonstrations, pages 32–36. Association for Computational Linguistics, San Diego, California, June 2016.

Abstract: New Dimensions in Testimony is a prototype dialogue system that allows users to conduct a conversation with a real person who is not available for conversation in real time. Users talk to a persistent representation of Holocaust survivor Pinchas Gutter on a screen, while a dialogue agent selects appropriate responses to user utterances from a set of pre-recorded video statements, simulating a live conversation. The technology is similar to existing conversational agents, but to our knowledge this is the first system to portray a real person. The demonstration will show the system on a range of screens (from mobile phones to large TVs), and allow users to have individual conversations with Mr. Gutter.

PDF Paper (1.7MB)
PDF poster (1.9MB)
Back to Ron Artstein’s home

Olga Uryupina, Ron Artstein, Antonella Bristot, Federica Cavicchio, Kepa Rodriguez, and Massimo Poesio. ARRAU: linguistically-motivated annotation of anaphoric descriptions. In Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pages 2058–2062. Portorož, Slovenia, May 2016.

Abstract: This paper presents a second release of the ARRAU dataset: a multi-domain corpus with thorough linguistically motivated annotation of anaphora and related phenomena. Building upon the first release almost a decade ago, a considerable effort had been invested in improving the data both quantitatively and qualitatively. Thus, we have doubled the corpus size, expanded the selection of covered phenomena to include referentiality and genericity and designed and implemented a methodology for enforcing the consistency of the manual annotation. We believe that the new release of ARRAU provides a valuable material for ongoing research in complex cases of coreference as well as for a variety of related tasks. The corpus is publicly available through LDC.

PDF Paper (128KB)
Back to Ron Artstein’s home

Vasily Konovalov, Ron Artstein, Oren Melamud, and Ido Dagan. The Negochat corpus of human-agent negotiation dialogues. Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016), pages 3141–3145. Portorož, Slovenia, May 2016.

Abstract: Annotated in-domain corpora are crucial to the successful development of dialogue systems of automated agents, and in particular for developing natural language understanding (NLU) components of such systems. Unfortunately, such important resources are scarce. In this work, we introduce an annotated natural language human-agent dialogue corpus in the negotiation domain. The corpus was collected using Amazon Mechanical Turk following the ‘Wizard-Of-Oz’ approach, where a ‘wizard’ human translates the participants’ natural language utterances in real time into a semantic language. Once dialogue collection was completed, utterances were annotated with intent labels by two independent annotators, achieving high inter-annotator agreement. Our initial experiments with an SVM classifier show that automatically inferring such labels from the utterances is far from trivial. We make our corpus publicly available to serve as an aid in the development of dialogue systems for negotiation agents, and suggest that analogous corpora can be created following our methodology and using our available source code. To the best of our knowledge this is the first publicly available negotiation dialogue corpus.

PDF Paper (109KB)
Back to Ron Artstein’s home

Ron Artstein and Kenneth Silver. Ethics for a combined human-machine dialogue agent. In Ethical and Moral Considerations in Non-Human Agents: Papers from the AAAI Spring Symposium, pages 184–189. Stanford, California, March 2016.

Abstract: We discuss philosophical and ethical issues that arise from a dialogue system intended to portray a real person, using recordings of the person together with a machine agent that selects recordings during a synchronous conversation with a user. System output may count as actions of the speaker if the speaker intends to communicate with users and the outputs represent what the speaker would have chosen to say in context; in such cases the system can justifiably be said to be holding a conversation that is offset in time. The autonomous agent may at times misrepresent the speaker’s intentions, and such failures are analogous to good-faith misunderstandings. The user may or may not need to be informed that the speaker is not organically present, depending on the application.

PDF Paper (618KB)
Back to Ron Artstein’s home

David Traum, Andrew Jones, Kia Hays, Heather Maio, Oleg Alexander, Ron Artstein, Paul Debevec, Alesia Gainer, Kallirroi Georgila, Kathleen Haase, Karen Jungblut, Anton Leuski, Stephen Smith, and William Swartout. New Dimensions in Testimony: Digitally Preserving a Holocaust Survivor’s Interactive Storytelling. In Interactive Storytelling: 8th International Conference on Interactive Digital Storytelling, ICIDS 2015, Copenhagen, Denmark, November 30–December 4, 2015, Proceedings (Lecture Notes in Computer Science 9445), pages 269–281. Springer, Heidelberg, December 2015. Best paper award

Abstract: We describe a digital system that allows people to have an interactive conversation with a human storyteller (a Holocaust survivor) who has recorded a number of dialogue contributions, including many compelling narratives of his experiences and thoughts. The goal is to preserve as much as possible of the experience of face-to-face interaction. The survivor’s stories, answers to common questions, and testimony are recorded in high fidelity, and then delivered interactively to an audience as responses to spoken questions. People can ask questions and receive answers on a broad range of topics including the survivor’s experiences before, after and during the war, his attitudes and philosophy. Evaluation results show that most user questions can be addressed by the system, and that audiences are highly engaged with the resulting interaction.

PDF paper (950KB)
Back to Ron Artstein’s home

David Traum, Kallirroi Georgila, Ron Artstein, and Anton Leuski. Evaluating spoken dialogue processing for time-offset interaction. Proceedings of the SIGDIAL 2015 Conference: the 16th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 199–208. Prague, Czech Republic, September 2015. Best paper award

Abstract: This paper presents the first evaluation of a full automated prototype system for time-offset interaction, that is, conversation between a live person and recordings of someone who is not temporally co-present. Speech recognition reaches word error rates as low as 5% with general-purpose language models and 19% with domain-specific models, and language understanding can identify appropriate direct responses to 60–66% of user utterances while keeping errors to 10–16% (the remainder being indirect, or off-topic responses). This is sufficient to enable a natural flow and relatively open-ended conversations, with a collection of under 2000 recorded statements.

PDF paper (182KB)
Back to Ron Artstein’s home

Ron Artstein, Anton Leuski, Heather Maio, Tomer Mor-Barak, Carla Gordon, and David Traum. How many utterances are needed to support time-offset interaction? In Proceedings of the Twenty-Eighth International Florida Artificial Intelligence Research Society Conference, pages 144–149. Hollywood, Florida, May 2015.

Abstract: Time-offset interaction is a new technology that enables conversational interaction with a person who is not present, using pre-recorded video statements. Statements were recorded by Pinchas Gutter, a Holocaust survivor, talking about his personal experiences before, during and after the Holocaust. Participants interacted with the statements through a “Wizard of Oz” system, where live operators select an appropriate reaction to each utterance in real time; unanswered questions were analyzed to identify gaps, and additional statements were recorded to fill the gaps. Even though participant questions were completely unconstrained, the recorded statements from the first round directly addressed at least 58% of the questions; this number rises to 95% with the second round of recording, when tested on newly elicited utterances. This demonstrates the feasibility for a system to address unseen questions and sustain short conversations when the topic is well defined. The statements have been put into an automated system using existing language understanding technology, to create a preliminary working system of time-offset interaction, allowing a live conversation with a real human who is not present for the conversation in real time.

PDF paper (748KB)
Back to Ron Artstein’s home

Simon S. Woo, Jelena Mirkovic, Ron Artstein, and Elsi Kaiser. Life-experience passwords (LEPs). In Who are you?! Adventures in Authentication: WAY Workshop. Menlo Park, California, July 2014.

Abstract: User-supplied textual passwords are extensively used today for user authentication. However, these passwords have serious deficiencies in the way they interact with humans’ natural ability to form memories. Strong passwords that are hard to crack are also often hard for humans to remember, while memorable passwords are easily brute-forced or guessed. We propose a novel password design – life-experience passwords (LEPs). We explain how to use users’ existing episodic memories about defining life events to create memorable and hard-to-guess passwords and discuss challenges involved in design and use of LEPs.

PDF paper (121KB)
Back to Ron Artstein’s home

Jonathan Gratch, Ron Artstein, Gale Lucas, Giota Stratou, Stefan Scherer, Angela Nazarian, Rachel Wood, Jill Boberg, David DeVault, Stacy Marsella, David Traum, Skip Rizzo, and Louis-Philippe Morency. The Distress Analysis Interview Corpus of Human and Computer Interviews. In Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014), pages 3123–3128. Reykjavik, Iceland, May 2014.

Abstract: The Distress Analysis Interview Corpus (DAIC) contains clinical interviews designed to support the diagnosis of psychological distress conditions such as anxiety, depression, and post traumatic stress disorder. The interviews are conducted by humans, human controlled agents and autonomous agents, and the participants include both distressed and non-distressed individuals. Data collected include audio and video recordings and extensive questionnaire responses; parts of the corpus have been transcribed and annotated for a variety of verbal and non-verbal features. The corpus has been used to support the creation of an automated interviewer agent, and for research on the automatic identification of psychological distress.

PDF paper (864KB)
Back to Ron Artstein’s home

David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jon Gratch, Arno Hartholt, Margaux Lhommet, Gale Lucas, Stacy Marsella, Fabrizio Morbini, Angela Nazarian, Stefan Scherer, Giota Stratou, Apar Suri, David Traum, Rachel Wood, Yuyu Xu, Albert Rizzo, and Louis-Philippe Morency. SimSensei kiosk: A virtual human interviewer for healthcare Decision Support. Proceedings of the 13th International Conference on Autonomous Agents and Multiagent Systems (AAMAS 2014), pages 1061–1068, Paris, May 2014. Nominated for best paper award

Abstract: We present SimSensei Kiosk, an implemented virtual human interviewer designed to create an engaging face-to-face interaction where the user feels comfortable talking and sharing information. SimSensei Kiosk is also designed to create interactional situations favorable to the automatic assessment of distress indicators, defined as verbal and nonverbal behaviors correlated with depression, anxiety or post-traumatic stress disorder (PTSD). In this paper, we summarize the design methodology, performed over the past two years, which is based on three main development cycles: (1) analysis of face-to-face human interactions to identify potential distress indicators, dialogue policies and virtual human gestures, (2) development and analysis of a Wizard-of-Oz prototype system where two human operators were deciding the spoken and gestural responses, and (3) development of a fully automatic virtual interviewer able to engage users in 15–25 minute interactions. We show the potential of our fully automatic virtual human interviewer in a user study, and situate its performance in relation to the Wizard-of-Oz prototype.

PDF paper (1.2MB)
Back to Ron Artstein’s home

Ron Artstein, David Traum, Oleg Alexander, Anton Leuski, Andrew Jones, Kallirroi Georgila, Paul Debevec, William Swartout, Heather Maio, and Stephen Smith. Time-offset interaction with a Holocaust survivor. In IUI ’14: Proceedings of the 19th international conference on Intelligent User Interfaces, pages 163–168, Haifa, Israel, February 2014.

Abstract: Time-offset interaction is a new technology that allows for two-way communication with a person who is not available for conversation in real time: a large set of statements are prepared in advance, and users access these statements through natural conversation that mimics face-to-face interaction. Conversational reactions to user questions are retrieved through a statistical classifier, using technology that is similar to previous interactive systems with synthetic characters; however, all of the retrieved utterances are genuine statements by a real person. Recordings of answers, listening and idle behaviors, and blending techniques are used to create a persistent visual image of the person throughout the interaction. A proof-of-concept has been implemented using the likeness of Pinchas Gutter, a Holocaust survivor, enabling short conversations about his family, his religious views, and resistance. This proof-of-concept has been shown to dozens of people, from school children to Holocaust scholars, with many commenting on the impact of the experience and potential for this kind of interface.

PDF paper (729KB)
Back to Ron Artstein’s home

William Swartout, Ron Artstein, Eric Forbell, Susan Foutz, H. Chad Lane, Belinda Lange, Jacquelyn Morie, Dan Noren, Skip Rizzo, and David Traum. Virtual humans for learning. AI Magazine 34(4): 13-30, 2013.

Abstract: Virtual humans are computer-generated characters designed to look and behave like real people. Studies have shown that virtual humans can mimic many of the social effects that one finds in human-human interactions such as creating rapport, and people respond to virtual humans in ways that are similar to how they respond to real people. We believe that virtual humans represent a new metaphor for interacting with computers, one in which working with a computer becomes much like interacting with a person and this can bring social elements to the interaction that are not easily supported with conventional interfaces. We present two systems that embody these ideas. The first, the Twins are virtual docents in the Museum of Science, Boston, designed to engage visitors and raise their awareness and knowledge of science. The second SimCoach, uses an empathetic virtual human to provide veterans and their families with information about PTSD and depression.

Back to Ron Artstein’s home

Lauren Faust and Ron Artstein. People hesitate more, talk less to virtual interviewers than to human interviewers. In Semdial 2013 DialDam: Proceedings of the 17th Workshop on the Semantics and Pragmatics of Dialogue, pages 35–43, Amsterdam, December 2013.

Abstract: In a series of screening interviews for psychological distress, conducted separately by a human interviewer and by an animated virtual character controlled by a human, participants talked substantially less and produced twice as many filled pauses when talking to the virtual character. This contrasts with earlier findings, where people were less disfluent when talking to a computer dialogue system. The results suggest that the characteristics of computer-directed speech vary depending on the type of dialogue system used.

PDF paper (1.5MB)
Back to Ron Artstein’s home

David DeVault, Kallirroi Georgila, Ron Artstein, Fabrizio Morbini, David Traum, Stefan Scherer, Albert (Skip) Rizzo and Louis-Philippe Morency. Verbal indicators of psychological distress in interactive dialogue with a virtual human. In Proceedings of the SIGDIAL 2013 Conference: the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 193–202. Metz, France, August 2013.

Abstract: We explore the presence of indicators of psychological distress in the linguistic behavior of subjects in a corpus of semi-structured virtual human interviews. At the level of aggregate dialogue-level features, we identify several significant differences between subjects with depression and PTSD when compared to non-distressed subjects. At a more fine-grained level, we show that significant differences can also be found among features that represent subject behavior during specific moments in the dialogues. Finally, we present statistical classification results that suggest the potential for automatic assessment of psychological distress in individual interactions with a virtual human dialogue system.

PDF paper (922K)
Back to Ron Artstein’s home

Fabrizio Morbini, Kartik Audhkhasi, Kenji Sagae, Ron Artstein, Doğan Can, Panayiotis Georgiou, Shri Narayanan, Anton Leuski and David Traum. Which ASR should I choose for my dialogue system? In Proceedings of the SIGDIAL 2013 Conference: the 14th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 394–403. Metz, France, August 2013.

Abstract: We present an analysis of several publicly available automatic speech recognizers (ASRs) in terms of their suitability for use in different types of dialogue systems. We focus in particular on cloud based ASRs that recently have become available to the community. We include features of ASR systems and desiderata and requirements for different dialogue systems, taking into account the dialogue genre, type of user, and other features. We then present speech recognition results for six different dialogue systems. The most interesting result is that different ASR systems perform best on the data sets. We also show that there is an improvement over a previous generation of recognizers on some of these data sets. We also investigate language understanding (NLU) on the ASR output, and explore the relationship between ASR and NLU performance.

PDF paper (176K)
Back to Ron Artstein’s home

Fabrizio Morbini, Kartik Audhkhasi, Ron Artstein, Maarten Van Segbroeck, Kenji Sagae, Panayiotis Georgiou, David R. Traum, and Shri Narayanan. A reranking approach for recognition and classification of speech input in conversational dialogue systems. In Fourth IEEE Workshop on Spoken Language Technology (SLT). Miami Beach, Forida, December 2012.

Abstract: We address the challenge of interpreting spoken input in a conversational dialogue system with an approach that aims to exploit the close relationship between the tasks of speech recognition and language understanding through joint modeling of these two tasks. Instead of using a standard pipeline approach where the output of a speech recognizer is the input of a language understanding module, we merge multiple speech recognition and utterance classification hypotheses into one list to be processed by a joint reranking model. We obtain substantially improved performance in language understanding in experiments with thousands of user utterances collected from a deployed spoken dialogue system.

Back to Ron Artstein’s home

Sunghyun Park, Gelareh Mohammadi, Ron Artstein, and Louis-Philippe Morency. Crowdsourcing micro-level multimedia annotations: The challenges of evaluation and interface. To appear in International ACM Workshop on Crowdsourcing for Multimedia (CrowdMM). Nara, Japan, October 2012.

Abstract: This paper presents a new evaluation procedure and tool for crowdsourcing micro-level multimedia annotations and shows that such annotations can achieve a quality comparable to that of expert annotations. We propose a new evaluation procedure, called MM-Eval (Micro-level Multimedia Evaluation), which compares fine time-aligned annotations using Krippendorff’s alpha metric and introduce two new metrics to evaluate the types of disagreement between coders. We also introduce OCTAB (Online Crowdsourcing Tool for Annotations of Behaviors), a web-based annotation tool that allows precise and convenient multimedia behavior annotations, directly from Amazon Mechanical Turk interface. With an experiment using the above tool and evaluation procedure, we show that a majority vote among annotations from 3 crowdsource workers leads to a quality comparable to that of local expert annotations.

Back to Ron Artstein’s home

David Traum, Priti Aggarwal, Ron Artstein, Susan Foutz, Jillian Gerten, Athanasios Katsamanis, Anton Leuski, Dan Noren, and William Swartout. Ada and Grace: Direct interaction with museum visitors. In Intelligent Virtual Agents: 12th International Conference, IVA 2012, Santa Cruz, CA, USA, September 12–14, 2012 Proceedings (Lecture Notes in Artificial Intelligence 7502), pages 245–251. Springer, Heidelberg, September 2012.

Abstract: We report on our efforts to prepare Ada and Grace, virtual guides in the Museum of Science, Boston, to interact directly with museum visitors, including children. We outline the challenges in extending the exhibit to support this usage, mostly relating to the processing of speech from a broad population, especially child speech. We also present the summative evaluation, showing success in all the intended impacts of the exhibit: that children ages 7–14 will increase their awareness of, engagement in, interest in, positive attitude about, and knowledge of computer science and technology.

Back to Ron Artstein’s home

Xuchen Yao, Emma Tosch, Grace Chen, Elnaz Nouri, Ron Artstein, Anton Leuski, Kenji Sagae, and David Traum. Creating conversational characters using question generation tools. Dialogue and Discourse 3(2): 125–146, 2012.

Abstract: This article describes a new tool for extracting question-answer pairs from text articles, and reports three experiments which investigate how suitable this technique is for supplying knowledge to conversational characters. Experiment 1 demonstrates the feasibility of our method by creating characters for 14 distinct topics and evaluating them using hand-authored questions. Experiment 2 evaluates three of these characters using questions collected from naive participants, showing that the generated characters provide full or partial answers to about half of the questions asked. Experiment 3 adds automatically extracted knowledge to an existing, hand-authored character, demonstrating that augmented characters can answer questions about new topics but with some degradation of the ability to answer questions about topics that the original character was trained to answer. Overall, the results show that question generation is a promising method for creating or augmenting a question answering conversational character using an existing text.

PDF article (193K)
Back to Ron Artstein’s home

William Yang Wang, Ron Artstein, Anton Leuski, and David Traum. Improving spoken dialogue understanding using phonetic mixture models. In Cross-Disciplinary Advances in Applied Natural Language Processing: Issues and Approaches, edited by Chutima Boonthum-Denecke, Philip M. McCarthy, and Travis A. Lamkin, chapter 15, pages 225–238. IGI Global, Hershey, Pensylvania, 2012.

Abstract: Reasoning about sound similarities improves the performance of a Natural Language Understanding component that interprets speech recognizer output: the authors observed a 5% to 7% reduction in errors when they augmented the word strings with a phonetic representation, derived from the words by means of a dictionary. The best performance comes from mixture models incorporating both word and phone features. Since the phonetic representation is derived from a dictionary, the method can be applied easily without the need for integration with a specific speech recognizer. The method has similarities with autonomous (or bottom-up) psychological models of lexical access, where contextual information is not integrated at the stage of auditory perception but rather later.

Back to Ron Artstein’s home

Sin-Hwa Kang, Jonathan Gratch, Candy Sidner, Ron Artstein, Lixing Huang, and Louis-Philippe Morency. Towards building a virtual counselor: Modeling nonverbal behavior during intimate self-disclosure. In Eleventh International Conference on Autonomous Agents and Multiagent Systems (AAMAS). Valencia, Spain, June 2012.

Abstract: Nonverbal behavior is considered critical for indicating intimacy and is important when designing a social virtual agent such as a counselor. One key research question is how to properly express intimate self-disclosure. In this paper we present an extensive study of human nonverbal behavior during intimate self-disclosure. This is an important milestone in creating a virtual counselor. A study of video interactions between human participants demonstrated that people display more head tilts and pauses when they revealed highly intimate information about themselves; they presented more head nods and eye gazes during less intimate sharing. An implementation of these behaviors in a virtual agent suggests that people tend to perceive head tilts, pauses and gaze aversion by the agent as conveying intimate self-disclosure. These findings are important for future research with virtual counselors and other social agents.

PDF (557KB)
Back to Ron Artstein’s home

Priti Aggarwal, Ron Artstein, Jillian Gerten, Athanasios Katsamanis, Shrikanth Narayanan, Angela Nazarian, and David Traum. The Twins corpus of museum visitor questions. In Proceedings of the Eigth International Conference on Language Resources and Evaluation (LREC 2012), pages 2355–2361. Istanbul, Turkey, May 2012.

Abstract: The Twins corpus is a collection of utterances spoken in interactions with two virtual characters who serve as guides at the Museum of Science in Boston. The corpus contains about 200,000 spoken utterances from museum visitors (primarily children) as well as from trained handlers who work at the museum. In addition to speech recordings, the corpus contains the outputs of speech recognition performed at the time of utterance as well as the system interpretation of the utterances. Parts of the corpus have been manually transcribed and annotated for question interpretation. The corpus has been used for improving performance of the museum characters and for a variety of research projects, such as phonetic-based Natural Language Understanding, creation of conversational characters from text resources, dialogue policy learning, and research on patterns of user interaction. It has the potential to be used for research on children’s speech and on language used when talking to a virtual human.

PDF (745KB)
Back to Ron Artstein’s home

Elnaz Nouri, Ron Artstein, Anton Leuski and David Traum. Augmenting Conversational Characters with Generated Question-Answer Pairs. In Question Generation: Papers form the AAAI Fall Symposium, pages 49–52. Arlington, Virginia, November 2011.

Abstract: We take a conversational character trained on a set of linked question-answer pairs authored by hand, and augment its training data by adding sets of question-answer pairs which are generated automatically from texts on different topics. The augmented characters can answer questions about the new topics, at the cost of some performance loss on questions about the topics that the original character was trained to answer.

Back to Ron Artstein’s home

Priti Aggarwal, Kevin Feeley, Fabrizio Morbini, Ron Artstein, Anton Leuski, David Traum, and Julia Kim. Interactive characters for cultural training of small military units. In Intelligent Virtual Agents: 11th International Conference, IVA 2011, Reykjavik, Iceland, September 15–17, 2011 Proceedings (Lecture Notes in Artificial Intelligence 6895), pages 426–427. Springer, Heidelberg, 2011. (Poster)

Abstract: CHAOS, the Combat Hunter Action and Observation Simulation, is an immersive simulation training environment which gives small military units the experience of interacting with local Afghan villagers during a patrol. It is a physical build-out of a housing compound in a mock Afghan village, with several life-size reactive and interactive animated Pashto-speaking virtual characters. The exercise requires an infantry squad to locate and interview a character named Omar, communicating through a live human interpreter and attending to proper protocol regarding Omar’s family. Character animation and behavior is based on extensive interviews with Afghan experts to provide a realistic setting of the intended locale. The system combines virtual human technology, story engineering, and physical set building to provide a compelling training environment that can handle a full squad, requiring trainees to integrate tasks such as working with an interpreter, dealing with non-English speakers from another culture, and assessing information and disposition to make decisions in a mission context.

PDF preprint (110KB)
PDF poster (2MB)
Back to Ron Artstein’s home

Sin-Hwa Kang, Candy Sidner, Jonathan Gratch, Ron Artstein, Lixing Hwang, and Louis-Philippe Morency. Modeling nonverbal behavior of a virtual counselor during intimate self-disclosure. In Intelligent Virtual Agents: 11th International Conference, IVA 2011, Reykjavik, Iceland, September 15–17, 2011 Proceedings (Lecture Notes in Artificial Intelligence 6895), pages 455–457. Springer, Heidelberg, 2011. (Poster)

Abstract: Humans often share personal information with others in order to create social connections. Sharing personal information is especially important in counseling interactions. Research studying the relationship between intimate self-disclosure and human behavior critically informs the development of virtual agents that create rapport with human interaction partners. One significant example of this application is using virtual agents as counselors in psychotherapeutic situations. The capability of expressing different intimacy levels is key to a successful virtual counselor to reciprocally induce disclosure in clients. Nonverbal behavior is considered critical for indicating intimacy and is important when designing a social virtual agent such as a counselor. One key research question is how to properly express intimate self-disclosure. In this study, our main goal is to find what types of interviewees’ nonverbal behavior is associated with different intimacy levels of verbal self-disclosure. Thus, we investigated humans’ nonverbal behavior associated to self-disclosure during interview setting (with intimate topics).

Back to Ron Artstein’s home

Ron Artstein, Michael Rushforth, Sudeep Gandhe, David Traum and Aram Donigian. Limits of Simple Dialogue Acts for Tactical Questioning Dialogues. Proceedings of the 7th IJCAI workshop on knowledge and reasoning in practical dialogue systems, pages 1–8. Barcelona, Spain, July 2011.

Abstract: A set of dialogue acts, generated automatically by applying a dialogue act scheme to a domain representation designed for easy scenario authoring, covers approximately 72%–76% of user utterances spoken in live interaction with a tactical questioning simulation trainer. The domain is represented as facts of the form <object, attribute, value> and conversational actions of the form <character, action>. User utterances from the corpus that fall outside the scope of the scheme include questions about temporal relations, relations between facts and relations between objects, questions about reason and evidence, assertions by the user, conditional offers, attempts to set the topic of conversation, and compound utterances. These utterance types constitute the limits of the simple dialogue act scheme.

PDF (193KB)
Back to Ron Artstein’s home

Ron Artstein. Error Retun Plots. Proceedings of the SIGDIAL 2011 Conference: the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 319–324. Portland, Oregon, June 2010. (Poster)

Abstract: Error-return plots show the rate of error (misunderstanding) against the rate of non-return (non-understanding) for Natural Language Processing systems. They are a useful visual tool for judging system performance when other measures such as recall/precision and detection-error tradeoff are less informative, specifically when a system is judged on the correctness of its responses, but may elect to not return a response.

PDF paper (324KB)
PDF poster (508KB)
Back to Ron Artstein’s home

Kallirroi Georgila, Ron Artstein, Angela Nazarian, Michael Rushforth, David Traum, and Katia Sycara. An annotation scheme for cross-cultural argumentation and persuasion dialogues. Proceedings of the SIGDIAL 2011 Conference: the 12th Annual Meeting of the Special Interest Group on Discourse and Dialogue, pages 272–278. Portland, Oregon, June 2010. (Poster)

Abstract: We present a novel annotation scheme for cross-cultural argumentation and persuasion dialogues. This scheme is an adaptation of existing coding schemes on negotiation, following a review of literature on cross-cultural differences in negotiation styles. The scheme has been refined through application to coding both two-party and multi-party negotiation dialogues in three different domains, and is general enough to be applicable to different domains with minor or no modifications at all. Dialogues annotated with the scheme have been used to successfully learn culture-specific dialogue policies for argumentation and persuasion.

PDF version (119KB)
Back to Ron Artstein’s home

William Yang Wang, Ron Artstein, Anton Leuski, and David Traum. Improving spoken dialogue understanding using phonetic mixture models. Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, pages 329–334. Palm Beach, Florida, May 2011. Finalist for best paper award

Abstract: Augmenting word tokens with a phonetic representation, derived from a dictionary, improves the performance of a Natural Language Understanding component that interprets speech recognizer output: we observed a 5% to 7% reduction in errors across a wide range of response return rates. The best performance comes from mixture models incorporating both word and phone features. Since the phonetic representation is derived from a dictionary, the method can be applied easily without the need for integration with a specific speech recognizer. The method has similarities with autonomous (or bottom-up) psychological models of lexical access, where contextual information is not integrated at the stage of auditory perception but rather later.

PDF version (1MB)
Back to Ron Artstein’s home

Grace Chen, Emma Tosch, Ron Artstein, Anton Leuski, and David Traum. Evaluating conversational characters created through question generation. Proceedings of the Twenty-Fourth International Florida Artificial Intelligence Research Society Conference, pages 343–344. Palm Beach, Florida, May 2011. (Poster) Best poster award

Abstract: Question generation tools can be used to extract a question-answer database from text articles. We investigate how suitable this technique is for giving domain-specific knowledge to conversational characters. We tested these characters by collecting questions and answers from naive participants, running the questions through the character, and comparing the system responses to the participant answers. Characters gave a full or partial answer to 53% of the user questions which had an answer available in the source text, and 43% of all questions asked. Performance was better for questions asked after the user had read the source text, and also varied by question type: the best results were answers to who questions, while answers to yes/no questions were among the poorer performers. The results show that question generation is a promising method for creating a question answering conversational character from an existing text.

PDF paper (57K)
PDF poster (135KB)
Back to Ron Artstein’s home

Julia Campbell, Mark Core, Ron Artstein, Lindsay Armstrong, Arno Hartholt, Cyrus Wilson, Kallirroi Georgila, Fabrizio Morbini, Edward Haynes, Dave Gomboc, Mike Birch, Jonathan Bobrow, H. Chad Lane, Jillian Gerten, Anton Leuski, David Traum, Matthew Trimmer, Rich DiNinni, Matthew Bosack, Timothy Jones, Richard E. Clark, and Kenneth A. Yates. Developing INOTS to support interpersonal skills practice. 2011 IEEE Aerospace Conference, Big Sky, Montana, March 2011.

Abstract: The Immersive Naval Officer Training System (INOTS) is a blended learning environment that merges traditional classroom instruction with a mixed reality training setting. INOTS supports the instruction, practice and assessment of interpersonal communication skills. The goal of INOTS is to provide a consistent training experience to supplement interpersonal skills instruction for Naval officer candidates without sacrificing trainee throughput and instructor control. We developed an instructional design from cognitive task analysis interviews with experts to serve as a framework for system development. We also leveraged commercial student response technology and research technologies including natural language recognition, virtual humans, realistic graphics, intelligent tutoring and automated instructor support tools. In this paper, we describe our methodologies for developing a blended learning environment, and our challenges adding mixed reality and virtual human technologies to a traditional classroom to support interpersonal skills training.

Back to Ron Artstein’s home

Antonio Roque, Kallirroi Georgila, Ron Artstein, Kenji Sagae, and David Traum. Natural language processing for joint fire observer training. 27th Army Science Conference, Orlando, Florida, December 2010.

Abstract: We describe recent research to enhance a training system which interprets Call for Fire (CFF) radio artillery requests. The research explores the feasibility of extending the system to also understand calls for Close Air Support (CAS). This work includes automated analysis of complex language behavior in CAS missions, evaluation of speech recognition performance, and simulation of speech recognition errors.

PDF paper (164K)
PDF poster (359K)
Back to Ron Artstein’s home

Jenny Brusk, Ron Artstein, and David Traum. Don’t tell anyone! Two experiments on gossip conversations. Proceedings of the SIGdial 2010 Conference, pages 193–200. Tokyo, Japan, September 2010.

Abstract: The purpose of this study is to get a working definition that matches people’s intuitive notion of gossip and is sufficiently precise for computational implementation. We conducted two experiments investigating what type of conversations people intuitively understand and interpret as gossip, and whether they could identify three proposed constituents of gossip conversations: third person focus, pejorative evaluation and substantiating behavior. The results show that (1) conversations are very likely to be considered gossip if all elements are present, no intimate relationships exist between the participants, and the person in focus is unambiguous. (2) Conversations that have at most one gossip element are not considered gossip. (3) Conversations that lack one or two elements or have an ambiguous element lead to inconsistent judgments.

PDF version (244K)
Back to Ron Artstein’s home

William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant, Josh Williams, Anton Leuski, Shrikanth Narayanan, Diane Piepol, Chad Lane, Jacquelyn Morie, Priti Aggarwal, Matt Liewer, Jen-Yuan Chiang, Jillian Gerten, Selina Chu, and Kyle White. Virtual museum guides demonstration. In 2010 IEEE Spoken Language Technology Workshop, pages 163–164. Berkeley, California, December 2010.

Abstract: The Virtual Museum Guides are two virtual humans set in an exhibit at the Museum of Science, Boston, designed to promote interest in Science, Technology, Engineering and Mathematics (STEM). The primary audience is children between ages 7 to 14, in particular females and other groups under-represented in STEM.

Back to Ron Artstein’s home

William Swartout, David Traum, Ron Artstein, Dan Noren, Paul Debevec, Kerry Bronnenkant, Josh Williams, Anton Leuski, Shrikanth Narayanan, Diane Piepol, Chad Lane, Jacquelyn Morie, Priti Aggarwal, Matt Liewer, Jen-Yuan Chiang, Jillian Gerten, Selina Chu, and Kyle White. Ada and Grace: Toward realistic and engaging virtual museum guides. In Intelligent Virtual Agents: 10th International Conference, IVA 2010, Philadelphia, PA, USA, September 20–22, 2010 Proceedings (Lecture Notes in Artificial Intelligence 6356), pages 286–300. Springer, Heidelberg, 2010.

Abstract: To increase the interest and engagement of middle school students in science and technology, the InterFaces project has created virtual museum guides that are in use at the Museum of Science, Boston. The characters use natural language interaction and have near photoreal appearance to increase engagement. The paper presents an evaluation of natural language performance and presents reports from museum staff on visitor reaction.

Back to Ron Artstein’s home

Xuchen Yao, Pravin Bhutada, Kallirroi Georgila, Kenji Sagae, Ron Artstein, and David Traum. Practical evaluation of speech recognizers for virtual human dialogue systems. LREC 2010, Valetta, Malta, May 2010. (Poster)

Abstract: We perform a large-scale evaluation of multiple off-the-shelf speech recognizers across diverse domains for virtual human dialogue systems. Our evaluation is aimed at speech recognition consumers and potential consumers with limited experience with readily available recognizers. We focus on practical factors to determine what levels of performance can be expected from different available recognizers in various projects featuring different types of conversational utterances. Our results show that there is no single recognizer that outperforms all other recognizers in all domains. The performance of each recognizer may vary significantly depending on the domain, the size and perplexity of the corpus, the out-of-vocabulary rate, and whether acoustic and language model adaptation has been used or not. We expect that our evaluation will prove useful to other speech recognition consumers, especially in the dialogue community, and will shed some light on the key problem in spoken dialogue systems of selecting the most suitable available speech recognition system for a particular application, and what impact training will have.

PDF version (400K)
Back to Ron Artstein’s home

Michael Rushforth, Sudeep Gandhe, Ron Artstein, Antonio Roque, Sarrah Ali, Nicolle Whitman, and David Traum. Varying personality in spoken dialogue with a virtual human. In Intelligent Virtual Agents: 9th International Conference, IVA 2009, Amsterdam, The Netherlands, September 14-16, 2009 Proceedings (Lecture Notes in Artificial Intelligence 5773), pages 541–542. Springer, Heidelberg, 2009. (Poster)

Abstract: This poster reports the results of two experiments to test a personality framework for virtual characters. We use the Tactical Questioning dialogue system architecture (TACQ) as a testbed for this effort. Characters built using the TACQ architecture can be used by trainees to practice their questioning skills by engaging in a role-play with a virtual human. The architecture supports advanced behavior in a questioning setting, including deceptive behavior, simple negotiations about whether to answer, tracking subdialogues for offers/threats, grounding behavior, and maintenance of the affective state of the virtual human. Trainees can use different questioning tactics in their sessions. In order for the questioning training to be effective, trainees should have experience of interacting with virtual humans with different personalities, who react in different ways to the same questioning tactics.

PDF Abstract (68K)
Extended version: Technical Report ICT-TR-03-2009 (PDF, 228K)
Back to Ron Artstein’s home

Sudeep Gandhe, Nicolle Whitman, David Traum and Ron Artstein. An integrated authoring tool for tactical questioning dialogue systems. In 6th Workshop on Knowledge and Reasoning in Practical Dialogue Systems, Pasadena, California, July 2009.

Abstract: We present an integrated authoring tool for rapid prototyping of dialogue systems for virtual humans taking part in tactical questioning simulations. The tool helps domain experts, who may have little or no knowledge of linguistics or computer science, to build virtual characters that can play the role of the interviewee. Working in a top-down fashion, the authoring process begins with specifying a domain of knowledge for the character; the authoring tool generates all relevant dialogue acts and allows authors to assign the language that will be used to refer to the domain elements. The authoring tool can also be used to manipulate some aspects of the dialogue strategies employed by the virtual characters, and it also supports re-using some of the authored content across different characters.

PDF version (577K)
Back to Ron Artstein’s home

Ron Artstein, Sudeep Gandhe, Michael Rushforth and David Traum. Viability of a Simple Dialogue Act Scheme for a Tactical Questioning Dialogue System. In DiaHolmia 2009: Proeedings of the 13th Workshop on the Semantics and Pragmatics of Dialogue, pages 43–50. Stockholm, Sweden, June 2009.

Abstract: User utterances in a spoken dialogue system for tactical questioning simulation were matched to a set of dialogue acts generated automatically from a representation of facts as <object, attribute, value> triples and actions as <character, action> pairs. The representation currently covers about 50% of user utterances, and we show that a few extensions can increase coverage to 80% or more. This demonstrates the viability of simple schemes for representing question-answering dialogues in implemented systems.

PDF version (244K)
Back to Ron Artstein’s home

Ron Artstein, Sudeep Gandhe, Jillian Gerten, Anton Leuski and David Traum. Semi-formal evaluation of conversational characters. In Languages: From Formal to Natural. Essays Dedicated to Nissim Francez on the Occasion of His 65th Birthday (Lecture Notes in Computer Science 5533), edited by Orna Grumberg, Michael Kaminski, Shmuel Katz and Shuly Wintner, pages 22–35. Springer, Heidelberg, 2009.

Abstract: Conversational dialogue systems cannot be evaluated in a fully formal manner, because dialogue is heavily dependent on context and current dialogue theory is not precise enough to specify a target output ahead of time. Instead, we evaluate dialogue systems in a semi-formal manner, using human judges to rate the coherence of a conversational character and correlating these judgments with measures extracted from within the system. We present a series of three evaluations of a single conversational character over the course of a year, demonstrating how this kind of evaluation helps bring about an improvement in overall dialogue coherence.

PDF preprint (1.05M)
Back to Ron Artstein’s home

Ron Artstein, Jacob Cannon, Sudeep Gandhe, Jillian Gerten, Joe Henderer, Anton Leuski and David Traum. Coherence of off-topic responses for a virtual character. 26th Army Science Conference, Orlando, Florida, December 2008.

Abstract: We demonstrate three classes of off-topic responses which allow a virtual question-answering character to handle cases where it does not understand the user s input: ask for clarification, indicate misunderstanding, and move on with the conversation. While falling short of full dialogue management, a combination of such responses together with prompts to change the topic can improve overall dialogue coherence.

PDF paper (460K)
PDF poster (1.7M)
Back to Ron Artstein’s home

Sudeep Gandhe, David DeVault, Antonio Roque, Bilyana Martinovski, Ron Artstein, Anton Leuski, Jillian Gerten, and David Traum. From domain specification to virtual humans: An integrated approach to authoring tactical questioning characters. Interspeech 2008, Brisbane, Australia, September 2008.

Abstract: We present a new approach for rapidly developing dialogue capabilities for virtual humans. Starting from domain specification, an integrated authoring interface automatically generates dialogue acts with all possible contents. These dialogue acts are linked to example utterances in order to provide training data for natural language understanding and generation. The virtual human dialogue system contains a dialogue manager following the information-state approach, using finite-state machines and SCXML to manage local coherence, as well as explicit modeling of emotions and compliance level and a grounding component based on evidence of understanding. Using the authoring tools, we design and implement a version of the virtual human Hassan and compare to previous architectures for the character.

PDF version (427K)
Back to Ron Artstein’s home

David DeVault, David Traum and Ron Artstein. Making grammar-based generation easier to deploy in dialogue systems. Proceedings of the 9th SIGdial Workshop on Discourse and Dialogue, pages 198–207. Columbus, Ohio, June 2008.

Abstract: We present a development pipeline and associated algorithms designed to make grammarbased generation easier to deploy in implemented dialogue systems. Our approach realizes a practical trade-off between the capabilities of a system s generation component and the authoring and maintenance burdens imposed on the generation content author for a deployed system. To evaluate our approach, we performed a human rating study with system builders who work on a common largescale spoken dialogue system. Our results demonstrate the viability of our approach and illustrate authoring/performance trade-offs between hand-authored text, our grammar-based approach, and a competing shallow statistical NLG technique.

PDF version (845K)
Back to Ron Artstein’s home

David DeVault, David Traum and Ron Artstein. Practical grammar-based NLG from examples. Proceedings of the Fifth International Natural Language Generation Conference, pages 77–85. Salt Fork, Ohio, June 2008.

Abstract: We present a technique that opens up grammar-based generation to a wider range of practical applications by dramatically reducing the development costs and linguistic expertise that are required. Our method infers the grammatical resources needed for generation from a set of declarative examples that link surface expressions directly to the application s available semantic representations. The same examples further serve to optimize a run-time search strategy that generates the best output that can be found within an application-specific time frame. Our method offers substantially lower development costs than hand-crafted grammars for applicationspecific NLG, while maintaining high output quality and diversity.

PDF version (376K)
Back to Ron Artstein’s home

Massimo Poesio and Ron Artstein. Anaphoric annotation in the ARRAU corpus. LREC 2008, Marrakech, Morocco, May 2008.

Abstract: Arrau is a new corpus annotated for anaphoric relations, with information about agreement and explicit representation of multiple antecedents for ambiguous anaphoric expressions and discourse antecedents for expressions which refer to abstract entities such as events, actions and plans. The corpus contains texts from different genres: task-oriented dialogues from the Trains-91 and Trains-93 corpus, narratives from the English Pear Stories corpus, newspaper articles from the Wall Street Journal portion of the Penn Treebank, and mixed text from the Gnome corpus.

PDF version (191K)
Back to Ron Artstein’s home

Ron Artstein, Sudeep Gandhe, Anton Leuski and David Traum. Field Testing of an interactive question-answering character. Proceedings of the ELRA workshop on evaluation, pages 36–40. Marrakech, Morocco, May 2008.

Abstract: We tested a life-size embodied question-answering character at a convention where he responded to questions from the audience. The character’s responses were then rated for coherence. The ratings, combined with speech transcripts, speech recognition results and the character’s responses, allowed us to identify where the character needs to improve, namely in speech recognition and providing off-topic responses.

PDF version (1.2M)
Back to Ron Artstein’s home

Ron Artstein and Massimo Poesio. Identifying reference to abstract objects in dialogue. brandial 2006 proceedings, Potsdam, Germany, September 2006.

Abstract: In two experiments, many annotators marked antecedents for discourse deixis as unconstrained regions of text. The experiments show that annotators do converge on the identity of these text regions, though much of what they do can be captured by a simple model. Demonstrative pronouns are more likely than definite descriptions to be marked with discourse antecedents. We suggest that our methodology is suitable for the systematic study of discourse deixis.

PDF version (58K)
Back to Ron Artstein’s home

Massimo Poesio, Patrick Sturt, Ron Artstein, and Ruth Filik. Underspecification and Anaphora: Theoretical Issues and Preliminary Evidence. Discourse Processes 42(2): 157-175, 2006.

Distributed as Technical report CSM-438, University of Essex Department of Computer Science, October 2005.

Abstract: Much experimental work in psycholinguistics suggests that fully specified syntactic and semantic interpretations are obtained incrementally. The finding that intepretation takes place incrementally is very robust and underlies our own view of sentence processing as well; however, most of this work tends to test very simple interpretive judgments, and using materials which have very clean-cut interpretations, which makes the view expressed above more questionable when applied to semantic interpretation. This article discusses a class of anaphoric expressions that do not appear to have a clear antecedent, using both corpus analysis and psychological experiments. We argue that these cases of anaphora are similar to cases of lexical polysemy, and propose an explicit semantic representation for such cases.

Published version
Article preprint (PDF, 150K)
Technical report (PDF, 143K)
Back to Ron Artstein’s home

Ron Artstein and Massimo Poesio. Inter-coder agreement for computational linguistics (survey article). Computational Linguistics 34(4): 555-596, 2008.

Abstract: This article is a survey of methods for measuring agreement among corpus annotators. It exposes the mathematics and underlying assumptions of agreement coefficients, covering Krippendorff’s alpha as well as Scott’s pi and Cohen’s kappa; discusses the use of coefficients in several annotation tasks; and argues that weighted, alpha-like coefficients, traditionally less used than kappa-like measures in Computational Linguistics, may be more appropriate for many corpus annotation tasks – but that their use makes the interpretation of the value of the coefficient even harder.

Journal version (PDF, 304K)
Extended version (PDF, 367K)
Back to Ron Artstein’s home

Ron Artstein and Nissim Francez. Plurality and temporal modification. Linguistics and Philosophy 29(3): 251-276, 2006.

Abstract: A semantics with plural entities and plural times accounts for cumulative relations between plural arguments and temporal expressions. The semantics equips nominal, verbal and sentential meanings with temporal context variables and treats temporal modifiers as temporal generalized quantifiers; cumulative conjunction, however, takes place at types lower than generalized quantifiers. The mediation of temporal context variables allows cumulative relations to percolate between an argument in a main clause and one in a temporal clause, in apparent violation of locality restrictions. Plural times form a semilattice structure imposed on the set of intervals; no interaction is observed between this and the internal temporal structure of intervals.

Published version
PDF preprint (85K)
Back to Ron Artstein’s home

Ron Artstein and Massimo Poesio. Bias decreases in proportion to the number of annotators. In Gerhard Jaeger, Paola Monachesi, Gerald Penn, James Rogers, and Shuly Wintner (eds.), Proceedings of FG-MoL 2005, pages 141-150. Edinburgh, August 2005.

Abstract: The effect of the individual biases of corpus annotators on the value of reliability coefficients is inversely proportional to the number of annotators (less one). As the number of annotators increases, the effect of their individual preferences becomes more similar to random noise. This suggests using multiple annotators as a means to control individual biases.

PDF version (110K)
Back to Ron Artstein’s home

Massimo Poesio and Ron Artstein. Annotating (anaphoric) ambiguity. Corpus linguistics, Birmingham, England, July 2005.

Abstract: We report the results of a preliminary study attempting to identify ambiguous expressions in spoken language dialogues. In this study we developed methods for marking explicit ambiguity, and generalized previous proposals by Passonneau concerning a distance metric for anaphora to be used with the α coefficient to allow for ambiguous annotations.

PDF version (64K)
Back to Ron Artstein’s home

Massimo Poesio and Ron Artstein. The reliability of anaphoric annotation, reconsidered: Taking ambiguity into account. Proceedings of the Workshop on Frontiers in Corpus Annotation II: Pie in the Sky, pages 76-83. Ann Arbor, June 2005.

Abstract: We report the results of a study of the reliability of anaphoric annotation which (i) involved a substantial number of naive subjects, (ii) used Krippendorff’s α instead of κ to measure agreement, as recently proposed by Passonneau, and (iii) allowed annotators to mark anaphoric expressions as ambiguous.

PDF version (71K)
Back to Ron Artstein’s home

Ron Artstein. Quantificational arguments in temporal adjunct clauses. Linguistics and Philosophy 28(5): 541-597, 2005.

Abstract: Quantificational arguments can take scope outside of temporal adjunct clauses, in an apparent violation of locality restrictions: the sentence few secretaries cried after each executive resigned allows the quantificational NP each executive to take scope above few secretaries. I show how this scope relation is the result of local operations: the adjunct clause is a temporal generalized quantifier, which takes scope over the main clause (Pratt and Francez 2001), and within the adjunct clause, the quantificational argument takes scope above the implicit determiner which forms the temporal generalized quantifier. The paper explores various relations among quantificational arguments across clause boundaries, including temporal clauses that are modified internally by a temporal adverbial and temporal clauses with embedded sentential complements.

Published version
PDF preprint (172K),
PostScript preprint (300K)
Back to Ron Artstein’s home

Ron Artstein. Coordination of parts of words. Lingua 115(4): 359-393, 2005.

Abstract: Coordination of parts of words, as in ortho and periodontists, has to be interpreted at the level of the word parts because the above NP can felicitously describe a pair of one orthodontist and one periodontist. This paper develops a theory of denotations for arbitrary word parts, in which the coordinate word parts denote their own sound, and the rest of the word is a function from sounds to word meanings. This yields the correct interpretation for number in coordinate constructions. The paper also explores phonological constraints on coordinate structures, and shows how certain ungrammatical structures that can be interpreted by the semantics are ruled out on phonological grounds.

Published version
PDF preprint (150K),
PostScript preprint (238K)
Back to Ron Artstein’s home

Ron Artstein. Focus below the word level. Natural Language Semantics 12(1): 1-22, 2004.

Abstract: Intonational focus can be observed on parts of words that appear to lack intrinsic meaning, and triggers alternatives that are similar in form. In order to provide a unified treatment of focus above and below the word level (they do, after all, behave the same in most respects), I develop a theory of denotations for arbitrary word parts in which focused word parts denote their own sound and the unfocused parts are functions from sounds to word meanings. This allows focus theories to generalize below the word level; any differences with focus above the word level are located in the semantics of word parts. The paper also explores phonological constraints on focus placement, and shows that the focusability of a word part depends solely on its prosodic status, not on any semantic factors.

Published version
PDF preprint (135K),
PostScript preprint (165K)
Back to Ron Artstein’s home

Ron Artstein. A focus semantics for echo questions. In Ágnes Bende-Farkas and Arndt Riester (eds.), Workshop on Information Structure in Context, pp. 98-107. IMS, University of Stuttgart, 2002.

Abstract: Echo questions are interpreted through focus semantics. Echo questions must be entailed by previous discourse; focus is therefore not needed to mark givenness, and instead it is used to compute the question denotation: the questioned element, marked with a pitch accent, is a focus constituent, and the alternative set of the echo question is its question denotation, i.e. the set of possible answers. The focus strategy exempts echo questions from locality restrictions (“islands”), allows echo questions on parts of words, and allows second-order echo questions which denote sets of questions.

PDF version (88K)
PostScript version (112K)
Back to Ron Artstein’s home

Ron Artstein. Person, animacy and null subjects. In Tina Cambier-Langeveld, Anikó Lipták, Michael Redford and Erik Jan van der Torre (eds.), Proceedings of Console VII, pp. 1-15. SOLE, Leiden, 1999.

Abstract: Licensing of null subjects can be contingent on person and animacy specification. For example, Hebrew allows null subjects if they are first or second person, but not if they are third person. This follows from a general typology that is based on the universal person/animacy hierarchy: if a subject of a certain person or animacy specification may be null, then every subject higher on the hierarchy may be null as well. The above typology, in turn, follows from the general way abstract hierarchies interact in the grammar: elements that appear on the high end of one hierarchy and the low end of another give rise to marked configurations. The mechanism of alignment in Optimality Theory gives a formalization of these universal properties of hierarchies.

PDF version (75K)
Back to Ron Artstein’s home

Ron Artstein. The incompatibility of underspecification and markedness in Optimality Theory. In Ron Artstein and Madeline Holler (eds.), RuLing Papers 1: Working Papers from Rutgers University, pp. 7-13. Rutgers University Department of Linguistics, New Brunswick, NJ, 1998.

Abstract: Underspecification in the underlying representation cannot give rise to marked structure on the surface, because Optimality Theory grammars force an output to be equally or less marked than the input. Underspecification can still account for alternations involving unmarked structure, but it is only useful when such alternations exist along with forms that do not alternate. The evidence for the existence of such grammatical systems is not very convincing, casting doubts about the usefulness of underspecification in general.

PDF version (25K)
Back to Ron Artstein’s home

Ron Artstein. Group events as means for representing collectivity. In Benjamin Bruening (ed.), MITWPL 31: Proceedings of the Eighth Student Conference in Linguistics , pp. 41-51. MIT Working Papers in Linguistics, Cambridge, MA, 1997.

Abstract: In this paper I argue in favor of the introduction of "group" events into a framework of event semantics; these mirror the "group" individuals introduced by Landman (1989), and give the domain of events a structure similar to that of the domain of individuals. Group events are used in order to capture collectivity effects that cannot be represented through the domain of individuals, as in the case of predicate conjunction. An attempt to extend the notion of group events and to use them for counting with adverbials such as three times proves at the very least troublesome.

PDF version (39K)
PostScript version (132K)
Back to Ron Artstein’s home