Conversational Agents (CA) are now widely prevalent, found in various aspects of our daily interactions, including computers (e.g., Cortana), smartphones (e.g., Siri), homes, and websites. These systems may be rule based or statistical based machine learning models, and in the latter case, CAs are trained on conversation corpora that are either gathered in real life settings or created using the Wizard-Of-Oz method. When the corpora are gathered from real life sources like online platforms such as Reddit, the large variety of data ensures generalization in the CA responses, but it may introduce implicit or explicit biases related to racial, sexual, political, or gender matters. The presence of such biases in the data poses a challenge to the construction of CAs, as it may amplify the potential risks that biases pose to society. Therefore, it is crucial to ensure that CAs exhibit responsible and safe behavior in their interactions with users.
The aim of this workshop is to address the challenges posed by biased data in both Machine Learning and society. We invite researchers to compare different chatbots/corpora on sexual, political, racist, or gender topics; study methods to create or mitigate bias during the construction of datasets; define approaches to assess and/or remove bias present in the corpora; or handle bias at the chatbot level through NLP or Machine Learning techniques.
We also welcome submissions that take a theoretical approach to addressing the issue of bias in CAs, without necessarily involving the creation of a corpus, implementation of an agent, or exploration of Machine Learning techniques.
The workshop solicits contributions including but not limited to the following topics
Authors are invited to submit original, previously unpublished research papers.
We encourage the submission of:
Abstracts and papers must be written in English and formatted according to the Springer LNCS guidelines. Author instructions, style files, and the copyright form can be downloaded here. All papers must be converted to PDF prior to electronic submission.
All papers need to be ‘best-effort’ anonymized. We strongly encourage making code and data available anonymously (e.g., in an anonymous GitHub repository via Anonymous GitHub or in a Dropbox folder). The authors may have a (non-anonymous) pre-print published online, but it should not be cited in the submitted paper to preserve anonymity. Reviewers will be asked not to search for them.
At least one author of each accepted paper must have a full registration and be in-presence to present the paper. Papers without a full registration or in-presence presentation won't be included in the post-workshop Springer proceedings.
Select: Biased Data in Conversational Agents