Research Vision

Progress on true open-domain social dialogue agents has been hindered by the lack of diversity, scale, and quality of training corpora. SODA is the first million-scale high-quality dialogue dataset covering a wide range of social interactions.


SODA: Million-scale Dialogue Distillation with Social Commonsense Contextualization

Hyunwoo Kim, Jack Hessel, Liwei Jiang, Ximing Lu, Youngjae Yu, Pei Zhou, and 5 more... ArXiv  2022