ConversationalIR

Overview of venues, research themes and datasets relevant for conversational search.

Generate Convert Improve

Install / Use

/learn @chauff/ConversationalIR

About this skill

Quality Score

0/100

README

Workshops and conferences

Please make a pull request if you find a venue to be missing.

2017

2018

2019

2020

2021

2022

Research themes

This categorization was created in a bottom-up manner based on the 300+ papers accepted by the workshops/conferences listed above (up to and including the MICROS 2021 workshop). Each paper was included in only one category or sub-category. This is of course purely subjective; ask 10 different researchers to come up with categories and you get 10 different results ...

The bracketed numbers starting with → indicate the total number of papers in this branch of the tree; e.g. there are a total of 48 papers that fall into some node inside the Domains category. The bracketed numbers after a sub-category without → indicate the number assigned to this particular node by me. For instance, chitchat (4) means that 4 papers fell in this category while Alexa Prize (3) means that three - different - papers fell into this more specific sub-category.

The mindmap was created with markmap, a neat markdown to mindmap tool. The svg of the mindmap is also available: open it in your favourite browser to experience an unpixelated mindmap. If you want to reuse the categories, alter/update/edit them, take the markdown file as starting point and then head over to markmap!

research themes

Interestingly, if you check out the report published about the Dagstuhl Conversational Search seminar in late 2019 (where many researchers came together to define/work on a roadmap) you will find that the themes chosen there are not well covered in the above research lines. Dagstuhl themes:

Defining conversational search
Evaluating conversational search
Modeling in conversational search
Argumentation and explanation
Scenarios that invite conversational search
Conversational search for learning technologies
Common conversational community prototype

Relevant datasets/benchmarks 🗂 and leaderboards 🚴

🚴 XOR-TyDi QA
- "XOR-TyDi QA brings together for the first time information-seeking questions, open-retrieval QA, and multilingual QA to create a multilingual open-retrieval QA dataset that enables cross-lingual answer retrieval. It consists of questions written by information-seeking native speakers in 7 typologically diverse languages and answer annotations that are retrieved from multilingual document collections."
🚴 Leaderboard for multi-turn response selection
- "Multi-turn response selection in retrieval-based chatbots is a task which aims to select the best-matched response from a set of candidates, given the context of a conversation. This task is attracting more and more attention in academia and industry. However, no one has maintained a leaderboard and a collection of popular papers and datasets yet. The main objective of this repository is to provide the reader with a quick overview of benchmark datasets and the state-of-the-art studies on this task, which serves as a stepping stone for further research."
🚴 Papers with code leaderboard for conversational response selection
🗂 MANtIS: a Multi-Domain Information Seeking Dialogues Dataset.
- "Unlike previous information-seeking dialogue datasets that focus on only one domain, MANtIS has more than 80K diverse conversations from 14 different Stack Exchannge sites, such as physics, travel and worldbuilding. Additionaly, all dialogues have a url, providing grounding to the conversations. It can be used for the following tasks: conversation response ranking/generation and user intent prediction. We provide manually annotated user intent labels for more than 1300 dialogues, resulting in a total of 6701 labeled utterances."
🗂 CANARD
- "CANARD is a dataset for question-in-context rewriting that consists of questions each given in a dialog context together with a context-independent rewriting of the question."
- "CANARD is constructed by crowdsourcing question rewritings using Amazon Mechanical Turk. We apply several automatic and manual quality controls to ensure the quality of the data collection process. The dataset consists of 40,527 questions with different context lengths."
🗂 QReCC
- "We introduce QReCC (Question Rewriting in Conversational Context), an end-to-end open-domain question answering dataset comprising of 14K conversations with 81K question-answer pairs. The goal of this dataset is to provide a challenging benchmark for end-to-end conversational question answering that includes the individual subtasks of question rewriting, passage retrieval and reading comprehension."
🗂 CAsT-19: A Dataset for Conversational Information Seeking
- "The corpus is 38,426,252 passages from the TREC Complex Answer Retrieval (CAR) and Microsoft MAchine Reading COmprehension (MARCO) datasets. Eighty information seeking dialogues (30 train, 50 test) are on average 9 to 10 questions long. A dialogue may explore a topic broadly or drill down into subtopics." (source)
🗂 FIRE 2020 task: Retrieval From Conversational Dialogues (RCD-2020)
- "Task 1: Given an excerpt of a dialogue act, output the span of text indicating a potential piece of info

Related Skills

proje

Interactive vocabulary learning platform with smart flashcards and spaced repetition for effective language acquisition.

YC-Killer

2.7k

A library of enterprise-grade AI agents designed to democratize artificial intelligence and provide free, open-source alternatives to overvalued Y Combinator startups. If you are excited about democratizing AI access & AI agents, please star ⭐️ this repository and use the link in the readme to join our open source AI research team.

best-practices-researcher

The most comprehensive Claude Code skills registry | Web Search: https://skills-registry-web.vercel.app

groundhog

398

Groundhog's primary purpose is to teach people how Cursor and all these other coding agents work under the hood. If you understand how these coding assistants work from first principles, then you can drive these tools harder (or perhaps make your own!).

chauff

View profile

View on GitHub

GitHub Stars146

CategoryEducation

Updated4mo ago

Forks19

chauff/conversationalIR

Security Score

82/100

Audited on Nov 28, 2025

No findings