A Survey of Surveys (NLP & ML)

In this document, we survey hundreds of survey papers on Natural Language Processing (NLP) and Machine Learning (ML). We categorize these papers into popular topics and do simple counting for some interesting problems. In addition, we show the list of the papers with urls (1063 papers).

:new: A list of LLM surveys is released! Link

Categorization

We follow the ACL and ICML submission guideline of recent years, covering a broad range of areas in NLP and ML. The categorization is as follows:

Natural Language Processing
- <a href="#computational-social-science-and-social-media">Computational Social Science and Social Media</a>
- <a href="#dialogue-and-interactive-systems">Dialogue and Interactive Systems</a>
- <a href="#generation">Generation</a>
- <a href="#information-extraction">Information Extraction</a>
- <a href="#information-retrieval-and-text-mining">Information Retrieval and Text Mining</a>
- <a href="#interpretability-and-analysis-of-models-for-nLP">Interpretability and Analysis of Models for NLP</a>
- <a href="#knowledge-graph">Knowledge Graph</a>
- <a href="#language-grounding-to-vision-robotics-and-beyond">Language Grounding to Vision, Robotics and Beyond</a>
- <a href="#large-language-models">Large Language Models</a>
- <a href="#linguistic-theories-cognitive-modeling-and-psycholinguistics">Linguistic Theories, Cognitive Modeling and Psycholinguistics</a>
- <a href="#machine-learning-for-nlp">Machine Learning for NLP</a>
- <a href="#machine-translation">Machine Translation</a>
- <a href="#named-entity-recognition">Named Entity Recognition</a>
- <a href="#natural-language-inference">Natural Language Inference</a>
- <a href="#natural-language-processing">Natural Language Processing</a>
- <a href="#nlp-applications">NLP Applications</a>
- <a href="#pre-trained-models">Pre-trained Models</a>
- <a href="#prompt">Prompt</a>
- <a href="#question-answering">Question Answering</a>
- <a href="#reading-comprehension">Reading Comprehension</a>
- <a href="#recommender-systems">Recommender Systems</a>
- <a href="#resources-and-evaluation">Resources and Evaluation</a>
- <a href="#semantics">Semantics</a>
- <a href="#sentiment-analysis-stylistic-analysis-and-argument-mining">Sentiment Analysis, Stylistic Analysis and Argument Mining</a>
- <a href="#speech-and-multimodality">Speech and Multimodality</a>
- <a href="#summarization">Summarization</a>
- <a href="#tagging-chunking-syntax-and-parsing">Tagging, Chunking, Syntax and Parsing</a>
- <a href="#text-classification">Text Classification</a>
Machine Learning
- <a href="#architectures">Architectures</a>
- <a href="#automl">AutoML</a>
- <a href="#bayesian-methods">Bayesian Methods</a>
- <a href="#classification-clustering-and-regression">Classification, Clustering and Regression</a>
- <a href="#computer-vision">Computer Vision</a>
- <a href="#contrastive-learning">Contrastive Learning</a>
- <a href="#curriculum-learning">Curriculum Learning</a>
- <a href="#data-augmentation">Data Augmentation</a>
- <a href="#deep-learning-general-methods">Deep Learning General Methods</a>
- <a href="#deep-reinforcement-learning">Deep Reinforcement Learning</a>
- <a href="#diffusion-models">Diffusion Models</a>
- <a href="#federated-learning">Federated Learning</a>
- <a href="#few-shot-and-zero-shot-learning">Few-Shot and Zero-Shot Learning</a>
- <a href="#general-machine-learning">General Machine Learning</a>
- <a href="#generative-adversarial-networks">Generative Adversarial Networks</a>
- <a href="#graph-neural-networks">Graph Neural Networks</a>
- <a href="#interpretability-and-analysis">Interpretability and Analysis</a>
- <a href="#knowledge-distillation">Knowledge Distillation</a>
- <a href="#meta-learning">Meta Learning</a>
- <a href="#metric-learning">Metric Learning</a>
- <a href="#ml-and-dl-applications">ML and DL Applications</a>
- <a href="#model-compression-and-acceleration">Model Compression and Acceleration</a>
- <a href="#multi-label-learning">Multi-Label Learning</a>
- <a href="#multi-task-and-multi-view-learning">Multi-Task and Multi-View Learning</a>
- <a href="#online-learning">Online Learning</a>
- <a href="#optimization">Optimization</a>
- <a href="#semi-supervised-weakly-supervised-and-unsupervised-learning">Semi-Supervised,-Weakly-Supervised-and-Unsupervised-Learning</a>
- <a href="#transfer-learning">Transfer Learning</a>
- <a href="#trustworthy-machine-learning">Trustworthy Machine Learning</a>

To reduce class imbalance, we separate some of the hot sub-topics from the original categorization of ACL and ICML submissions. E.g., Named Entity Recognition is a first-level area in our categorization because it is the focus of several surveys.

Statistics

We show the number of paper in each area in Figures 1-2.

<img src="https://s2.loli.net/2023/05/26/DUa43miWf5NFlZx.png" width="70%" height="70%"/> Figure 1: # of papers in each NLP area. <img src="https://s2.loli.net/2023/05/26/z3PslUXbZFd6qrB.png" width="70%" height="70%"/> Figure 2: # of papers in each ML area.

Also, we plot paper number as a function of publication year (see Figure 3).

<img src="https://s2.loli.net/2023/05/26/7tMmcRO1lK9N5hF.png" width="70%" height="70%"/> Figure 3: # of papers vs publication year.

In addition, we generate word clouds to show hot topics in these surveys (see Figures 4-5).

<img src="https://s2.loli.net/2023/05/26/6RqNCKBwsEZtA3H.png" width="60%" height="60%"/> Figure 4: The word cloud for NLP. <img src="https://s2.loli.net/2023/05/26/zln92QYvmGLWMUE.png" width="60%" height="60%"/> Figure 5: The word cloud for ML.

The NLP Paper List

Computational Social Science and Social Media

A Comprehensive Survey on Community Detection with Deep Learning. arXiv 2021 paper bib

Xing Su, Shan Xue, Fanzhen Liu, Jia Wu, Jian Yang, Chuan Zhou, Wenbin Hu, Cécile Paris, Surya Nepal, Di Jin, Quan Z. Sheng, Philip S. Yu
A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput. Surv. 2021 paper bib

Xinyi Zhou, Reza Zafarani
A Survey of Race, Racism, and Anti-Racism in NLP. ACL 2021 paper bib

Anjalie Field, Su Lin Blodgett, Zeerak Waseem, Yulia Tsvetkov
A Survey on Computational Propaganda Detection. IJCAI 2020 paper bib

Giovanni Da San Martino, Stefano Cresci, Alberto Barrón-Cedeño, Seunghak Yu, Roberto Di Pietro, Preslav Nakov
A Survey on Trust Prediction in Online Social Networks. IEEE Access 2020 paper bib

Seyed Mohssen Ghafari, Amin Beheshti, Aditya Joshi, Cécile Paris, Adnan Mahmood, Shahpar Yakhchi, Mehmet A. Orgun
Computational Sociolinguistics: A Survey. Comput. Linguistics 2016 paper bib

Dong Nguyen, A. Seza Dogruöz, Carolyn P. Rosé, Franciska de Jong
Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective. J. Artif. Intell. Res. 2021 paper bib

Svetlana Kiritchenko, Isar Nejadgholi, Kathleen C. Fraser
From Symbols to Embeddings: A Tale of Two Representations in Computational Social Science. J. Soc. Comput. 2021 paper bib

Huimin Chen, Cheng Yang, Xuanming Zhang, Zhiyuan Liu, Maosong Sun, Jianbin Jin
Language (Technology) is Power: A Critical Survey of "Bias" in NLP. ACL 2020 paper bib

Su Lin Blodgett, Solon Barocas, Hal Daumé III, Hanna M. Wallach
Societal Biases in Language Generation: Progress and Challenges. ACL 2021 paper bib

Emily Sheng, Kai-Wei Chang, Prem Natarajan, Nanyun Peng
Tackling Online Abuse: A Survey of Automated Abuse Detection Methods. arXiv 2019 paper bib

Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova
When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People?. ACL 2020 paper bib

ABigSurvey

Install / Use

README

A Survey of Surveys (NLP & ML)

Categorization

Statistics

The NLP Paper List

Computational Social Science and Social Media