ABigSurvey
A collection of 1000+ survey papers on Natural Language Processing (NLP) and Machine Learning (ML).
Install / Use
/learn @NiuTrans/ABigSurveyREADME
A Survey of Surveys (NLP & ML)
In this document, we survey hundreds of survey papers on Natural Language Processing (NLP) and Machine Learning (ML). We categorize these papers into popular topics and do simple counting for some interesting problems. In addition, we show the list of the papers with urls (1063 papers).
:new: A list of LLM surveys is released! Link
Categorization
We follow the ACL and ICML submission guideline of recent years, covering a broad range of areas in NLP and ML. The categorization is as follows:
- Natural Language Processing
- <a href="#computational-social-science-and-social-media">Computational Social Science and Social Media</a>
- <a href="#dialogue-and-interactive-systems">Dialogue and Interactive Systems</a>
- <a href="#generation">Generation</a>
- <a href="#information-extraction">Information Extraction</a>
- <a href="#information-retrieval-and-text-mining">Information Retrieval and Text Mining</a>
- <a href="#interpretability-and-analysis-of-models-for-nLP">Interpretability and Analysis of Models for NLP</a>
- <a href="#knowledge-graph">Knowledge Graph</a>
- <a href="#language-grounding-to-vision-robotics-and-beyond">Language Grounding to Vision, Robotics and Beyond</a>
- <a href="#large-language-models">Large Language Models</a>
- <a href="#linguistic-theories-cognitive-modeling-and-psycholinguistics">Linguistic Theories, Cognitive Modeling and Psycholinguistics</a>
- <a href="#machine-learning-for-nlp">Machine Learning for NLP</a>
- <a href="#machine-translation">Machine Translation</a>
- <a href="#named-entity-recognition">Named Entity Recognition</a>
- <a href="#natural-language-inference">Natural Language Inference</a>
- <a href="#natural-language-processing">Natural Language Processing</a>
- <a href="#nlp-applications">NLP Applications</a>
- <a href="#pre-trained-models">Pre-trained Models</a>
- <a href="#prompt">Prompt</a>
- <a href="#question-answering">Question Answering</a>
- <a href="#reading-comprehension">Reading Comprehension</a>
- <a href="#recommender-systems">Recommender Systems</a>
- <a href="#resources-and-evaluation">Resources and Evaluation</a>
- <a href="#semantics">Semantics</a>
- <a href="#sentiment-analysis-stylistic-analysis-and-argument-mining">Sentiment Analysis, Stylistic Analysis and Argument Mining</a>
- <a href="#speech-and-multimodality">Speech and Multimodality</a>
- <a href="#summarization">Summarization</a>
- <a href="#tagging-chunking-syntax-and-parsing">Tagging, Chunking, Syntax and Parsing</a>
- <a href="#text-classification">Text Classification</a>
- Machine Learning
- <a href="#architectures">Architectures</a>
- <a href="#automl">AutoML</a>
- <a href="#bayesian-methods">Bayesian Methods</a>
- <a href="#classification-clustering-and-regression">Classification, Clustering and Regression</a>
- <a href="#computer-vision">Computer Vision</a>
- <a href="#contrastive-learning">Contrastive Learning</a>
- <a href="#curriculum-learning">Curriculum Learning</a>
- <a href="#data-augmentation">Data Augmentation</a>
- <a href="#deep-learning-general-methods">Deep Learning General Methods</a>
- <a href="#deep-reinforcement-learning">Deep Reinforcement Learning</a>
- <a href="#diffusion-models">Diffusion Models</a>
- <a href="#federated-learning">Federated Learning</a>
- <a href="#few-shot-and-zero-shot-learning">Few-Shot and Zero-Shot Learning</a>
- <a href="#general-machine-learning">General Machine Learning</a>
- <a href="#generative-adversarial-networks">Generative Adversarial Networks</a>
- <a href="#graph-neural-networks">Graph Neural Networks</a>
- <a href="#interpretability-and-analysis">Interpretability and Analysis</a>
- <a href="#knowledge-distillation">Knowledge Distillation</a>
- <a href="#meta-learning">Meta Learning</a>
- <a href="#metric-learning">Metric Learning</a>
- <a href="#ml-and-dl-applications">ML and DL Applications</a>
- <a href="#model-compression-and-acceleration">Model Compression and Acceleration</a>
- <a href="#multi-label-learning">Multi-Label Learning</a>
- <a href="#multi-task-and-multi-view-learning">Multi-Task and Multi-View Learning</a>
- <a href="#online-learning">Online Learning</a>
- <a href="#optimization">Optimization</a>
- <a href="#semi-supervised-weakly-supervised-and-unsupervised-learning">Semi-Supervised,-Weakly-Supervised-and-Unsupervised-Learning</a>
- <a href="#transfer-learning">Transfer Learning</a>
- <a href="#trustworthy-machine-learning">Trustworthy Machine Learning</a>
To reduce class imbalance, we separate some of the hot sub-topics from the original categorization of ACL and ICML submissions. E.g., Named Entity Recognition is a first-level area in our categorization because it is the focus of several surveys.
Statistics
We show the number of paper in each area in Figures 1-2.
<p align="center"><img src="https://s2.loli.net/2023/05/26/DUa43miWf5NFlZx.png" width="70%" height="70%"/></p> <p align="center">Figure 1: # of papers in each NLP area.</p> <p align="center"><img src="https://s2.loli.net/2023/05/26/z3PslUXbZFd6qrB.png" width="70%" height="70%"/></p> <p align="center">Figure 2: # of papers in each ML area.</p>Also, we plot paper number as a function of publication year (see Figure 3).
<p align="center"><img src="https://s2.loli.net/2023/05/26/7tMmcRO1lK9N5hF.png" width="70%" height="70%"/></p> <p align="center">Figure 3: # of papers vs publication year.</p>In addition, we generate word clouds to show hot topics in these surveys (see Figures 4-5).
<p align="center"><img src="https://s2.loli.net/2023/05/26/6RqNCKBwsEZtA3H.png" width="60%" height="60%"/></p> <p align="center">Figure 4: The word cloud for NLP.</p> <p align="center"><img src="https://s2.loli.net/2023/05/26/zln92QYvmGLWMUE.png" width="60%" height="60%"/></p> <p align="center">Figure 5: The word cloud for ML.</p>The NLP Paper List
Computational Social Science and Social Media
-
A Comprehensive Survey on Community Detection with Deep Learning. arXiv 2021 paper bib
Xing Su, Shan Xue, Fanzhen Liu, Jia Wu, Jian Yang, Chuan Zhou, Wenbin Hu, Cécile Paris, Surya Nepal, Di Jin, Quan Z. Sheng, Philip S. Yu
-
A Survey of Fake News: Fundamental Theories, Detection Methods, and Opportunities. ACM Comput. Surv. 2021 paper bib
Xinyi Zhou, Reza Zafarani
-
A Survey of Race, Racism, and Anti-Racism in NLP. ACL 2021 paper bib
Anjalie Field, Su Lin Blodgett, Zeerak Waseem, Yulia Tsvetkov
-
A Survey on Computational Propaganda Detection. IJCAI 2020 paper bib
Giovanni Da San Martino, Stefano Cresci, Alberto Barrón-Cedeño, Seunghak Yu, Roberto Di Pietro, Preslav Nakov
-
A Survey on Trust Prediction in Online Social Networks. IEEE Access 2020 paper bib
Seyed Mohssen Ghafari, Amin Beheshti, Aditya Joshi, Cécile Paris, Adnan Mahmood, Shahpar Yakhchi, Mehmet A. Orgun
-
Computational Sociolinguistics: A Survey. Comput. Linguistics 2016 paper bib
Dong Nguyen, A. Seza Dogruöz, Carolyn P. Rosé, Franciska de Jong
-
Confronting Abusive Language Online: A Survey from the Ethical and Human Rights Perspective. J. Artif. Intell. Res. 2021 paper bib
Svetlana Kiritchenko, Isar Nejadgholi, Kathleen C. Fraser
-
From Symbols to Embeddings: A Tale of Two Representations in Computational Social Science. J. Soc. Comput. 2021 paper bib
Huimin Chen, Cheng Yang, Xuanming Zhang, Zhiyuan Liu, Maosong Sun, Jianbin Jin
-
Language (Technology) is Power: A Critical Survey of "Bias" in NLP. ACL 2020 paper bib
Su Lin Blodgett, Solon Barocas, Hal Daumé III, Hanna M. Wallach
-
Societal Biases in Language Generation: Progress and Challenges. ACL 2021 paper bib
Emily Sheng, Kai-Wei Chang, Prem Natarajan, Nanyun Peng
-
Tackling Online Abuse: A Survey of Automated Abuse Detection Methods. arXiv 2019 paper bib
Pushkar Mishra, Helen Yannakoudakis, Ekaterina Shutova
-
When do Word Embeddings Accurately Reflect Surveys on our Beliefs About People?. ACL 2020 paper bib
Security Score
Audited on Mar 13, 2026
