ICSFSurvey
Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.
Install / Use
/learn @IAAR-Shanghai/ICSFSurveyREADME
[!IMPORTANT]
- Consider giving our repository a 🌟, so you will receive the latest news (paper list updates, new comments, etc.);
- If you want to cite our work, here is our bibtex entry: CITATION.bib.
📰 News
- 2024/10/26 We have created a relevant WeChat Group (微信群) for discussing reasoning and hallucination in LLMs.
- 2024/09/18 Paper v3.0 and a relevant Twitter thread.
- 2024/08/24 Updated paper list for better user experience. Link. Ongoing updates.
- 2024/07/22 Our paper ranks first on Hugging Face Daily Papers! Link.
- 2024/07/21 Our paper is now available on arXiv. Link.
🎉 Introduction
Welcome to the GitHub repository for our survey paper titled "Internal Consistency and Self-Feedback in Large Language Models: A Survey." The survey's goal is to provide a unified perspective on the self-evaluation and self-updating mechanisms in LLMs, encapsulated within the frameworks of Internal Consistency and Self-Feedback.

This repository includes three key resources:
- expt-consistency-types: Code and results for measuring consistency at different levels.
- expt-gpt4o-responses: Results from five different GPT-4o responses to the same query.
- Paper List: A comprehensive list of references related to our survey.
- 📰 News
- 🎉 Introduction
- 📚 Paper List
- 📝 Citation
📚 Paper List
Here we list the most important references cited in our survey, as well as the papers we consider worth noting. This list will be updated regularly.
Related Survey Papers
These are some of the most relevant surveys related to our paper.
-
A Survey on the Honesty of Large Language Models
CUHK, arXiv, 2024 [Paper] [Code] -
Awesome LLM Reasoning
NTU, GitHub, 2024 [Code] -
Awesome LLM Strawberry
NVIDIA, GitHub, 2024 [Code] -
Extrinsic Hallucinations in LLMs
OpenAI, Blog, 2024 [Paper] -
When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
PSU, arXiv, 2024 [Paper] -
A Survey on Self-Evolution of Large Language Models
PKU, arXiv, 2024 [Paper] [Code] -
Demystifying Chains, Trees, and Graphs of Thoughts
ETH, arXiv, 2024 [Paper] -
Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies
UCSB, TACL, 2024 [Paper] [Code] -
Uncertainty in Natural Language Processing: Sources, Quantification, and Applications
Nankai, arXiv, 2023 [Paper]
Section IV: Consistency Signal Acquisition
For various forms of expressions from an LLM, we can obtain various forms of consistency signals, which can help in better updating the expressions.
Confidence Estimation
-
Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
NUS, ICLR, 2024 [Paper] [Code] -
Linguistic Calibration of Long-Form Generations
Stanford, ICML, 2024 [Paper] [Code] -
InternalInspector I2: Robust Confidence Estimation in LLMs through Internal States
VT, arXiv, 2024 [Paper] -
Cycles of Thought: Measuring LLM Confidence through Stable Explanations
UCLA, arXiv, 2024 [Paper] -
TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness
UoEdin, arXiv, 2024 [Paper] [Code] -
Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
Oxford, ICLR, 2023 [Paper] [Code] -
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
UMD, arXiv, 2023 [Paper] -
Teaching models to express their uncertainty in words
Oxford, TMLR, 2022 [Paper] [Code] -
Language Models (Mostly) Know What They Know
Anthropic, arXiv, 2022 [Paper]
Hallucination Detection
- **Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-
