SkillAgentSearch skills...

ICSFSurvey

Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄.

Install / Use

/learn @IAAR-Shanghai/ICSFSurvey

README

<h2 align="center">Internal Consistency and Self-Feedback in Large Language Models: A Survey</h2> <p align="center"> <i> Explore concepts like Self-Correct, Self-Refine, Self-Improve, Self-Contradict, Self-Play, and Self-Knowledge, alongside o1-like reasoning elevation🍓 and hallucination alleviation🍄. </i> <p> <p align="center"> <!-- arxiv badges --> <a href="https://arxiv.org/abs/2407.14507"> <img src="https://img.shields.io/badge/Paper-red?style=flat&logo=arxiv"> </a> <!-- Github --> <a href="https://github.com/IAAR-Shanghai/ICSFSurvey"> <img src="https://img.shields.io/badge/Code-black?style=flat&logo=github"> </a> <!-- HuggingFace --> <a href="https://huggingface.co/papers/2407.14507"> <img src="https://img.shields.io/badge/-%F0%9F%A4%97%20Page-orange?style=flat"/> </a> </p> <p align="center"> <a href="https://scholar.google.com/citations?user=d0E7YlcAAAAJ">Xun Liang</a><sup>1*</sup>, <a href="https://ki-seki.github.io/">Shichao Song</a><sup>1*</sup>, <a href="https://github.com/fan2goa1">Zifan Zheng</a><sup>2*</sup>, <a href="https://github.com/MarrytheToilet">Hanyu Wang</a><sup>1</sup>, <a href="https://github.com/Duguce">Qingchen Yu</a><sup>2</sup>, <a href="https://xkli-allen.github.io/">Xunkai Li</a><sup>3</sup>, <a href="https://ronghuali.github.io/index.html">Rong-Hua Li</a><sup>3</sup>, Yi Wang<sup>4</sup>, Zhonghao Wang<sup>4</sup>, <a href="https://scholar.google.com/citations?user=GOKgLdQAAAAJ">Feiyu Xiong</a><sup>2</sup>, <a href="https://www.semanticscholar.org/author/Zhiyu-Li/2268429641">Zhiyu Li</a><sup>2†</sup> </p> <p align="center"> <small> <sup>1</sup><a href="https://en.ruc.edu.cn/">RUC</a>, <sup>2</sup><a href="https://www.iaar.ac.cn/">IAAR</a>, <sup>3</sup><a href="https://english.bit.edu.cn/">BIT</a>, <sup>4</sup><a href="https://english.news.cn/">Xinhua</a> <br> <sup>*</sup>Equal contribution, <sup>†</sup>Corresponding author (lizy@iaar.ac.cn) </small> </p>

[!IMPORTANT]

  • Consider giving our repository a 🌟, so you will receive the latest news (paper list updates, new comments, etc.);
  • If you want to cite our work, here is our bibtex entry: CITATION.bib.

📰 News

  • 2024/10/26 We have created a relevant WeChat Group (微信群) for discussing reasoning and hallucination in LLMs.
  • 2024/09/18 Paper v3.0 and a relevant Twitter thread.
  • 2024/08/24 Updated paper list for better user experience. Link. Ongoing updates.
  • 2024/07/22 Our paper ranks first on Hugging Face Daily Papers! Link.
  • 2024/07/21 Our paper is now available on arXiv. Link.

🎉 Introduction

Welcome to the GitHub repository for our survey paper titled "Internal Consistency and Self-Feedback in Large Language Models: A Survey." The survey's goal is to provide a unified perspective on the self-evaluation and self-updating mechanisms in LLMs, encapsulated within the frameworks of Internal Consistency and Self-Feedback.

This repository includes three key resources:

<details><summary>Click Me to Show the Table of Contents</summary> </details>

📚 Paper List

Here we list the most important references cited in our survey, as well as the papers we consider worth noting. This list will be updated regularly.

Related Survey Papers

These are some of the most relevant surveys related to our paper.

  • A Survey on the Honesty of Large Language Models
    CUHK, arXiv, 2024 [Paper] [Code]

  • Awesome LLM Reasoning
    NTU, GitHub, 2024 [Code]

  • Awesome LLM Strawberry
    NVIDIA, GitHub, 2024 [Code]

  • Extrinsic Hallucinations in LLMs
    OpenAI, Blog, 2024 [Paper]

  • When Can LLMs Actually Correct Their Own Mistakes? A Critical Survey of Self-Correction of LLMs
    PSU, arXiv, 2024 [Paper]

  • A Survey on Self-Evolution of Large Language Models
    PKU, arXiv, 2024 [Paper] [Code]

  • Demystifying Chains, Trees, and Graphs of Thoughts
    ETH, arXiv, 2024 [Paper]

  • Automatically Correcting Large Language Models: Surveying the Landscape of Diverse Automated Correction Strategies
    UCSB, TACL, 2024 [Paper] [Code]

  • Uncertainty in Natural Language Processing: Sources, Quantification, and Applications
    Nankai, arXiv, 2023 [Paper]

Section IV: Consistency Signal Acquisition

For various forms of expressions from an LLM, we can obtain various forms of consistency signals, which can help in better updating the expressions.

Confidence Estimation

  • Can LLMs Express Their Uncertainty? An Empirical Evaluation of Confidence Elicitation in LLMs
    NUS, ICLR, 2024 [Paper] [Code]

  • Linguistic Calibration of Long-Form Generations
    Stanford, ICML, 2024 [Paper] [Code]

  • InternalInspector I2: Robust Confidence Estimation in LLMs through Internal States
    VT, arXiv, 2024 [Paper]

  • Cycles of Thought: Measuring LLM Confidence through Stable Explanations
    UCLA, arXiv, 2024 [Paper]

  • TrustScore: Reference-Free Evaluation of LLM Response Trustworthiness
    UoEdin, arXiv, 2024 [Paper] [Code]

  • Semantic Uncertainty: Linguistic Invariances for Uncertainty Estimation in Natural Language Generation
    Oxford, ICLR, 2023 [Paper] [Code]

  • Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness
    UMD, arXiv, 2023 [Paper]

  • Teaching models to express their uncertainty in words
    Oxford, TMLR, 2022 [Paper] [Code]

  • Language Models (Mostly) Know What They Know
    Anthropic, arXiv, 2022 [Paper]

Hallucination Detection

  • **Investigating Factuality in Long-Form Text Generation: The Roles of Self-Known and Self-
View on GitHub
GitHub Stars173
CategoryEducation
Updated13d ago
Forks4

Languages

Jupyter Notebook

Security Score

85/100

Audited on Mar 19, 2026

No findings