SkillAgentSearch skills...

GlitchMiner

[AAAI 2026] Code of the paper "GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization"

Install / Use

/learn @wooozihui/GlitchMiner
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization

Python HuggingFace Transformers Contributions welcome

Update

  • 2025/11/8 🎉🎉🎉 Our paper has been accepted by AAAI 2026!
  • 2024/11/7 We add reproduction code of the two baseline method in this paper.

Read our paper for detailed insights.


🔍 Introduction

GlitchMiner is a robust framework designed to detect glitch tokens—tokens that cause unexpected behaviors in large language models (LLMs). These anomalies can severely impact model outputs, particularly in sensitive applications such as healthcare or finance. It uses gradient-based discrete optimization to identify glitch tokens effectively. <img width="1018" height="525" alt="image" src="https://github.com/user-attachments/assets/35dde1fa-c932-45b6-8b65-ce0fbdc092cc" />

  • Left: The pipeline.

  • Right: Visualization of GlitchMiner's local search strategy.


🛠️ Getting Started

Install GlitchMiner with pip:

pip install git+https://github.com/wooozihui/GlitchMiner.git

Usage Example

from transformers import AutoTokenizer, AutoModelForCausalLM
from glitchminer import GlitchMiner
import torch

if __name__ == "__main__":
    model_path = "Qwen/Qwen2.5-7B-Instruct"
    tokenizer = AutoTokenizer.from_pretrained(model_path)
    model = AutoModelForCausalLM.from_pretrained(
            model_path,
            device_map="cuda",
            torch_dtype=torch.bfloat16,
        )

    # Run GlitchMiner for glitch token detection
    glitch_tokens, glitch_token_ids = GlitchMiner(
        model,
        tokenizer,
        num_iterations=125,
        batch_size=8,
        k=32,
        if_print=True,
        print_language="CN",
    )

Strictly Glitch Token Verification

To eliminate false positives, we recommend using the strictly_glitch_verification function for cross-validation.

    from glitchminer import strictly_glitch_verification
    glitch_count, verified_glitch_ids = strictly_glitch_verification(model, tokenizer, glitch_token_ids)
    print(glitch_count)

⚙️ GlitchMiner Parameters

Here are the configurable parameters for GlitchMiner, with explanations of their purpose and usage:

| Parameter | Type | Default Value | Description | |-------------------|----------|---------------|---------------------------------------------------------------------------------------------------| | model | Model | Required | A Hugging Face AutoModelForCausalLM model used for glitch token detection. | | tokenizer | Tokenizer| Required | A Hugging Face AutoTokenizer for encoding and decoding tokens. | | num_iterations | int | 125 | The number of iterations to run the glitch token search. | | batch_size | int | 8 | Number of tokens processed per batch during the search process. | | k | int | 32 | Number of top similar tokens to evaluate during each iteration using cosine similarity. | | if_print | bool | True | If True, prints detailed progress and results during execution. | | print_language | str | "CN" | Output language for printed messages. Supports "CN" for Chinese and "ENG" for English. | | skip_tokens | list | None | Optional list of token IDs to exclude from the glitch detection process. |


🌟 Citing

If you find GlitchMiner helpful in your research, please consider give us a star or cite:

@misc{wu2024glitchminerminingglitchtokens,
      title={GlitchMiner: Mining Glitch Tokens in Large Language Models via Gradient-based Discrete Optimization}, 
      author={Zihui Wu and Haichang Gao and Ping Wang and Shudong Zhang and Zhaoxiang Liu and Shiguo Lian},
      year={2024},
      eprint={2410.15052},
      archivePrefix={arXiv},
      primaryClass={cs.AI},
      url={https://arxiv.org/abs/2410.15052}, 
}

Related Skills

View on GitHub
GitHub Stars45
CategoryDevelopment
Updated1mo ago
Forks3

Languages

Python

Security Score

75/100

Audited on Mar 9, 2026

No findings