GLiNER2
Unified Schema-Based Information Extraction
Install / Use
/learn @fastino-ai/GLiNER2README
GLiNER2: Unified Schema-Based Information Extraction and Text Classification
Extract entities, classify text, parse structured data, and extract relations—all in one efficient model.
GLiNER2 unifies Named Entity Recognition, Text Classification, Structured Data Extraction, and Relation Extraction into a single 205M parameter model. It provides efficient CPU-based inference without requiring complex pipelines or external API dependencies.
✨ Why GLiNER2?
- 🎯 One Model, Four Tasks: Entities, classification, structured data, and relations in a single forward pass
- 💻 CPU First: Lightning-fast inference on standard hardware—no GPU required
- 🛡️ Privacy: 100% local processing, zero external dependencies
🚀 Installation & Quick Start
pip install gliner2
from gliner2 import GLiNER2
# Load model once, use everywhere
extractor = GLiNER2.from_pretrained("fastino/gliner2-base-v1")
# Extract entities in one line
text = "Apple CEO Tim Cook announced iPhone 15 in Cupertino yesterday."
result = extractor.extract_entities(text, ["company", "person", "product", "location"])
print(result)
# {'entities': {'company': ['Apple'], 'person': ['Tim Cook'], 'product': ['iPhone 15'], 'location': ['Cupertino']}}
Quantization and Compilation
Enable fp16 and/or torch.compile for faster inference — no extra dependencies required.
# fp16
model = GLiNER2.from_pretrained("fastino/gliner2-base-v1", map_location="cuda", quantize=True)
# torch.compile (fused GPU kernels, first call triggers tracing)
model = GLiNER2.from_pretrained("fastino/gliner2-base-v1", map_location="cuda", compile=True)
# Both
model = GLiNER2.from_pretrained("fastino/gliner2-base-v1", map_location="cuda", quantize=True, compile=True)
# Or after loading
model.quantize()
model.compile()
🌐 API Access: GLiNER XL 1B
Our biggest and most powerful model—GLiNER XL 1B—is available exclusively via API. No GPU required, no model downloads, just instant access to state-of-the-art extraction. Get your API key at gliner.pioneer.ai.
from gliner2 import GLiNER2
# Access GLiNER XL 1B via API
extractor = GLiNER2.from_api() # Uses PIONEER_API_KEY env variable
result = extractor.extract_entities(
"OpenAI CEO Sam Altman announced GPT-5 at their San Francisco headquarters.",
["company", "person", "product", "location"]
)
# {'entities': {'company': ['OpenAI'], 'person': ['Sam Altman'], 'product': ['GPT-5'], 'location': ['San Francisco']}}
📦 Available Models
| Model | Parameters | Description | Use Case |
|-------|------------|-------------|--------------------------------------------------|
| fastino/gliner2-base-v1 | 205M | base size | Extraction / classification |
| fastino/gliner2-large-v1 | 340M | large size | Extraction / classification |
The models are available on Hugging Face.
📚 Documentation & Tutorials
Comprehensive guides for all GLiNER2 features:
Core Features
- Text Classification - Single and multi-label classification with confidence scores
- Entity Extraction - Named entity recognition with descriptions and spans
- Structured Data Extraction - Parse complex JSON structures from text
- Combined Schemas - Multi-task extraction in a single pass
- Regex Validators - Filter and validate extracted spans
- Relation Extraction - Extract relationships between entities
- API Access - Use GLiNER2 via cloud API
Training & Customization
- Training Data Format - Complete guide to preparing training data
- Model Training - Train custom models for your domain
- LoRA Adapters - Parameter-efficient fine-tuning
- Adapter Switching - Switch between domain adapters
🎯 Core Capabilities
1. Entity Extraction
Extract named entities with optional descriptions for precision:
# Basic entity extraction
entities = extractor.extract_entities(
"Patient received 400mg ibuprofen for severe headache at 2 PM.",
["medication", "dosage", "symptom", "time"]
)
# Output: {'entities': {'medication': ['ibuprofen'], 'dosage': ['400mg'], 'symptom': ['severe headache'], 'time': ['2 PM']}}
# Enhanced with descriptions for medical accuracy
entities = extractor.extract_entities(
"Patient received 400mg ibuprofen for severe headache at 2 PM.",
{
"medication": "Names of drugs, medications, or pharmaceutical substances",
"dosage": "Specific amounts like '400mg', '2 tablets', or '5ml'",
"symptom": "Medical symptoms, conditions, or patient complaints",
"time": "Time references like '2 PM', 'morning', or 'after lunch'"
}
)
# Same output but with higher accuracy due to context descriptions
# With confidence scores
entities = extractor.extract_entities(
"Apple Inc. CEO Tim Cook announced iPhone 15 in Cupertino.",
["company", "person", "product", "location"],
include_confidence=True
)
# Output: {
# 'entities': {
# 'company': [{'text': 'Apple Inc.', 'confidence': 0.95}],
# 'person': [{'text': 'Tim Cook', 'confidence': 0.92}],
# 'product': [{'text': 'iPhone 15', 'confidence': 0.88}],
# 'location': [{'text': 'Cupertino', 'confidence': 0.90}]
# }
# }
# With character positions (spans)
entities = extractor.extract_entities(
"Apple Inc. CEO Tim Cook announced iPhone 15 in Cupertino.",
["company", "person", "product"],
include_spans=True
)
# Output: {
# 'entities': {
# 'company': [{'text': 'Apple Inc.', 'start': 0, 'end': 9}],
# 'person': [{'text': 'Tim Cook', 'start': 15, 'end': 23}],
# 'product': [{'text': 'iPhone 15', 'start': 35, 'end': 44}]
# }
# }
# With both confidence and spans
entities = extractor.extract_entities(
"Apple Inc. CEO Tim Cook announced iPhone 15 in Cupertino.",
["company", "person", "product"],
include_confidence=True,
include_spans=True
)
# Output: {
# 'entities': {
# 'company': [{'text': 'Apple Inc.', 'confidence': 0.95, 'start': 0, 'end': 9}],
# 'person': [{'text': 'Tim Cook', 'confidence': 0.92, 'start': 15, 'end': 23}],
# 'product': [{'text': 'iPhone 15', 'confidence': 0.88, 'start': 35, 'end': 44}]
# }
# }
2. Text Classification
Single or multi-label classification with configurable confidence:
# Sentiment analysis
result = extractor.classify_text(
"This laptop has amazing performance but terrible battery life!",
{"sentiment": ["positive", "negative", "neutral"]}
)
# Output: {'sentiment': 'negative'}
# Multi-aspect classification
result = extractor.classify_text(
"Great camera quality, decent performance, but poor battery life.",
{
"aspects": {
"labels": ["camera", "performance", "battery", "display", "price"],
"multi_label": True,
"cls_threshold": 0.4
}
}
)
# Output: {'aspects': ['camera', 'performance', 'battery']}
# With confidence scores
result = extractor.classify_text(
"This laptop has amazing performance but terrible battery life!",
{"sentiment": ["positive", "negative", "neutral"]},
include_confidence=True
)
# Output: {'sentiment': {'label': 'negative', 'confidence': 0.82}}
# Multi-label with confidence
schema = extractor.create_schema().classification(
"topics",
["technology", "business", "health", "politics", "sports"],
multi_label=True,
cls_threshold=0.3
)
text = "Apple announced new health monitoring features in their latest smartwatch, boosting their stock price."
results = extractor.extract(text, schema, include_confidence=True)
# Output: {
# 'topics': [
# {'label': 'technology', 'confidence': 0.92},
# {'label': 'business', 'confidence': 0.78},
# {'label': 'health', 'confidence': 0.65}
# ]
# }
3. Structured Data Extraction
Parse complex structured information with field-level control:
# Product information extraction
text = "iPhone 15 Pro Max with 256GB storage, A17 Pro chip, priced at $1199. Available in titanium and black colors."
result = extractor.extract_json(
text,
{
"product": [
"name::str::Full product name and model",
"storage::str::Storage capacity like 256GB or 1TB",
"processor::str::Chip or processor information",
"price::str::Product price with currency",
"colors::list::Available color options"
]
}
)
# Output: {
# 'product': [{
# 'name': 'iPhone 15 Pro Max',
# 'storage': '256GB',
# 'processor': 'A17 Pro chip',
# 'price': '$1199',
# 'colors': ['titanium', 'black']
# }]
# }
# Multiple structured entities
text = "Apple Inc. headquarters in Cupertino launched iPhone 15 for $999 and MacBook Air for $1299."
result = extractor.extract_json(
text,
{
"company": [
"name::str::Company name",
"location::str::Company headquarters or office location"
],
"products": [
"name::str::Product name and model",
"price::str::Product retail price"
Related Skills
node-connect
349.0kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
frontend-design
109.4kCreate distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.
openai-whisper-api
349.0kTranscribe audio via OpenAI Audio Transcriptions API (Whisper).
qqbot-media
349.0kQQBot 富媒体收发能力。使用 <qqmedia> 标签,系统根据文件扩展名自动识别类型(图片/语音/视频/文件)。
