BatchLLM

Batch processing for LLM APIs - CSV/JSONL in, processed out. Concurrent requests, retries, checkpointing, cost tracking.

Generate Convert Improve

Install / Use

/learn @he-yufeng/BatchLLM

About this skill

Quality Score

0/100

README

BatchLLM

Batch processing for LLM APIs. CSV in, processed CSV out.

Feed BatchLLM a file of inputs, it fires them through any OpenAI-compatible API with concurrent requests, automatic retries, rate limiting, checkpointing, and cost tracking. Get a clean output file with results, token counts, and latency stats.

No more writing the same async retry loop for the hundredth time.

Why?

Every data scientist who's done bulk LLM processing has written some version of this:

Async request queue with semaphore-based concurrency
Exponential backoff with jitter on rate limit errors
Checkpoint/resume so you don't re-process 10k items after a crash
Token counting and cost estimation before committing to a big job

BatchLLM packages all of this into a single CLI command and Python API.

Features

Concurrent processing — configurable parallelism with asyncio semaphore
Automatic retries — exponential backoff, configurable max retries
Checkpoint/resume — crash-safe JSONL checkpoints, pick up where you left off
Cost tracking — real-time token counting with pricing for 30+ models
Cost estimation — estimate tokens and cost before running (uses tiktoken)
Multiple input formats — CSV, JSONL, plain text
Any OpenAI-compatible API — OpenAI, Anthropic (via proxy), DeepSeek, local models, etc.
Prompt templates — customize prompts with {input} placeholder
Rich progress bar — live progress with throughput and ETA

Installation

pip install batchllm

Quick Start

CLI

# Basic: process a CSV file
batchllm run data.csv -m gpt-4o-mini

# With system prompt and template
batchllm run data.csv -m gpt-4o-mini \
  -s "You are a translator" \
  -t "Translate to French: {input}"

# Higher concurrency with custom output path
batchllm run data.csv -m gpt-4o-mini -c 20 -o results.csv

# Resume from checkpoint after interruption
batchllm run data.csv -m gpt-4o-mini --checkpoint data.ckpt

# Estimate cost before running
batchllm estimate data.csv -m gpt-4o

# Use any OpenAI-compatible API
batchllm run data.csv -m deepseek-chat \
  --base-url https://api.deepseek.com/v1 \
  --api-key $DEEPSEEK_API_KEY

Python API

import asyncio
from batchllm import BatchProcessor, BatchConfig

config = BatchConfig(
    model="gpt-4o-mini",
    system_prompt="Classify the sentiment as positive, negative, or neutral.",
    max_concurrent=15,
    max_retries=3,
)

processor = BatchProcessor(config)

items = [
    "This product is amazing!",
    "Worst purchase ever.",
    "It's okay I guess.",
]

results = asyncio.run(processor.process_items(items))

for r in results:
    print(f"{r.input_text[:30]}... -> {r.output_text}")
    print(f"  tokens: {r.tokens_in}+{r.tokens_out}, latency: {r.latency_ms:.0f}ms")

File Processing

import asyncio
from batchllm import BatchProcessor, BatchConfig

config = BatchConfig(
    model="gpt-4o-mini",
    prompt_template="Summarize in one sentence: {input}",
    max_concurrent=10,
    input_column="text",
    output_column="summary",
)

processor = BatchProcessor(config)
results = asyncio.run(
    processor.process_file(
        "articles.csv",
        output_path="summaries.csv",
        checkpoint_path="articles.ckpt",
    )
)

Input Formats

CSV — reads from a configurable column (default: input):

input,category
"This movie was great",review
"Terrible service",complaint

JSONL — reads from a configurable field:

{"input": "This movie was great", "category": "review"}
{"input": "Terrible service", "category": "complaint"}

Plain text — one item per line:

This movie was great
Terrible service

Output Format

Output mirrors input format with added columns:

input,output,error,tokens_in,tokens_out,latency_ms
"This movie was great","Positive sentiment","",15,3,234.5
"Terrible service","Negative sentiment","",12,3,198.2

Cost Estimation

Estimate before running to avoid surprises:

$ batchllm estimate data.csv -m gpt-4o

┌──────────────────┬───────────────┐
│ Metric           │ Value         │
├──────────────────┼───────────────┤
│ Items            │ 10,000        │
│ Est. Input Tokens│ 1,250,000     │
│ Est. Output      │ ~1,250,000    │
│ Model            │ gpt-4o        │
│ Est. Cost        │ $15.63        │
└──────────────────┴───────────────┘

Supported Models (Cost Tracking)

Includes pricing for 30+ models:

| Provider | Models | |----------|--------| | OpenAI | gpt-4o, gpt-4o-mini, gpt-4-turbo, gpt-3.5-turbo, o1, o1-mini, o3-mini | | Anthropic | claude-opus-4-6, claude-sonnet-4-6, claude-haiku-4-5, claude-3.5-sonnet, claude-3-haiku | | Google | gemini-2.0-flash, gemini-1.5-pro, gemini-1.5-flash | | DeepSeek | deepseek-chat, deepseek-reasoner | | Mistral | mistral-large, mistral-small |

Custom pricing can be passed via the Python API.

Configuration

| Option | CLI Flag | Default | Description | |--------|----------|---------|-------------| | model | -m | gpt-4o-mini | Model name | | system_prompt | -s | None | System prompt | | prompt_template | -t | {input} | Prompt template | | max_concurrent | -c | 10 | Max parallel requests | | max_retries | --max-retries | 3 | Retries per item | | max_tokens | --max-tokens | None | Max output tokens | | temperature | --temperature | None | Sampling temperature | | api_key | --api-key | env OPENAI_API_KEY | API key | | base_url | --base-url | env OPENAI_BASE_URL | API base URL |

Contributing

git clone https://github.com/he-yufeng/BatchLLM.git
cd BatchLLM
pip install -e ".[dev]"
pytest

License

MIT

Related Skills

node-connect

352.9k

Diagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps

frontend-design

111.5k

Create distinctive, production-grade frontend interfaces with high design quality. Use this skill when the user asks to build web components, pages, or applications. Generates creative, polished code that avoids generic AI aesthetics.

openai-whisper-api

352.9k

Transcribe audio via OpenAI Audio Transcriptions API (Whisper).

qqbot-media

352.9k

QQBot 富媒体收发能力。使用 <qqmedia> 标签，系统根据文件扩展名自动识别类型（图片/语音/视频/文件）。