Summarize
Point at any URL/YouTube/Podcast or file. Get the gist. CLI and Chrome Extension.
Install / Use
/learn @steipete/SummarizeREADME
Summarize 📝 — Chrome Side Panel + CLI
Fast summaries from URLs, files, and media. Works in the terminal, a Chrome Side Panel and Firefox Sidebar.
Highlights
- Chrome Side Panel chat (streaming agent + history) inside the sidebar.
- YouTube slides: screenshots + OCR + transcript cards, timestamped seek, OCR/Transcript toggle.
- Media-aware summaries: auto‑detect video/audio vs page content.
- Streaming Markdown + metrics + cache‑aware status.
- CLI supports URLs, files, podcasts, YouTube, audio/video, PDFs.
Feature overview
- URLs, files, and media: web pages, PDFs, images, audio/video, YouTube, podcasts, RSS.
- Slide extraction for video sources (YouTube/direct media) with OCR + timestamped cards.
- Transcript-first media flow: published transcripts when available, then Groq/ONNX/whisper.cpp/AssemblyAI/Gemini/OpenAI/FAL transcription fallback when not.
- Streaming output with Markdown rendering, metrics, and cache-aware status.
- Local, paid, and free models: OpenAI‑compatible local endpoints, paid providers, plus an OpenRouter free preset.
- Output modes: Markdown/text, JSON diagnostics, extract-only, metrics, timing, and cost estimates.
- Smart default: if content is shorter than the requested length, we return it as-is (use
--force-summaryto override).
Get the extension (recommended)

One‑click summarizer for the current tab. Chrome Side Panel + Firefox Sidebar + local daemon for streaming Markdown.
Chrome Web Store: Summarize Side Panel
YouTube slide screenshots (from the browser):

Beginner quickstart (extension)
- Install the CLI (choose one):
- npm (cross‑platform):
npm i -g @steipete/summarize - Homebrew (macOS arm64):
brew install steipete/tap/summarize
- npm (cross‑platform):
- Install the extension (Chrome Web Store link above) and open the Side Panel.
- The panel shows a token + install command. Run it in Terminal:
summarize daemon install --token <TOKEN>
Why a daemon/service?
- The extension can’t run heavy extraction inside the browser. It talks to a local background service on
127.0.0.1for fast streaming and media tools (yt‑dlp, ffmpeg, OCR, transcription). - The service autostarts (launchd/systemd/Scheduled Task) so the Side Panel is always ready.
If you only want the CLI, you can skip the daemon install entirely.
Notes:
- Summarization only runs when the Side Panel is open.
- Auto mode summarizes on navigation (incl. SPAs); otherwise use the button.
- Daemon is localhost-only and requires a shared token; rerunning
summarize daemon install --token <TOKEN>adds another paired browser token instead of invalidating the old one. - Autostart: macOS (launchd), Linux (systemd user), Windows (Scheduled Task).
- Tip: configure
freeviasummarize refresh-free(needsOPENROUTER_API_KEY). Add--set-defaultto set model=free.
More:
- Step-by-step install: apps/chrome-extension/README.md
- Architecture + troubleshooting: docs/chrome-extension.md
- Firefox compatibility notes: apps/chrome-extension/docs/firefox.md
Slides (extension)
- Select Video + Slides in the Summarize picker.
- Slides render at the top; expand to full‑width cards with timestamps.
- Click a slide to seek the video; toggle Transcript/OCR when OCR is significant.
- Requirements:
yt-dlp+ffmpegfor extraction;tesseractfor OCR. Missing tools show an in‑panel notice.
Advanced (unpacked / dev)
- Build + load the extension (unpacked):
- Chrome:
pnpm -C apps/chrome-extension buildchrome://extensions→ Developer mode → Load unpacked- Pick:
apps/chrome-extension/.output/chrome-mv3
- Firefox:
pnpm -C apps/chrome-extension build:firefoxabout:debugging#/runtime/this-firefox→ Load Temporary Add-on- Pick:
apps/chrome-extension/.output/firefox-mv3/manifest.json
- Chrome:
- Open Side Panel/Sidebar → copy token.
- Install daemon in dev mode:
pnpm summarize daemon install --token <TOKEN> --dev
CLI

Install
Requires Node 22+.
- npx (no install):
npx -y @steipete/summarize "https://example.com"
- npm (global):
npm i -g @steipete/summarize
- npm (library / minimal deps):
npm i @steipete/summarize-core
import { createLinkPreviewClient } from "@steipete/summarize-core/content";
- Homebrew (custom tap):
brew install steipete/tap/summarize
Homebrew availability depends on the current tap formula for your architecture. If Homebrew install fails on Intel/x64, use the npm global install above.
Optional local dependencies
Install these if you want media-heavy features:
ffmpeg: required for--slidesand many local media/transcription flowsyt-dlp: required for YouTube slide extraction and some remote media flowstesseract: optional OCR for--slides-ocr- Optional cloud transcription providers:
GROQ_API_KEYASSEMBLYAI_API_KEYGEMINI_API_KEY/GOOGLE_GENERATIVE_AI_API_KEY/GOOGLE_API_KEYOPENAI_API_KEYFAL_KEY
macOS (Homebrew):
brew install ffmpeg yt-dlp
brew install tesseract # optional, for --slides-ocr
If --slides is enabled and these tools are missing, Summarize warns and continues without slides.
CLI vs extension
- CLI only: just install via npm/Homebrew and run
summarize ...(no daemon needed). - Chrome/Firefox extension: install the CLI and run
summarize daemon install --token <TOKEN>so the Side Panel can stream results and use local tools.
Quickstart
summarize "https://example.com"
Inputs
URLs or local paths:
summarize "/path/to/file.pdf" --model google/gemini-3-flash
summarize "https://example.com/report.pdf" --model google/gemini-3-flash
summarize "/path/to/audio.mp3"
summarize "/path/to/video.mp4"
Stdin (pipe content using -):
echo "content" | summarize -
pbpaste | summarize -
# binary stdin also works (PDF/image/audio/video bytes)
cat /path/to/file.pdf | summarize -
Notes:
- Stdin has a 50MB size limit
- The
-argument tells summarize to read from standard input - Text stdin is treated as UTF-8 text (whitespace-only input is rejected as empty)
- Binary stdin is preserved as raw bytes and file type is auto-detected when possible
- Useful for piping clipboard content or command output
YouTube (supports youtube.com and youtu.be):
summarize "https://youtu.be/dQw4w9WgXcQ" --youtube auto
Podcast RSS (transcribes latest enclosure):
summarize "https://feeds.npr.org/500005/podcast.xml"
Apple Podcasts episode page:
summarize "https://podcasts.apple.com/us/podcast/2424-jelly-roll/id360084272?i=1000740717432"
Spotify episode page (best-effort; may fail for exclusives):
summarize "https://open.spotify.com/episode/5auotqWAXhhKyb9ymCuBJY"
Output length
--length controls how much output we ask for (guideline), not a hard cap.
summarize "https://example.com" --length long
summarize "https://example.com" --length 20k
- Presets:
short|medium|long|xl|xxl - Character targets:
1500,20k,20000 - Optional hard cap:
--max-output-tokens <count>(e.g.2000,2k)- Provider/model APIs still enforce their own maximum output limits.
- If omitted, no max token parameter is sent (provider default).
- Prefer
--lengthunless you need a hard cap.
- Short content: when extracted content is shorter than the requested length, the CLI returns the content as-is.
- Override with
--force-summaryto always run the LLM.
- Override with
- Minimums:
--lengthnumeric values must be >= 50 chars;--max-output-tokensmust be >= 16. - Preset targets (source of truth:
packages/core/src/prompts/summary-lengths.ts):- short: target ~900 chars (range 600-1,200)
- medium: target ~1,800 chars (range 1,200-2,500)
- long: target ~4,200 chars (range 2,500-6,000)
- xl: target ~9,000 chars (range 6,000-14,000)
- xxl: target ~17,000 chars (range 14,000-22,000)
What file types work?
Best effort and provider-dependent. These usually work well:
text/*and common structured text (.txt,.md,.json,.yaml,.xml, ...)- Text-like files are inlined into the prompt for better provider compatibility.
- PDFs:
application/pdf(provider support varies; Google is the most reliable here) - Images:
image/jpeg,image/png,image/webp,image/gif - Audio/Video:
audio/*,video/*(local audio/video files MP3/WAV/M4A/OGG/FLAC/MP4/MOV/WEBM automatically transcribed, when supported by the model)
Notes:
- If a provider rejects a media type, the CLI fails fast with a friendly message.
- xAI models do not support attaching generic files (like PDFs) via the AI SDK; use Google/OpenAI/Anthropic for those.
Model ids
Use gateway-style ids: <provider>/<model>.
Examples:
openai/gpt-5-minianthropic/claude-sonnet-4-5xai/grok-4-fast-non-reasoninggoogle/gemini-3-flashzai/glm-4.7openrouter/openai/gpt-5-mini(force OpenRouter)
Note: some models/providers do not support streaming or certain file media types. When that happens, the CLI prints a friendly error (or auto-disables streaming for that model when supported by the provider).
Limits
- Text inputs over 10 MB are rejected before tokenization.
- Text prompts are preflighted against the model input limit (LiteLLM catalog), using a GPT tokenizer.
Common flags
summarize <input> [flags]
Use summarize --help or summarize help for the full help text.
--model <provider/model>: which model to use (defaults toauto)--model auto: a
