SkillAgentSearch skills...

Summarize

Point at any URL/YouTube/Podcast or file. Get the gist. CLI and Chrome Extension.

Install / Use

/learn @steipete/Summarize
About this skill

Quality Score

0/100

Supported Platforms

Universal

README

Summarize 📝 — Chrome Side Panel + CLI

Fast summaries from URLs, files, and media. Works in the terminal, a Chrome Side Panel and Firefox Sidebar.

Highlights

  • Chrome Side Panel chat (streaming agent + history) inside the sidebar.
  • YouTube slides: screenshots + OCR + transcript cards, timestamped seek, OCR/Transcript toggle.
  • Media-aware summaries: auto‑detect video/audio vs page content.
  • Streaming Markdown + metrics + cache‑aware status.
  • CLI supports URLs, files, podcasts, YouTube, audio/video, PDFs.

Feature overview

  • URLs, files, and media: web pages, PDFs, images, audio/video, YouTube, podcasts, RSS.
  • Slide extraction for video sources (YouTube/direct media) with OCR + timestamped cards.
  • Transcript-first media flow: published transcripts when available, then Groq/ONNX/whisper.cpp/AssemblyAI/Gemini/OpenAI/FAL transcription fallback when not.
  • Streaming output with Markdown rendering, metrics, and cache-aware status.
  • Local, paid, and free models: OpenAI‑compatible local endpoints, paid providers, plus an OpenRouter free preset.
  • Output modes: Markdown/text, JSON diagnostics, extract-only, metrics, timing, and cost estimates.
  • Smart default: if content is shorter than the requested length, we return it as-is (use --force-summary to override).

Get the extension (recommended)

Summarize extension screenshot

One‑click summarizer for the current tab. Chrome Side Panel + Firefox Sidebar + local daemon for streaming Markdown.

Chrome Web Store: Summarize Side Panel

YouTube slide screenshots (from the browser):

Summarize YouTube slide screenshots

Beginner quickstart (extension)

  1. Install the CLI (choose one):
    • npm (cross‑platform): npm i -g @steipete/summarize
    • Homebrew (macOS arm64): brew install steipete/tap/summarize
  2. Install the extension (Chrome Web Store link above) and open the Side Panel.
  3. The panel shows a token + install command. Run it in Terminal:
    • summarize daemon install --token <TOKEN>

Why a daemon/service?

  • The extension can’t run heavy extraction inside the browser. It talks to a local background service on 127.0.0.1 for fast streaming and media tools (yt‑dlp, ffmpeg, OCR, transcription).
  • The service autostarts (launchd/systemd/Scheduled Task) so the Side Panel is always ready.

If you only want the CLI, you can skip the daemon install entirely.

Notes:

  • Summarization only runs when the Side Panel is open.
  • Auto mode summarizes on navigation (incl. SPAs); otherwise use the button.
  • Daemon is localhost-only and requires a shared token; rerunning summarize daemon install --token <TOKEN> adds another paired browser token instead of invalidating the old one.
  • Autostart: macOS (launchd), Linux (systemd user), Windows (Scheduled Task).
  • Tip: configure free via summarize refresh-free (needs OPENROUTER_API_KEY). Add --set-default to set model=free.

More:

Slides (extension)

  • Select Video + Slides in the Summarize picker.
  • Slides render at the top; expand to full‑width cards with timestamps.
  • Click a slide to seek the video; toggle Transcript/OCR when OCR is significant.
  • Requirements: yt-dlp + ffmpeg for extraction; tesseract for OCR. Missing tools show an in‑panel notice.

Advanced (unpacked / dev)

  1. Build + load the extension (unpacked):
    • Chrome: pnpm -C apps/chrome-extension build
      • chrome://extensions → Developer mode → Load unpacked
      • Pick: apps/chrome-extension/.output/chrome-mv3
    • Firefox: pnpm -C apps/chrome-extension build:firefox
      • about:debugging#/runtime/this-firefox → Load Temporary Add-on
      • Pick: apps/chrome-extension/.output/firefox-mv3/manifest.json
  2. Open Side Panel/Sidebar → copy token.
  3. Install daemon in dev mode:
    • pnpm summarize daemon install --token <TOKEN> --dev

CLI

Summarize CLI screenshot

Install

Requires Node 22+.

  • npx (no install):
npx -y @steipete/summarize "https://example.com"
  • npm (global):
npm i -g @steipete/summarize
  • npm (library / minimal deps):
npm i @steipete/summarize-core
import { createLinkPreviewClient } from "@steipete/summarize-core/content";
  • Homebrew (custom tap):
brew install steipete/tap/summarize

Homebrew availability depends on the current tap formula for your architecture. If Homebrew install fails on Intel/x64, use the npm global install above.

Optional local dependencies

Install these if you want media-heavy features:

  • ffmpeg: required for --slides and many local media/transcription flows
  • yt-dlp: required for YouTube slide extraction and some remote media flows
  • tesseract: optional OCR for --slides-ocr
  • Optional cloud transcription providers:
    • GROQ_API_KEY
    • ASSEMBLYAI_API_KEY
    • GEMINI_API_KEY / GOOGLE_GENERATIVE_AI_API_KEY / GOOGLE_API_KEY
    • OPENAI_API_KEY
    • FAL_KEY

macOS (Homebrew):

brew install ffmpeg yt-dlp
brew install tesseract # optional, for --slides-ocr

If --slides is enabled and these tools are missing, Summarize warns and continues without slides.

CLI vs extension

  • CLI only: just install via npm/Homebrew and run summarize ... (no daemon needed).
  • Chrome/Firefox extension: install the CLI and run summarize daemon install --token <TOKEN> so the Side Panel can stream results and use local tools.

Quickstart

summarize "https://example.com"

Inputs

URLs or local paths:

summarize "/path/to/file.pdf" --model google/gemini-3-flash
summarize "https://example.com/report.pdf" --model google/gemini-3-flash
summarize "/path/to/audio.mp3"
summarize "/path/to/video.mp4"

Stdin (pipe content using -):

echo "content" | summarize -
pbpaste | summarize -
# binary stdin also works (PDF/image/audio/video bytes)
cat /path/to/file.pdf | summarize -

Notes:

  • Stdin has a 50MB size limit
  • The - argument tells summarize to read from standard input
  • Text stdin is treated as UTF-8 text (whitespace-only input is rejected as empty)
  • Binary stdin is preserved as raw bytes and file type is auto-detected when possible
  • Useful for piping clipboard content or command output

YouTube (supports youtube.com and youtu.be):

summarize "https://youtu.be/dQw4w9WgXcQ" --youtube auto

Podcast RSS (transcribes latest enclosure):

summarize "https://feeds.npr.org/500005/podcast.xml"

Apple Podcasts episode page:

summarize "https://podcasts.apple.com/us/podcast/2424-jelly-roll/id360084272?i=1000740717432"

Spotify episode page (best-effort; may fail for exclusives):

summarize "https://open.spotify.com/episode/5auotqWAXhhKyb9ymCuBJY"

Output length

--length controls how much output we ask for (guideline), not a hard cap.

summarize "https://example.com" --length long
summarize "https://example.com" --length 20k
  • Presets: short|medium|long|xl|xxl
  • Character targets: 1500, 20k, 20000
  • Optional hard cap: --max-output-tokens <count> (e.g. 2000, 2k)
    • Provider/model APIs still enforce their own maximum output limits.
    • If omitted, no max token parameter is sent (provider default).
    • Prefer --length unless you need a hard cap.
  • Short content: when extracted content is shorter than the requested length, the CLI returns the content as-is.
    • Override with --force-summary to always run the LLM.
  • Minimums: --length numeric values must be >= 50 chars; --max-output-tokens must be >= 16.
  • Preset targets (source of truth: packages/core/src/prompts/summary-lengths.ts):
    • short: target ~900 chars (range 600-1,200)
    • medium: target ~1,800 chars (range 1,200-2,500)
    • long: target ~4,200 chars (range 2,500-6,000)
    • xl: target ~9,000 chars (range 6,000-14,000)
    • xxl: target ~17,000 chars (range 14,000-22,000)

What file types work?

Best effort and provider-dependent. These usually work well:

  • text/* and common structured text (.txt, .md, .json, .yaml, .xml, ...)
    • Text-like files are inlined into the prompt for better provider compatibility.
  • PDFs: application/pdf (provider support varies; Google is the most reliable here)
  • Images: image/jpeg, image/png, image/webp, image/gif
  • Audio/Video: audio/*, video/* (local audio/video files MP3/WAV/M4A/OGG/FLAC/MP4/MOV/WEBM automatically transcribed, when supported by the model)

Notes:

  • If a provider rejects a media type, the CLI fails fast with a friendly message.
  • xAI models do not support attaching generic files (like PDFs) via the AI SDK; use Google/OpenAI/Anthropic for those.

Model ids

Use gateway-style ids: <provider>/<model>.

Examples:

  • openai/gpt-5-mini
  • anthropic/claude-sonnet-4-5
  • xai/grok-4-fast-non-reasoning
  • google/gemini-3-flash
  • zai/glm-4.7
  • openrouter/openai/gpt-5-mini (force OpenRouter)

Note: some models/providers do not support streaming or certain file media types. When that happens, the CLI prints a friendly error (or auto-disables streaming for that model when supported by the provider).

Limits

  • Text inputs over 10 MB are rejected before tokenization.
  • Text prompts are preflighted against the model input limit (LiteLLM catalog), using a GPT tokenizer.

Common flags

summarize <input> [flags]

Use summarize --help or summarize help for the full help text.

  • --model <provider/model>: which model to use (defaults to auto)
  • --model auto: a
View on GitHub
GitHub Stars5.3k
CategoryDevelopment
Updated5h ago
Forks345

Languages

TypeScript

Security Score

85/100

Audited on Mar 30, 2026

No findings