Yank
Yank is a fast CLI downloader that fetches multiple files concurrently from URLs, with progress tracking, resume support, and automatic retries. Read URLs from a file or arguments and parallelize downloads efficiently.
Install / Use
/learn @HosseinAbedi/YankREADME
A simple, fast downloader with resume and progress. Use the standalone binary or Docker — no Haskell setup required.
Why Yank
- Built for flaky networks: resume via HTTP Range and per-file retries
- Clear progress: percent, file count, bytes, aggregate speed
- Simple inputs: positional URLs or
-f file.txt
Highlights
- ⚡ Concurrent downloads with configurable worker pool
- 🔄 Automatic retries
- 📊 Progress line with % complete, per-file size, and aggregate speed
- ⏸️ Resume partial downloads (HTTP Range)
- 📂 Read URLs from file (
-f file.txt, one URL per line) - 🎯 Type-safe CLI (optparse-applicative) and robust error reporting
- 🔗 Google Drive support: Download public files and folders directly with automatic URL conversion and preserve original filenames
Why not a shell script?
- Built-in resume and retries with HTTP Range handling (shell
curlloops often restart from scratch). - Live progress UI (TUI or quiet summary) across multiple files, not per-process logs.
- Concurrency with backpressure via worker pool instead of ad-hoc xargs/background jobs.
- Consistent summary (success/fail/bytes/time) and exit codes for automation.
- Single static binary or Docker image; no bespoke dependencies or fragile per-URL logic.
Comparison with Other Tools
vs. wget/curl
| Feature | Yank | wget | curl |
|---------|------|------|------|
| Concurrent downloads | ✅ Built-in worker pool | ❌ Manual scripting needed | ❌ Manual scripting needed |
| Resume support | ✅ HTTP Range with -r flag | ✅ -c flag | ✅ -C - flag |
| Progress tracking | ✅ Aggregate multi-file TUI | ⚠️ Per-file only | ⚠️ Per-file only |
| Automatic retries | ✅ Configurable per-file | ⚠️ Limited (--tries) | ❌ Manual implementation |
| Google Drive files | ✅ Auto-detect & convert URLs | ❌ No support | ❌ No support |
| Google Drive folders | ✅ Extract & download all files | ❌ No support | ❌ No support |
| URL list from file | ✅ -f file.txt | ✅ -i file.txt | ❌ Requires xargs |
| Summary report | ✅ Success/fail/bytes/time | ❌ No aggregate stats | ❌ No aggregate stats |
vs. gdown
| Feature | Yank | gdown | |---------|------|-------| | Google Drive files | ✅ Auto-detect share links | ✅ Supports file IDs | | Google Drive folders | ✅ All files, no limit | ⚠️ Up to 50 files only | | Folder structure | ✅ Creates named subfolders | ❌ Downloads to current dir | | Mixed URL types | ✅ Drive + HTTP/HTTPS in one run | ❌ Drive-only | | Concurrent downloads | ✅ Configurable workers | ❌ Sequential only | | Resume support | ✅ HTTP Range for all sources | ❌ No resume | | Progress tracking | ✅ Real-time TUI with speed | ⚠️ Basic progress bar | | Automatic retries | ✅ Configurable per-file | ❌ No retries | | Multiple URLs | ✅ From CLI or file | ⚠️ Single file at a time | | Authentication | ❌ Public links only | ⚠️ Limited auth support |
Key advantages of Yank:
- No 50-file limit for Google Drive folders (gdown limitation)
- Mixed sources: Download from Google Drive, GitHub releases, direct URLs in one command
- Production-ready: Concurrent downloads, retries, resume, and comprehensive error handling
- Folder organization: Automatically creates subfolders matching Drive folder names
Downloads
- Prebuilt archives are published on GitHub Releases (tar.gz/zip per OS/arch).
- After extracting, place the binary on your PATH (e.g.,
~/.local/binon Linux/macOS or%USERPROFILE%\binon Windows).
Download and Install the Binary
Head over to the Releases page and grab the latest Linux archive (usually named yank-vX.Y.Z-linux-amd64.tar.gz).
Once you have the file on your machine or server, follow these steps to install it:
# Unzip the archive
tar -xzvf yank-vX.Y.Z-linux-amd64.tar.gz
# Make the binary executable
chmod +x yank
# Move it to your local bin path so you can run it from anywhere
sudo mv yank /usr/local/bin/
Verify the installation:
# Check that yank is accessible from anywhere
which yank
# Display the version to confirm it's working
yank --version
If which yank doesn't show /usr/local/bin/yank, ensure /usr/local/bin is in your $PATH:
echo $PATH
# Should include /usr/local/bin
# If not, add it to your shell profile (~/.bashrc, ~/.zshrc, etc.):
export PATH="/usr/local/bin:$PATH"
Examples
Basic Usage: Parallel Data Pulls
The most common use case for ML engineers is downloading a list of URLs from a text file. If you have a datasets.txt containing links to your CSVs or other types of files:
# Publicly available direct-download CSV, ZIP and GZ files
# Last checked functional around early 2026
# All links are from open government, academic or well-known open data sources
# No login / payment required
# ────────────────────────────────────────────────
# Plain .csv files (direct download, no compression)
# ────────────────────────────────────────────────
https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv
# Very small classic dataset: monthly international airline passengers 1958–1960
https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv
# Tiny example file with fake names and addresses
https://people.sc.fsu.edu/~jburkardt/data/csv/biostats.csv
# Very small: office workers height, weight, age, etc.
https://img.exim.gov/s3fs-public/dataset/vbhv-d8am/Data.Gov_-_FY25_Q3.csv
# Example export-related data snapshot from export.gov / data.gov
https://edg.epa.gov/EPADataCommons/public/OA/EPA_SmartLocationDatabase_V3_Jan_2021_Final.csv
# EPA Smart Location Database – large (~ hundreds of MB), good for urban / transport analysis
# ────────────────────────────────────────────────
# .zip files (archives, usually containing CSVs inside)
# ────────────────────────────────────────────────
https://edg.epa.gov/EPADataCommons/public/OA/WalkabilityIndex.zip
# EPA National Walkability Index dataset
https://galaxy-zoo-1.s3.amazonaws.com/GalaxyZoo1_DR_table5.csv.zip
# Galaxy Zoo citizen science project – classifications table
# ────────────────────────────────────────────────
# .gz / .gzip files (compressed, usually TSV or CSV inside)
# ────────────────────────────────────────────────
https://datasets.imdbws.com/title.basics.tsv.gz
# IMDb – title basics (movies, series, episodes, etc.) – very popular, ~1 GB uncompressed
https://datasets.imdbws.com/title.ratings.tsv.gz
# IMDb – user ratings and vote counts
https://datasets.imdbws.com/title.akas.tsv.gz
# IMDb – alternate titles / regions / languages
https://galaxy-zoo-1.s3.amazonaws.com/GalaxyZoo1_DR_table2.csv.gz
# Galaxy Zoo – another classifications / demographics table
https://ftp.ncbi.nlm.nih.gov/geo/series/GSE147nnn/GSE147507/suppl/GSE147507_counts_processed_ENSEMBL.txt.gz
# NCBI GEO – single-cell RNA-seq count matrix example (COVID-19 PBMC study)
# ────────────────────────────────────────────────
# Quick test / funny / tiny files
# ────────────────────────────────────────────────
https://people.sc.fsu.edu/~jburkardt/data/csv/cities.csv
# Very small list of world cities with population & country
https://people.sc.fsu.edu/~jburkardt/data/csv/oscar_age_male.csv
# Small: ages of Best Actor Oscar winners
Then run:
yank -c 5 -o downloads -f datasets.txt --no-tui --retries 1
Example output:
🚀 Yank v0.2.1 - Concurrent Downloader
📦 Downloading 15 files with concurrency 5
✅ [ 6% | 1/15] 18 MB (18 MB/s) Data.Gov_-_FY25_Q3.csv
✅ [ 13% | 2/15] 321 B (18 MB/s) airtravel.csv
✅ [ 20% | 3/15] 328 B (18 MB/s) addresses.csv
✅ [ 26% | 4/15] 849 B (18 MB/s) biostats.csv
✅ [ 33% | 5/15] 3 MB (5 MB/s) GalaxyZoo1_DR_table5.csv.zip
✅ [ 40% | 6/15] 7 MB (5 MB/s) title.ratings.tsv.gz
✅ [ 46% | 7/15] 206 MB (39 MB/s) title.basics.tsv.gz
✅ [ 53% | 8/15] 19 MB (21 MB
