A simple, fast downloader with resume and progress. Use the standalone binary or Docker — no Haskell setup required.

Why Yank

Built for flaky networks: resume via HTTP Range and per-file retries
Clear progress: percent, file count, bytes, aggregate speed
Simple inputs: positional URLs or -f file.txt

Highlights

⚡ Concurrent downloads with configurable worker pool
🔄 Automatic retries
📊 Progress line with % complete, per-file size, and aggregate speed
⏸️ Resume partial downloads (HTTP Range)
📂 Read URLs from file (-f file.txt, one URL per line)
🎯 Type-safe CLI (optparse-applicative) and robust error reporting
🔗 Google Drive support: Download public files and folders directly with automatic URL conversion and preserve original filenames

Why not a shell script?

Built-in resume and retries with HTTP Range handling (shell curl loops often restart from scratch).
Live progress UI (TUI or quiet summary) across multiple files, not per-process logs.
Concurrency with backpressure via worker pool instead of ad-hoc xargs/background jobs.
Consistent summary (success/fail/bytes/time) and exit codes for automation.
Single static binary or Docker image; no bespoke dependencies or fragile per-URL logic.

Comparison with Other Tools

vs. wget/curl

| Feature | Yank | wget | curl | |---------|------|------|------| | Concurrent downloads | ✅ Built-in worker pool | ❌ Manual scripting needed | ❌ Manual scripting needed | | Resume support | ✅ HTTP Range with -r flag | ✅ -c flag | ✅ -C - flag | | Progress tracking | ✅ Aggregate multi-file TUI | ⚠️ Per-file only | ⚠️ Per-file only | | Automatic retries | ✅ Configurable per-file | ⚠️ Limited (--tries) | ❌ Manual implementation | | Google Drive files | ✅ Auto-detect & convert URLs | ❌ No support | ❌ No support | | Google Drive folders | ✅ Extract & download all files | ❌ No support | ❌ No support | | URL list from file | ✅ -f file.txt | ✅ -i file.txt | ❌ Requires xargs | | Summary report | ✅ Success/fail/bytes/time | ❌ No aggregate stats | ❌ No aggregate stats |

vs. gdown

| Feature | Yank | gdown | |---------|------|-------| | Google Drive files | ✅ Auto-detect share links | ✅ Supports file IDs | | Google Drive folders | ✅ All files, no limit | ⚠️ Up to 50 files only | | Folder structure | ✅ Creates named subfolders | ❌ Downloads to current dir | | Mixed URL types | ✅ Drive + HTTP/HTTPS in one run | ❌ Drive-only | | Concurrent downloads | ✅ Configurable workers | ❌ Sequential only | | Resume support | ✅ HTTP Range for all sources | ❌ No resume | | Progress tracking | ✅ Real-time TUI with speed | ⚠️ Basic progress bar | | Automatic retries | ✅ Configurable per-file | ❌ No retries | | Multiple URLs | ✅ From CLI or file | ⚠️ Single file at a time | | Authentication | ❌ Public links only | ⚠️ Limited auth support |

Key advantages of Yank:

No 50-file limit for Google Drive folders (gdown limitation)
Mixed sources: Download from Google Drive, GitHub releases, direct URLs in one command
Production-ready: Concurrent downloads, retries, resume, and comprehensive error handling
Folder organization: Automatically creates subfolders matching Drive folder names

Downloads

Prebuilt archives are published on GitHub Releases (tar.gz/zip per OS/arch).
After extracting, place the binary on your PATH (e.g., ~/.local/bin on Linux/macOS or %USERPROFILE%\bin on Windows).

Download and Install the Binary

Head over to the Releases page and grab the latest Linux archive (usually named yank-vX.Y.Z-linux-amd64.tar.gz).

Once you have the file on your machine or server, follow these steps to install it:

# Unzip the archive
tar -xzvf yank-vX.Y.Z-linux-amd64.tar.gz

# Make the binary executable
chmod +x yank

# Move it to your local bin path so you can run it from anywhere
sudo mv yank /usr/local/bin/

Verify the installation:

# Check that yank is accessible from anywhere
which yank

# Display the version to confirm it's working
yank --version

If which yank doesn't show /usr/local/bin/yank, ensure /usr/local/bin is in your $PATH:

echo $PATH
# Should include /usr/local/bin

# If not, add it to your shell profile (~/.bashrc, ~/.zshrc, etc.):
export PATH="/usr/local/bin:$PATH"

Examples

Basic Usage: Parallel Data Pulls

The most common use case for ML engineers is downloading a list of URLs from a text file. If you have a datasets.txt containing links to your CSVs or other types of files:

# Publicly available direct-download CSV, ZIP and GZ files
# Last checked functional around early 2026
# All links are from open government, academic or well-known open data sources
# No login / payment required

# ────────────────────────────────────────────────
# Plain .csv files (direct download, no compression)
# ────────────────────────────────────────────────

https://people.sc.fsu.edu/~jburkardt/data/csv/airtravel.csv
# Very small classic dataset: monthly international airline passengers 1958–1960

https://people.sc.fsu.edu/~jburkardt/data/csv/addresses.csv
# Tiny example file with fake names and addresses

https://people.sc.fsu.edu/~jburkardt/data/csv/biostats.csv
# Very small: office workers height, weight, age, etc.

https://img.exim.gov/s3fs-public/dataset/vbhv-d8am/Data.Gov_-_FY25_Q3.csv
# Example export-related data snapshot from export.gov / data.gov

https://edg.epa.gov/EPADataCommons/public/OA/EPA_SmartLocationDatabase_V3_Jan_2021_Final.csv
# EPA Smart Location Database – large (~ hundreds of MB), good for urban / transport analysis

# ────────────────────────────────────────────────
# .zip files (archives, usually containing CSVs inside)
# ────────────────────────────────────────────────

https://edg.epa.gov/EPADataCommons/public/OA/WalkabilityIndex.zip
# EPA National Walkability Index dataset

https://galaxy-zoo-1.s3.amazonaws.com/GalaxyZoo1_DR_table5.csv.zip
# Galaxy Zoo citizen science project – classifications table

# ────────────────────────────────────────────────
# .gz / .gzip files (compressed, usually TSV or CSV inside)
# ────────────────────────────────────────────────

https://datasets.imdbws.com/title.basics.tsv.gz
# IMDb – title basics (movies, series, episodes, etc.) – very popular, ~1 GB uncompressed

https://datasets.imdbws.com/title.ratings.tsv.gz
# IMDb – user ratings and vote counts

https://datasets.imdbws.com/title.akas.tsv.gz
# IMDb – alternate titles / regions / languages

https://galaxy-zoo-1.s3.amazonaws.com/GalaxyZoo1_DR_table2.csv.gz
# Galaxy Zoo – another classifications / demographics table

https://ftp.ncbi.nlm.nih.gov/geo/series/GSE147nnn/GSE147507/suppl/GSE147507_counts_processed_ENSEMBL.txt.gz
# NCBI GEO – single-cell RNA-seq count matrix example (COVID-19 PBMC study)

# ────────────────────────────────────────────────
# Quick test / funny / tiny files
# ────────────────────────────────────────────────

https://people.sc.fsu.edu/~jburkardt/data/csv/cities.csv
# Very small list of world cities with population & country

https://people.sc.fsu.edu/~jburkardt/data/csv/oscar_age_male.csv
# Small: ages of Best Actor Oscar winners

Then run:

yank -c 5 -o downloads -f datasets.txt --no-tui --retries 1

Example output:

🚀 Yank v0.2.1 - Concurrent Downloader
📦 Downloading 15 files with concurrency 5

✅ [  6% | 1/15] 18 MB (18 MB/s) Data.Gov_-_FY25_Q3.csv
✅ [ 13% | 2/15] 321 B (18 MB/s) airtravel.csv
✅ [ 20% | 3/15] 328 B (18 MB/s) addresses.csv
✅ [ 26% | 4/15] 849 B (18 MB/s) biostats.csv
✅ [ 33% | 5/15] 3 MB (5 MB/s) GalaxyZoo1_DR_table5.csv.zip
✅ [ 40% | 6/15] 7 MB (5 MB/s) title.ratings.tsv.gz
✅ [ 46% | 7/15] 206 MB (39 MB/s) title.basics.tsv.gz
✅ [ 53% | 8/15] 19 MB (21 MB

Yank

Install / Use

README