Sqlxport
sql2parquet: A modern CLI tool to export SQL query results from PostgreSQL or Amazon Redshift directly to Parquet files, with optional upload to S3 or MinIO.
Install / Use
/learn @vahid110/SqlxportREADME
sqlxport
Modular CLI + API tool to extract data from PostgreSQL, Redshift, SQLite (and more), exporting to formats like Parquet/CSV, with optional S3 upload and Athena integration.
✅ Features
- 🔄 Run custom SQL queries against PostgreSQL, Redshift, SQLite
- 📦 Export to Parquet or CSV (
--format) - 🩣 Upload results to S3 or MinIO
- 🔄 Redshift
UNLOADsupport (--export-mode redshift-unload) - 🧹 Partition output by column
- 📜 Generate Athena
CREATE TABLEDDL - 🔍 Preview local or remote Parquet/CSV files
- ⚙️
.envsupport for convenient config - 🐍 Reusable Python API
❓ Why SQLxport?
SQLxport simplifies data export workflows and is designed for automation:
- ✅ One command gives you SQL → Parquet/CSV → S3
- 🧱 Works locally, in CI, or inside Docker
- 🪢 Connects to Athena, MinIO, Redshift easily
- 🔌 Clean format and database plugin model
- 🧪 Fully tested, scriptable, production-ready
📦 Installation
pip install .
# or for development
pip install -e .
🚀 Usage
Choose Export Mode
| --export-mode | Compatible DB URLs | Description |
|------------------------|----------------------------------|--------------------------|
| postgres-query | postgresql://, postgres:// | SELECT + local export |
| redshift-unload | redshift:// | UNLOAD to S3 |
| sqlite-query | sqlite:///path.db | For local/lightweight testing |
CLI Examples
Basic Export
sqlxport run \
--export-mode postgres-query \
--db-url postgresql://user:pass@localhost:5432/mydb \
--query "SELECT * FROM users" \
--output-file users.parquet \
--format parquet
S3 Upload
sqlxport run \
--export-mode postgres-query \
--db-url postgresql://... \
--query "..." \
--output-file users.parquet \
--s3-bucket my-bucket \
--s3-key users.parquet \
--s3-access-key AKIA... \
--s3-secret-key ... \
--s3-endpoint https://s3.amazonaws.com
Partitioned Export
sqlxport run \
--export-mode postgres-query \
--db-url postgresql://... \
--query "..." \
--output-dir output/ \
--partition-by group_column \
--format csv
Redshift UNLOAD Mode
sqlxport run \
--export-mode redshift-unload \
--db-url redshift://... \
--query "SELECT * FROM large_table" \
--s3-output-prefix s3://bucket/unload/ \
--iam-role arn:aws:iam::123456789012:role/MyUnloadRole
🐍 Python API
from sqlxport.api.export import run_export, ExportJobConfig
config = ExportJobConfig(
db_url="sqlite:///test.db",
query="SELECT * FROM users",
format="csv",
output_file="out.csv",
export_mode="sqlite-query"
)
run_export(config)
🧪 Running Tests
pytest tests/unit/
pytest tests/integration/
pytest tests/e2e/
🔧 Environment Variables
Supports .env or exported shell variables:
DB_URL=postgresql://username:password@localhost:5432/mydb
S3_BUCKET=my-bucket
S3_KEY=data/users.parquet
S3_ACCESS_KEY=...
S3_SECRET_KEY=...
S3_ENDPOINT=https://s3.amazonaws.com
IAM_ROLE=arn:aws:iam::123456789012:role/MyUnloadRole
Generate a template with:
sqlxport run --generate-env-template
🛠 Roadmap
- ✅ Modular export modes
- ✅ CSV and partitioned output
- ⏳ Add
jsonl,xlsxformats - ⏳ Plugin system for writers/loaders
- ⏳ SaaS mode / UI platform
- ⏳ Kafka/Kinesis streaming support
🔐 Security
- Don’t commit
.envfiles - Use credential vaults when possible
👨💼 Author
Vahid Saber
Built with ❤️ for data engineers and developers.
📄 License
MIT License
