Petaly
Python Open-source ETL tool for seamless data movement across PostgreSQL, MySQL, Redshift, BigQuery, S3, GCS, and CSV files, with yaml/json-based configuration.
Install / Use
/learn @petaly-labs/PetalyREADME


Overview
Petaly is an open-source ETL/ELT (Extract, Load, "Transform") tool, created by and for data professionals! Our mission is to simplify data movement across different platforms with a tool that truly understands the needs of the data community.
Key Features
-
Multiple Data Sources: Support for various endpoints:
- PostgreSQL
- MySQL
- BigQuery
- Redshift
- Google Cloud Storage (GCS Bucket)
- S3 Bucket
- Local CSV files
-
Features:
- Source to target schema evaluation and mapping
- CSV file load with column-type recognition
- Target table structure generation
- Configurable type mapping between different databases
- Full table unload/load in CSV format
-
User-Friendly: No programming knowledge required
-
YAML/JSON Configuration: Easy pipeline setup
-
Cloud Ready: Full support for AWS and GCP
[EXPERIMENTAL]:
Petaly went agentic!<br> The AI Agent can create and run pipeline using natural language prompts.<br> If you're interested in exploring, check out the experimental branch: petaly-ai-agent<br>
Feedback is welcome!
Quick Start
Requirements
System Requirements
- Python 3.10 - 3.12
- Operating System:
- Linux
- MacOS
Note: Petaly may work on other operating systems and Python versions, but these haven't been tested yet.
Installation
Basic Installation
# Create and activate virtual environment
mkdir petaly
cd petaly
python3 -m venv .venv
source .venv/bin/activate
# Install Petaly
python3 -m pip install petaly
Cloud Provider Support
GCP Support
# Install with GCP support
python3 -m pip install petaly[gcp]
Prerequisites:
- Install Google Cloud SDK
- Configure access to your Google Project
- Set up service account authentication
AWS Support
# Install with AWS support
python3 -m pip install petaly[aws]
Prerequisites:
- Install AWS CLI
- Configure AWS credentials
Full Installation
# Install all features including AWS, GCP
python3 -m pip install petaly[all]
From Source
# Clone the repository
git clone https://github.com/petaly-labs/petaly.git
cd petaly
# Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate
# Install development dependencies
pip3 install -r requirements.txt
# Install in editable mode (recommended)
pip install -e .
# Alternative: Add src to PYTHONPATH
export PYTHONPATH=$PYTHONPATH:$(pwd)/src
Configuration
1. Initialize Configuration
# Create petaly.ini in default location (~/.petaly/petaly.ini)
python3 -m petaly init
# Or specify custom location
python3 -m petaly -c /absolute-path-to-your-config-dir/petaly.ini init
2. Set Environment Variable (Optional)
# Set the environment variable if the folder differs from the default location
export PETALY_CONFIG_DIR=/absolute-path-to-your-config-dir
# Alternative run command using the main config parameter: -c /absolute-path-to-your-config-dir/petaly.ini
python3 -m petaly -c /absolute-path-to-your-config-dir/petaly.ini [command]
3. Initialize Workspace
- Configure
petaly.ini:
[workspace_config]
pipeline_dir_path=/home/user/petaly/pipelines
logs_dir_path=/home/user/petaly/logs
output_dir_path=/home/user/petaly/output
[global_settings]
logging_mode=INFO
pipeline_format=yaml
- Create workspace:
python3 -m petaly init --workspace
Create Pipeline
Initialize a new pipeline:
python3 -m petaly init -p my_pipeline
Follow the wizard to configure your pipeline. For detailed configuration options, see Pipeline Configuration Guide.
Run Pipeline
Execute your pipeline:
python3 -m petaly run -p my_pipeline
Run Specific Operations
# Extract data from source only
python3 -m petaly run -p my_pipeline --source_only
# Load data to target only
python3 -m petaly run -p my_pipeline --target_only
# Run specific objects
python3 -m petaly run -p my_pipeline -o object1,object2
Tutorial: CSV to PostgreSQL
Prerequisites
- Petaly installed and workspace initialized
- PostgreSQL server running
Steps
- Initialize Pipeline
python3 -m petaly init -p csv_to_postgres
- Download Test Data
# Download and extract test files
gunzip options.csv.gz
gunzip stocks.csv.gz
- Configure Pipeline
- Use
csvas source - Use
postgresas target - Configure database connection details
- Run Pipeline
python3 -m petaly run -p csv_to_postgres
Example Configuration
pipeline:
pipeline_attributes:
pipeline_name: csv_to_postgres
is_enabled: true
source_attributes:
connector_type: csv
target_attributes:
connector_type: postgres
database_user: root
database_password: db-password
database_host: localhost
database_port: 5432
database_name: petalydb
database_schema: petaly_tutorial
data_attributes:
use_data_objects_spec: only
object_default_settings:
header: true
columns_delimiter: ","
columns_quote: none
Documentation
Contributing
We welcome contributions! Please see our Contributing Guide for details.
License
Petaly is licensed under the Apache License 2.0. See the LICENSE file for details.
Related Skills
claude-opus-4-5-migration
82.0kMigrate prompts and code from Claude Sonnet 4.0, Sonnet 4.5, or Opus 4.1 to Opus 4.5
model-usage
333.7kUse CodexBar CLI local cost usage to summarize per-model usage for Codex or Claude, including the current (most recent) model or a full model breakdown. Trigger when asked for model-level usage/cost data from codexbar, or when you need a scriptable per-model summary from codexbar cost JSON.
feishu-drive
333.7k|
things-mac
333.7kManage Things 3 via the `things` CLI on macOS (add/update projects+todos via URL scheme; read/search/list from the local Things database)
