S3mirror
Production-ready Python utility for mirroring buckets and objects between S3-compatible endpoints with parallel transfers, comprehensive logging, and automation-friendly CLI.
Install / Use
/learn @soakes/S3mirrorREADME
S3 Mirror 🪞
S3 Mirror is a production-ready Python utility for synchronizing buckets and objects between S3-compatible endpoints. Built on boto3, it provides enterprise-grade reliability with comprehensive logging, parallelized transfers, and automation-friendly operation.
Table of Contents
- Overview
- Key Features
- Prerequisites
- Installation
- Configuration
- Usage
- Logging
- Safety Considerations
- Continuous Integration
- Project Structure
- Contributing
- License
Overview
Motivation: While MinIO's mc client has served as a capable mirroring tool, recent upstream changes and deprecation of essential features created concerns about long-term reliability and availability. S3 Mirror addresses this gap by providing:
- Complete independence from proprietary tooling ecosystems
- Foundation on boto3, the industry-standard AWS SDK for Python
- Full transparency and auditability of synchronization operations
- Universal S3 compatibility across AWS, MinIO, Ceph, Backblaze B2, Wasabi, and other providers
This tool is designed for infrastructure engineers and DevOps teams requiring dependable, scriptable S3 replication without vendor lock-in.
Key Features
✅ Multi-Endpoint Synchronization – Mirror buckets and objects between any S3-compatible services
✅ Performance Optimization – Configurable parallelization with multipart upload support
✅ True Mirroring – Optional deletion of extraneous destination objects for exact replication
✅ Flexible Configuration – YAML/JSON config files with CLI flag overrides
✅ Production Logging – Multiple logging modes including cron-friendly file output with silent console operation
✅ Automation Ready – Idempotent design for reliable scheduled execution
✅ CI/CD Validated – Automated linting and formatting across Python 3.10–3.13
✅ Dependency Management – Automated security updates via Dependabot
Prerequisites
- Python 3.10 or higher (tested through 3.13)
- S3 Credentials: AWS access keys or IAM credentials for both source and destination endpoints
- Network Access: Connectivity to both S3 endpoints (including proxy/firewall configuration if required)
Installation
Clone the repository and set up the Python environment:
git clone https://github.com/soakes/s3mirror.git
cd s3mirror
# Create and activate virtual environment
python -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
# Install dependencies
pip install -r requirements.txt
Configuration
S3 Mirror uses YAML or JSON configuration files to define connection parameters and synchronization behavior. Create a config file based on the template below:
source:
endpoint_url: "https://s3.source.example.com"
aws_access_key_id: "SOURCE_ACCESS_KEY"
aws_secret_access_key: "SOURCE_SECRET_KEY"
region_name: "us-east-1"
verify_ssl: false
destination:
endpoint_url: "https://s3.destination.example.com"
aws_access_key_id: "DEST_ACCESS_KEY"
aws_secret_access_key: "DEST_SECRET_KEY"
region_name: "us-east-1"
verify_ssl: false
performance:
max_workers: 20 # Parallel transfer threads
multipart_threshold: 8388608 # 8 MB - files larger trigger multipart upload
multipart_chunksize: 8388608 # 8 MB - chunk size for multipart uploads
max_concurrency: 10 # Concurrent S3 operations per thread
max_pool_connections: 50 # HTTP connection pool size
sync:
delete_extraneous: true # Remove objects in destination not present in source
exclude_buckets: [] # Bucket names to skip during mirroring
Configuration Parameters
Source/Destination Blocks:
endpoint_url: S3-compatible API endpoint URLaws_access_key_id/aws_secret_access_key: Authentication credentialsregion_name: AWS region identifier (required even for non-AWS endpoints)verify_ssl: SSL certificate verification (disable for self-signed certificates)
Performance Tuning:
- Adjust
max_workersbased on available CPU cores and network bandwidth - Set
multipart_thresholdandmultipart_chunksizeaccording to typical object sizes - Increase
max_pool_connectionsfor high-throughput scenarios
Sync Behavior:
delete_extraneous: Enable true mirroring by removing destination-only objectsexclude_buckets: Skip specific buckets (useful for test/temporary buckets)
Usage
Basic Operation
Execute a synchronization using your configuration file:
./s3mirror.py --config config.yaml
Command-Line Options
# Silent mode (console shows errors only)
./s3mirror.py --config config.yaml --quiet
# Log to file with silent console (ideal for cron jobs)
./s3mirror.py --config config.yaml --log-file /var/log/s3mirror.log
# Debug mode with verbose output
./s3mirror.py --config config.yaml --debug
# Disable deletion of extraneous objects
./s3mirror.py --config config.yaml --no-delete
Cron Automation Example
Add to your crontab for scheduled synchronization:
# Run daily at 2:00 AM with file logging
0 2 * * * /path/to/s3mirror/.venv/bin/python /path/to/s3mirror/s3mirror.py --config /path/to/config.yaml --log-file /var/log/s3mirror.log --quiet
Logging
S3 Mirror provides multiple logging modes tailored to different operational contexts:
| Mode | Console Output | File Output | Use Case |
|------|----------------|-------------|----------|
| Normal | Human-readable progress messages | None | Interactive execution |
| Debug | Colorized [LEVEL] messages with details | None | Troubleshooting |
| File Log | Errors only | Full DEBUG with timestamps | Production automation |
| Quiet | None (unless errors occur) | None | Minimal output scenarios |
Recommendation: For production cron jobs, use --log-file with --quiet to maintain detailed logs while preventing unnecessary console output.
Safety Considerations
⚠️ Deletion Behavior: When delete_extraneous: true, S3 Mirror removes objects from the destination that do not exist in the source. This ensures perfect replication but requires careful consideration.
Best Practices:
- Test in non-production environments first to validate configuration
- Enable deletion only when true mirroring is required (vs. one-way copying)
- Use
exclude_bucketsto protect specific buckets from synchronization - Review logs regularly to identify unexpected deletions or errors
- Maintain backup copies of critical data before enabling deletion
To disable deletion while still copying new/changed objects:
./s3mirror.py --config config.yaml --no-delete
Or set delete_extraneous: false in the configuration file.
Continuous Integration
Every commit and pull request is automatically validated through GitHub Actions across Python 3.10 through 3.13:
Code Quality Workflows
Linting (lint.yml):
- Pylint: Static code analysis for code quality and standards compliance
- Cross-version testing: Validates compatibility across all supported Python versions
Formatting (format.yml):
- Black: Code formatting verification (PEP 8 conformance)
- isort: Import statement organization
- Consistent style enforcement across the entire codebase
Dependency Management
Dependabot (dependabot.yml):
- Automated dependency updates for security patches and version bumps
- Weekly scanning of Python packages and GitHub Actions
- Auto-merge workflow (
dependabot-auto-merge.yml) for patch and minor updates
The CI pipeline ensures code quality, security, and cross-version compatibility, providing confidence for production deployment.
Project Structure
s3mirror/
├── .github/
│ ├── dependabot.yml # Dependabot configuration
│ └── workflows/
│ ├── dependabot-auto-merge.yml # Auto-merge for dependency updates
│ ├── format.yml # Code formatting checks (Black, isort)
│ └── lint.yml # Linting workflow (Pylint)
├── .pylintrc # Pylint configuration and standards
├── LICENSE # MIT License
├── README.md # This documentation
├── requirements.txt # Python dependencies (boto3, PyYAML, etc.)
└── s3mirror.py # Main synchronization script
Contributing
Contributions are welcome and appreciated. To contribute:
- Fork the repository on GitHub
- Create a feature branch (
git checkout -b feature/your-feature) - Implement your changes with appropriate tests
- Ensure CI passes (run
pylint,black, andisortlocally) - **Submit a pull request
Related Skills
imsg
349.2kiMessage/SMS CLI for listing chats, history, and sending messages via Messages.app.
node-connect
349.2kDiagnose OpenClaw node connection and pairing failures for Android, iOS, and macOS companion apps
oracle
349.2kBest practices for using the oracle CLI (prompt + file bundling, engines, sessions, and file attachment patterns).
lobster
349.2kLobster Lobster executes multi-step workflows with approval checkpoints. Use it when: - User wants a repeatable automation (triage, monitor, sync) - Actions need human approval before executing (s
