S5cmd
Parallel S3 and local filesystem execution tool.
Install / Use
/learn @peak/S5cmdREADME

Overview
s5cmd is a very fast S3 and local filesystem execution tool. It comes with support
for a multitude of operations including tab completion and wildcard support
for files, which can be very handy for your object storage workflow while working
with large number of files.
There are already other utilities to work with S3 and similar object storage
services, thus it is natural to wonder what s5cmd has to offer that others don't.
In short, s5cmd offers a very fast speed.
Thanks to Joshua Robinson for his
study and experimentation on s5cmd; to quote his medium post:
For uploads, s5cmd is 32x faster than s3cmd and 12x faster than aws-cli. For downloads, s5cmd can saturate a 40Gbps link (~4.3 GB/s), whereas s3cmd and aws-cli can only reach 85 MB/s and 375 MB/s respectively.
If you would like to know more about performance of s5cmd and the
reasons for its fast speed, refer to benchmarks section
Features

s5cmd supports wide range of object management tasks both for cloud
storage services and local filesystems.
- List buckets and objects
- Upload, download or delete objects
- Move, copy or rename objects
- Set Server Side Encryption using AWS Key Management Service (KMS)
- Set Access Control List (ACL) for objects/files on the upload, copy, move.
- Print object contents to stdout
- Select JSON records from objects using SQL expressions
- Create or remove buckets
- Summarize objects sizes, grouping by storage class
- Wildcard support for all operations
- Multiple arguments support for delete operation
- Command file support to run commands in batches at very high execution speeds
- Dry run support
- S3 Transfer Acceleration support
- Google Cloud Storage (and any other S3 API compatible service) support
- Structured logging for querying command outputs
- Shell auto-completion
- S3 ListObjects API backward compatibility
Installation
Official Releases
Binaries
The Releases page provides pre-built binaries for Linux, macOS and Windows.
Homebrew
For macOS, a homebrew tap is provided:
brew install peak/tap/s5cmd
Unofficial Releases (by Community)
Warning These releases are maintained by the community. They might be out of date compared to the official releases.
MacPorts
You can also install s5cmd from MacPorts on macOS:
sudo port selfupdate
sudo port install s5cmd
Conda
s5cmd is included in the conda-forge channel, and it can be downloaded through the Conda.
Installing
s5cmdfrom theconda-forgechannel can be achieved by addingconda-forgeto your channels with:conda config --add channels conda-forge conda config --set channel_priority strictOnce the
conda-forgechannel has been enabled,s5cmdcan be installed withconda:conda install s5cmd
ps. Quoted from s5cmd feedstock. You can also find further instructions on its README.
FreeBSD
On FreeBSD you can install s5cmd as a package:
pkg install s5cmd
or via ports:
cd /usr/ports/net/s5cmd
make install clean
Build from source
You can build s5cmd from source if you have Go 1.19+
installed.
go install github.com/peak/s5cmd/v2@master
⚠️ Please note that building from master is not guaranteed to be stable since
development happens on master branch.
Docker
Hub
$ docker pull peakcom/s5cmd
$ docker run --rm -v ~/.aws:/root/.aws peakcom/s5cmd <S3 operation>
ℹ️ /aws directory is the working directory of the image. Mounting your current working directory to it allows you to run s5cmd as if it was installed in your system;
docker run --rm -v $(pwd):/aws -v ~/.aws:/root/.aws peakcom/s5cmd <S3 operation>
Build
$ git clone https://github.com/peak/s5cmd && cd s5cmd
$ docker build -t s5cmd .
$ docker run --rm -v ~/.aws:/root/.aws s5cmd <S3 operation>
Usage
s5cmd supports multiple-level wildcards for all S3 operations. This is
achieved by listing all S3 objects with the prefix up to the first wildcard,
then filtering the results in-memory. For example, for the following command;
s5cmd cp 's3://bucket/logs/2020/03/*' .
first a ListObjects request is send, then the copy operation will be executed
against each matching object, in parallel.
Specifying credentials
s5cmd uses official AWS SDK to access S3. SDK requires credentials to sign
requests to AWS. Credentials can be provided in a variety of ways:
-
Command line options
--profileto use a named profile,--credentials-fileflag to use the specified credentials file# Use your company profile in AWS default credential file s5cmd --profile my-work-profile ls s3://my-company-bucket/ # Use your company profile in your own credential file s5cmd --credentials-file ~/.your-credentials-file --profile my-work-profile ls s3://my-company-bucket/ -
Environment variables
# Export your AWS access key and secret pair export AWS_ACCESS_KEY_ID='<your-access-key-id>' export AWS_SECRET_ACCESS_KEY='<your-secret-access-key>' export AWS_PROFILE='<your-profile-name>' export AWS_REGION='<your-bucket-region>' s5cmd ls s3://your-bucket/ -
If
s5cmdruns on an Amazon EC2 instance, EC2 IAM role -
If
s5cmdruns on EKS, Kube IAM role -
Or, you can send requests anonymously with
--no-sign-requestoption# List objects in a public bucket s5cmd --no-sign-request ls s3://public-bucket/
Region detection
While executing the commands, s5cmd detects the region according to the following order of priority:
--source-regionor--destination-regionflags ofcpcommand.AWS_REGIONenvironment variable.- Region section of AWS profile.
- Auto detection from bucket region (via
HeadBucketAPI call). us-east-1as default region.
Examples
Check if a bucket exists
s5cmd head s3://bucket/
Print a remote object's metadata
s5cmd head s3://bucket/object.gz
Download a single S3 object
s5cmd cp s3://bucket/object.gz .
Download multiple S3 objects
Suppose we have the following objects:
s3://bucket/logs/2020/03/18/file1.gz
s3://bucket/logs/2020/03/19/file2.gz
s3://bucket/logs/2020/03/19/originals/file3.gz
s5cmd cp 's3://bucket/logs/2020/03/*' logs/
s5cmd will match the given wildcards and arguments by doing an efficient
search against the given prefixes. All matching objects will be downloaded in
parallel. s5cmd will create the destination directory if it is missing.
logs/ directory content will look like:
$ tree
.
└── logs
├── 18
│ └── file1.gz
└── 19
├── file2.gz
└── originals
└── file3.gz
4 directories, 3 files
ℹ️ s5cmd preserves the source directory structure by default. If you want to
flatten the source directory structure, use the --flatten flag.
s5cmd cp --flatten 's3://bucket/logs/2020/03/*' logs/
logs/ directory content will look like:
$ tree
.
└── logs
├── file1.gz
├── file2.gz
└── file3.gz
1 directory, 3 files
Upload a file to S3
s5cmd cp object.gz s3://bucket/
by setting server side encryption (aws kms) of the file:
s5cmd cp -sse aws:kms -sse-kms-key-id <your-kms-key-id> object.gz s3://bucket/
by setting Access Control List (acl) policy of the object:
s5cmd cp -acl bucket-owner-full-control object.gz s3://bucket/
Upload multiple files to S3
s5cmd cp directory/ s3://bucket/
Will upload all files at given directory to S3 while keeping the folder hierarchy of the source.
Stream stdin to S3
You can upload remote objects by piping stdin to s5cmd:
curl https://github.com/peak/s5cmd/ | s5cmd pipe s3://bucket/s5cmd.html
Or you can compress the data before uploading:
gzip -c file | s5cmd pipe s3://bucket/file.gz
Delete an S3 object
s5cmd rm s3://bucket/logs/2020/03/18/file1.gz
Delete multiple S3 objects
s5cmd rm 's3://bucket/logs/2020/03/19/*'
Will remove all matching objects:
s3://bucket/logs/2020/03/19/file2.gz
s3://bucket/logs/2020/03/19/originals/file3.gz
s5cmd utilizes S3 delete batch API. If matching objects are up to 1000,
they'll be deleted in a single request. However, it should be noted that commands such as
s5cmd rm s3://bucket-foo/object s3://bucket-bar/object
are not supported by s5cmd and result in error (since we have 2 different buckets), as it is in odds with the benefit of performing batch delete requests. Thus, if in need, one can use s5cmd run mode for this case, i.e,
$ s5cmd run
rm s3://bucket-foo/object
rm s3://bucket-bar/object
more details and examples on s5cmd run are presented in a later section.
Copy objects from S3 to S3
s5cmd supports copying objects on the server side as well.
s5cmd cp 's3://bucket/logs/2020/*' s3://
