SkillAgentSearch skills...

Yas3fs

YAS3FS (Yet Another S3-backed File System) is a Filesystem in Userspace (FUSE) interface to Amazon S3. It was inspired by s3fs but rewritten from scratch to implement a distributed cache synchronized by Amazon SNS notifications. A web console is provided to easily monitor the nodes of a cluster.

Install / Use

/learn @danilop/Yas3fs
About this skill

Quality Score

0/100

Supported Platforms

Zed

README

Yet Another S3-backed File System: yas3fs

Join the chat at https://gitter.im/danilop/yas3fs

YAS3FS (Yet Another S3-backed File System) is a Filesystem in Userspace (FUSE) interface to Amazon S3. It was inspired by s3fs but rewritten from scratch to implement a distributed cache synchronized by Amazon SNS notifications. A web console is provided to easily monitor the nodes of a cluster through the YAS3FS Console project.

If you use YAS3FS please share your experience on the wiki, thanks!

  • It allows to mount an S3 bucket (or a part of it, if you specify a path) as a local folder.
  • It works on Linux and Mac OS X.
  • For maximum speed all data read from S3 is cached locally on the node, in memory or on disk, depending of the file size.
  • Parallel multi-part downloads are used if there are reads in the middle of the file (e.g. for streaming).
  • Parallel multi-part uploads are used for files larger than a specified size.
  • With buffering enabled (the default) files can be accessed during the download from S3 (e.g. for streaming).
  • It can be used on more than one node to create a "shared" file system (i.e. a yas3fs "cluster").
  • SNS notifications are used to update other nodes in the cluster that something has changed on S3 and they need to invalidate their cache.
  • Notifications can be listened using HTTP or SQS endpoints.
  • If the cache grows to its maximum size, the less recently accessed files are removed.
  • Signed URLs are provided through Extended file attributes (xattr).
  • AWS credentials can be passed using AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY environment variables.
  • In an EC2 instance a IAM role can be used to give access to S3/SNS/SQS resources.
  • It is written in Python (2.6) using boto and fusepy.

This is a personal project. No relation whatsoever exists between this project and my employer.

License

Copyright (c) 2012-2014 Danilo Poccia, http://danilop.net

This code is licensed under the The MIT License (MIT). Please see the LICENSE file that accompanies this project for the terms of use.

Introduction

This is the logical architecture of yas3fs:

yas3fs Logical Architecture

I strongly suggest to start yas3fs for the first time with the -df (debug + foreground) options, to see if there is any error. When everything works it can be interrupted (with ^C) and restarted to run in background (it's the default with no -f options).

To mount an S3 bucket without using SNS (i.e. for a single node):

yas3fs s3://bucket/path /path/to/mount

To persist file system metadata such as attr/xattr yas3fs is using S3 User Metadata. To mount an S3 bucket without actually writing metadata in it, e.g. because it is a bucket you mainly use as a repository and not as a file system, you can use the --no-metadata option.

To mount an S3 bucket using SNS and listening to an SQS endpoint:

yas3fs s3://bucket/path /path/to/mount --topic TOPIC-ARN --new-queue

To mount an S3 bucket using SNS and listening to an HTTP endpoint (on EC2):

yas3fs s3://bucket/path /path/to/mount --topic TOPIC-ARN --ec2-hostname --port N

On EC2 the security group must allow inbound traffic from SNS on the selected port.

On EC2 the command line doesn't need any information on the actual server and can easily be used within an Auto Scaling group.

Quick Installation

WARNING: PIP installation is no longer supported. Use "git clone" instead.

Requires Python 2.6 or higher. Install using pip.

pip install yas3fs

If it fails, check the CentOS 6 installation steps below.

If you want to do a quick test here's the installation procedure depending on the OS flavor (Linux or Mac):

  • Create an S3 bucket in the AWS region you prefer.
  • You don't need to create anything in the bucket as the initial path (if any) is created by the tool on the first mount.
  • If you want to use an existing S3 bucket you can use the --no-metadata option to not use user metadata to persist file system attr/xattr.
  • If you want to have more than one node in sync, create an SNS topic in the same region as the S3 bucket and write down the full topic ARN (you need it to run the tool if more than one client is connected to the same bucket/path).
  • Create a IAM Role that gives access to the S3 and SNS/SQS resources you need or pass the AWS credentials to the tool using environment variables (see -h).

On Amazon Linux

sudo yum -y install fuse fuse-libs
sudo easy_install pip
sudo pip install yas3fs # assume root installation
sudo sed -i'' 's/^# *user_allow_other/user_allow_other/' /etc/fuse.conf # uncomment user_allow_other
yas3fs -h # See the usage
mkdir LOCAL-PATH
# For single host mount
yas3fs s3://BUCKET/PATH LOCAL-PATH
# For multiple hosts mount
yas3fs s3://BUCKET/PATH LOCAL-PATH --topic TOPIC-ARN --new-queue

On Ubuntu Linux

sudo apt-get update
sudo apt-get -y install fuse python-pip 
sudo pip install yas3fs # assume root installation
sudo sed -i'' 's/^# *user_allow_other/user_allow_other/' /etc/fuse.conf # uncomment user_allow_other
sudo chmod a+r /etc/fuse.conf # make it readable by anybody, it is not the default on Ubuntu
yas3fs -h # See the usage
mkdir LOCAL-PATH
# For single host mount
yas3fs s3://BUCKET/PATH LOCAL-PATH
# For multiple hosts mount
yas3fs s3://BUCKET/PATH LOCAL-PATH --topic TOPIC-ARN --new-queue

On a Mac with OS X

Install FUSE for OS X from http://osxfuse.github.com.

sudo pip install yas3fs # assume root installation
mkdir LOCAL-PATH
# For single host mount
yas3fs s3://BUCKET/PATH LOCAL-PATH
# For multiple hosts mount
yas3fs s3://BUCKET/PATH LOCAL-PATH --topic TOPIC-ARN --new-queue

On CentOS 6

sudo yum -y install fuse fuse-libs centos-release-scl
sudo yum -y install python27
# upgrade setuptools
scl enable python27 -- pip install setuptools --upgrade
# grab the latest sources
git clone https://github.com/danilop/yas3fs.git
cd yas3fs
scl enable python27 -- python setup.py install
scl enable python27 -- yas3fs -h # See the usage
mkdir LOCAL-PATH
# For single host mount
scl enable python27 -- yas3fs s3://BUCKET/PATH LOCAL-PATH
# For multiple hosts mount
scl enable python27 -- yas3fs s3://BUCKET/PATH LOCAL-PATH --topic TOPIC-ARN --new-queue

/etc/fstab support

# Put contrib/mount.yas3fs to /usr/local/sbin and make the symlink
chmod +x /usr/local/sbin/mount.yas3fs
cd /sbin; sudo ln -s /usr/local/sbin/mount.yas3fs.centos6 # replace centos6 to amzn1 for Amazon Linux installation
# Add the contents of contrib/fstab.snippet to /etc/fstab and modify accordingly
# Try to mount
mount /mnt/mybucket

Workaround to unmount yas3fs correctly during host shutdown or reboot

sudo cp contrib/unmount-yas3fs.init.d /etc/init.d/unmount-yas3fs
sudo chmod +x /etc/init.d/unmount-yas3fs
sudo chkconfig --add unmount-yas3fs
sudo chkconfig unmount-yas3fs on
sudo /etc/init.d/unmount-yas3fs start

To listen to SNS HTTP notifications (I usually suggest to use SQS instead) with a Mac you need to install the Python M2Crypto module, download the most suitable "egg" from http://chandlerproject.org/Projects/MeTooCrypto#Downloads.

sudo easy_install M2Crypto-*.egg

If something does not work as expected you can use the -df options to run in foreground and in debug mode.

Unmount

To unmount the file system on Linux:

fusermount -u LOCAL-PATH
or
umount LOCAL-PATH

The latter works if /etc/fstab support steps (see above) were completed

To unmount the file system on a Mac you can use umount.

rsync usage

rsync's option --inplace has to be used to avoid S3 busy events

Full Usage

yas3fs -h

usage: yas3fs [-h] [--region REGION] [--topic ARN] [--new-queue]
              [--new-queue-with-hostname] [--queue NAME] 
              [--queue-wait N] [--queue-polling N] [--nonempty]
              [--hostname HOSTNAME] [--use-ec2-hostname] [--port N]
              [--cache-entries N] [--cache-mem-size N] [--cache-disk-size N]
              [--cache-path PATH] [--recheck-s3] [--cache-on-disk N] [--cache-check N]
              [--s3-num N] [--download-num N] [--prefetch-num N] [--st-blksize N]
              [--buffer-size N] [--buffer-prefetch N] [--no-metadata]
              [--prefetch] [--mp-size N] [--mp-num N] [--mp-retries N]
              [--s3-retries N] [--s3-retries-sleep N] 
              [--s3-use-sigv4] [--s3-endpoint URI]
              [--aws-managed-encryption] 
              [--no-allow-other]
              [--download-retries-num N] [--download-retries-sleep N]
              [--read-retries-num N] [--read-retries-sleep N]
              [--id ID] [--mkdir] [--uid N] [--gid N] [--umask MASK]
              [--read-only] [--expiration N] [--requester-pays]
              [--with-plugin-file FILE] [--with-plugin-class CLASS]
              [-l FILE] 
              [--log-mb-size N] [--log-backup-count N] [--log-backup-gzip]
              [-f] [-d] [-V]
              S3Path LocalPath

YAS3FS (Y
View on GitHub
GitHub Stars657
CategoryDevelopment
Updated1mo ago
Forks92

Languages

Python

Security Score

95/100

Audited on Feb 26, 2026

No findings