SkillAgentSearch skills...

Sfs

The legacy distributed object storage server developed by PitchPoint Solutions can store billions of large and small files using minimal resources. Object data is stored in replicated volumes implemented like Facebooks Haystack Object Store. Object metadata which essentially maps an object name to a volume position is stored in an elasticsearch index. (Development by PitchPoint Solutions has been discontinued)

Install / Use

/learn @pitchpoint-solutions/Sfs

README

Simple File Server

Overview

  • Sfs aims to be a file server that can serve and securely store billions of large and small files using minimal resources.
  • The http api implements common features of the openstack swift http api so that it can be used with exiting tools.

Features

  • Object metadata is indexed in elasticsearch which allows for quick reads, updates, deletes and listings even when containers contain millions of objects
  • Objects are versioned and the number of revisions stored can be configured per container. This means that you'll never loose an object by overwriting it or deleting it (unless you force a deletion).
  • Each object can have a TTL set on it so that manual purging is not required
  • Object data is stored in data files that are themselves replicated and healed when necessary (the object data replication level can be controlled by container independently of the object index settings). If the object size is very small it's stored with the object metadata instead of the data file. This is useful if you're storing token type data.
  • Object data files do not need to be compacted since the block ranges are recycled. The range allocator attempts to be intelligent about which ranges object data is written to so that writes are sequential.
  • Maximum object size is 5GB. It's defined at build time and can be changed.
  • Objects of many terabytes are supported through the openstack swift dynamic large object functionality
  • Each container gets it's own index so that object metadata sharding and replication can be controlled on a container level.
  • Object data is encrypted at rest using AES256-GCM if the container is configured to encrypt by default or the object upload request includes the "X-Server-Side-Encryption" http header
  • Master keys are automatically generated, rotated and stored on redundant key management services (Amazon KMS and Azure KMS). You will need accounts on both services but since sfs uses a tiny amount of master keys the charges are minimal.
  • Container encryption keys are automatically generated, rotated and not stored in plain text anywhere. Once sfs starts it initializes the master keys and when a container key needs to be decrypted it uses the appropriate master key.
  • A container can be exported into into a file and imported into another container. The dynamic large object manifest will also be updated if it references objects in the container that was exported. The container export format is independent of the index and data file format so that an export can be imported into any sfs version that supports the export file format.
  • Container exports can be compressed and encrypted using AES256-GCM if the appropriate http headers are supplied to the http export api
  • Adding new sfs nodes to the cluster is a simple as starting the docker image on another server. New data will always be written to the nodes with the most available storage space. Existing data will not be rebalanced.
  • The entire implementation is event driven and non blocking. Built using Vert.x.

Mailing Lists

Latest release

The most recent release of sfs is release-1.20170106133707.

To run release-1.20170106133707

docker run ... pitchpointsolutions/simple-file-server:release-1.20170106133707

To run the latest release-1

docker pull pitchpointsolutions/simple-file-server:release-1
docker run ... pitchpointsolutions/simple-file-server:release-1

Snapshots

Snapshots of sfs are built from the master branch

To run the latest snapshot

docker pull pitchpointsolutions/simple-file-server:latest
docker run ... pitchpointsolutions/simple-file-server:latest

Quickstart on Linux

docker run -d -P --name sfs_example_elasticsearch elasticsearch:2.4.1 -Des.cluster.name=sfs_example_elasticsearch
HOSTNAME=`hostname` && export HOST_IP=`ping -c1 -n ${HOSTNAME} | head -n1 | sed "s/.*(\([0-9]*\.[0-9]*\.[0-9]*\.[0-9]*\)).*/\1/g"`;
export DOCKER_ES_PORT=`docker port sfs_example_elasticsearch 9300/tcp | sed -E 's/(.+):(.+)/\2/'`
docker run -d -P --add-host localhost:127.0.0.1 -e SFS_HTTP_LISTEN_ADDRESSES=0.0.0.0:80 -e SFS_HTTP_PUBLISH_ADDRESSES=127.0.0.1:80 -e SFS_REMOTENODE_SECRET=YWJjMTIzCg== -e SFS_KEYSTORE_AWS_KMS_ENDPOINT=https://kms.us-east-1.amazonaws.com -e SFS_KEYSTORE_AWS_KMS_KEY_ID=arn:aws:kms:us-east-1:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab -e SFS_KEYSTORE_AWS_KMS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE -e SFS_KEYSTORE_AWS_KMS_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYAWSEXAMPLEKEY -e SFS_KEYSTORE_AZURE_KMS_ENDPOINT=https://yourvaultname.vault.azure.net -e SFS_KEYSTORE_AZURE_KMS_KEY_ID=6603bbb5-cf2e-4367-8327-43ba49ba74b0 -e SFS_KEYSTORE_AZURE_KMS_ACCESS_KEY_ID=a14970c2-397c-4af2-867e-b3480f9eaac6 -e SFS_KEYSTORE_AZURE_KMS_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYAZUREEXAMPLEKEY -e SFS_ELASTICSEARCH_CLUSTER_NAME=sfs_example_elasticsearch -e SFS_ELASTICSEARCH_NODE_NAME=${HOST_IP}:${DOCKER_ES_PORT} -e SFS_ELASTICSEARCH_DISCOVERY_ZEN_PING_UNICAST_HOSTS=${HOST_IP}:${DOCKER_ES_PORT} -e SFS_ELASTICSEARCH_DISCOVERY_ZEN_PING_MULTICAST_ENABLED=false -e SFS_ELASTICSEARCH_DISCOVERY_ZEN_PING_UNICAST_ENABLED=true --detach --name sfs_example_middlware -P pitchpointsolutions/simple-file-server
export DOCKER_SFS_PORT=`docker port sfs_example_middlware 80/tcp | sed -E 's/(.+):(.+)/\2/'`
curl -v -XGET "http://localhost:${DOCKER_SFS_PORT}/admin/001/healthcheck"
curl -XPOST -u admin:admin "http://localhost:${DOCKER_SFS_PORT}/openstackswift001/my_account"
curl -XPUT -u admin:admin "http://localhost:${DOCKER_SFS_PORT}/openstackswift001/my_account/my-container"
curl -XPUT -u admin:admin "http://localhost:${DOCKER_SFS_PORT}/openstackswift001/my_account/my-container/my_object" -d 'abc123'
curl -XGET -u admin:admin "http://localhost:${DOCKER_SFS_PORT}/openstackswift001/my_account/my-container/my_object"

Quickstart using Docker Machine

docker run -d -P --name sfs_example_elasticsearch elasticsearch:2.4.1 -Des.cluster.name=sfs_example_elasticsearch
export HOST_IP=`docker-machine ip`;
export DOCKER_ES_PORT=`docker port sfs_example_elasticsearch 9300/tcp | sed -E 's/(.+):(.+)/\2/'`
docker run -d -P --add-host localhost:127.0.0.1 -e SFS_HTTP_LISTEN_ADDRESSES=0.0.0.0:80 -e SFS_HTTP_PUBLISH_ADDRESSES=127.0.0.1:80 -e SFS_REMOTENODE_SECRET=YWJjMTIzCg== -e SFS_KEYSTORE_AWS_KMS_ENDPOINT=https://kms.us-east-1.amazonaws.com -e SFS_KEYSTORE_AWS_KMS_KEY_ID=arn:aws:kms:us-east-1:111122223333:key/1234abcd-12ab-34cd-56ef-1234567890ab -e SFS_KEYSTORE_AWS_KMS_ACCESS_KEY_ID=AKIAIOSFODNN7EXAMPLE -e SFS_KEYSTORE_AWS_KMS_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYAWSEXAMPLEKEY -e SFS_KEYSTORE_AZURE_KMS_ENDPOINT=https://yourvaultname.vault.azure.net -e SFS_KEYSTORE_AZURE_KMS_KEY_ID=6603bbb5-cf2e-4367-8327-43ba49ba74b0 -e SFS_KEYSTORE_AZURE_KMS_ACCESS_KEY_ID=a14970c2-397c-4af2-867e-b3480f9eaac6 -e SFS_KEYSTORE_AZURE_KMS_SECRET_KEY=wJalrXUtnFEMI/K7MDENG/bPxRfiCYAZUREEXAMPLEKEY -e SFS_ELASTICSEARCH_CLUSTER_NAME=sfs_example_elasticsearch -e SFS_ELASTICSEARCH_NODE_NAME=${HOST_IP}:${DOCKER_ES_PORT} -e SFS_ELASTICSEARCH_DISCOVERY_ZEN_PING_UNICAST_HOSTS=${HOST_IP}:${DOCKER_ES_PORT} -e SFS_ELASTICSEARCH_DISCOVERY_ZEN_PING_MULTICAST_ENABLED=false -e SFS_ELASTICSEARCH_DISCOVERY_ZEN_PING_UNICAST_ENABLED=true --detach --name sfs_example_middlware -P pitchpointsolutions/simple-file-server
export DOCKER_SFS_PORT=`docker port sfs_example_middlware 80/tcp | sed -E 's/(.+):(.+)/\2/'`
curl -v -XGET "http://${HOST_IP}:${DOCKER_SFS_PORT}/admin/001/healthcheck"
curl -XPOST -u admin:admin "http://${HOST_IP}:${DOCKER_SFS_PORT}/openstackswift001/my_account"
curl -XPUT -u admin:admin "http://${HOST_IP}:${DOCKER_SFS_PORT}/openstackswift001/my_account/my-container"
curl -XPUT -u admin:admin "http://${HOST_IP}:${DOCKER_SFS_PORT}/openstackswift001/my_account/my-container/my_object" -d 'abc123'
curl -XGET -u admin:admin "http://${HOST_IP}:${DOCKER_SFS_PORT}/openstackswift001/my_account/my-container/my_object"

Master Keys

  • On first use 1 master is key is generated if one does not exist
  • A new master key is generated every 30 days
  • Master keys are re-encrypted every 30 days
  • The primary copy of the master key is encrypted using amazons kms
  • The backup copy of the master key is encrypted using azures kms
  • If the primary copy of the master key can't be decrypted the backup copy will be used repair the primary copy
  • If the backup copy of the master key can't be decrypted the primary copy will be used to repair the backup copy
  • When a master key is needed it is decrypted using the kms and stored encrypted locally in memory so that every use of the master key doesn't require a call to the kms. This means after a restart each master key that is used will only call the kms to decrypt the key once.

Container Keys

  • When a container key is needed a new one is generated if none exist.
  • If the container key been active for more than 30 days and new one is generated.
  • The container key is encrypted using AES256-GCM w/ nonce using the latest master key and stored encrypted as part of the container metadata.
  • When object data needs to be encrypted the latest container key is fetched, decrypted using the appropriate master key.
  • When object data needs to be decrypted the appropriate container key is fetched, decrypted using the appropriate master key.
  • Container keys are re-encrypted every 30 days

Object Encryption

  • Each object is encrypted using AES256-GCM w/ nonce

Metadata Replication

  • Metadata replication uses the elasticsearch settings

Object Data Replication

  • Object streams are replicated in real time. This means that if you upload a 5GB file you don't have to wait for the data to be copied from the primary volume to the replicas after the write to the primary has finished.
  • This is implemented by cloning the data stream to each volume

Related Skills

View on GitHub
GitHub Stars89
CategoryDevelopment
Updated1mo ago
Forks11

Languages

Java

Security Score

100/100

Audited on Feb 12, 2026

No findings