Fast s3 utility is a faster version of s3cmd's ls, get, and cp functions ideal for buckets containing millions of keys.
Installation
go get -u github.com/metaverse/fasts3
This should install the binary under $GOPATH/bin/
Configuration
Use aws configure command from the aws cli tool (https://aws.amazon.com/cli/) which will create the necessary config files in ~/.aws/credentials.
add your region to your credentials file to support tab-completion of buckets:
[default] aws_access_key_id=xxxx aws_secret_access_key=xxxx region=us-east-1
Alternatively you can set these environment variables which will take precedence over the credentials file:
export AWS_ACCESS_KEY_ID=<access_key> export AWS_SECRET_ACCESS_KEY=<secret_key> export AWS_REGION=us-east-1
Usage
Use:
fasts3 --help
fasts3 <cmd> --help
Using search depth to go faster
Many times you know the structure of your s3 bucket, and this can be used to optimize listings. Say you have a structure like so:
fasts3 ls s3://mybuck/logs/ DIR s3://mybuck/logs/2011/ DIR s3://mybuck/logs/2012/ DIR s3://mybuck/logs/2013/ DIR s3://mybuck/logs/2014/ DIR s3://mybuck/logs/2015/
Doing a fasts3 ls -r s3://mybuck/logs/ will read all keys under logs sequentially. We can make this faster by adding a --search-depth 1 flag to the command which gives each of the underlying directories its own thread, increasing throughput.
Concurrency
The concurrency level of s3 command execution can be tweaked based on your usage needs. By default, 4*NumCPU s3 commands will be executed concurrently, which is ideal based on our benchmarks. If you want to override this value, set GOMAXPROCS in your environment to set the concurrency level: GOMAXPROCS=64 fasts3 ls -r s3://mybuck/logs/ will execute 64 s3 commands concurrently.
Examples
# ls fasts3 ls s3://mybucket/ # lists top level directories and keys fasts3 ls -r s3://mybucket/ # lists all keys in the bucket fasts3 ls -r --search-depth 1 s3://mybucket/ # lists all keys in the bucket using the directories 1 level down to thread fasts3 ls -r s3://mybucket/ | awk '{s += $1}END{print s}' # sum sizes of all objects in the bucket # get fasts3 get s3://mybuck/logs/ # fetches all logs in the prefix # stream fasts3 stream s3://mybuck/logs/ # streams all logs under prefix to stdout fasts3 stream --key-regex ".*2015-01-01" s3://mybuck/logs/ # streams all logs with 2015-01-01 in the key name stdout # cp fasts3 cp -r s3://mybuck/logs/ s3://otherbuck/ # copies all subdirectories to another bucket fasts3 cp -r -f s3://mybuck/logs/ s3://otherbuck/all-logs/ # copies all source files into the same destination directory
Completion
Bash and ZSH completion are available.
To install for bash:
For zsh:
