AWS has announced Mountpoint for Amazon S3, which provides high-performance access to Amazon S3, infoq reports. Currently in alpha, the local mountpoint provides high transfer speeds in a single location and is designed primarily for data lake applications.
Mountpoint for Amazon S3 translates local file system API calls to S3 object API calls such as GET and LIST. It supports arbitrary and sequential file read and file and directory write operations. The alpha version does not support writes (PUT) and is expected to only support sequential writes to new objects in the future.
“Mountpoint is designed for large-scale analytics applications that read and generate large amounts of S3 data in parallel but don’t require the ability to write to the middle of existing objects. Mountpoint allows you to map S3 buckets or prefixes into your instance’s file system namespace, traverse the contents of your buckets as if they were local files, and achieve high throughput access to objects”,
said James Bornholt, AWS scientist and associate professor at the University of Texas, Dewabrat Kumar, senior product manager at AWS, and Andy Warfield, AWS engineer.
The open-source client does not emulate operations like directory renames that would require many S3 API calls or POSIX file system features that are not supported in S3 APIs.
Mountpoint for S3 is not the first client presenting S3 as a filesystem, with Goofys and s3fs popular open-source options to mount a bucket via FUSE. While some developers question on Reddit the need for a new client and worry that it will be used outside the data lake space
Mountpoint is published under the Apache 2.0 license and is not yet ready for production workloads. The initial alpha version and public roadmap are available on GitHub.