AWS announced a roadmap for Step Functions, a large-scale parallel data processing solution. Optimized for S3, the new AWS orchestration service feature targets interactive and highly parallel workflows for serverless data processing.

The state map for the Step Function performs the same data processing steps. The existing state map is limited to 40 parallel iterations at a time. This limitation makes it difficult to scale data processing workloads to thousands of elements (or even more). To achieve higher parallel processing prior to the AWS innovation, complex workarounds had to be applied to the existing map state component.

The new distributed state map allows you to write Step Functions to coordinate large-scale parallel workloads within your server applications. You can now iterate over millions of objects, such as logs, images, or .csv files stored in Amazon Simple Storage Service (Amazon S3). The new distributed card state can run up to ten thousand parallel data processing workflows.

The Step Functions distributed card supports a maximum parallelism of up to 10,000 parallel executions, well above the parallelism supported by many other AWS services. You can use the maximum concurrency feature of the distributed map to ensure that you do not exceed the concurrency of a downstream service. There are two factors to consider when working with other services. First, the maximum concurrency supported by the service for your account. Second, the burst and ramp rates determine how fast you can achieve the maximum concurrency.

The new feature is generally available in a subset of AWS regions, including Ohio, Northern Virginia, Singapore, Frankfurt, and Ireland.

 

Tags: , , , , , , , , , , , , , , , , , , ,
Editor @ DevStyleR