AWS announced the full availability of Amazon Omics, a service for storing, analyzing and processing genomic, transcriptomic and other omics data. The service is designed for healthcare and life sciences organizations to improve patient care and advance research. It supports large-scale analytics and collaborative research.
Amazon Omics makes it easy to store, search and analyze data and then generate conclusions from it. It simplifies and speeds up the process of storing and analyzing multiomic information for research and clinical applications so that focusing on extracting insights from the data can come first.
Amazon Omics storage is compatible with bioinformatics file formats such as FASTQ, BAM, and CRAM and allows you to store, discover, and share this data efficiently and at low cost. These file formats are stored as read-set objects within a sequence store. You can also store reference genomes in the FASTA format. Data is imported as immutable objects with unique identifiers to support workloads that require strict data provenance. Access to individual data objects, including references and read-set objects, can be controlled using tags and attribute-based access controls through AWS Identity and Access Management (IAM).
Amazon Omics helps you run bioinformatics workflows at scale. Specify your workflow definition, the tools you want to use, and the data to analyze, and Amazon Omics will provision the underlying infrastructure and implement the workflow. Workflow definitions compliant with WDL 1.1 and Nextflow 22.10.0 DSL2 specifications are supported. Workflows use OCI-compliant containerized tooling stored in private registries in Amazon Elastic Container Registry (ECR).
Analysis at scale
With Amazon Omics, you can quickly ingest and transform genomics data formats such as (g)VCF, GFF3, and TSV/CSVs into Apache Parquet. You can make the genomics data accessible through analytics services such as Amazon Athena. You can transform both variant data (data from an individual sample) and annotation data (known information about positions in the genome). You can control access to analytics stores with AWS Lake Formation, making it easier to perform queries across diverse data sources while implementing fine-grained access controls.
Data collaboration and provenance
Amazon Omics makes it easier for researchers to tag collaborators, set up their permissions, and share data securely with them. This simplifies how you make your omics data findable, accessible, interoperable, and reusable (FAIR). With domain-specific metadata, you can link Amazon Omics data stores with other omics and healthcare data to facilitate multiomic and multimodal analysis.
Security, privacy, and compliance
Amazon Omics is HIPAA eligible. You can apply attribute-based controls to define fine-grained data access and governance. Comprehensive logging and provenance capture is built in so you know what data was accessed, who accessed it, and when.