A high level view of the architecture you will need to run workflows is shown is below.
This section of the guide details the common components required for job execution and data storage. This includes the following:
- A place to store your input data and generated results
- Access controls to your data and compute resources
- Code and artifacts used to provision compute resources
- Containerized task scheduling and execution
The above is referred to here as the "Genomics Workflows Core". To launch this core in your AWS account, use the Cloudformation template below.
|Genomics Workflow Core||Create EC2 Launch Templates, AWS Batch Job Queues and Compute Environments, a secure Amazon S3 bucket, and IAM policies and roles within an existing VPC. NOTE: You must provide VPC ID, and subnet IDs.||cloud_download||play_arrow|
The core is agnostic of the workflow orchestrator you intended to use, and can be installed multiple times in your account if needed (e.g. for use by different projects). Each installation uses a
Namespace value to group resources accordingly. By default, the
Namespace is set to the stack name, which must be unique within an AWS region.
To create all of the resources described, the Cloudformation template above uses Nested Stacks. This is a way to modularize complex stacks and enable reuse. The individual nested stack templates are intended to be run from a parent or "root" template. On the following pages, the individual nested stack templates are available for viewing only.