Cloud Data Delivery Service
Introduction
Sequence Read Archive (SRA) can deliver many different file types using the SRA Toolkit, but cannot deliver all original files submitted to SRA. Since the SRA toolkit does not provide access to these files, SRA has created a cloud data delivery service to deliver the source files and other file types from NCBI cold storage buckets to individual data consumers' buckets in AWS and GCP. This service is provided for both public and authorized access (dbGaP) data.
In order to access this system, you must be logged into your MyNCBI account and have a bucket in either AWS or GCP. You can create one in a few minutes while you are making your data delivery request or simply enter the name of your existing bucket on the page.
For AWS, please ensure that your bucket is in the US East (N. Virginia) region, or for GCP, ensure your bucket is in the us-east1 (South Carolina) region.
Review these resources to learn how to create an AWS bucket or a GCP bucket.
The cloud delivery service will help you create and attach the relevant permissions to your bucket.
Run Selector
To request that original files be delivered through the cloud data delivery service, you begin in SRA Run Selector. There are two ways to provide SRA accessions to Run Selector:
- Through a comma-separated list of accessions by copying and pasting accessions in Run Selector
- Send search results from Entrez to Run Selector using the 'send to' link that appears at the top right of the results page once you refine the search to less than 20K results
Once Run Selector has loaded the results, select accessions of interest with the checkboxes and press the Deliver Data button.
In the data delivery application, there are four easy steps to submit your request:
- Review selected run accessions
- Register a new bucket for your account or select bucket registered previously on the data delivery page
- Select the source file type(s) you wish to receive
- Review the details of your request: number of files, file size, destination
Your destination bucket may include subfolders, but the bucket name cannot end with a forward slash ("/") character. It may take up to 48 hours to deliver data to your bucket.
NCBI limits cold storage and general delivery within a 30 day timeframe. There are two limits:
- 5TB limit on data delivery to your cloud bucket from cold storage
- 20TB limit for general delivery to your cloud bucket
30 days after your request, that data delivery request will no longer count against your limit. There are separate limits for public and dbGaP delivery requests, each user receives 5TB/20TB for public files and 5TB/20TB for dbGaP files per 30 day period.
Contact [email protected] with any questions about the cloud data delivery service or the NCBI limits.
Engage
NCBI wants your feedback on SRA in the Cloud. Contact [email protected] with questions or if you would like to provide input on new functionality.