Data Storage Model
Improving How SRA Data Is Distributed
As SRA continues to mature in the cloud and to grow in size, we recognize the need to improve our data storage model while keeping in mind sustainability and ease of use of such a prolific archive. We need to streamline our data storage model and are in the process of kicking off this effort, which will take 4-7 months to fully realize. As market and technical conditions change, we will continue to monitor and refine our model, ensuring transparency of data movements to the best of our ability.
Benefits
- Improving user experience while keeping any user-facing costs at a minimum
- Reducing costs by removing duplicate data across repositories
- Advancing SRA's accuracy in tracking of data status and location
- Improving auditing, technical, and curation processes
- Improving data management by simplifying data storage
- Maintaining our full archive plus a disaster recovery copy while minimizing the tax-payer burden over the next five years and into the future
SRA Availability
As we transition to the revised data storage model (see Diagram), the entire corpus of SRA will continue to remain available to users across all data formats:
- Submitted raw files
- SRA Normalized
- SRA Lite
Submitted Raw Files
Raw submitted files can be easily retrieved from cold storage using our CDDS service.
SRA Normalized Files
SRA Normalized files will remain accessible through AWS ODP.
SRA Lite
The move to broaden distribution of SRA Lite format is beneficial for most users, the vast majority of the time.
SRA Lite supports reliable and faster data transfer, downloads, and analysis using current tools.
- A complete copy of SRA, with some mix of SRA Normalized and SRA Lite formats, will continue to be available in Google Cloud as we transition GCP entirely from SRA Normalized to SRA Lite.
- Our NCBI servers will also have a mix of SRA Normalized and SRA Lite as we process NCBI stored files to SRA Lite format, which we are planning to be complete in approximately spring 2023. At that time, we expect to have completely transitioned to SRA Lite NCBI storage.
SRA Tool Kit
If you are using the SRA Toolkit, you may continue to set your location and file format preferences and allow the toolkit to optimize by speed and expense on your behalf.
Questions?
If you have any questions, please contact us at [email protected] for assistance.
Read our blog post on Improving How SRA Data is Distributed