Getting started

Welcome to NCBI Datasets

Getting started

Welcome to NCBI Datasets

NCBI Datasets is a new resource that lets you easily gather data from across NCBI databases. You have the choice of getting the data through three interfaces:

schematic showing the NCBI Datasets available interfaces: web, API and command line tool

Data delivery

How is the data delivered?

NCBI Datasets delivers data and metadata as a cohesive data package contained in a zip archive. When unzipped, files can be found in the folder ncbi_dataset/data.

What do we mean by “cohesive” ?

For all data packages, users can include multiple files associated with the requested accession. For example, if users want to download the human reference genome assembly, they can also simultaneously select from transcript, protein, GFF, GTF, GBFF, and metadata files. For more information about data packages and their contents, please see our data packages page.

Where can I learn more about NCBI Datasets?

You can read more about how to use NCBI Datasets by checking out our How-to guides where you can find instructions on how to download data and metadata for genomes, genes, ortholog sets, and viruses. Additionally, we also have an extensive documentation page for our API and detailed information about our command-line tools in our command-line tool reference .

Generated November 25, 2024