RefSeq Announcements for 2016
January 7, 2016: Announcing RefSeq Release 74
This full release incorporates genomic, transcript, and protein data available, as of January 11, 2016 and includes 89,458,499 records, 58,496,614 proteins, 13,719,136 RNAs, and sequences from 57,993 organisms. Additional information is available in the Release Notes.
Changes since the previous release:
[1] A list of updated organisms and dbSNP annotation summary is available here:
ftp://ftp.ncbi.nih.gov/snp/release-notes/RefSeq/refseq74.snp.rpt
[2] Future change: GI sequence identifiers to be removed from some file formats
As of 06/15/2016, the integer sequence identifiers known as "GIs" will no longer be included in the GenBank, GenPept, and FASTA formats supported by NCBI for the display of sequence records.
Please refer to the FTP release notes for additional details.
March 14, 2016: Announcing RefSeq Release 75
This full release incorporates genomic, transcript, and protein data available, as of March 7, 2016 and includes 92,936,289 records, 61,034,675 proteins, 14,035,988 RNAs, and sequences from 58,776 organisms. Additional information is available in the Release Notes.
Changes since the previous release:
[1] A list of updated organisms and dbSNP annotation summary is available here:
<ftp://ftp.ncbi.nih.gov/snp/release-notes/RefSeq/refseq75.snp.rpt>
[2] Decrease in number of plasmid records:
A number of plasmid records were suppressed to reduce redundancy, or, due to decisions to suppress assemblies that don't meet current RefSeq quality criteria.
[3] Future change: GI sequence identifiers to be removed from some file formats
As of September 2016, the integer sequence identifiers known as "GIs" will no longer be included in the GenBank, GenPept, and FASTA formats supported by NCBI for the display of sequence records. In addition, the FASTA format will no longer include the database source abbreviation. Please refer to the NCBI News Announcement posting for more detail. http://www.ncbi.nlm.nih.gov/news/03-02-2016-phase-out-of-GI-numbers/
May 16, 2016: Announcing RefSeq Release 76
This full release incorporates genomic, transcript, and protein data available, as of May 9, 2016 and includes 97,792,976 records, 63,971,766 proteins, 14,965,826 RNAs, and sequences from 59,995 organisms. Additional information is available in the Release Notes.
Changes since the previous release:
[1] A list of updated organisms and dbSNP annotation summary is available here:
<ftp://ftp.ncbi.nih.gov/snp/release-notes/RefSeq/refseq76.snp.rpt>
[2] Re-annotation of human reference genome (GRCh38.p7) currently in progress
The known human RefSeq accessions (NM_, NR_, NG_, NP_) found in this release have been used as an input reagent for RefSeq's human genome annotation release 108 which is in progress at the time of this release.
[3] Future update of human reference mitochondrion NC_012920.1
Based on received requests, we are planning to add RefSeq transcript accessions to the human reference mitochondrion (NC_012920.1). This requires some development work and should be available in RefSeq release 77 or 78.
[4] Future change: GI sequence identifiers to be removed from some file formats
As of September 2016, the integer sequence identifiers known as "GIs" will no longer be included in the GenBank, GenPept, and FASTA formats supported by NCBI for the display of sequence records. In addition, the FASTA format will no longer include the database source abbreviation. Please refer to the NCBI News Announcement posting for more detail. http://www.ncbi.nlm.nih.gov/news/03-02-2016-phase-out-of-GI-numbers/
July 7, 2016: Announcing RefSeq Release 77
This full release incorporates genomic, transcript and protein data available as of June 29, 2016 and includes 100,678,438 records, 65,964,245 proteins, 15,563,994 RNAs, and sequences from 60,892 organisms. Additional information is available in the Release Notes.
Changes since the previous release:
[1] A list of updated organisms and dbSNP annotation summary is available here:
<ftp://ftp.ncbi.nih.gov/snp/release-notes/RefSeq/refseq77.snp.rpt>
[2] This release includes updated genome annotation for the human, mouse, and zebrafish reference genomes.
Additional information is available here: Homo sapiens annotation release 108 (GRCh38.p7): www.ncbi.nlm.nih.gov/genome/annotation_euk/Homo_sapiens/108/ Genes and pseudogenes: 54,220 Mus musculus annotation release 106 (GRCm38.p4): www.ncbi.nlm.nih.gov/genome/annotation_euk/Mus_musculus/106/ Genes and pseudogenes: 46,062 Danio rerio annotation release 105 (GRCz10): www.ncbi.nlm.nih.gov/genome/annotation_euk/Danio_rerio/105/ Genes and pseudogenes: 42,154 [3] Bacterial antimicrobial resistance genes and proteins data set:
NCBI has started a new project to provide a curated RefSeq data set of antimicrobial resistance genes and proteins. The first data release is included in RefSeq release 77. A description of the project, and links to access the nucleotide Gene records and protein records is available in NCBI's BioProject resource - www.ncbi.nlm.nih.gov/bioproject/PRJNA313047. More genes and proteins will be added to this data set in the coming months. [4] Re-annotation of genomes from three prokaryotic clades:
As part of ongoing work to improve genome annotation of antimicrobial resistance loci, Escherichia coli, Shigella, and Klebsiella pneumoniae genomes were re-annotated using NCBI s prokaryotic genome annotation pipeline (PGAP). PGAP uses curated reference proteins and protein families to apply annotation at different levels of specificity depending on sequence similarity.
More information about PGAP is available in a recent open access publication in Nucleic Acids Research www.ncbi.nlm.nih.gov/pubmed/27342282
[5] Update of human reference mitochondrion NC_012920.1:
This update is not included in this release but will be publicly available very shortly after the RefSeq release is deployed. The human reference mitochondrion has been updated to include RefSeq transcript records. In making this change we re-accessioned all of the protein records. The original accessions (such as YP_003024026.1) are tracked as secondary accessions on the replacement record (for example, NP_001315102.1).