Updating Information on GenBank Records

You can update your existing GenBank records at any time using the different file types described below. If you are updating multiple records, send a list of all accessions to be updated at the top of your request. To ensure accuracy, updates must include the GenBank accession numbers if accessions have been assigned. Updates submitted with SUB numbers or BankIt numbers cannot be processed after accessions have been assigned. Save the update file types as plain text and mail as an attachment to [email protected].

Do not submit a new file to update an existing record as this will create unnecessary duplication in the database

If you submitted to our collaborators at ENA or DDBJ, please see their instructions for update formats. Prokaryotic and eukaryotic genomes, TSA and SRA should be updated as described on the linked pages. Updates to BioProject and BioSample should be sent to [email protected].

Editing Source Information

Send updates to the source information (i.e. strain, cultivar, geo_loc_name, specimen_voucher) in a multi-column tab-delimited (.tsv) table, for example:

acc. num.       strain  geo_loc_name   organism
MHxxxx02        82      USA     Escherichia coli
MHxxxx03        ABC     Canada  Bacillus subtilis

Editing/Adding Project Links

To include BioProject, BioSample, and/or SRA run accessions send the information in a tab-delimited (.tsv) table. Please provide the assigned accessions and not the temporary SUB numbers. For example:

acc. num.       BioProject  BioSample   SRA run accession
MHxxxx02        PRJNAxxxxxx      SAMNxxxxxx     SRRxxxxxx
MHxxxx03        PRJNAxxxxxx      SAMNxxxxxx     SRRxxxxxx

The Project Link information can be combined with the source table if both are being updated.

Updating Publication Information

[a] If the PMID or DOI are publicly available please send the information as a tab-delimited (.tsv) table as follows:

    acc. num.   PMID 
    MHXXXX01    29980901
    MHXXXX02    29980901

    acc. num.   DOI
    MHXXXX01    10.1000/xyz123
    MHXXXX02    https://doi.org/10.1000/xyz123doi

[b] For all other updates, please provide the revised information in a tab-delimited (.tsv) table. You must replace any non-ASCII characters (for example, characters with accents and umlauts) with the appropriate English letters.

The complete list of revised author names should be provided in the following format: first_initial middle_initial surname, etc., For example:

    acc. num.    authors    title
    MHXXXX01    J. A. Smith    Identification of gene A               
    MHXXXX02    X. P. Weng, J. Doe    Identification of gene B

[c] For affiliation updates, send the correct affiliation in the text portion of an email

Nucleotide Sequence Update

If you are updating the current nucleotide sequence send the complete new sequence(s) in fasta format:

>MHxxxx02
cggtaataatggaccttggaccccggcaaagcggagagac
>MHxxxx03
ggaccttggaccccggcaaagcggagagaccggtaataat

Do not include non-IUPAC characters within the sequence. Use n's for unknown nucleotides within the sequence.
If updating multiple sequences, send all the sequences in a single fasta file.
Do not include source information in the fasta defline. Source changes need to be sent as a tab-delimited table as described above.

Feature (Annotation) Update

If you are adding annotation or changing locations of features, then send us the features as a tab-delimited 5-column Feature table.

a. If the record is publicly released and has annotation, you can download the existing annotation .tbl file by retrieving the record and clicking on the 'Send to' option. Choose 'File' as Destination and then Format 'Feature table'. Edit this table and send to us via email.

b. If the record is not yet publicly released, let us know, and we will send you a 5-column table with the current annotation for you to edit and return to us.

Please maintain the tab structure of the table when editing.

For example:

>Feature gb|MHxxxxxx
<1      400     gene
                        gene            ENO1
<1      30      CDS
70      300
                        product         enolase
                        note            homodimer
<1      30      mRNA
70      400
                        product         enolase
<1      30      exon
                        number          1
70      400     exon
                        number          2

GenBank

Public nucleic acid sequence repository