U.S. flag

An official website of the United States government

Eukaryotic Genome Submission Examples

Figure 1: Sample FASTA-formatted sequence

>HTE831 [organism=Drosophila yakuba] [strain=HTE831]
tagagcaaaaaatagacattttaatggcgctaatcatacaaggaaggaataataacactg
acatggatacatccacttaatctacatttgcttattcctatcttgactatatctatatcc
[etc.]

Figure 2: Feature table format

This mock example of a feature table file includes:

Note that the relative order of the features in the file does not matter, and that the misc_feature and repeat_region features do not have a corresponding gene feature, and so do not have a locus_tag.

See the flatfile view of this file in Figure 3 .

>Feature HTE831
63574  87173   gene
            locus_tag       Ngs_17131
63574    63907   mRNA
75690   75730
84396   85536
85598   85773
85836   86109
86173   86467
86555   86670
86731   87173
            product  hypothetical protein
            protein_id  gnl|ncbi|Ngs_17131
            transcript_id   gnl|ncbi|Ngs_mrna17131
84402   85536   CDS
85598   85773
85836   86109
86173   86467
86555   86670
86731   86882
            product hypothetical protein
            protein_id  gnl|ncbi|Ngs_17131
            transcript_id   gnl|ncbi|Ngs_mrna17131
            inference   similar to RNA sequence, mRNA:INSD:AY123455.2
102664  100872  gene
            locus_tag       Ngs_3038
            gene    TpnI
102664   102502  mRNA
102400  102234
102168  100872
            product troponin isoform B
            protein_id  gnl|ncbi|Ngs_3038B
            transcript_id   gnl|ncbi|Ngs_mrna3038B
            note    transcript variant B; alternatively spliced
102655   102234  mRNA
102168  100872
            product troponin isoform A
            protein_id  gnl|ncbi|Ngs_3038A
            transcript_id   gnl|ncbi|Ngs_mrna3038A
            note    transcript variant A; alternatively spliced
102503  102502  CDS
102400  102234
102168  101261
            product troponin isoform B
            protein_id  gnl|ncbi|Ngs_3038B
            transcript_id   gnl|ncbi|Ngs_mrna3038B
            note    encoded by transcript variant B; alternatively spliced
102492  102234  CDS
102168  101261
            product troponin isoform A
            protein_id  gnl|ncbi|Ngs_3038A
            transcript_id   gnl|ncbi|Ngs_mrna3038A
            note    encoded by transcript variant A; alternatively spliced
<112616  >115107  gene
            locus_tag       Ngs_2945
<112616   112646  mRNA
112703  113463
113584  113762
113821  114249
114302  114464
114804  114902
114964  >115107
            product bifunctional methylenetetrahydrofolate dehydrogenase (NADP+)/methenyltetrahydrofolate cyclohydrolase
            protein_id  gnl|ncbi|Ngs_2945
            transcript_id   gnl|ncbi|Ngs_mrna2945
112616  112646  CDS
112703  113463
113584  113762
113821  114249
114302  114464
114804  114902
114964  115107
            product bifunctional methylenetetrahydrofolate dehydrogenase (NADP+)/methenyltetrahydrofolate cyclohydrolase
            EC_number       1.5.1.5
            EC_number       3.5.4.9
            note    bifunctional
            experiment  Western blot
            protein_id  gnl|ncbi|Ngs_2945
            transcript_id   gnl|ncbi|Ngs_mrna2945
101    180 gene
            locus_tag       Ngs_10111
            gene    trnL
101  180 tRNA
            product Leu
45111  45190   gene
            locus_tag       Ngs_10112
            pseudo
45111    45190   tRNA
            product Xxx
2103   400 gene
            locus_tag       Ngs_11232
2103 400 rRNA
            product 18S ribosomal RNA
60101    60567   misc_feature
            note    similar to ABC transporters
43027   43136   repeat_region
            mobile_element  retrotransposon:mini-me-Dpse-like{}4773
56408   56558   repeat_region
            mobile_element  retrotransposon:INE-1{}4674
62077   62147   repeat_region
            mobile_element  retrotransposon:P-T-Damb-like{}4769
63111   63154   repeat_region
            note    at-rich

Figure 3: GenBank flatfile

This is part of the flatfile view of the .sqn file made from the .fsa file ( Fig. 1 ) and .tbl file ( Fig. 2 ).

source          1..116100
                     /organism="Drosophila yakuba"
                     /mol_type="genomic DNA"
                     /strain="HTE831"
                     /db_xref="taxon:7245"
    gene            101..180
                     /gene="trnL"
                     /locus_tag="Ngs_10111"
     tRNA            101..180
                     /gene="trnL"
                     /locus_tag="Ngs_10111"
                     /product="tRNA-Leu"
     gene            complement(400..2103)
                     /locus_tag="Ngs_11232"
     rRNA            complement(400..2103)
                     /locus_tag="Ngs_11232"
                     /product="18S ribosomal RNA"
     repeat_region   43027..43136
                     /mobile_element="retrotransposon:mini-me-Dpse-like{}4773"
      gene            45111..45190
                     /locus_tag="Ngs_10112"
                     /pseudo
     tRNA            45111..45190
                     /locus_tag="Ngs_10112"
                     /product="tRNA-OTHER"
                     /pseudo
     repeat_region   56408..56558
                     /mobile_element="retrotransposon:INE-1{}4674"
     misc_feature    60101..60567
                     /note="similar to ABC transporters"
     repeat_region   62077..62147
                     /mobile_element="retrotransposon:P-T-Damb-like{}4769"
     repeat_region   63111..63154
                     /note="at-rich"
    gene            63574..87173
                     /locus_tag="Ngs_17131"
     mRNA            join(63574..63907,75690..75730,84396..85536,85598..85773,
                     85836..86109,86173..86467,86555..86670,86731..87173)
                     /locus_tag="Ngs_17131"
                     /product="hypothetical protein"
     CDS             join(84402..85536,85598..85773,85836..86109,86173..86467,
                     86555..86670,86731..86882)
                     /locus_tag="Ngs_17131"
                     /inference="similar to RNA sequence, mRNA:INSD:AY123455.2"
                     /codon_start=1
                     /product="hypothetical protein"
                     /translation="MQSTQSKSDRSSMHRGPLLLCAVMVVLVTLPEQINARMAFEKLT
                     DFDFPGNTYYSVKNLSLYECQGWCREEADCQAAAFSFVVNPLSPSQETHCQLQNDSSA
                     ANPSAAPQRSANMYYMIKLQLRSENVCHRPWSFERVPNKVIRGLDNALIYTSTKEACL
                     SACLNERRFVCRSVEYDYNNMKCVLSDSDRRSSGQFVQLVDAQGTDYFENLCLKPAQA
                     CKNNRSFGNSQKMGVSEEKVAQYVGLHYYTDKELQVTSESACRLACEIESEFLCRSFL
                     YLGQPQGSQYNCRLYHLDHKTLPDGPSTYLNHERPLIDHGEPIGQYFENQCEKAAGLG
                     AGSPPGTLDKIDTLPVSLDTIEDPNLTNLTRNDVNCDKTGTCYDVSVHCKDTRIAVQV
                     RTNKPFNGRIYALGRSETCNIDVINSDAFRLDLTMAGQDCNTQSVTGVYSNTVVLQHH
                     SVVMTKADKIYKVKCTYDMSSKNITFGMMPIRDPEMIHINSSPEAPPPRIRILDTRQR
                     EVETVRIGDRLNFRIEIPEDTPYGIFARSCVAMAKDARTSFKIIDDDGCPTDPTIFPG
                     FTADGNALQSTYEAFRFTESYGVIFQCNVKYCLGPCEPAVCEWNMDSFESLGRRRRRS
                     IESNDTKSEDDMNISQEILVLDFGDEKREFFKADPSTDFAKDKTVTIIEPCPTKTSVL
                     ALAVTCALMILLYISTLFCYYMKKWMQPHKIVA"
      gene            complement(100872..102664)
                     /gene="TpnI"
                     /locus_tag="Ngs_3038"
     mRNA            complement(join(100872..102168,102234..102400,
                     102502..102664))
                     /gene="TpnI"
                     /locus_tag="Ngs_3038"
                     /product="troponin isoform B"
                     /note="transcript variant B; alternatively spliced"
     mRNA            complement(join(100872..102168,102234..102655))
                     /gene="TpnI"
                     /locus_tag="Ngs_3038"
                     /product="troponin isoform A"
                     /note="transcript variant A; alternatively spliced"
    CDS             complement(join(101261..102168,102234..102400,
                     102502..102503))
                     /gene="TpnI"
                     /locus_tag="Ngs_3038"
                     /note="encoded by transcript variant B; alternatively spliced"
                     /codon_start=1
                     /product="troponin isoform B"
                     /translation="MDSSQSRKNGFLLHLPLETKRNPSNPNTPLSNLLNLTDFHYLLA
                     SNVCRKAKRELLAVLIVTSYAGHDALRSAHRQAIPQSKLEEMGLRRVFLLAALPSREH
                     FISQDQLASEQNRFGDLLQGNFIEDYRNLSYKHVMGLKWVSEECKKQAKFIIKLDDDI
                     IYDVFHLRRYLETLEVREPGLATSSTLLSGYVLDAKPPIRLRANKWYVSKKEYPQALY
                     PAYLSGWLYVTNVPTAERIVAEAERMSFFWIDDTWLTGVVRTRLGIPLERHNDWFSAN
                     AEFIDCCVRDLKKHNYECEYSVGPNGGDDRLLVEFLHNVEKCYFDECVKRPVGKSLKE
                     TCLAAAKSRPPKHGFPEIKALRLR"
     CDS             complement(join(101261..102168,102234..102492))
                     /gene="TpnI"
                     /locus_tag="Ngs_3038"
                     /note="encoded by transcript variant A; alternatively spliced"
                     /codon_start=1
                     /product="troponin isoform A"
                     /translation="MRMRGRRLLPIILSLLLIVLLSLCYFSNHLRDSSQSRKNGFLLH
                     LPLETKRNPSNPNTPLSNLLNLTDFHYLLASNVCRKAKRELLAVLIVTSYAGHDALRS
                     AHRQAIPQSKLEEMGLRRVFLLAALPSREHFISQDQLASEQNRFGDLLQGNFIEDYRN
                     LSYKHVMGLKWVSEECKKQAKFIIKLDDDIIYDVFHLRRYLETLEVREPGLATSSTLL
                     SGYVLDAKPPIRLRANKWYVSKKEYPQALYPAYLSGWLYVTNVPTAERIVAEAERMSF
                     FWIDDTWLTGVVRTRLGIPLERHNDWFSANAEFIDCCVRDLKKHNYECEYSVGPNGGD
                     DRLLVEFLHNVEKCYFDECVKRPVGKSLKETCLAAAKSRPPKHGFPEIKALRLR"
    gene            <112616..>115107
                     /locus_tag="Ngs_2945"
     mRNA            join(<112616..112646,112703..113463,113584..113762,
                     113821..114249,114302..114464,114804..114902,
                     114964..>115107)
                     /locus_tag="Ngs_2945"
                     /product="bifunctional methylenetetrahydrofolate dehydrogenase (NADP+)/methenyltetrahydrofolate cyclohydrolase"
     CDS             join(112616..112646,112703..113463,113584..113762,
                     113821..114249,114302..114464,114804..114902,
                     114964..115107)
                     /locus_tag="Ngs_2945"
                     /EC_number="3.5.4.9"
                     /EC_number="1.5.1.5"
                     /experiment="Western blot"
                     /codon_start=1
                     /product="bifunctional methylenetetrahydrofolate dehydrogenase (NADP+)/methenyltetrahydrofolate cyclohydrolase"
                     /translation="MESITFGVLTISDTCWQEPEKDTSGPILRQLIGETFANTQVIGN
                     IVPDEKDIIQQELRKWIDREELRVILTTGGTGFAPRDVTPEATRQLLEKECPQLSMYI
                     TLESIKQTQYAALSRGLCGIAGNTLILNLPGSEKAVKECFQTISALLPHAVHLIGDDV
                     SLVRKTHAEVQGSAQKSHICPHKTGTGTDSDRNSPYPMLPVQEVLSIIFNTVQKTANL
                     NKILLEMNAPVNIPPFRASIKDGYAMKSTGFSGTKRVLGCIAAGDSPNSLPLAEDECY
                     KINTGAPLPLEADCVVQVEDTKLLQLDKNGQESLVDILVEPQAGLDVRPVGYDLSTND
                     RIFPALDPSPVVVKSLLASVGNRLILSKPKVAIVSTGSELCSPRNQLTPGKIFDSNTT
                     MLTELLVYFGFNCMHTCVLSDSFQRTKESLLELFEVVDFVICSGGVSMGDKDFVKSVL
                     EDLQFRIHCGRVNIKPGKPMTFASRKDKYFFGLPGNPVSAFVTFHLFALPAIRFAAGW
                     DRCKCSLSVLNVKLLNDFSLDSRPEFVRASVISKSGELYASVNGNQISSRLQSIVGAD
                     VLINLPARTSDRPLAKAGEIFPASVLRFDFISKYE"
ORIGIN
        1 tagagcaaaa aatagacatt ttaatggcgc taatcataca aggaaggaat aataacactg
       61 acatggatac atccacttaa tctacatttg cttattccta tcttgactat atctatatcc
       [etc.]
Support Center

Last updated: 2017-11-09T23:39:24Z