NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|116309068|emb|CAH66177|]
View 

OSIGBa0130O15.1 [Oryza sativa]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
949-1192 1.19e-145

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


:

Pssm-ID: 400190  Cd Length: 243  Bit Score: 443.57  E-value: 1.19e-145
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068   949 NKTWRLVDLPSGHRPIGLKWVYKLKKDAQgVVVKHKARLVAKGYVQRAGIDFDEVFAPVARLDSVRLLLALAAQEGWMVH 1028
Cdd:pfam07727    1 NETWTLVKLPKNVKPIGTTWVHTHKINDL-KEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  1029 HMDVKSAFLNGELIEEVYVVQPPGFEIDGQENKVYRLDKALYGLRQAPRAWNTKLDCTLKKLGFKQSPLEHGLYARGDGS 1108
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFNIDNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDTAESGMYCRGFGE 159
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  1109 GRLLVGVYVDDLVIVGGDSGMIKGFKEQMKAEFKMSDLGPLSFYLGIEVHQEAGIITLKQAAYASRIVEKAGLTGCNPCA 1188
Cdd:pfam07727  160 NKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMKDLGDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNNNGKY 239

                   ....
gi 116309068  1189 TPME 1192
Cdd:pfam07727  240 TPII 243
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1276-1415 5.39e-73

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


:

Pssm-ID: 260004  Cd Length: 140  Bit Score: 239.29  E-value: 5.39e-73
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068 1276 QGYSDSDMAGDIDTRKSTTGVIFFLGKNPVSWQSQKQRVVALSSCESEYIAAATAACQGIWLARLLGDLRNAATEVVDLR 1355
Cdd:cd09272     1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068 1356 VDNQSALALMKNPVFHDRSKHIQTKFHFIREAVENGEITPSYIGTEGQLADILTKPLSRI 1415
Cdd:cd09272    81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLPRP 140
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
487-554 1.64e-22

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


:

Pssm-ID: 372857  Cd Length: 67  Bit Score: 92.43  E-value: 1.64e-22
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 116309068   487 RLYMLDI-NLARPVCLAAHADEDAWRWHARLGHINFRALCKMGKEELVRGLPCLSqvDQVCEACLAGKH 554
Cdd:pfam13976    1 GLYLLDLsSVANSSIAVASKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK--DLVCESCQLGKQ 67
transpos_IS481 super family cl41329
IS481 family transposase; null
568-682 4.84e-16

IS481 family transposase; null


The actual alignment was detected with superfamily member NF033577:

Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 80.33  E-value: 4.84e-16
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  568 DVPLALLHGDLCGpITPATPSGNRYFLLLVDDYSRYMWVALLS--TKDAAPAAIKRTQAAAerksGRKLRALRTDRGGEF 645
Cdd:NF033577  125 AHPGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPdeTAETAADFLRRAFAEH----GIPIRRVLTDNGSEF 199
                          90       100       110
                  ....*....|....*....|....*....|....*....
gi 116309068  646 TS--TQFAEYCAELGMRRELTAPYSPQQNGVVERRNQSV 682
Cdd:NF033577  200 RSraHGFELALAELGIEHRRTRPYHPQTNGKVERFHRTL 238
Retrotran_gag_2 super family cl26047
gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains ...
90-220 6.70e-10

gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.


The actual alignment was detected with superfamily member pfam14223:

Pssm-ID: 464108  Cd Length: 130  Bit Score: 58.40  E-value: 6.70e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068    90 LDAICSAVPPEMIGTLATKASAREAWECIKTMRVGNDRIRKASAQKvraEYESLAFRGDETVEDFALRLTTIVNQLATLG 169
Cdd:pfam14223    1 LALIVLSLSDSLLRLVRNADTAKEAWDKLESTYERKSPANKLTLRR---QLHSLKMKEGESVLEHINKFEELVNKLSALG 77
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 116309068   170 DPEPADKVVEKYLRVARPRFNQLILSIETLLDisTLSMEEVTGRLKAAEDS 220
Cdd:pfam14223   78 VEISDEDLVVKLLRSLPESYENFVTAIESSSD--KITLEELISKLLDEEER 126
ZnF_C2HC smart00343
zinc finger;
292-308 4.52e-05

zinc finger;


:

Pssm-ID: 197667 [Multi-domain]  Cd Length: 17  Bit Score: 41.66  E-value: 4.52e-05
                            10
                    ....*....|....*..
gi 116309068    292 KCRNCGKLGHWAKDCRS 308
Cdd:smart00343    1 KCYNCGKEGHIARDCPS 17
DUF4219 super family cl47255
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
49-75 1.02e-04

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


The actual alignment was detected with superfamily member pfam13961:

Pssm-ID: 433608  Cd Length: 27  Bit Score: 40.57  E-value: 1.02e-04
                           10        20
                   ....*....|....*....|....*..
gi 116309068    49 LTKTNYNDWALLMKIKLQARCLWAAIE 75
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVVE 27
 
Name Accession Description Interval E-value
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
949-1192 1.19e-145

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 443.57  E-value: 1.19e-145
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068   949 NKTWRLVDLPSGHRPIGLKWVYKLKKDAQgVVVKHKARLVAKGYVQRAGIDFDEVFAPVARLDSVRLLLALAAQEGWMVH 1028
Cdd:pfam07727    1 NETWTLVKLPKNVKPIGTTWVHTHKINDL-KEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  1029 HMDVKSAFLNGELIEEVYVVQPPGFEIDGQENKVYRLDKALYGLRQAPRAWNTKLDCTLKKLGFKQSPLEHGLYARGDGS 1108
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFNIDNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDTAESGMYCRGFGE 159
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  1109 GRLLVGVYVDDLVIVGGDSGMIKGFKEQMKAEFKMSDLGPLSFYLGIEVHQEAGIITLKQAAYASRIVEKAGLTGCNPCA 1188
Cdd:pfam07727  160 NKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMKDLGDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNNNGKY 239

                   ....
gi 116309068  1189 TPME 1192
Cdd:pfam07727  240 TPII 243
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1276-1415 5.39e-73

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 239.29  E-value: 5.39e-73
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068 1276 QGYSDSDMAGDIDTRKSTTGVIFFLGKNPVSWQSQKQRVVALSSCESEYIAAATAACQGIWLARLLGDLRNAATEVVDLR 1355
Cdd:cd09272     1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068 1356 VDNQSALALMKNPVFHDRSKHIQTKFHFIREAVENGEITPSYIGTEGQLADILTKPLSRI 1415
Cdd:cd09272    81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLPRP 140
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
487-554 1.64e-22

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 92.43  E-value: 1.64e-22
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 116309068   487 RLYMLDI-NLARPVCLAAHADEDAWRWHARLGHINFRALCKMGKEELVRGLPCLSqvDQVCEACLAGKH 554
Cdd:pfam13976    1 GLYLLDLsSVANSSIAVASKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK--DLVCESCQLGKQ 67
transpos_IS481 NF033577
IS481 family transposase; null
568-682 4.84e-16

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 80.33  E-value: 4.84e-16
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  568 DVPLALLHGDLCGpITPATPSGNRYFLLLVDDYSRYMWVALLS--TKDAAPAAIKRTQAAAerksGRKLRALRTDRGGEF 645
Cdd:NF033577  125 AHPGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPdeTAETAADFLRRAFAEH----GIPIRRVLTDNGSEF 199
                          90       100       110
                  ....*....|....*....|....*....|....*....
gi 116309068  646 TS--TQFAEYCAELGMRRELTAPYSPQQNGVVERRNQSV 682
Cdd:NF033577  200 RSraHGFELALAELGIEHRRTRPYHPQTNGKVERFHRTL 238
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
570-669 2.71e-14

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 70.04  E-value: 2.71e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068   570 PLALLHGDLCgPITPATPSGNRYFLLLVDDYSRYMWVALLSTKDAAPAAIKRTQAAAERKSGRkLRALRTDRGGEFTSTQ 649
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWALSSEMDAELVLDALERAIAFRGGV-PLIIHSDNGSEYTSKA 78
                           90       100
                   ....*....|....*....|
gi 116309068   650 FAEYCAELGMRRELTAPYSP 669
Cdd:pfam00665   79 FREFLKDLGIKPSFSRPGNP 98
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
585-682 4.31e-13

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 71.72  E-value: 4.31e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  585 ATPSGNRYFLLLVDDYSRYmWVAL-----LSTKDAApAAIKrtQAAAERKSGRKLRaLRTDRGGEFTSTQFAEYCAELGM 659
Cdd:COG2801   160 PTAEGWLYLAAVIDLFSRE-IVGWsvsdsMDAELVV-DALE--MAIERRGPPKPLI-LHSDNGSQYTSKAYQELLKKLGI 234
                          90       100
                  ....*....|....*....|...
gi 116309068  660 RRELTAPYSPQQNGVVERRNQSV 682
Cdd:COG2801   235 TQSMSRPGNPQDNAFIESFFGTL 257
transpos_IS3 NF033516
IS3 family transposase;
550-682 6.86e-11

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 65.66  E-value: 6.86e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  550 LAGKHRRSPFPRQALCRSDVPLA--LLHGDL---------CGPITP-ATPSGNRYFLLLVDDYSRYMwVAL-----LSTK 612
Cdd:NF033516  180 LLARRRRKRRPYTTDSGHVHPVApnLLNRQFtatrpnqvwVTDITYiRTAEGWLYLAVVLDLFSREI-VGWsvstsMSAE 258
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  613 DAApAAIKrtQAAAERKSGRKLRaLRTDRGGEFTSTQFAEYCAELGMRRELTAPYSPQQNGVVERRNQSV 682
Cdd:NF033516  259 LVL-DALE--MAIEWRGKPEGLI-LHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGTL 324
Retrotran_gag_2 pfam14223
gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains ...
90-220 6.70e-10

gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.


Pssm-ID: 464108  Cd Length: 130  Bit Score: 58.40  E-value: 6.70e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068    90 LDAICSAVPPEMIGTLATKASAREAWECIKTMRVGNDRIRKASAQKvraEYESLAFRGDETVEDFALRLTTIVNQLATLG 169
Cdd:pfam14223    1 LALIVLSLSDSLLRLVRNADTAKEAWDKLESTYERKSPANKLTLRR---QLHSLKMKEGESVLEHINKFEELVNKLSALG 77
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 116309068   170 DPEPADKVVEKYLRVARPRFNQLILSIETLLDisTLSMEEVTGRLKAAEDS 220
Cdd:pfam14223   78 VEISDEDLVVKLLRSLPESYENFVTAIESSSD--KITLEELISKLLDEEER 126
ZnF_C2HC smart00343
zinc finger;
292-308 4.52e-05

zinc finger;


Pssm-ID: 197667 [Multi-domain]  Cd Length: 17  Bit Score: 41.66  E-value: 4.52e-05
                            10
                    ....*....|....*..
gi 116309068    292 KCRNCGKLGHWAKDCRS 308
Cdd:smart00343    1 KCYNCGKEGHIARDCPS 17
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
291-308 8.92e-05

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 40.59  E-value: 8.92e-05
                           10
                   ....*....|....*...
gi 116309068   291 DKCRNCGKLGHWAKDCRS 308
Cdd:pfam00098    1 GKCYNCGEPGHIARDCPK 18
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
49-75 1.02e-04

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


Pssm-ID: 433608  Cd Length: 27  Bit Score: 40.57  E-value: 1.02e-04
                           10        20
                   ....*....|....*....|....*..
gi 116309068    49 LTKTNYNDWALLMKIKLQARCLWAAIE 75
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVVE 27
 
Name Accession Description Interval E-value
RVT_2 pfam07727
Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually ...
949-1192 1.19e-145

Reverse transcriptase (RNA-dependent DNA polymerase); A reverse transcriptase gene is usually indicative of a mobile element such as a retrotransposon or retrovirus. Reverse transcriptases occur in a variety of mobile elements, including retrotransposons, retroviruses, group II introns, bacterial msDNAs, hepadnaviruses, and caulimoviruses. This Pfam entry includes reverse transcriptases not recognized by the pfam00078 model.


Pssm-ID: 400190  Cd Length: 243  Bit Score: 443.57  E-value: 1.19e-145
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068   949 NKTWRLVDLPSGHRPIGLKWVYKLKKDAQgVVVKHKARLVAKGYVQRAGIDFDEVFAPVARLDSVRLLLALAAQEGWMVH 1028
Cdd:pfam07727    1 NETWTLVKLPKNVKPIGTTWVHTHKINDL-KEVQYKARLVAQGFRQIAGEDYDKVFSPVIRLSSVRLLLAIAAEYEWPVH 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  1029 HMDVKSAFLNGELIEEVYVVQPPGFEIDGQENKVYRLDKALYGLRQAPRAWNTKLDCTLKKLGFKQSPLEHGLYARGDGS 1108
Cdd:pfam07727   80 HMDVSSAFLNGDIDEEIYVKQPPGFNIDNESGKVWQLNKSLYGLKQAPYMWNTCITKVLMDLNFEPDTAESGMYCRGFGE 159
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  1109 GRLLVGVYVDDLVIVGGDSGMIKGFKEQMKAEFKMSDLGPLSFYLGIEVHQEAGIITLKQAAYASRIVEKAGLTGCNPCA 1188
Cdd:pfam07727  160 NKLIVGLYVDDMFITGSDITIINDFKLELAKHFKMKDLGDISEFLGIEFIQIAGGIRLSQHNYLNSVIKKFNLTNNNGKY 239

                   ....
gi 116309068  1189 TPME 1192
Cdd:pfam07727  240 TPII 243
RNase_HI_RT_Ty1 cd09272
Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) ...
1276-1415 5.39e-73

Ty1/Copia family of RNase HI in long-term repeat retroelements; Ribonuclease H (RNase H) enzymes are divided into two major families, Type 1 and Type 2, based on amino acid sequence similarities and biochemical properties. RNase H is an endonuclease that cleaves the RNA strand of an RNA/DNA hybrid in a sequence non-specific manner in the presence of divalent cations. RNase H is widely present in various organisms including bacteria, archaea, and eukaryotes. RNase HI has also been observed as adjunct domains to the reverse transcriptase gene in retroviruses, in long-term repeat (LTR)-bearing and non-LTR retrotransposons. RNase HI in LTR retrotransposons perform degradation of the original RNA template, generation of a polypurine tract (the primer for plus-strand DNA synthesis), and final removal of RNA primers from newly synthesized minus and plus strands. The catalytic residues for RNase H enzymatic activity, three aspartatic acids and one glutamic acid residue (DEDD) are unvaried across all RNase H domains. Phylogenetic patterns of RNase HI of LTR retroelements is classified into five major families, Ty3/Gypsy, Ty1/Copia, Bel/Pao, DIRS1, and the vertebrate retroviruses. The Ty1/Copia family is widely distributed among the genomes of plants, fungi, and animals. RNase H inhibitors have been explored as an anti-HIV drug target because RNase H inactivation inhibits reverse transcription.


Pssm-ID: 260004  Cd Length: 140  Bit Score: 239.29  E-value: 5.39e-73
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068 1276 QGYSDSDMAGDIDTRKSTTGVIFFLGKNPVSWQSQKQRVVALSSCESEYIAAATAACQGIWLARLLGDLRNAATEVVDLR 1355
Cdd:cd09272     1 EGYSDADWAGDPDDRRSTSGYVFFLGGGPISWKSKKQTTVALSSTEAEYIALAEAAKEALWLRRLLEELGIPLDGPTTIY 80
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068 1356 VDNQSALALMKNPVFHDRSKHIQTKFHFIREAVENGEITPSYIGTEGQLADILTKPLSRI 1415
Cdd:cd09272    81 CDNQSAIALAKNPVFHSRTKHIDIRYHFIREKVEKGEIKVEYVPTEDQLADILTKPLPRP 140
gag_pre-integrs pfam13976
GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements ...
487-554 1.64e-22

GAG-pre-integrase domain; This domain is found associated with retroviral insertion elements and lies just upstream of the integrase region on the polyproteins.


Pssm-ID: 372857  Cd Length: 67  Bit Score: 92.43  E-value: 1.64e-22
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 116309068   487 RLYMLDI-NLARPVCLAAHADEDAWRWHARLGHINFRALCKMGKEELVRGLPCLSqvDQVCEACLAGKH 554
Cdd:pfam13976    1 GLYLLDLsSVANSSIAVASKDDETWLWHRRLGHPSFKGLKKLVKKGLLPGLPISK--DLVCESCQLGKQ 67
transpos_IS481 NF033577
IS481 family transposase; null
568-682 4.84e-16

IS481 family transposase; null


Pssm-ID: 468094 [Multi-domain]  Cd Length: 283  Bit Score: 80.33  E-value: 4.84e-16
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  568 DVPLALLHGDLCGpITPATPSGNRYFLLLVDDYSRYMWVALLS--TKDAAPAAIKRTQAAAerksGRKLRALRTDRGGEF 645
Cdd:NF033577  125 AHPGELWHIDIKK-LGRIPDVGRLYLHTAIDDHSRFAYAELYPdeTAETAADFLRRAFAEH----GIPIRRVLTDNGSEF 199
                          90       100       110
                  ....*....|....*....|....*....|....*....
gi 116309068  646 TS--TQFAEYCAELGMRRELTAPYSPQQNGVVERRNQSV 682
Cdd:NF033577  200 RSraHGFELALAELGIEHRRTRPYHPQTNGKVERFHRTL 238
rve pfam00665
Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into ...
570-669 2.71e-14

Integrase core domain; Integrase mediates integration of a DNA copy of the viral genome into the host chromosome. Integrase is composed of three domains. The amino-terminal domain is a zinc binding domain pfam02022. This domain is the central catalytic domain. The carboxyl terminal domain that is a non-specific DNA binding domain pfam00552. The catalytic domain acts as an endonuclease when two nucleotides are removed from the 3' ends of the blunt-ended viral DNA made by reverse transcription. This domain also catalyzes the DNA strand transfer reaction of the 3' ends of the viral DNA to the 5' ends of the integration site.


Pssm-ID: 459897 [Multi-domain]  Cd Length: 98  Bit Score: 70.04  E-value: 2.71e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068   570 PLALLHGDLCgPITPATPSGNRYFLLLVDDYSRYMWVALLSTKDAAPAAIKRTQAAAERKSGRkLRALRTDRGGEFTSTQ 649
Cdd:pfam00665    1 PNQLWQGDFT-YIRIPGGGGKLYLLVIVDDFSREILAWALSSEMDAELVLDALERAIAFRGGV-PLIIHSDNGSEYTSKA 78
                           90       100
                   ....*....|....*....|
gi 116309068   650 FAEYCAELGMRRELTAPYSP 669
Cdd:pfam00665   79 FREFLKDLGIKPSFSRPGNP 98
Tra5 COG2801
Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];
585-682 4.31e-13

Transposase InsO and inactivated derivatives [Mobilome: prophages, transposons];


Pssm-ID: 442053 [Multi-domain]  Cd Length: 309  Bit Score: 71.72  E-value: 4.31e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  585 ATPSGNRYFLLLVDDYSRYmWVAL-----LSTKDAApAAIKrtQAAAERKSGRKLRaLRTDRGGEFTSTQFAEYCAELGM 659
Cdd:COG2801   160 PTAEGWLYLAAVIDLFSRE-IVGWsvsdsMDAELVV-DALE--MAIERRGPPKPLI-LHSDNGSQYTSKAYQELLKKLGI 234
                          90       100
                  ....*....|....*....|...
gi 116309068  660 RRELTAPYSPQQNGVVERRNQSV 682
Cdd:COG2801   235 TQSMSRPGNPQDNAFIESFFGTL 257
transpos_IS3 NF033516
IS3 family transposase;
550-682 6.86e-11

IS3 family transposase;


Pssm-ID: 468052 [Multi-domain]  Cd Length: 369  Bit Score: 65.66  E-value: 6.86e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  550 LAGKHRRSPFPRQALCRSDVPLA--LLHGDL---------CGPITP-ATPSGNRYFLLLVDDYSRYMwVAL-----LSTK 612
Cdd:NF033516  180 LLARRRRKRRPYTTDSGHVHPVApnLLNRQFtatrpnqvwVTDITYiRTAEGWLYLAVVLDLFSREI-VGWsvstsMSAE 258
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  613 DAApAAIKrtQAAAERKSGRKLRaLRTDRGGEFTSTQFAEYCAELGMRRELTAPYSPQQNGVVERRNQSV 682
Cdd:NF033516  259 LVL-DALE--MAIEWRGKPEGLI-LHSDNGSQYTSKAYREWLKEHGITQSMSRPGNCWDNAVAESFFGTL 324
Tra8 COG2826
Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons];
575-725 1.28e-10

Transposase and inactivated derivatives, IS30 family [Mobilome: prophages, transposons];


Pssm-ID: 442074 [Multi-domain]  Cd Length: 325  Bit Score: 64.52  E-value: 1.28e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068  575 HGDLcgpITPATpsGNRYFLLLVDDYSRYMWVALLSTKDAAPAAikRTQAAAERKSGRKLR-ALRTDRGGEFTstQFAEY 653
Cdd:COG2826   176 EGDL---IIGKR--GKSALLTLVERKSRFVILLKLPDKTAESVA--DALIRLLRKLPAFLRkSITTDNGKEFA--DHKEI 246
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 116309068  654 CAELGMRRELTAPYSPQQNGVVERRNqsvvGTARSMLkAKG------LPgmfwgEAINTAVYLLNRSSSKGIGGKTPY 725
Cdd:COG2826   247 EAALGIKVYFADPYSPWQRGTNENTN----GLLRQYF-PKGtdfstvTQ-----EELDAIADRLNNRPRKCLGYKTPA 314
Retrotran_gag_2 pfam14223
gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains ...
90-220 6.70e-10

gag-polypeptide of LTR copia-type; This family is found in Plants and fungi, and contains LTR-polyproteins, or retrotransposons of the copia-type.


Pssm-ID: 464108  Cd Length: 130  Bit Score: 58.40  E-value: 6.70e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 116309068    90 LDAICSAVPPEMIGTLATKASAREAWECIKTMRVGNDRIRKASAQKvraEYESLAFRGDETVEDFALRLTTIVNQLATLG 169
Cdd:pfam14223    1 LALIVLSLSDSLLRLVRNADTAKEAWDKLESTYERKSPANKLTLRR---QLHSLKMKEGESVLEHINKFEELVNKLSALG 77
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 116309068   170 DPEPADKVVEKYLRVARPRFNQLILSIETLLDisTLSMEEVTGRLKAAEDS 220
Cdd:pfam14223   78 VEISDEDLVVKLLRSLPESYENFVTAIESSSD--KITLEELISKLLDEEER 126
ZnF_C2HC smart00343
zinc finger;
292-308 4.52e-05

zinc finger;


Pssm-ID: 197667 [Multi-domain]  Cd Length: 17  Bit Score: 41.66  E-value: 4.52e-05
                            10
                    ....*....|....*..
gi 116309068    292 KCRNCGKLGHWAKDCRS 308
Cdd:smart00343    1 KCYNCGKEGHIARDCPS 17
zf-CCHC pfam00098
Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following ...
291-308 8.92e-05

Zinc knuckle; The zinc knuckle is a zinc binding motif composed of the the following CX2CX4HX4C where X can be any amino acid. The motifs are mostly from retroviral gag proteins (nucleocapsid). Prototype structure is from HIV. Also contains members involved in eukaryotic gene regulation, such as C. elegans GLH-1. Structure is an 18-residue zinc finger.


Pssm-ID: 395050 [Multi-domain]  Cd Length: 18  Bit Score: 40.59  E-value: 8.92e-05
                           10
                   ....*....|....*...
gi 116309068   291 DKCRNCGKLGHWAKDCRS 308
Cdd:pfam00098    1 GKCYNCGEPGHIARDCPK 18
DUF4219 pfam13961
Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal ...
49-75 1.02e-04

Domain of unknown function (DUF4219); This domain is very short and is found at the N-terminal of many Gag-pol polyprotein and related proteins. There is a highly conserved YxxWxxxM sequence motif.


Pssm-ID: 433608  Cd Length: 27  Bit Score: 40.57  E-value: 1.02e-04
                           10        20
                   ....*....|....*....|....*..
gi 116309068    49 LTKTNYNDWALLMKIKLQARCLWAAIE 75
Cdd:pfam13961    1 LDGDNYETWKLRMKLYLQAQDLWEVVE 27
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH