NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1958645862|ref|XP_038969065|]
View 

teneurin-4 isoform X2 [Rattus norvegicus]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
Ten_N pfam06484
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ...
154-553 1.23e-177

Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).


:

Pssm-ID: 461932 [Multi-domain]  Cd Length: 367  Bit Score: 549.96  E-value: 1.23e-177
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  154 SLT-RRRDAERRYTSSSADSEEGKGP-QKSYSSSETLKAYDQDARLAYGSRVKDMVPQESEEFCRTGTNFTLRELGLGEM 231
Cdd:pfam06484    1 SLTkRRRDKERRYTSSSADSEECRVPtQKSYSSSETLKAFDHDSRMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICEP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  232 TPPHGTLYRTDIGLPHCGYSMGASSDADLEADTVLSPEHPVRLWGRSTRSGRSSCLSSRANSNLTLTDTEHENT---ETG 308
Cdd:pfam06484   81 SPRHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKsdnENG 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  309 APLHCSSASSTPIEQSPSPPPSPpaNESQRRLLGNGVAQPTPDSDSEEEFVPNSFLVKSGSASLGVAAnDHPSGLQNHPR 388
Cdd:pfam06484  161 PPIPPSSSSSSPVEQHSPPPPSL--NENQRPLLGNNASHPILDSDPDEEFSPNSYLVRTGSGPQSAPS-EQPPNFQNHSR 237
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  389 LRTPPPPLPHAHTPNQHHAASINSLNRGNFTPRSNPSPAPTdHSLSGEPPagSAQEPTHAQDNWLLNSNIPLETRnlgkq 468
Cdd:pfam06484  238 LRTPPPPLPPPHKQNQHHHPSINSLNRSSLTNRRNPSPAPT-ASLPAELQ--STQESVQLQDSWVLNSNVPLETR----- 309
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  469 pflgtlqdnliemdilsasrhdgaysdgHFLFKPG-GTSPLFCTTSPGYPLTSSTVYSPPPRPLPRSTFSRPAFNLKKPS 547
Cdd:pfam06484  310 ----------------------------HFLFKTGtGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPY 361

                   ....*.
gi 1958645862  548 KYCNWK 553
Cdd:pfam06484  362 KYCSWK 367
NHL super family cl18310
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ...
1435-1769 3.00e-46

NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.


The actual alignment was detected with superfamily member cd14953:

Pssm-ID: 302697 [Multi-domain]  Cd Length: 323  Bit Score: 170.79  E-value: 3.00e-46
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1435 GLADGNKLLA----PVALTCGSDGSLYVGDF--NYIRRIFPSGNVTNIL--------------EMSHSPahkYYLATDPm 1494
Cdd:cd14953     11 GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAgtgtagfadgggaaAQFNTP---SGVAVDA- 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1495 SGAVFLSDTNSRRVFKIKSTTVVKdlvknseVVAGTGDQclpfddtRCGDGGKATEATLTNPRGITVDKFGLIYFVDGT- 1573
Cdd:cd14953     87 AGNLYVADTGNHRIRKITPDGVVS-------TLAGTGTA-------GFSDDGGATAAQFNYPTGVAVDAAGNLYVADTGn 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1574 -MIRRVDQNGIISTLLGsndlTSARPLSCDSVMdiSQVRLEWPTDLAINPMDNsLYVLD--NNVVLQISENHQVRIVAGR 1650
Cdd:cd14953    153 hRIRKITPDGVVTTVAG----TGGAGYAGDGPA--TAAQFNNPTGVAVDAAGN-LYVADrgNHRIRKITPDGVVTTVAGT 225
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1651 pmhcqvpGIDHFLLSKVAIHATLESATALAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGAPSGcdckndancdcF 1730
Cdd:cd14953    226 -------GTAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVAGGGAG-----------F 284
                          330       340       350
                   ....*....|....*....|....*....|....*....
gi 1958645862 1731 SGDDGYAKDAKLNTPSSLAVCADGELYVADLGNIRIRFI 1769
Cdd:cd14953    285 SGDGGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
Tox-GHH pfam15636
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ...
2891-2968 3.37e-37

GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.


:

Pssm-ID: 464783  Cd Length: 78  Bit Score: 135.43  E-value: 3.37e-37
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958645862 2891 EEKVRVLELARQRAVRQAWAREQQRLREGEEGLRAWTDGEKQQVLNTGRVQGYDGFFVTSVEQYPELSDSANNIHFMR 2968
Cdd:pfam15636    1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
1793-2667 8.15e-34

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


:

Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 143.74  E-value: 8.15e-34
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1793 YLFDTSGKHLYTQSLPTGDYLYNFTYTGDGDITHITDNNGNMVNVRRDSTGMPLWLVVPDGQVYWVTMGTNSALRSVTTQ 1872
Cdd:COG3209    185 GTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTG 264
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1873 GHELAMMTYHGNSGLLATKSNENgWTTFYEYDSFGRLTNVTFPTGQVSSFRSDTDSSVHVQVETSSKDDVTITTNLSASG 1952
Cdd:COG3209    265 AGTGASGAGLDASTGTGGAGGSN-AAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTG 343
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1953 AFYTLLQDQVRNSYYIGADGSLRLLLANGMEVALQSEPHLLAGTVNPTVGKRNVTLPIDNGLNLVEWRQRKEQARGQVTV 2032
Cdd:COG3209    344 GTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAA 423
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2033 FGRRLRVHNRNLLSLDFDRVTRTEKIYDDHRKFTLRILYDQAGRPSLWSPSSRLNGVNVTYSPGGHIAGIQRGIMSERME 2112
Cdd:COG3209    424 GALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLD 503
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2113 YDQAGRITSRIFADgkmWSYTYLEKSMVLHLHSQRQYIFEFDKNDRLSSVTMPNVARQTLETIRSVGYYRNIYQPPEGNA 2192
Cdd:COG3209    504 DTLGGTTTTTAGAR---GLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGAST 580
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2193 SVIQDFTEDGHLLHTFYLGTGRRVIYKYGKLSKLAETLYDTTKVSFTYDESAGMLKTVNLQNEGFTCTIRYRQIGPLIDR 2272
Cdd:COG3209    581 TTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRAT 660
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2273 QIFRFTEEGMVNARFDYNYDNSFRVTSmQAVINETPLPIDLYRYDDVSGKTEQFGKFGVIYYDINQIITTAVMTHTKHFD 2352
Cdd:COG3209    661 GTTGTGTGVTAGLTTLATGGTTVGGGT-GTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGG 739
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2353 AYGRMKEVQYEIFRSLMYWmTVQYDNMGRVVKKELKVGPYANTTRYSYEYDADGQLQTVSINDKPLWRYSYDLNGNLHLL 2432
Cdd:COG3209    740 TTGTLTTTSTTTTTTAGAL-TYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSV 818
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2433 SPGNSARLTPL-----RYDLRDRITRLgdvqykmdEDGFLRQRGGDVFEYNSAGLLIKAYNRASGWsvRYRYDGLGRRVS 2507
Cdd:COG3209    819 ITVGSGGGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTE--SYTYDANGNLTS 888
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2508 SKSSHSHHLQFFYADLtnPTKVTHlynhSSSEITSLYYDLQGHlfamelssgdefyiaCDNIGTPLAVFSGTGLMIKQIL 2587
Cdd:COG3209    889 RTDGGTTTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYD 947
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2588 YTAYGEIYMDTNPNFQIIIGYHGGLYDPLTKLVHMGRRDYDVLAGRWTSPDhelwkrlsssSIVP---FHLYMFKNNNPI 2664
Cdd:COG3209    948 YDPFGNLLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD----------PIGLaggLNLYAYVGNNPV 1017

                   ...
gi 1958645862 2665 SNS 2667
Cdd:COG3209   1018 NYV 1020
acid_disulf_rpt NF033662
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ...
1046-1076 2.33e-09

acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.


:

Pssm-ID: 411265 [Multi-domain]  Cd Length: 32  Bit Score: 54.83  E-value: 2.33e-09
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1958645862 1046 SMETACGDSKDNDGDGLVDCMDPDCCLQPLC 1076
Cdd:NF033662     2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
DSL super family cl19567
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ...
959-1002 4.33e-05

Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.


The actual alignment was detected with superfamily member pfam01414:

Pssm-ID: 473190  Cd Length: 46  Bit Score: 43.00  E-value: 4.33e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*..
gi 1958645862  959 CEDGWMGAACDqRACHPRCAE--HGTC-RDGKCECSPGWNGEHCTIA 1002
Cdd:pfam01414    1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
Keratin_B2 super family cl37504
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized ...
840-988 1.97e-03

Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.


The actual alignment was detected with superfamily member pfam01500:

Pssm-ID: 366678 [Multi-domain]  Cd Length: 161  Bit Score: 41.70  E-value: 1.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  840 TNQCIDVACSSHGTCimGTCICNPGYKGESCEEVDCMDPTCS----SRGVCVRGECHCSVgwgGTNCETPraTCLDQCS- 914
Cdd:pfam01500    6 TSFCGFPTCSTGGTC--GSGCCQPCCCQSSCCRPSCCQTSCCqpttFQSSCCRPTCQPCC---QTSCCQP--TCCQTSSc 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  915 -------GHGTfLPDTGLCNCDPSWTGHDCSIEICAADCGGHGVCVGGTCrCEDGWMGAACdqraCHPRCAEHGTCRDGK 987
Cdd:pfam01500   79 qtgcggiGYGQ-EGSSGAVSSRTRWCRPDCRVEGTCLPPCCVVSCTPPTC-CQLHHAQASC----CRPSYCGQSCCRPAC 152

                   .
gi 1958645862  988 C 988
Cdd:pfam01500  153 C 153
 
Name Accession Description Interval E-value
Ten_N pfam06484
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ...
154-553 1.23e-177

Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).


Pssm-ID: 461932 [Multi-domain]  Cd Length: 367  Bit Score: 549.96  E-value: 1.23e-177
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  154 SLT-RRRDAERRYTSSSADSEEGKGP-QKSYSSSETLKAYDQDARLAYGSRVKDMVPQESEEFCRTGTNFTLRELGLGEM 231
Cdd:pfam06484    1 SLTkRRRDKERRYTSSSADSEECRVPtQKSYSSSETLKAFDHDSRMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICEP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  232 TPPHGTLYRTDIGLPHCGYSMGASSDADLEADTVLSPEHPVRLWGRSTRSGRSSCLSSRANSNLTLTDTEHENT---ETG 308
Cdd:pfam06484   81 SPRHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKsdnENG 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  309 APLHCSSASSTPIEQSPSPPPSPpaNESQRRLLGNGVAQPTPDSDSEEEFVPNSFLVKSGSASLGVAAnDHPSGLQNHPR 388
Cdd:pfam06484  161 PPIPPSSSSSSPVEQHSPPPPSL--NENQRPLLGNNASHPILDSDPDEEFSPNSYLVRTGSGPQSAPS-EQPPNFQNHSR 237
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  389 LRTPPPPLPHAHTPNQHHAASINSLNRGNFTPRSNPSPAPTdHSLSGEPPagSAQEPTHAQDNWLLNSNIPLETRnlgkq 468
Cdd:pfam06484  238 LRTPPPPLPPPHKQNQHHHPSINSLNRSSLTNRRNPSPAPT-ASLPAELQ--STQESVQLQDSWVLNSNVPLETR----- 309
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  469 pflgtlqdnliemdilsasrhdgaysdgHFLFKPG-GTSPLFCTTSPGYPLTSSTVYSPPPRPLPRSTFSRPAFNLKKPS 547
Cdd:pfam06484  310 ----------------------------HFLFKTGtGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPY 361

                   ....*.
gi 1958645862  548 KYCNWK 553
Cdd:pfam06484  362 KYCSWK 367
NHL_like_1 cd14953
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ...
1435-1769 3.00e-46

Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271323 [Multi-domain]  Cd Length: 323  Bit Score: 170.79  E-value: 3.00e-46
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1435 GLADGNKLLA----PVALTCGSDGSLYVGDF--NYIRRIFPSGNVTNIL--------------EMSHSPahkYYLATDPm 1494
Cdd:cd14953     11 GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAgtgtagfadgggaaAQFNTP---SGVAVDA- 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1495 SGAVFLSDTNSRRVFKIKSTTVVKdlvknseVVAGTGDQclpfddtRCGDGGKATEATLTNPRGITVDKFGLIYFVDGT- 1573
Cdd:cd14953     87 AGNLYVADTGNHRIRKITPDGVVS-------TLAGTGTA-------GFSDDGGATAAQFNYPTGVAVDAAGNLYVADTGn 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1574 -MIRRVDQNGIISTLLGsndlTSARPLSCDSVMdiSQVRLEWPTDLAINPMDNsLYVLD--NNVVLQISENHQVRIVAGR 1650
Cdd:cd14953    153 hRIRKITPDGVVTTVAG----TGGAGYAGDGPA--TAAQFNNPTGVAVDAAGN-LYVADrgNHRIRKITPDGVVTTVAGT 225
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1651 pmhcqvpGIDHFLLSKVAIHATLESATALAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGAPSGcdckndancdcF 1730
Cdd:cd14953    226 -------GTAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVAGGGAG-----------F 284
                          330       340       350
                   ....*....|....*....|....*....|....*....
gi 1958645862 1731 SGDDGYAKDAKLNTPSSLAVCADGELYVADLGNIRIRFI 1769
Cdd:cd14953    285 SGDGGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
Tox-GHH pfam15636
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ...
2891-2968 3.37e-37

GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.


Pssm-ID: 464783  Cd Length: 78  Bit Score: 135.43  E-value: 3.37e-37
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958645862 2891 EEKVRVLELARQRAVRQAWAREQQRLREGEEGLRAWTDGEKQQVLNTGRVQGYDGFFVTSVEQYPELSDSANNIHFMR 2968
Cdd:pfam15636    1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
1793-2667 8.15e-34

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 143.74  E-value: 8.15e-34
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1793 YLFDTSGKHLYTQSLPTGDYLYNFTYTGDGDITHITDNNGNMVNVRRDSTGMPLWLVVPDGQVYWVTMGTNSALRSVTTQ 1872
Cdd:COG3209    185 GTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTG 264
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1873 GHELAMMTYHGNSGLLATKSNENgWTTFYEYDSFGRLTNVTFPTGQVSSFRSDTDSSVHVQVETSSKDDVTITTNLSASG 1952
Cdd:COG3209    265 AGTGASGAGLDASTGTGGAGGSN-AAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTG 343
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1953 AFYTLLQDQVRNSYYIGADGSLRLLLANGMEVALQSEPHLLAGTVNPTVGKRNVTLPIDNGLNLVEWRQRKEQARGQVTV 2032
Cdd:COG3209    344 GTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAA 423
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2033 FGRRLRVHNRNLLSLDFDRVTRTEKIYDDHRKFTLRILYDQAGRPSLWSPSSRLNGVNVTYSPGGHIAGIQRGIMSERME 2112
Cdd:COG3209    424 GALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLD 503
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2113 YDQAGRITSRIFADgkmWSYTYLEKSMVLHLHSQRQYIFEFDKNDRLSSVTMPNVARQTLETIRSVGYYRNIYQPPEGNA 2192
Cdd:COG3209    504 DTLGGTTTTTAGAR---GLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGAST 580
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2193 SVIQDFTEDGHLLHTFYLGTGRRVIYKYGKLSKLAETLYDTTKVSFTYDESAGMLKTVNLQNEGFTCTIRYRQIGPLIDR 2272
Cdd:COG3209    581 TTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRAT 660
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2273 QIFRFTEEGMVNARFDYNYDNSFRVTSmQAVINETPLPIDLYRYDDVSGKTEQFGKFGVIYYDINQIITTAVMTHTKHFD 2352
Cdd:COG3209    661 GTTGTGTGVTAGLTTLATGGTTVGGGT-GTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGG 739
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2353 AYGRMKEVQYEIFRSLMYWmTVQYDNMGRVVKKELKVGPYANTTRYSYEYDADGQLQTVSINDKPLWRYSYDLNGNLHLL 2432
Cdd:COG3209    740 TTGTLTTTSTTTTTTAGAL-TYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSV 818
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2433 SPGNSARLTPL-----RYDLRDRITRLgdvqykmdEDGFLRQRGGDVFEYNSAGLLIKAYNRASGWsvRYRYDGLGRRVS 2507
Cdd:COG3209    819 ITVGSGGGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTE--SYTYDANGNLTS 888
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2508 SKSSHSHHLQFFYADLtnPTKVTHlynhSSSEITSLYYDLQGHlfamelssgdefyiaCDNIGTPLAVFSGTGLMIKQIL 2587
Cdd:COG3209    889 RTDGGTTTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYD 947
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2588 YTAYGEIYMDTNPNFQIIIGYHGGLYDPLTKLVHMGRRDYDVLAGRWTSPDhelwkrlsssSIVP---FHLYMFKNNNPI 2664
Cdd:COG3209    948 YDPFGNLLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD----------PIGLaggLNLYAYVGNNPV 1017

                   ...
gi 1958645862 2665 SNS 2667
Cdd:COG3209   1018 NYV 1020
Vgb COG4257
Streptogramin lyase [Defense mechanisms];
1445-1769 1.58e-10

Streptogramin lyase [Defense mechanisms];


Pssm-ID: 443399 [Multi-domain]  Cd Length: 270  Bit Score: 64.66  E-value: 1.58e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1445 PVALTCGSDGSLYVGDF--NYIRRIFP-SGNVTNILEMSHSPAHKyyLATDPmSGAVFLSDTNSRRVFKIKSTTvvkdlv 1521
Cdd:COG4257     19 PRDVAVDPDGAVWFTDQggGRIGRLDPaTGEFTEYPLGGGSGPHG--IAVDP-DGNLWFTDNGNNRIGRIDPKT------ 89
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1522 KNSEVVAGTGDQCLPFddtrcgdggkateatltnprGITVDKFGLIYFVDGT--MIRRVD-QNGIISTLLGsnDLTSARp 1598
Cdd:COG4257     90 GEITTFALPGGGSNPH--------------------GIAFDPDGNLWFTDQGgnRIGRLDpATGEVTEFPL--PTGGAG- 146
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1599 lscdsvmdisqvrlewPTDLAINPmDNSLYVLDNnvvlqisENHQVRIVAGRPMHcqvpgidhflLSKVAIHATLESATA 1678
Cdd:COG4257    147 ----------------PYGIAVDP-DGNLWVTDF-------GANAIGRIDPDTGT----------LTEYALPTPGAGPRG 192
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1679 LAVSHNGVLYIAETDEKKINRIRqvTTSGEISLVAGAPSGCDckndancdcfsgddgyakdaklntPSSLAVCADGELYV 1758
Cdd:COG4257    193 LAVDPDGNLWVADTGSGRIGRFD--PKTGTVTEYPLPGGGAR------------------------PYGVAVDGDGRVWF 246
                          330
                   ....*....|.
gi 1958645862 1759 ADLGNIRIRFI 1769
Cdd:COG4257    247 AESGANRIVRF 257
Rhs_assc_core TIGR03696
RHS repeat-associated core domain; This model represents a conserved unique core sequence ...
2588-2667 2.66e-10

RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.


Pssm-ID: 274730 [Multi-domain]  Cd Length: 77  Bit Score: 58.67  E-value: 2.66e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2588 YTAYGEIyMDTNPNFQIIIGYHGGLYDPLTKLVHMGRRDYDVLAGRWTSPDhelwkrlssssivPF------HLYMFKNN 2661
Cdd:TIGR03696    1 YDPYGEV-LSESGAAPNPLRFTGQYYDAETGLYYNGARYYDPELGRFLSPD-------------PIglggglNLYAYVGN 66

                   ....*.
gi 1958645862 2662 NPISNS 2667
Cdd:TIGR03696   67 NPVNWV 72
acid_disulf_rpt NF033662
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ...
1046-1076 2.33e-09

acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.


Pssm-ID: 411265 [Multi-domain]  Cd Length: 32  Bit Score: 54.83  E-value: 2.33e-09
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1958645862 1046 SMETACGDSKDNDGDGLVDCMDPDCCLQPLC 1076
Cdd:NF033662     2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
RHS_core NF041261
RHS element core protein;
2086-2504 4.48e-09

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 62.71  E-value: 4.48e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2086 LNGVNVTYSPGGhiAGIQRGIMSE-------RMEYDQAGRITSRIFADGKMWSYTyleksmvLHLHSqrqyifefdknDR 2158
Cdd:NF041261   401 LNRREVLHTEGE--GGLKRVVKKEhadgsvtRSGYDAAGRLTAQTDAAGRRTEYS-------LNVVS-----------GD 460
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2159 LSSVTMPNVarqtletiRSVGYYRNiyqppegnasviqdfteDGHLLHTFYLGTGRRVIYKYGKLSKL-AETLYDTTKVS 2237
Cdd:NF041261   461 ITDITTPDG--------RETKFYYN-----------------DGNQLTSVTSPDGLESRREYDEPGRLvSETSRSGETTR 515
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2238 FTYDESAGMLKTVNLQNEGFTCTIRYRQIGplidrQIFRFTEEGMVNARFDYNydnsfRVTSMQAVINETPlpIDLYR-Y 2316
Cdd:NF041261   516 YRYDDPHSELPATTTDATGSTKQMTWSRYG-----QLLAFTDCSGYQTRYEYD-----RFGQMTAVHREEG--ISTYRrY 583
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2317 DD----VSGKTEQfGKFGVIYY----DINQIITTAVMTHTKHFDAYGR-MKEVQYEIFRSLmywmtvQYDNMGRVVKKEL 2387
Cdd:NF041261   584 DNrgqlTSVKDAQ-GRETRYEYnaagDLTAVITPDGNRSETQYDAWGKaVSTTQGGLTRSM------EYDAAGRITTLTN 656
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2388 KvgpyaNTTRYSYEYDADGQLQTVSINDKPLWRYSYDLNGNlhLLSPGNSARLTPLRYDLRDRITRL---GDV--QYKMD 2462
Cdd:NF041261   657 E-----NGSHSTFLYDALDRLVQQRGFDGRTQRYHYDLTGK--LTQSEDEGLVTLWHYDESDRITHRtvnGEPaeQWQYD 729
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|..
gi 1958645862 2463 EDGFLRQrggdvFEYNSAGLLIkaynrasgwSVRYRYDGLGR 2504
Cdd:NF041261   730 EHGWLTD-----ISHLSEGHRV---------AVHYGYDDKGR 757
RHS_core NF041261
RHS element core protein;
2314-2642 3.66e-06

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 53.08  E-value: 3.66e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2314 YRYDDVSGKTEQFGKFGVIY---YDINQIITTAVMT-----HT----------KHFDAYGRMKEVQYEIFRSLmywmTVQ 2375
Cdd:NF041261   367 YRYDDTGRVTEQLNPAGLSYryqYEQDRITITDSLNrrevlHTegegglkrvvKKEHADGSVTRSGYDAAGRL----TAQ 442
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2376 YDNMGRVVKKELKV---------GPYANTTRYSYeyDADGQLQTVSINDKPLWRYSYDLNGNLhLLSPGNSARLTPLRYD 2446
Cdd:NF041261   443 TDAAGRRTEYSLNVvsgditditTPDGRETKFYY--NDGNQLTSVTSPDGLESRREYDEPGRL-VSETSRSGETTRYRYD 519
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2447 lrDRITRLGDVqyKMDEDGFLRQrggdvFEYNSAGLLIkAYNRASGWSVRYRYDGLGRRVSSKSSHSHHLqffYADLTNP 2526
Cdd:NF041261   520 --DPHSELPAT--TTDATGSTKQ-----MTWSRYGQLL-AFTDCSGYQTRYEYDRFGQMTAVHREEGIST---YRRYDNR 586
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2527 TKVTHLYNHSSSEiTSLYYDLQGHLFAMELSSGDEFYIACDNIGTPLAVFSGtGLMiKQILYTAYGEIYMDTNPNfqiii 2606
Cdd:NF041261   587 GQLTSVKDAQGRE-TRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQG-GLT-RSMEYDAAGRITTLTNEN----- 658
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*...
gi 1958645862 2607 GYHGG-LYDPLTKLVHMG-------RRDYDvLAGRWTSPDHE----LW 2642
Cdd:NF041261   659 GSHSTfLYDALDRLVQQRgfdgrtqRYHYD-LTGKLTQSEDEglvtLW 705
PLN02919 PLN02919
haloacid dehalogenase-like hydrolase family protein
1489-1773 2.83e-05

haloacid dehalogenase-like hydrolase family protein


Pssm-ID: 215497 [Multi-domain]  Cd Length: 1057  Bit Score: 49.85  E-value: 2.83e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1489 LATDPMSGAVFLSDTNSRRVfkiksttVVKDLVKNSEV-VAGTGDQCL---PFDDtrcgdggkateATLTNPRGITVDKF 1564
Cdd:PLN02919   573 LAIDLLNNRLFISDSNHNRI-------VVTDLDGNFIVqIGSTGEEGLrdgSFED-----------ATFNRPQGLAYNAK 634
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1565 GLIYFVDGT---MIRRVD-QNGIISTLLGS----NDLTSARPLScdsvmdiSQVrLEWPTDLAINPMDNSLYV------- 1629
Cdd:PLN02919   635 KNLLYVADTenhALREIDfVNETVRTLAGNgtkgSDYQGGKKGT-------SQV-LNSPWDVCFEPVNEKVYIamagqhq 706
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1630 -----LDNNVVLQISENHQVRIVAGR----PMHCQVPGI------DHFLLSK---------------------------- 1666
Cdd:PLN02919   707 iweynISDGVTRVFSGDGYERNLNGSsgtsTSFAQPSGIslspdlKELYIADsesssiraldlktggsrllaggdptfsd 786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1667 ----------VAIHATLESATALAVSHNGVLYIAETDEKKINRIRQVTtsGEISLVAGAPSGcdckndancdcfSGDDGY 1736
Cdd:PLN02919   787 nlfkfgdhdgVGSEVLLQHPLGVLCAKDGQIYVADSYNHKIKKLDPAT--KRVTTLAGTGKA------------GFKDGK 852
                          330       340       350
                   ....*....|....*....|....*....|....*..
gi 1958645862 1737 AKDAKLNTPSSLAVCADGELYVADLGNIRIRFIRKNK 1773
Cdd:PLN02919   853 ALKAQLSEPAGLALGENGRLFVADTNNSLIRYLDLNK 889
DSL pfam01414
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ...
959-1002 4.33e-05

Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.


Pssm-ID: 460202  Cd Length: 46  Bit Score: 43.00  E-value: 4.33e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*..
gi 1958645862  959 CEDGWMGAACDqRACHPRCAE--HGTC-RDGKCECSPGWNGEHCTIA 1002
Cdd:pfam01414    1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
NHL_like_2 cd14957
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ...
1679-1865 1.51e-04

Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271327 [Multi-domain]  Cd Length: 280  Bit Score: 46.49  E-value: 1.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1679 LAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGapsgcdckndancdcfSGDDGyakDAKLNTPSSLAVCADGELYV 1758
Cdd:cd14957     23 IAVDSAGNIYVADTGN---NRIQVFTSSGVYSYSIG----------------SGGTG---SGQFNSPYGIAVDSNGNIYV 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1759 ADLGNIRIRfirknkpVLNTQNMYElsspidqelYLFDTSGkhlytQSLPTGDYLYNFTYTGDGDItHITDNNGNMVNVr 1838
Cdd:cd14957     81 ADTDNNRIQ-------VFNSSGVYQ---------YSIGTGG-----SGDGQFNGPYGIAVDSNGNI-YVADTGNHRIQV- 137
                          170       180
                   ....*....|....*....|....*..
gi 1958645862 1839 RDSTGmplwlvvpdgqVYWVTMGTNSA 1865
Cdd:cd14957    138 FTSSG-----------TFSYSIGSGGT 153
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1886-1918 5.54e-04

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 39.50  E-value: 5.54e-04
                           10        20        30
                   ....*....|....*....|....*....|...
gi 1958645862 1886 GLLATKSNENGWTTFYEYDSFGRLTNVTFPTGQ 1918
Cdd:pfam05593    5 GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
Keratin_B2 pfam01500
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized ...
840-988 1.97e-03

Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.


Pssm-ID: 366678 [Multi-domain]  Cd Length: 161  Bit Score: 41.70  E-value: 1.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  840 TNQCIDVACSSHGTCimGTCICNPGYKGESCEEVDCMDPTCS----SRGVCVRGECHCSVgwgGTNCETPraTCLDQCS- 914
Cdd:pfam01500    6 TSFCGFPTCSTGGTC--GSGCCQPCCCQSSCCRPSCCQTSCCqpttFQSSCCRPTCQPCC---QTSCCQP--TCCQTSSc 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  915 -------GHGTfLPDTGLCNCDPSWTGHDCSIEICAADCGGHGVCVGGTCrCEDGWMGAACdqraCHPRCAEHGTCRDGK 987
Cdd:pfam01500   79 qtgcggiGYGQ-EGSSGAVSSRTRWCRPDCRVEGTCLPPCCVVSCTPPTC-CQLHHAQASC----CRPSYCGQSCCRPAC 152

                   .
gi 1958645862  988 C 988
Cdd:pfam01500  153 C 153
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
30-320 2.67e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 2.67e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862   30 GLRSPRELLLVSPELSSEPRPARSWAPLSNSESGGVSGTVPRLSAVlVPASPA---VAACSHESKPPCPLGSDGLGEGAA 106
Cdd:PHA03307    76 GTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTP-PPASPPpspAPDLSEMLRPVGSPGPPPAASPPA 154
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  107 GPADTNSSQSAGMEPDHSALSAAraqfVDVEEREPEAMDVKERKPYRSLTRRRDAERRYTSSSADSEEGKGPqksyssse 186
Cdd:PHA03307   155 AGASPAAVASDAASSRQAALPLS----SPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAP-------- 222
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  187 tLKAYDQDARLayGSRVKDMVPQESEEFCRTGTNFTLRELGLGEMTPphgTLYRTDIG--LPHCGYSMGASSDADLEADT 264
Cdd:PHA03307   223 -APGRSAADDA--GASSSDSSSSESSGCGWGPENECPLPRPAPITLP---TRIWEASGwnGPSSRPGPASSSSSPRERSP 296
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  265 VLSPEHPvRLWGRSTRSGRSSCLSSRANSNLTLTDTEHENTE----TGAPLHCSSASSTP 320
Cdd:PHA03307   297 SPSPSSP-GSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRgaavSPGPSPSRSPSPSR 355
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
848-871 3.87e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 37.23  E-value: 3.87e-03
                           10        20
                   ....*....|....*....|....*...
gi 1958645862  848 CSSHGTCIMG----TCICNPGYKGESCE 871
Cdd:cd00054     11 CQNGGTCVNTvgsyRCSCPPGYTGRNCE 38
 
Name Accession Description Interval E-value
Ten_N pfam06484
Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of ...
154-553 1.23e-177

Teneurin Intracellular Region; This family is found in the intracellular N-terminal region of the Teneurin family of proteins. These proteins are 'pair-rule' genes and are involved in tissue patterning, specifically probably neural patterning. The intracellular domain is cleaved in response to homophilic interaction of the extracellular domain, and translocates to the nucleus. Here it probably carries out to some transcriptional regulatory activity. The length of this region and the conservation suggests that there may be two structural domains here (personal obs:C Yeats).


Pssm-ID: 461932 [Multi-domain]  Cd Length: 367  Bit Score: 549.96  E-value: 1.23e-177
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  154 SLT-RRRDAERRYTSSSADSEEGKGP-QKSYSSSETLKAYDQDARLAYGSRVKDMVPQESEEFCRTGTNFTLRELGLGEM 231
Cdd:pfam06484    1 SLTkRRRDKERRYTSSSADSEECRVPtQKSYSSSETLKAFDHDSRMLYGNRVKDMVHKEADEFSRQGQNFSLRELGICEP 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  232 TPPHGTLYRTDIGLPHCGYSMGASSDADLEADTVLSPEHPVRLWGRSTRSGRSSCLSSRANSNLTLTDTEHENT---ETG 308
Cdd:pfam06484   81 SPRHGLAYCTEMGLPHRGYSISTGSDADTETDGPMSPEHAVRLWGRGTKSGRSSCLSSRSNSALTLTDTEHENKsdnENG 160
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  309 APLHCSSASSTPIEQSPSPPPSPpaNESQRRLLGNGVAQPTPDSDSEEEFVPNSFLVKSGSASLGVAAnDHPSGLQNHPR 388
Cdd:pfam06484  161 PPIPPSSSSSSPVEQHSPPPPSL--NENQRPLLGNNASHPILDSDPDEEFSPNSYLVRTGSGPQSAPS-EQPPNFQNHSR 237
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  389 LRTPPPPLPHAHTPNQHHAASINSLNRGNFTPRSNPSPAPTdHSLSGEPPagSAQEPTHAQDNWLLNSNIPLETRnlgkq 468
Cdd:pfam06484  238 LRTPPPPLPPPHKQNQHHHPSINSLNRSSLTNRRNPSPAPT-ASLPAELQ--STQESVQLQDSWVLNSNVPLETR----- 309
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  469 pflgtlqdnliemdilsasrhdgaysdgHFLFKPG-GTSPLFCTTSPGYPLTSSTVYSPPPRPLPRSTFSRPAFNLKKPS 547
Cdd:pfam06484  310 ----------------------------HFLFKTGtGTTPLFCTASPGYPLTSGTVYSPPPRPLPRNTFSRPAFKLKKPY 361

                   ....*.
gi 1958645862  548 KYCNWK 553
Cdd:pfam06484  362 KYCSWK 367
NHL_like_1 cd14953
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ...
1435-1769 3.00e-46

Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271323 [Multi-domain]  Cd Length: 323  Bit Score: 170.79  E-value: 3.00e-46
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1435 GLADGNKLLA----PVALTCGSDGSLYVGDF--NYIRRIFPSGNVTNIL--------------EMSHSPahkYYLATDPm 1494
Cdd:cd14953     11 GFSGGGGTAArfnsPSGVAVDAAGNLYVADRgnHRIRKITPDGVVTTVAgtgtagfadgggaaAQFNTP---SGVAVDA- 86
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1495 SGAVFLSDTNSRRVFKIKSTTVVKdlvknseVVAGTGDQclpfddtRCGDGGKATEATLTNPRGITVDKFGLIYFVDGT- 1573
Cdd:cd14953     87 AGNLYVADTGNHRIRKITPDGVVS-------TLAGTGTA-------GFSDDGGATAAQFNYPTGVAVDAAGNLYVADTGn 152
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1574 -MIRRVDQNGIISTLLGsndlTSARPLSCDSVMdiSQVRLEWPTDLAINPMDNsLYVLD--NNVVLQISENHQVRIVAGR 1650
Cdd:cd14953    153 hRIRKITPDGVVTTVAG----TGGAGYAGDGPA--TAAQFNNPTGVAVDAAGN-LYVADrgNHRIRKITPDGVVTTVAGT 225
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1651 pmhcqvpGIDHFLLSKVAIHATLESATALAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGAPSGcdckndancdcF 1730
Cdd:cd14953    226 -------GTAGFSGDGGATAAQLNNPTGVAVDAAGNLYVADSGN---HRIRKITPAGVVTTVAGGGAG-----------F 284
                          330       340       350
                   ....*....|....*....|....*....|....*....
gi 1958645862 1731 SGDDGYAKDAKLNTPSSLAVCADGELYVADLGNIRIRFI 1769
Cdd:cd14953    285 SGDGGPATSAQFNNPTGVAVDAAGNLYVADTGNNRIRKI 323
Tox-GHH pfam15636
GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH ...
2891-2968 3.37e-37

GHH signature containing HNH/Endo VII superfamily nuclease toxin; A predicted toxin of the HNH/Endonuclease VII fold present in bacterial polymorphic toxin systems with a characteriztic sG[HQ]H signature motif. In bacterial polymorphic toxin systems, the toxin is exported by the type 2, type 6, type 7 or TcdB/TcaC-type secretion system. The metazoan teneurin proteins possess an inactive of this domain at their C-terminus.


Pssm-ID: 464783  Cd Length: 78  Bit Score: 135.43  E-value: 3.37e-37
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1958645862 2891 EEKVRVLELARQRAVRQAWAREQQRLREGEEGLRAWTDGEKQQVLNTGRVQGYDGFFVTSVEQYPELSDSANNIHFMR 2968
Cdd:pfam15636    1 EERKRLLEHAKKRAVREAWHRERQLLRNGLPGSRDWTDEEKEELLSTGSVPGYDGEYIHPVEQYPELADDPSNIRFRK 78
NHL_like_1 cd14953
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ...
1526-1769 9.59e-35

Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271323 [Multi-domain]  Cd Length: 323  Bit Score: 137.28  E-value: 9.59e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1526 VVAGTGdqclpfddTRCGDGGKATEATLTNPRGITVDKFGLIYFVDGT--MIRRVDQNGIISTLLG------SNDLTSAR 1597
Cdd:cd14953      3 TVAGSG--------TAGFSGGGGTAARFNSPSGVAVDAAGNLYVADRGnhRIRKITPDGVVTTVAGtgtagfADGGGAAA 74
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1598 PLSCdsvmdisqvrlewPTDLAINPMDNsLYVLD--NNVVLQISENHQVRIVAGRpmhcqvpGIDHFLLSKVAIHATLES 1675
Cdd:cd14953     75 QFNT-------------PSGVAVDAAGN-LYVADtgNHRIRKITPDGVVSTLAGT-------GTAGFSDDGGATAAQFNY 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1676 ATALAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGAPSGcdckndancdcFSGDDGYAKDAKLNTPSSLAVCADGE 1755
Cdd:cd14953    134 PTGVAVDAAGNLYVADTGN---HRIRKITPDGVVTTVAGTGGA-----------GYAGDGPATAAQFNNPTGVAVDAAGN 199
                          250
                   ....*....|....
gi 1958645862 1756 LYVADLGNIRIRFI 1769
Cdd:cd14953    200 LYVADRGNHRIRKI 213
RhsA COG3209
Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction ...
1793-2667 8.15e-34

Uncharacterized conserved protein RhaS, contains 28 RHS repeats [General function prediction only];


Pssm-ID: 442442 [Multi-domain]  Cd Length: 1103  Bit Score: 143.74  E-value: 8.15e-34
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1793 YLFDTSGKHLYTQSLPTGDYLYNFTYTGDGDITHITDNNGNMVNVRRDSTGMPLWLVVPDGQVYWVTMGTNSALRSVTTQ 1872
Cdd:COG3209    185 GTGAVTLATGLAGSALLALGSGAILGGLAGAYSGSATTATGTALGTPASVAATVTGSATGAAGAGAAVATAATTLGGTTG 264
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1873 GHELAMMTYHGNSGLLATKSNENgWTTFYEYDSFGRLTNVTFPTGQVSSFRSDTDSSVHVQVETSSKDDVTITTNLSASG 1952
Cdd:COG3209    265 AGTGASGAGLDASTGTGGAGGSN-AAATAGGLGGAGLGSGGAGGGGTAGGTTTAAGTTGTAAVSGAADAGTTTTTGTGTG 343
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1953 AFYTLLQDQVRNSYYIGADGSLRLLLANGMEVALQSEPHLLAGTVNPTVGKRNVTLPIDNGLNLVEWRQRKEQARGQVTV 2032
Cdd:COG3209    344 GTTTTVGGGGSLTLGGYGAAGGLTTSVGAGGGGSTSGSTTTVGGGGTATGSGGGSSTTGVGAGTTTTSTTGGDGGPATAA 423
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2033 FGRRLRVHNRNLLSLDFDRVTRTEKIYDDHRKFTLRILYDQAGRPSLWSPSSRLNGVNVTYSPGGHIAGIQRGIMSERME 2112
Cdd:COG3209    424 GALTAGGTATGTGTGGGGTTAGTDATTTTGGAGASGTLTTTGGAATGATTGGGTEAGTGGGTLTSGSAGATTLGTDTTLD 503
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2113 YDQAGRITSRIFADgkmWSYTYLEKSMVLHLHSQRQYIFEFDKNDRLSSVTMPNVARQTLETIRSVGYYRNIYQPPEGNA 2192
Cdd:COG3209    504 DTLGGTTTTTAGAR---GLVVTTGTTLTLGTTTTATLSATDATGTGDTTTTGTVGTGTSTGTGGTGTVTTTGDGTGGAST 580
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2193 SVIQDFTEDGHLLHTFYLGTGRRVIYKYGKLSKLAETLYDTTKVSFTYDESAGMLKTVNLQNEGFTCTIRYRQIGPLIDR 2272
Cdd:COG3209    581 TTGTTGGTATTTTVTTTTTTSTAGTTTTTTSGYTRAGLTLTLGTGTASGLERATASTGSTTGGTTGTGVTTTGTTTTRAT 660
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2273 QIFRFTEEGMVNARFDYNYDNSFRVTSmQAVINETPLPIDLYRYDDVSGKTEQFGKFGVIYYDINQIITTAVMTHTKHFD 2352
Cdd:COG3209    661 GTTGTGTGVTAGLTTLATGGTTVGGGT-GTTSTATTGATTGGTETGTTVTTLAGGTTTRLGTTTTGGGGGTTTDGTGTGG 739
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2353 AYGRMKEVQYEIFRSLMYWmTVQYDNMGRVVKKELKVGPYANTTRYSYEYDADGQLQTVSINDKPLWRYSYDLNGNLHLL 2432
Cdd:COG3209    740 TTGTLTTTSTTTTTTAGAL-TYTYDALGRLTSETTPGGVTQGTYTTRYTYDALGRLTSVTYPDGETVTYTYDALGRLTSV 818
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2433 SPGNSARLTPL-----RYDLRDRITRLgdvqykmdEDGFLRQRGGDVFEYNSAGLLIKAYNRASGWsvRYRYDGLGRRVS 2507
Cdd:COG3209    819 ITVGSGGGTDLqdrtyTYDAAGNITSI--------TDALRAGTLTQTYTYDALGRLTSATDPGTTE--SYTYDANGNLTS 888
                          730       740       750       760       770       780       790       800
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2508 SKSSHSHHLQFFYADLtnPTKVTHlynhSSSEITSLYYDLQGHlfamelssgdefyiaCDNIGTPLAVFSGTGLMIKQIL 2587
Cdd:COG3209    889 RTDGGTTTYTYDALGR--LVSVTK----PDGTTTTYTYDALGH---------------TDHLGSVRALTDASGQVVWRYD 947
                          810       820       830       840       850       860       870       880
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2588 YTAYGEIYMDTNPNFQIIIGYHGGLYDPLTKLVHMGRRDYDVLAGRWTSPDhelwkrlsssSIVP---FHLYMFKNNNPI 2664
Cdd:COG3209    948 YDPFGNLLAETSGAAANPLRFTGQEYDAETGLYYNGARYYDPALGRFLSPD----------PIGLaggLNLYAYVGNNPV 1017

                   ...
gi 1958645862 2665 SNS 2667
Cdd:COG3209   1018 NYV 1020
NHL cd05819
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ...
1440-1769 2.35e-18

NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.


Pssm-ID: 271320 [Multi-domain]  Cd Length: 269  Bit Score: 87.76  E-value: 2.35e-18
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1440 NKLLAPVALTCGSDGSLYVGDF--NYIRRIFPSGN-VTNI------LEMSHSPAHkyyLATDPmSGAVFLSDTNSRRVFK 1510
Cdd:cd05819      5 GELNNPQGIAVDSSGNIYVADTgnNRIQVFDPDGNfITSFgsfgsgDGQFNEPAG---VAVDS-DGNLYVADTGNHRIQK 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1511 IKSTTVVKDlvknseVVAGTGDQCLPFDdtrcgdggkateatltNPRGITVDKFGLIYFVDgTM---IRRVDQNGIISTL 1587
Cdd:cd05819     81 FDPDGNFLA------SFGGSGDGDGEFN----------------GPRGIAVDSSGNIYVAD-TGnhrIQKFDPDGEFLTT 137
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1588 LGSNDLTSARplscdsvmdisqvrLEWPTDLAINPmDNSLYVLDnnvvlqiSENHQVRIVAgrpmhcqvPGiDHFLL--- 1664
Cdd:cd05819    138 FGSGGSGPGQ--------------FNGPTGVAVDS-DGNIYVAD-------TGNHRIQVFD--------PD-GNFLTtfg 186
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1665 SKVAIHATLESATALAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGapsgcdckndancdcfsgdDGYAKDAKLNT 1744
Cdd:cd05819    187 STGTGPGQFNYPTGIAVDSDGNIYVADSGN---NRVQVFDPDGAGFGGNG-------------------NFLGSDGQFNR 244
                          330       340
                   ....*....|....*....|....*
gi 1958645862 1745 PSSLAVCADGELYVADLGNIRIRFI 1769
Cdd:cd05819    245 PSGLAVDSDGNLYVADTGNNRIQVF 269
NHL cd05819
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ...
1553-1863 2.98e-17

NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.


Pssm-ID: 271320 [Multi-domain]  Cd Length: 269  Bit Score: 84.68  E-value: 2.98e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1553 LTNPRGITVDKFGLIYFVDGTM--IRRVDQNGIISTLLGSNDltsarplscdsvmdISQVRLEWPTDLAINPmDNSLYVL 1630
Cdd:cd05819      7 LNNPQGIAVDSSGNIYVADTGNnrIQVFDPDGNFITSFGSFG--------------SGDGQFNEPAGVAVDS-DGNLYVA 71
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1631 D--NNVVLQISENHQVRIVAGRPmhcqvpGIDHFLLSkvaihatleSATALAVSHNGVLYIAETDEkkiNRIRQVTTSGE 1708
Cdd:cd05819     72 DtgNHRIQKFDPDGNFLASFGGS------GDGDGEFN---------GPRGIAVDSSGNIYVADTGN---HRIQKFDPDGE 133
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1709 ISLVAGAPSGCDckndancdcfsgddgyakdAKLNTPSSLAVCADGELYVADLGNIRIRFIrknkpvlntqnmyelsSPI 1788
Cdd:cd05819    134 FLTTFGSGGSGP-------------------GQFNGPTGVAVDSDGNIYVADTGNHRIQVF----------------DPD 178
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1789 DQELYLFDTSGKHLYTQSLPTG------DYLYnFTYTGDGDITHITDN------NGNmVNVRRDSTGMPLWLVV-PDGQV 1855
Cdd:cd05819    179 GNFLTTFGSTGTGPGQFNYPTGiavdsdGNIY-VADSGNNRVQVFDPDgagfggNGN-FLGSDGQFNRPSGLAVdSDGNL 256

                   ....*...
gi 1958645862 1856 YWVTMGTN 1863
Cdd:cd05819    257 YVADTGNN 264
NHL_like_1 cd14953
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ...
1432-1638 4.54e-15

Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271323 [Multi-domain]  Cd Length: 323  Bit Score: 79.11  E-value: 4.54e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1432 SCNGLADGNKLLAPVALTCGSDGSLYVGDF--NYIRRIFPSGNVTNilemshspahkyylatdpmsgavflsdtnsrrvf 1509
Cdd:cd14953    176 AGDGPATAAQFNNPTGVAVDAAGNLYVADRgnHRIRKITPDGVVTT---------------------------------- 221
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1510 kiksttvvkdlvknsevVAGTGDQclPFddtrcGDGGKATEATLTNPRGITVDKFGLIYFVD---GTmIRRVDQNGIIST 1586
Cdd:cd14953    222 -----------------VAGTGTA--GF-----SGDGGATAAQLNNPTGVAVDAAGNLYVADsgnHR-IRKITPAGVVTT 276
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....
gi 1958645862 1587 LLGSndlTSARPLSCDSVmdiSQVRLEWPTDLAINPmDNSLYVLD--NNVVLQI 1638
Cdd:cd14953    277 VAGG---GAGFSGDGGPA---TSAQFNNPTGVAVDA-AGNLYVADtgNNRIRKI 323
NHL cd05819
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ...
1438-1700 8.97e-14

NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.


Pssm-ID: 271320 [Multi-domain]  Cd Length: 269  Bit Score: 74.28  E-value: 8.97e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1438 DGNKLLAPVALTCGSDGSLYVGDF--NYIRRIFPSGNVT-------NILEMSHSPahkYYLATDPmSGAVFLSDTNSRRV 1508
Cdd:cd05819     50 GDGQFNEPAGVAVDSDGNLYVADTgnHRIQKFDPDGNFLasfggsgDGDGEFNGP---RGIAVDS-SGNIYVADTGNHRI 125
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1509 FKIKSttvvkdlvkNSEVVAGTGdqclpfddtrcgdGGKATEATLTNPRGITVDKFGLIYFVDGT--MIRRVDQNGIIST 1586
Cdd:cd05819    126 QKFDP---------DGEFLTTFG-------------SGGSGPGQFNGPTGVAVDSDGNIYVADTGnhRIQVFDPDGNFLT 183
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1587 LLGSNDLTSArplscdsvmdisqvRLEWPTDLAINPMDNsLYVLD--NNVVLQISENHQVRIVAGrpmhcqvpgidhfll 1664
Cdd:cd05819    184 TFGSTGTGPG--------------QFNYPTGIAVDSDGN-IYVADsgNNRVQVFDPDGAGFGGNG--------------- 233
                          250       260       270
                   ....*....|....*....|....*....|....*.
gi 1958645862 1665 SKVAIHATLESATALAVSHNGVLYIAETDEKKINRI 1700
Cdd:cd05819    234 NFLGSDGQFNRPSGLAVDSDGNLYVADTGNNRIQVF 269
Vgb COG4257
Streptogramin lyase [Defense mechanisms];
1445-1769 1.58e-10

Streptogramin lyase [Defense mechanisms];


Pssm-ID: 443399 [Multi-domain]  Cd Length: 270  Bit Score: 64.66  E-value: 1.58e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1445 PVALTCGSDGSLYVGDF--NYIRRIFP-SGNVTNILEMSHSPAHKyyLATDPmSGAVFLSDTNSRRVFKIKSTTvvkdlv 1521
Cdd:COG4257     19 PRDVAVDPDGAVWFTDQggGRIGRLDPaTGEFTEYPLGGGSGPHG--IAVDP-DGNLWFTDNGNNRIGRIDPKT------ 89
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1522 KNSEVVAGTGDQCLPFddtrcgdggkateatltnprGITVDKFGLIYFVDGT--MIRRVD-QNGIISTLLGsnDLTSARp 1598
Cdd:COG4257     90 GEITTFALPGGGSNPH--------------------GIAFDPDGNLWFTDQGgnRIGRLDpATGEVTEFPL--PTGGAG- 146
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1599 lscdsvmdisqvrlewPTDLAINPmDNSLYVLDNnvvlqisENHQVRIVAGRPMHcqvpgidhflLSKVAIHATLESATA 1678
Cdd:COG4257    147 ----------------PYGIAVDP-DGNLWVTDF-------GANAIGRIDPDTGT----------LTEYALPTPGAGPRG 192
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1679 LAVSHNGVLYIAETDEKKINRIRqvTTSGEISLVAGAPSGCDckndancdcfsgddgyakdaklntPSSLAVCADGELYV 1758
Cdd:COG4257    193 LAVDPDGNLWVADTGSGRIGRFD--PKTGTVTEYPLPGGGAR------------------------PYGVAVDGDGRVWF 246
                          330
                   ....*....|.
gi 1958645862 1759 ADLGNIRIRFI 1769
Cdd:COG4257    247 AESGANRIVRF 257
Rhs_assc_core TIGR03696
RHS repeat-associated core domain; This model represents a conserved unique core sequence ...
2588-2667 2.66e-10

RHS repeat-associated core domain; This model represents a conserved unique core sequence shared by large numbers of proteins. It is occasional in the Archaea Methanosarcina barkeri) but common in bacteria and eukaryotes. Most fall into two large classes. One class consists of long proteins in which two classes of repeats are abundant: an FG-GAP repeat (pfam01839) class, and an RHS repeat (pfam05593) or YD repeat (TIGR01643). This class includes secreted bacterial insecticidal toxins and intercellular signalling proteins such as the teneurins in animals. The other class consists of uncharacterized proteins shorter than 400 amino acids, where this core domain of about 75 amino acids tends to occur in the N-terminal half. Over twenty such proteins are found in Pseudomonas putida alone; little sequence similarity or repeat structure is found among these proteins outside the region modeled by this domain.


Pssm-ID: 274730 [Multi-domain]  Cd Length: 77  Bit Score: 58.67  E-value: 2.66e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2588 YTAYGEIyMDTNPNFQIIIGYHGGLYDPLTKLVHMGRRDYDVLAGRWTSPDhelwkrlssssivPF------HLYMFKNN 2661
Cdd:TIGR03696    1 YDPYGEV-LSESGAAPNPLRFTGQYYDAETGLYYNGARYYDPELGRFLSPD-------------PIglggglNLYAYVGN 66

                   ....*.
gi 1958645862 2662 NPISNS 2667
Cdd:TIGR03696   67 NPVNWV 72
acid_disulf_rpt NF033662
acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with ...
1046-1076 2.33e-09

acidic double-disulfide repeat; The acidic double-disulfide repeat is an Asp-rich repeat with four nearly invariant Cys residues in a repeat length of about 35 amino acids.


Pssm-ID: 411265 [Multi-domain]  Cd Length: 32  Bit Score: 54.83  E-value: 2.33e-09
                           10        20        30
                   ....*....|....*....|....*....|.
gi 1958645862 1046 SMETACGDSKDNDGDGLVDCMDPDCCLQPLC 1076
Cdd:NF033662     2 ATDTTCSDGIDNDGDGLTDCADPDCAGNPVC 32
NHL_PKND_like cd14952
NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein ...
1489-1766 3.00e-09

NHL repeat domain of the protein kinase PknD; PknD is a mycobacterial transmembrane protein with a cytosolic kinase domain and an extracellular sensor domain that contains NHL repeats. It plays a key role in the development of central nervous system tuberculosis, by mediating the invasion of host brain endothelia. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271322 [Multi-domain]  Cd Length: 247  Bit Score: 60.30  E-value: 3.00e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1489 LATDPmSGAVFLSDTNSRRVFKiksttvvkdlvknseVVAGTGDQC-LPFDDtrcgdggkateatLTNPRGITVDKFGLI 1567
Cdd:cd14952     15 VAVDA-AGNVYVADSGNNRVLK---------------LAAGSTTQTvLPFTG-------------LYQPQGVAVDAAGTV 65
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1568 YFVDGtmirrvDQNGIISTLLGSNDLTsarPLSCDSvmdisqvrLEWPTDLAINPMDNsLYVLD--NNVVLqisenhqvR 1645
Cdd:cd14952     66 YVTDF------GNNRVLKLAAGSTTQT---VLPFTG--------LNDPTGVAVDAAGN-VYVADtgNNRVL--------K 119
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1646 IVAGRPMHCQVPgidhFllskvaihATLESATALAVSHNGVLYIAETDEkkiNRIRQvttsgeisLVAGA------Psgc 1719
Cdd:cd14952    120 LAAGSNTQTVLP----F--------TGLSNPDGVAVDGAGNVYVTDTGN---NRVLK--------LAAGSttqtvlP--- 173
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*..
gi 1958645862 1720 dckndancdcFSGddgyakdakLNTPSSLAVCADGELYVADLGNIRI 1766
Cdd:cd14952    174 ----------FTG---------LNSPSGVAVDTAGNVYVTDHGNNRV 201
RHS_core NF041261
RHS element core protein;
2086-2504 4.48e-09

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 62.71  E-value: 4.48e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2086 LNGVNVTYSPGGhiAGIQRGIMSE-------RMEYDQAGRITSRIFADGKMWSYTyleksmvLHLHSqrqyifefdknDR 2158
Cdd:NF041261   401 LNRREVLHTEGE--GGLKRVVKKEhadgsvtRSGYDAAGRLTAQTDAAGRRTEYS-------LNVVS-----------GD 460
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2159 LSSVTMPNVarqtletiRSVGYYRNiyqppegnasviqdfteDGHLLHTFYLGTGRRVIYKYGKLSKL-AETLYDTTKVS 2237
Cdd:NF041261   461 ITDITTPDG--------RETKFYYN-----------------DGNQLTSVTSPDGLESRREYDEPGRLvSETSRSGETTR 515
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2238 FTYDESAGMLKTVNLQNEGFTCTIRYRQIGplidrQIFRFTEEGMVNARFDYNydnsfRVTSMQAVINETPlpIDLYR-Y 2316
Cdd:NF041261   516 YRYDDPHSELPATTTDATGSTKQMTWSRYG-----QLLAFTDCSGYQTRYEYD-----RFGQMTAVHREEG--ISTYRrY 583
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2317 DD----VSGKTEQfGKFGVIYY----DINQIITTAVMTHTKHFDAYGR-MKEVQYEIFRSLmywmtvQYDNMGRVVKKEL 2387
Cdd:NF041261   584 DNrgqlTSVKDAQ-GRETRYEYnaagDLTAVITPDGNRSETQYDAWGKaVSTTQGGLTRSM------EYDAAGRITTLTN 656
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2388 KvgpyaNTTRYSYEYDADGQLQTVSINDKPLWRYSYDLNGNlhLLSPGNSARLTPLRYDLRDRITRL---GDV--QYKMD 2462
Cdd:NF041261   657 E-----NGSHSTFLYDALDRLVQQRGFDGRTQRYHYDLTGK--LTQSEDEGLVTLWHYDESDRITHRtvnGEPaeQWQYD 729
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|..
gi 1958645862 2463 EDGFLRQrggdvFEYNSAGLLIkaynrasgwSVRYRYDGLGR 2504
Cdd:NF041261   730 EHGWLTD-----ISHLSEGHRV---------AVHYGYDDKGR 757
Vgb COG4257
Streptogramin lyase [Defense mechanisms];
1445-1579 4.72e-07

Streptogramin lyase [Defense mechanisms];


Pssm-ID: 443399 [Multi-domain]  Cd Length: 270  Bit Score: 53.87  E-value: 4.72e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1445 PVALTCGSDGSLYVGDF--NYIRRIFPSGNVTNILEMSHSPAHKYYLATDPmSGAVFLSDTNSRRVFKIKSTTvvkdlvk 1522
Cdd:COG4257    147 PYGIAVDPDGNLWVTDFgaNAIGRIDPDTGTLTEYALPTPGAGPRGLAVDP-DGNLWVADTGSGRIGRFDPKT------- 218
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1958645862 1523 nsevvagtgdqclpfddtrcgdgGKATEATLTN----PRGITVDKFGLIYFVDGT--MIRRVD 1579
Cdd:COG4257    219 -----------------------GTVTEYPLPGggarPYGVAVDGDGRVWFAESGanRIVRFD 258
NHL_like_3 cd14956
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ...
1545-1766 4.82e-07

Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271326 [Multi-domain]  Cd Length: 274  Bit Score: 54.21  E-value: 4.82e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1545 GGKATEA-TLTNPRGITVDKFGLIYFVDGT--MIRRVDQNGIISTLLGSN-----DLTSARPLSCDS-----VMD----- 1606
Cdd:cd14956     50 GTTGDGPgQFGRPRGLAVDKDGWLYVADYWgdRIQVFTLTGELQTIGGSSgsgpgQFNAPRGVAVDAdgnlyVADfgnqr 129
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1607 ISQVRLE------------------WPTDLAINPmDNSLYVLDnnvvlqiSENHQVrivagrpmhcQVPGIDHFLLSKVA 1668
Cdd:cd14956    130 IQKFDPDgsflrqwggtgiepgsfnYPRGVAVDP-DGTLYVAD-------TYNDRI----------QVFDNDGAFLRKWG 191
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1669 IHAT----LESATALAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGAPSGcdckndancdcfsgddgyaKDAKLNT 1744
Cdd:cd14956    192 GRGTgpgqFNYPYGIAIDPDGNVFVADFGN---NRIQKFTADGTFLTSWGSPGT-------------------GPGQFKN 249
                          250       260
                   ....*....|....*....|..
gi 1958645862 1745 PSSLAVCADGELYVADLGNIRI 1766
Cdd:cd14956    250 PWGVVVDADGTVYVADSNNNRV 271
RHS_core NF041261
RHS element core protein;
2314-2642 3.66e-06

RHS element core protein;


Pssm-ID: 469161 [Multi-domain]  Cd Length: 1261  Bit Score: 53.08  E-value: 3.66e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2314 YRYDDVSGKTEQFGKFGVIY---YDINQIITTAVMT-----HT----------KHFDAYGRMKEVQYEIFRSLmywmTVQ 2375
Cdd:NF041261   367 YRYDDTGRVTEQLNPAGLSYryqYEQDRITITDSLNrrevlHTegegglkrvvKKEHADGSVTRSGYDAAGRL----TAQ 442
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2376 YDNMGRVVKKELKV---------GPYANTTRYSYeyDADGQLQTVSINDKPLWRYSYDLNGNLhLLSPGNSARLTPLRYD 2446
Cdd:NF041261   443 TDAAGRRTEYSLNVvsgditditTPDGRETKFYY--NDGNQLTSVTSPDGLESRREYDEPGRL-VSETSRSGETTRYRYD 519
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2447 lrDRITRLGDVqyKMDEDGFLRQrggdvFEYNSAGLLIkAYNRASGWSVRYRYDGLGRRVSSKSSHSHHLqffYADLTNP 2526
Cdd:NF041261   520 --DPHSELPAT--TTDATGSTKQ-----MTWSRYGQLL-AFTDCSGYQTRYEYDRFGQMTAVHREEGIST---YRRYDNR 586
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 2527 TKVTHLYNHSSSEiTSLYYDLQGHLFAMELSSGDEFYIACDNIGTPLAVFSGtGLMiKQILYTAYGEIYMDTNPNfqiii 2606
Cdd:NF041261   587 GQLTSVKDAQGRE-TRYEYNAAGDLTAVITPDGNRSETQYDAWGKAVSTTQG-GLT-RSMEYDAAGRITTLTNEN----- 658
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*...
gi 1958645862 2607 GYHGG-LYDPLTKLVHMG-------RRDYDvLAGRWTSPDHE----LW 2642
Cdd:NF041261   659 GSHSTfLYDALDRLVQQRgfdgrtqRYHYD-LTGKLTQSEDEglvtLW 705
Vgb COG4257
Streptogramin lyase [Defense mechanisms];
1551-1769 8.60e-06

Streptogramin lyase [Defense mechanisms];


Pssm-ID: 443399 [Multi-domain]  Cd Length: 270  Bit Score: 50.02  E-value: 8.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1551 ATLTNPRGITVDKFGLIYFVD--GTMIRRVD-QNGIISTllgsndltsarplscdsvmdISQVRLEWPTDLAINPmDNSL 1627
Cdd:COG4257     14 APGSGPRDVAVDPDGAVWFTDqgGGRIGRLDpATGEFTE--------------------YPLGGGSGPHGIAVDP-DGNL 72
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1628 YVLD--NNVVLQIS-ENHQVRIVAGrpmhcqvPGIDHFLlskvaihatlesaTALAVSHNGVLYIAETDekkINRIRQVT 1704
Cdd:COG4257     73 WFTDngNNRIGRIDpKTGEITTFAL-------PGGGSNP-------------HGIAFDPDGNLWFTDQG---GNRIGRLD 129
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1705 T-SGEISLV-----AGAPSGCDCKND---------ANC-DCFSGDDG----YAKDAKLNTPSSLAVCADGELYVADLGNI 1764
Cdd:COG4257    130 PaTGEVTEFplptgGAGPYGIAVDPDgnlwvtdfgANAiGRIDPDTGtlteYALPTPGAGPRGLAVDPDGNLWVADTGSG 209

                   ....*
gi 1958645862 1765 RIRFI 1769
Cdd:COG4257    210 RIGRF 214
NHL_like_2 cd14957
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ...
1556-1863 2.46e-05

Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271327 [Multi-domain]  Cd Length: 280  Bit Score: 48.80  E-value: 2.46e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1556 PRGITVDKFGLIYFVD--GTMIRRVDQNGIISTLLGSNDltsarplscdsvmdISQVRLEWPTDLAINPMDNsLYVLDnn 1633
Cdd:cd14957     20 PRGIAVDSAGNIYVADtgNNRIQVFTSSGVYSYSIGSGG--------------TGSGQFNSPYGIAVDSNGN-IYVAD-- 82
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1634 vvlqiSENHQVRIvagrpmhcqvpgidhFLLSKVAIHA---------TLESATALAVSHNGVLYIAETDEkkiNRIrQVT 1704
Cdd:cd14957     83 -----TDNNRIQV---------------FNSSGVYQYSigtggsgdgQFNGPYGIAVDSNGNIYVADTGN---HRI-QVF 138
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1705 TSgeislvAGAPsgcdckndancdCFSGDDGYAKDAKLNTPSSLAVCADGELYVADLGNIRIRfirknkpVLNTQNMYel 1784
Cdd:cd14957    139 TS------SGTF------------SYSIGSGGTGPGQFNGPQGIAVDSDGNIYVADTGNHRIQ-------VFTSSGTF-- 191
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1785 sspidqeLYLFDTSGkhlytqslpTGDYLYNFTY----TGDGDItHITDNNGNMVNVrRDSTGmplwlvvpdgqVYWVTM 1860
Cdd:cd14957    192 -------QYTFGSSG---------SGPGQFSDPYgiavDSDGNI-YVADTGNHRIQV-FTSSG-----------AYQYSI 242

                   ...
gi 1958645862 1861 GTN 1863
Cdd:cd14957    243 GTS 245
PLN02919 PLN02919
haloacid dehalogenase-like hydrolase family protein
1489-1773 2.83e-05

haloacid dehalogenase-like hydrolase family protein


Pssm-ID: 215497 [Multi-domain]  Cd Length: 1057  Bit Score: 49.85  E-value: 2.83e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1489 LATDPMSGAVFLSDTNSRRVfkiksttVVKDLVKNSEV-VAGTGDQCL---PFDDtrcgdggkateATLTNPRGITVDKF 1564
Cdd:PLN02919   573 LAIDLLNNRLFISDSNHNRI-------VVTDLDGNFIVqIGSTGEEGLrdgSFED-----------ATFNRPQGLAYNAK 634
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1565 GLIYFVDGT---MIRRVD-QNGIISTLLGS----NDLTSARPLScdsvmdiSQVrLEWPTDLAINPMDNSLYV------- 1629
Cdd:PLN02919   635 KNLLYVADTenhALREIDfVNETVRTLAGNgtkgSDYQGGKKGT-------SQV-LNSPWDVCFEPVNEKVYIamagqhq 706
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1630 -----LDNNVVLQISENHQVRIVAGR----PMHCQVPGI------DHFLLSK---------------------------- 1666
Cdd:PLN02919   707 iweynISDGVTRVFSGDGYERNLNGSsgtsTSFAQPSGIslspdlKELYIADsesssiraldlktggsrllaggdptfsd 786
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1667 ----------VAIHATLESATALAVSHNGVLYIAETDEKKINRIRQVTtsGEISLVAGAPSGcdckndancdcfSGDDGY 1736
Cdd:PLN02919   787 nlfkfgdhdgVGSEVLLQHPLGVLCAKDGQIYVADSYNHKIKKLDPAT--KRVTTLAGTGKA------------GFKDGK 852
                          330       340       350
                   ....*....|....*....|....*....|....*..
gi 1958645862 1737 AKDAKLNTPSSLAVCADGELYVADLGNIRIRFIRKNK 1773
Cdd:PLN02919   853 ALKAQLSEPAGLALGENGRLFVADTNNSLIRYLDLNK 889
DSL pfam01414
Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain ...
959-1002 4.33e-05

Delta serrate ligand; This family has been redefined to correspond to the EGF-like domain defined by structure.


Pssm-ID: 460202  Cd Length: 46  Bit Score: 43.00  E-value: 4.33e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*..
gi 1958645862  959 CEDGWMGAACDqRACHPRCAE--HGTC-RDGKCECSPGWNGEHCTIA 1002
Cdd:pfam01414    1 CDENYYGSTCS-KFCRPRDDKfgHYTCdANGNKVCLPGWTGPYCDKP 46
NHL_like_1 cd14953
Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat ...
1709-1769 7.82e-05

Uncharacterized NHL-repeat domain in bacterial proteins; This bacterial family of NHL-repeat domains is found in a variety of domain architectures. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271323 [Multi-domain]  Cd Length: 323  Bit Score: 47.52  E-value: 7.82e-05
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1958645862 1709 ISLVAGAPSGcdckndancdcfSGDDGYAKDAKLNTPSSLAVCADGELYVADLGNIRIRFI 1769
Cdd:cd14953      1 VSTVAGSGTA------------GFSGGGGTAARFNSPSGVAVDAAGNLYVADRGNHRIRKI 49
Vgb COG4257
Streptogramin lyase [Defense mechanisms];
1440-1517 1.34e-04

Streptogramin lyase [Defense mechanisms];


Pssm-ID: 443399 [Multi-domain]  Cd Length: 270  Bit Score: 46.55  E-value: 1.34e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1440 NKLLAPVALTCGSDGSLYVGDF--NYIRRIFP-SGNVTNIlEMSHSPAHKYYLATDPmSGAVFLSDTNSRRVFKIKSTTV 1516
Cdd:COG4257    185 TPGAGPRGLAVDPDGNLWVADTgsGRIGRFDPkTGTVTEY-PLPGGGARPYGVAVDG-DGRVWFAESGANRIVRFDPDTE 262

                   .
gi 1958645862 1517 V 1517
Cdd:COG4257    263 L 263
NHL_like_2 cd14957
Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and ...
1679-1865 1.51e-04

Uncharacterized NHL-repeat domain in bacterial and archaeal proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271327 [Multi-domain]  Cd Length: 280  Bit Score: 46.49  E-value: 1.51e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1679 LAVSHNGVLYIAETDEkkiNRIRQVTTSGEISLVAGapsgcdckndancdcfSGDDGyakDAKLNTPSSLAVCADGELYV 1758
Cdd:cd14957     23 IAVDSAGNIYVADTGN---NRIQVFTSSGVYSYSIG----------------SGGTG---SGQFNSPYGIAVDSNGNIYV 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1759 ADLGNIRIRfirknkpVLNTQNMYElsspidqelYLFDTSGkhlytQSLPTGDYLYNFTYTGDGDItHITDNNGNMVNVr 1838
Cdd:cd14957     81 ADTDNNRIQ-------VFNSSGVYQ---------YSIGTGG-----SGDGQFNGPYGIAVDSNGNI-YVADTGNHRIQV- 137
                          170       180
                   ....*....|....*....|....*..
gi 1958645862 1839 RDSTGmplwlvvpdgqVYWVTMGTNSA 1865
Cdd:cd14957    138 FTSSG-----------TFSYSIGSGGT 153
YD_repeat_2x TIGR01643
YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular ...
1882-1922 2.41e-04

YD repeat (two copies); This model describes two tandem copies of a 21-residue extracellular repeat found in Gram-negative, Gram-positive, and animal proteins. The repeat is named for a YD dipeptide, the most strongly conserved motif of the repeat. These repeats appear in general to be involved in binding carbohydrate; the chicken teneurin-1 YD-repeat region has been shown to bind heparin.


Pssm-ID: 273728 [Multi-domain]  Cd Length: 42  Bit Score: 40.65  E-value: 2.41e-04
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|.
gi 1958645862 1882 HGNSGLLATKSNENGWTTFYEYDSFGRLTNVTFPTGQVSSF 1922
Cdd:TIGR01643    1 YDAAGRLTGSTDADGTTTRYTYDAAGRLVEITDADGGSTRY 41
NHL-2_like cd14951
NHL repeat domain of NHL repeat-containing protein 2 and similar proteins; NHL ...
1680-1770 3.53e-04

NHL repeat domain of NHL repeat-containing protein 2 and similar proteins; NHL repeat-containing protein 2 (NHLRC2) and related bacterial proteins; members of this eukaryotic and bacterial family are uncharacterized, the NHL repeat domain is found C-terminally of a thioredoxin domain. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271321 [Multi-domain]  Cd Length: 334  Bit Score: 45.65  E-value: 3.53e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1680 AVSHNGVLYIAETdekKINRIRQV-TTSGEISLVAGapsgcdckndancdcfSGDDGYA-KDAKLNTPSSLAVCADGELY 1757
Cdd:cd14951    202 AALPDGSVYVADT---YNHKIKRVdPATGEVSTLAG----------------TGKAGYKdLEAQFSEPSGLVVDGDGRLY 262
                           90
                   ....*....|...
gi 1958645862 1758 VADLGNIRIRFIR 1770
Cdd:cd14951    263 VADTNNHRIRRLD 275
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
1886-1918 5.54e-04

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 39.50  E-value: 5.54e-04
                           10        20        30
                   ....*....|....*....|....*....|...
gi 1958645862 1886 GLLATKSNENGWTTFYEYDSFGRLTNVTFPTGQ 1918
Cdd:pfam05593    5 GRLTSVTDPDGRVTTYTYDAAGRLTAVTDPDGT 37
NHL_like_5 cd14963
Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) ...
1441-1692 1.76e-03

Uncharacterized NHL-repeat domain in bacterial proteins; The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271333 [Multi-domain]  Cd Length: 268  Bit Score: 43.05  E-value: 1.76e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1441 KLLAPVALTCGSDGSLYVGDFnYIRRI--F-PSGNVTNILemshspAHKYYLATDPMSGAVFLSDTNsrrVFkiksttvV 1517
Cdd:cd14963     54 EFKYPYGIAVDSDGNIYVADL-YNGRIqvFdPDGKFLKYF------PEKKDRVKLISPAGLAIDDGK---LY-------V 116
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1518 KDLVKNSEVVagtgdqclpFDDT-----RCGDGGKAtEATLTNPRGITVDKFGLIYFVDgTMIRRV---DQNG-IISTLL 1588
Cdd:cd14963    117 SDVKKHKVIV---------FDLEgklllEFGKPGSE-PGELSYPNGIAVDEDGNIYVAD-SGNGRIqvfDKNGkFIKELN 185
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1589 GSNDLTSArplscdsvmdisqvrLEWPTDLAINPmDNSLYVLDN--NVVLQISENHQVRIVAGRpmhcqvPGIDhfllsk 1666
Cdd:cd14963    186 GSPDGKSG---------------FVNPRGIAVDP-DGNLYVVDNlsHRVYVFDEQGKELFTFGG------RGKD------ 237
                          250       260
                   ....*....|....*....|....*.
gi 1958645862 1667 vaiHATLESATALAVSHNGVLYIAET 1692
Cdd:cd14963    238 ---DGQFNLPNGLFIDDDGRLYVTDR 260
Keratin_B2 pfam01500
Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized ...
840-988 1.97e-03

Keratin, high sulfur B2 protein; High sulfur proteins are cysteine-rich proteins synthesized during the differentiation of hair matrix cells, and form hair fibres in association with hair keratin intermediate filaments. This family has been divided up into four regions, with the second region containing 8 copies of a short repeat. This family is also known as B2 or KAP1.


Pssm-ID: 366678 [Multi-domain]  Cd Length: 161  Bit Score: 41.70  E-value: 1.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  840 TNQCIDVACSSHGTCimGTCICNPGYKGESCEEVDCMDPTCS----SRGVCVRGECHCSVgwgGTNCETPraTCLDQCS- 914
Cdd:pfam01500    6 TSFCGFPTCSTGGTC--GSGCCQPCCCQSSCCRPSCCQTSCCqpttFQSSCCRPTCQPCC---QTSCCQP--TCCQTSSc 78
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  915 -------GHGTfLPDTGLCNCDPSWTGHDCSIEICAADCGGHGVCVGGTCrCEDGWMGAACdqraCHPRCAEHGTCRDGK 987
Cdd:pfam01500   79 qtgcggiGYGQ-EGSSGAVSSRTRWCRPDCRVEGTCLPPCCVVSCTPPTC-CQLHHAQASC----CRPSYCGQSCCRPAC 152

                   .
gi 1958645862  988 C 988
Cdd:pfam01500  153 C 153
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
30-320 2.67e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 2.67e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862   30 GLRSPRELLLVSPELSSEPRPARSWAPLSNSESGGVSGTVPRLSAVlVPASPA---VAACSHESKPPCPLGSDGLGEGAA 106
Cdd:PHA03307    76 GTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTP-PPASPPpspAPDLSEMLRPVGSPGPPPAASPPA 154
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  107 GPADTNSSQSAGMEPDHSALSAAraqfVDVEEREPEAMDVKERKPYRSLTRRRDAERRYTSSSADSEEGKGPqksyssse 186
Cdd:PHA03307   155 AGASPAAVASDAASSRQAALPLS----SPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAP-------- 222
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  187 tLKAYDQDARLayGSRVKDMVPQESEEFCRTGTNFTLRELGLGEMTPphgTLYRTDIG--LPHCGYSMGASSDADLEADT 264
Cdd:PHA03307   223 -APGRSAADDA--GASSSDSSSSESSGCGWGPENECPLPRPAPITLP---TRIWEASGwnGPSSRPGPASSSSSPRERSP 296
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862  265 VLSPEHPvRLWGRSTRSGRSSCLSSRANSNLTLTDTEHENTE----TGAPLHCSSASSTP 320
Cdd:PHA03307   297 SPSPSSP-GSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRgaavSPGPSPSRSPSPSR 355
EGF_2 pfam07974
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
977-999 3.65e-03

EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.


Pssm-ID: 400365  Cd Length: 26  Bit Score: 36.94  E-value: 3.65e-03
                           10        20
                   ....*....|....*....|....*
gi 1958645862  977 CAEHGTCRD--GKCECSPGWNGEHC 999
Cdd:pfam07974    2 CSGRGTCVNqcGKCVCDSGYQGATC 26
NHL-2_like cd14951
NHL repeat domain of NHL repeat-containing protein 2 and similar proteins; NHL ...
1546-1648 3.79e-03

NHL repeat domain of NHL repeat-containing protein 2 and similar proteins; NHL repeat-containing protein 2 (NHLRC2) and related bacterial proteins; members of this eukaryotic and bacterial family are uncharacterized, the NHL repeat domain is found C-terminally of a thioredoxin domain. The NHL (NCL-1, HT2A and LIN-41) repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures.


Pssm-ID: 271321 [Multi-domain]  Cd Length: 334  Bit Score: 42.18  E-value: 3.79e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1546 GKATEATLTNPRGITVDKFGLIYFVDgTM---IRRVD-QNGIISTLLGSNDLTSarplscdsvmDISQVRLEWPTDLAIN 1621
Cdd:cd14951    188 GPGAEALLQHPLGVAALPDGSVYVAD-TYnhkIKRVDpATGEVSTLAGTGKAGY----------KDLEAQFSEPSGLVVD 256
                           90       100
                   ....*....|....*....|....*..
gi 1958645862 1622 PmDNSLYVLDNNvvlqiseNHQVRIVA 1648
Cdd:cd14951    257 G-DGRLYVADTN-------NHRIRRLD 275
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
848-871 3.87e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 37.23  E-value: 3.87e-03
                           10        20
                   ....*....|....*....|....*...
gi 1958645862  848 CSSHGTCIMG----TCICNPGYKGESCE 871
Cdd:cd00054     11 CQNGGTCVNTvgsyRCSCPPGYTGRNCE 38
EGF_2 pfam07974
EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.
847-870 6.09e-03

EGF-like domain; This family contains EGF domains found in a variety of extracellular proteins.


Pssm-ID: 400365  Cd Length: 26  Bit Score: 36.56  E-value: 6.09e-03
                           10        20
                   ....*....|....*....|....*.
gi 1958645862  847 ACSSHGTCIM--GTCICNPGYKGESC 870
Cdd:pfam07974    1 ICSGRGTCVNqcGKCVCDSGYQGATC 26
NHL cd05819
NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in ...
1738-1837 8.46e-03

NHL repeat unit of beta-propeller proteins; The NHL(NCL-1, HT2A and LIN-41)-repeat is found in multiple tandem copies, typically as 6 instances. It is about 40 residues long and resembles the WD repeat and other beta-propeller structures. The repeats have a catalytic activity in Peptidyl-glycine alpha-amidating monooxygenase; proteolysis has shown that the Peptidyl-alpha-hydroxyglycine alpha-amidating lyase (PAL) activity is localized to the repeats. Tripartite motif-containing protein 32 interacts with the activation domain of Tat. This interaction is mediated by the NHL repeats.


Pssm-ID: 271320 [Multi-domain]  Cd Length: 269  Bit Score: 40.76  E-value: 8.46e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1958645862 1738 KDAKLNTPSSLAVCADGELYVADLGNIRIRfirknkpVLNTQNMYelsspidqeLYLFDTSGKHLYTQSLPTGDYLynft 1817
Cdd:cd05819      3 GPGELNNPQGIAVDSSGNIYVADTGNNRIQ-------VFDPDGNF---------ITSFGSFGSGDGQFNEPAGVAV---- 62
                           90       100
                   ....*....|....*....|
gi 1958645862 1818 yTGDGDItHITDNNGNMVNV 1837
Cdd:cd05819     63 -DSDGNL-YVADTGNHRIQK 80
RHS_repeat pfam05593
RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be ...
2477-2506 8.98e-03

RHS Repeat; RHS proteins contain extended repeat regions. These repeats often appear to be involved in ligand binding. Note that this model may not find all the repeats in a protein and that it covers two RHS repeats. The 3D structure of an RHS-repeat-containing protein (the B and C components of an ABC toxin complex) has been determined. The RHS repeats form an extended strip of beta-sheet that spirals around to form a hollow shell, encapsulating the variable C-terminal domain.


Pssm-ID: 461685 [Multi-domain]  Cd Length: 37  Bit Score: 36.04  E-value: 8.98e-03
                           10        20        30
                   ....*....|....*....|....*....|
gi 1958645862 2477 YNSAGLLIKAYNrASGWSVRYRYDGLGRRV 2506
Cdd:pfam05593    1 YDAAGRLTSVTD-PDGRVTTYTYDAAGRLT 29
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH