NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|293358604|ref|XP_001071128|]
View 

multimerin-1 isoform X1 [Rattus norvegicus]

Protein Classification

calcium-binding EGF-like domain-containing protein( domain architecture ID 13728361)

calcium-binding epidermal growth factor (EGF)-like domain-containing protein may play a crucial role in numerous protein-protein interactions

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
C1q super family cl23878
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
1076-1210 1.11e-39

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


The actual alignment was detected with superfamily member smart00110:

Pssm-ID: 420072  Cd Length: 135  Bit Score: 143.60  E-value: 1.11e-39
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   1076 SYRYAPMVAFFVSHTHGMTAPG-PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIESFSAHISGFFVVDGVDKLRF 1154
Cdd:smart00110    1 NYKAQPRSAFSVIRSNRPPPPGqPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVESKGRNVKVSLMKNGIQVMST 80
                            90       100       110       120       130
                    ....*....|....*....|....*....|....*....|....*....|....*...
gi 293358604   1155 ESENtdsEIHCDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLLYRT 1210
Cdd:smart00110   81 YDEY---QKGLYDVASGGALLQLRQGDQVWLELpdEKNGLYAGEYVDSTFSGFLLFPD 135
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
194-263 2.34e-12

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


:

Pssm-ID: 462204  Cd Length: 69  Bit Score: 63.21  E-value: 2.34e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 293358604   194 KNWCAHvhtKLSPTVILDTHGSNVNSGR----GSCGWPSgLCSRRsQKSSNAVYRMQHKIVTSLEWRCCPGYIG 263
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTGTESYVQPVYkpylTWCAGHR-RCSTY-RTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1027-1059 1.52e-08

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


:

Pssm-ID: 238011  Cd Length: 38  Bit Score: 51.48  E-value: 1.52e-08
                          10        20        30
                  ....*....|....*....|....*....|....
gi 293358604 1027 CSSF-PCQNGGTCISGRSSFICACRHPFMGDTCT 1059
Cdd:cd00054     5 CASGnPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38
CCDC158 super family cl37899
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
360-707 1.38e-07

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


The actual alignment was detected with superfamily member pfam15921:

Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 56.28  E-value: 1.38e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   360 LKALKSKSiddllKNIVKDQFKVFQDDMQETIAQLFKTVSSLSKDLESTRQAVLQVNQSFVSVTAQKDSalirENQPTWK 439
Cdd:pfam15921  247 LEALKSES-----QNKIELLLQQHQDRIEQLISEHEVEITGLTEKASSARSQANSIQSQLEIIQEQARN----QNSMYMR 317
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   440 DITELKNSITDIRQEmaltcekpLKElvAKQSHlEGALEQEHSQIVLYHQSLNETLS----------NMQEAHTQLLSIL 509
Cdd:pfam15921  318 QLSDLESTVSQLRSE--------LRE--AKRMY-EDKIEELEKQLVLANSELTEARTerdqfsqesgNLDDQLQKLLADL 386
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   510 QVSGTE-----------------NVATEESV-------NSNVTKYVSVLQETASK-QGLMLLQMLSDLHVQES--KISNL 562
Cdd:pfam15921  387 HKREKElslekeqnkrlwdrdtgNSITIDHLrrelddrNMEVQRLEALLKAMKSEcQGQMERQMAAIQGKNESleKVSSL 466
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   563 TILLEMEKESARGECEEMLSKcrhdfKFQLKDTEENLHVLNQTLTEV----------IFPMDIKVDKMSEQLNDLTYDME 632
Cdd:pfam15921  467 TAQLESTKEMLRKVVEELTAK-----KMTLESSERTVSDLTASLQEKeraieatnaeITKLRSRVDLKLQELQHLKNEGD 541
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   633 ILQPLLEQRSLLQQQIIHEPKEDTVTRRELQNLIGAVNQ-------LNV----LTKELTKRH------NLLRNEVQSRSE 695
Cdd:pfam15921  542 HLRNVQTECEALKLQMAEKDKVIEILRQQIENMTQLVGQhgrtagaMQVekaqLEKEINDRRlelqefKILKDKKDAKIR 621
                          410
                   ....*....|..
gi 293358604   696 AFERRISDHALE 707
Cdd:pfam15921  622 ELEARVSDLELE 633
 
Name Accession Description Interval E-value
C1Q smart00110
Complement component C1q domain; Globular domain found in many collagens and eponymously in ...
1076-1210 1.11e-39

Complement component C1q domain; Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.


Pssm-ID: 128420  Cd Length: 135  Bit Score: 143.60  E-value: 1.11e-39
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   1076 SYRYAPMVAFFVSHTHGMTAPG-PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIESFSAHISGFFVVDGVDKLRF 1154
Cdd:smart00110    1 NYKAQPRSAFSVIRSNRPPPPGqPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVESKGRNVKVSLMKNGIQVMST 80
                            90       100       110       120       130
                    ....*....|....*....|....*....|....*....|....*....|....*...
gi 293358604   1155 ESENtdsEIHCDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLLYRT 1210
Cdd:smart00110   81 YDEY---QKGLYDVASGGALLQLRQGDQVWLELpdEKNGLYAGEYVDSTFSGFLLFPD 135
C1q pfam00386
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
1084-1207 1.61e-31

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


Pssm-ID: 395310 [Multi-domain]  Cd Length: 126  Bit Score: 119.70  E-value: 1.61e-31
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604  1084 AFFVSHTHGMTAPG--PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIEsfSAHISGFFV---VDGVDKLRFESEN 1158
Cdd:pfam00386    1 AFSAGRTTGLTAPNeqPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHIT--TVDGKSLYVslvKNGQEVVSFYDQP 78
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 293358604  1159 TDSEihcDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLL 1207
Cdd:pfam00386   79 QKGS---LDVASGSVVLELQRGDEVWLQLtgYNGLYYDGSDTDSTFSGFLL 126
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
194-263 2.34e-12

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


Pssm-ID: 462204  Cd Length: 69  Bit Score: 63.21  E-value: 2.34e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 293358604   194 KNWCAHvhtKLSPTVILDTHGSNVNSGR----GSCGWPSgLCSRRsQKSSNAVYRMQHKIVTSLEWRCCPGYIG 263
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTGTESYVQPVYkpylTWCAGHR-RCSTY-RTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1027-1059 1.52e-08

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 51.48  E-value: 1.52e-08
                          10        20        30
                  ....*....|....*....|....*....|....
gi 293358604 1027 CSSF-PCQNGGTCISGRSSFICACRHPFMGDTCT 1059
Cdd:cd00054     5 CASGnPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38
CCDC158 pfam15921
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
360-707 1.38e-07

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 56.28  E-value: 1.38e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   360 LKALKSKSiddllKNIVKDQFKVFQDDMQETIAQLFKTVSSLSKDLESTRQAVLQVNQSFVSVTAQKDSalirENQPTWK 439
Cdd:pfam15921  247 LEALKSES-----QNKIELLLQQHQDRIEQLISEHEVEITGLTEKASSARSQANSIQSQLEIIQEQARN----QNSMYMR 317
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   440 DITELKNSITDIRQEmaltcekpLKElvAKQSHlEGALEQEHSQIVLYHQSLNETLS----------NMQEAHTQLLSIL 509
Cdd:pfam15921  318 QLSDLESTVSQLRSE--------LRE--AKRMY-EDKIEELEKQLVLANSELTEARTerdqfsqesgNLDDQLQKLLADL 386
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   510 QVSGTE-----------------NVATEESV-------NSNVTKYVSVLQETASK-QGLMLLQMLSDLHVQES--KISNL 562
Cdd:pfam15921  387 HKREKElslekeqnkrlwdrdtgNSITIDHLrrelddrNMEVQRLEALLKAMKSEcQGQMERQMAAIQGKNESleKVSSL 466
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   563 TILLEMEKESARGECEEMLSKcrhdfKFQLKDTEENLHVLNQTLTEV----------IFPMDIKVDKMSEQLNDLTYDME 632
Cdd:pfam15921  467 TAQLESTKEMLRKVVEELTAK-----KMTLESSERTVSDLTASLQEKeraieatnaeITKLRSRVDLKLQELQHLKNEGD 541
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   633 ILQPLLEQRSLLQQQIIHEPKEDTVTRRELQNLIGAVNQ-------LNV----LTKELTKRH------NLLRNEVQSRSE 695
Cdd:pfam15921  542 HLRNVQTECEALKLQMAEKDKVIEILRQQIENMTQLVGQhgrtagaMQVekaqLEKEINDRRlelqefKILKDKKDAKIR 621
                          410
                   ....*....|..
gi 293358604   696 AFERRISDHALE 707
Cdd:pfam15921  622 ELEARVSDLELE 633
EGF pfam00008
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ...
1027-1057 5.02e-07

EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.


Pssm-ID: 394967  Cd Length: 31  Bit Score: 46.99  E-value: 5.02e-07
                           10        20        30
                   ....*....|....*....|....*....|.
gi 293358604  1027 CSSFPCQNGGTCISGRSSFICACRHPFMGDT 1057
Cdd:pfam00008    1 CAPNPCSNGGTCVDTPGGYTCICPEGYTGKR 31
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
311-851 3.09e-06

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 51.56  E-value: 3.09e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   311 EQLSQQERKLMLLQKKVDNVSLAAGDLRNAYLSLEEKVSkdNSKEFQSFLKALKSKsIDDLLK--NIVKDQFKVFQDDMQ 388
Cdd:TIGR04523  166 KQKEELENELNLLEKEKLNIQKNIDKIKNKLLKLELLLS--NLKKKIQKNKSLESQ-ISELKKqnNQLKDNIEKKQQEIN 242
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   389 ETIAQLFKTVSSLSKDLESTRQAV--LQVNQSFVSVTAQKDSALirENQptwkdITELKNSITDIRQEMALTCEKPLKEL 466
Cdd:TIGR04523  243 EKTTEISNTQTQLNQLKDEQNKIKkqLSEKQKELEQNNKKIKEL--EKQ-----LNQLKSEISDLNNQKEQDWNKELKSE 315
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   467 VAKQshlEGALEQEHSQIVlyhQSlNETLSNMQEAHTQLLSILQVSGTENVATEESVNSNVTKYVSVLQETASKqglmlL 546
Cdd:TIGR04523  316 LKNQ---EKKLEEIQNQIS---QN-NKIISQLNEQISQLKKELTNSESENSEKQRELEEKQNEIEKLKKENQSY-----K 383
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   547 QMLSDLHVQ----ESKISNLTIL----------LEMEKESARGECEEMLSKcRHDFKFQLKDTEENLHVLNQTLTEvifp 612
Cdd:TIGR04523  384 QEIKNLESQindlESKIQNQEKLnqqkdeqikkLQQEKELLEKEIERLKET-IIKNNSEIKDLTNQDSVKELIIKN---- 458
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   613 MDIKVDKMSEQLNDLTYDMEILQPLLEQrslLQQQIihepKEDTvtrRELQNLIGAVNQLNVLTKELTKRHNLLRNEVQ- 691
Cdd:TIGR04523  459 LDNTRESLETQLKVLSRSINKIKQNLEQ---KQKEL----KSKE---KELKKLNEEKKELEEKVKDLTKKISSLKEKIEk 528
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   692 --SRSEAFERRISDhaleTEDGLNKtmivinnaIDFVQDNYVLKETLSAKpynpkvcecNQNmdailsfISEFQHLNDSi 769
Cdd:TIGR04523  529 leSEKKEKESKISD----LEDELNK--------DDFELKKENLEKEIDEK---------NKE-------IEELKQTQKS- 579
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   770 qtLVNDNQKynfilqiakaltaipKDEKLSQLNFQKvyqmyNETTSQVSKCQQNVSYLKEHMLAVKKNTKEFETRLQGIE 849
Cdd:TIGR04523  580 --LKKKQEE---------------KQELIDQKEKEK-----KDLIKEIEEKEKKISSLEKELEKAKKENEKLSSIIKNIK 637

                   ..
gi 293358604   850 SK 851
Cdd:TIGR04523  638 SK 639
EGF_CA smart00179
Calcium-binding EGF-like domain;
1027-1059 4.11e-06

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 44.55  E-value: 4.11e-06
                            10        20        30
                    ....*....|....*....|....*....|....*
gi 293358604   1027 CSSF-PCQNGGTCISGRSSFICACRHPFM-GDTCT 1059
Cdd:smart00179    5 CASGnPCQNGGTCVNTVGSYRCECPPGYTdGRNCE 39
ClyA_Cry6Aa-like cd22656
Bacillus thuringiensis crystal 6Aa (Cry6Aa) toxin, and similar proteins; This model includes ...
333-505 6.75e-03

Bacillus thuringiensis crystal 6Aa (Cry6Aa) toxin, and similar proteins; This model includes pesticidal Cry6Aa toxin from Bacillus thuringiensis, one of the many parasporal crystal (Cry) toxins produced during the sporulation phase of growth. Many of these proteins are toxic to numerous insect species and have been effectively used as proteinaceous insecticides to directly kill insect pests; some have been used to control insect growth on transgenic agricultural plants. Cry6Aa exists as a protoxin, which is activated by cleavage using trypsin. Structure studies for Cry6Aa support a mechanism of action by pore formation, similar to cytolysin A (ClyA)-type alpha pore-forming toxins (alpha-PFTs) such as HblB, and bioassay and mutation studies show that Cry6Aa is an active pore-forming toxin. Cry6Aa shows atypical features compared to other members of alpha-PFTs, including internal repeat sequences and small loop regions within major alpha helices.


Pssm-ID: 439154 [Multi-domain]  Cd Length: 309  Bit Score: 40.05  E-value: 6.75e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604  333 AAGDLRNAYLSLEEKVSKDNSKEFQSFLKALKsKSIDDLLKNIVKDQfkvfqDDMQETIAQLFKTVSSLSKDLESTRQAV 412
Cdd:cd22656    85 AGGTIDSYYAEILELIDDLADATDDEELEEAK-KTIKALLDDLLKEA-----KKYQDKAAKVVDKLTDFENQTEKDQTAL 158
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604  413 LQVNQSFVSVTAQKDSALIRenqptwKDITELKNSITDIRQEMALTCEKPLKELVAKQSHLEGALEQEHSQIVLYH---Q 489
Cdd:cd22656   159 ETLEKALKDLLTDEGGAIAR------KEIKDLQKELEKLNEEYAAKLKAKIDELKALIADDEAKLAAALRLIADLTaadT 232
                         170
                  ....*....|....*.
gi 293358604  490 SLNETLSNMQEAHTQL 505
Cdd:cd22656   233 DLDNLLALIGPAIPAL 248
 
Name Accession Description Interval E-value
C1Q smart00110
Complement component C1q domain; Globular domain found in many collagens and eponymously in ...
1076-1210 1.11e-39

Complement component C1q domain; Globular domain found in many collagens and eponymously in complement C1q. When part of full length proteins these domains form a 'bouquet' due to the multimerization of heterotrimers. The C1q fold is similar to that of tumour necrosis factor.


Pssm-ID: 128420  Cd Length: 135  Bit Score: 143.60  E-value: 1.11e-39
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   1076 SYRYAPMVAFFVSHTHGMTAPG-PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIESFSAHISGFFVVDGVDKLRF 1154
Cdd:smart00110    1 NYKAQPRSAFSVIRSNRPPPPGqPIRFDKVLYNQQGHYDPRTGKFTCPVPGVYYFSYHVESKGRNVKVSLMKNGIQVMST 80
                            90       100       110       120       130
                    ....*....|....*....|....*....|....*....|....*....|....*...
gi 293358604   1155 ESENtdsEIHCDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLLYRT 1210
Cdd:smart00110   81 YDEY---QKGLYDVASGGALLQLRQGDQVWLELpdEKNGLYAGEYVDSTFSGFLLFPD 135
C1q pfam00386
C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement ...
1084-1207 1.61e-31

C1q domain; C1q is a subunit of the C1 enzyme complex that activates the serum complement system.


Pssm-ID: 395310 [Multi-domain]  Cd Length: 126  Bit Score: 119.70  E-value: 1.61e-31
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604  1084 AFFVSHTHGMTAPG--PILFNDLSVNYGASYNPRTGKFRIPYLGVYIFKYTIEsfSAHISGFFV---VDGVDKLRFESEN 1158
Cdd:pfam00386    1 AFSAGRTTGLTAPNeqPVRFDKVLTNIGGHYDPATGKFTCPVPGVYYFSYHIT--TVDGKSLYVslvKNGQEVVSFYDQP 78
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 293358604  1159 TDSEihcDRVLTGDALFELNYGQEVWLRL--VKGTIPIKYPPVTTFSGYLL 1207
Cdd:pfam00386   79 QKGS---LDVASGSVVLELQRGDEVWLQLtgYNGLYYDGSDTDSTFSGFLL 126
EMI pfam07546
EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final ...
194-263 2.34e-12

EMI domain; The Pfam alignment is truncated at the C-terminus and does not include the final cysteine defined in Callebaut et al. This is to stop the family overlapping with other domains.


Pssm-ID: 462204  Cd Length: 69  Bit Score: 63.21  E-value: 2.34e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 293358604   194 KNWCAHvhtKLSPTVILDTHGSNVNSGR----GSCGWPSgLCSRRsQKSSNAVYRMQHKIVTSLEWRCCPGYIG 263
Cdd:pfam07546    1 RNVCAY---KVVSCVVVTGTESYVQPVYkpylTWCAGHR-RCSTY-RTTYRPAYRQVYKTVTRLEWRCCPGWGG 69
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1027-1059 1.52e-08

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 51.48  E-value: 1.52e-08
                          10        20        30
                  ....*....|....*....|....*....|....
gi 293358604 1027 CSSF-PCQNGGTCISGRSSFICACRHPFMGDTCT 1059
Cdd:cd00054     5 CASGnPCQNGGTCVNTVGSYRCSCPPGYTGRNCE 38
CCDC158 pfam15921
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
360-707 1.38e-07

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 56.28  E-value: 1.38e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   360 LKALKSKSiddllKNIVKDQFKVFQDDMQETIAQLFKTVSSLSKDLESTRQAVLQVNQSFVSVTAQKDSalirENQPTWK 439
Cdd:pfam15921  247 LEALKSES-----QNKIELLLQQHQDRIEQLISEHEVEITGLTEKASSARSQANSIQSQLEIIQEQARN----QNSMYMR 317
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   440 DITELKNSITDIRQEmaltcekpLKElvAKQSHlEGALEQEHSQIVLYHQSLNETLS----------NMQEAHTQLLSIL 509
Cdd:pfam15921  318 QLSDLESTVSQLRSE--------LRE--AKRMY-EDKIEELEKQLVLANSELTEARTerdqfsqesgNLDDQLQKLLADL 386
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   510 QVSGTE-----------------NVATEESV-------NSNVTKYVSVLQETASK-QGLMLLQMLSDLHVQES--KISNL 562
Cdd:pfam15921  387 HKREKElslekeqnkrlwdrdtgNSITIDHLrrelddrNMEVQRLEALLKAMKSEcQGQMERQMAAIQGKNESleKVSSL 466
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   563 TILLEMEKESARGECEEMLSKcrhdfKFQLKDTEENLHVLNQTLTEV----------IFPMDIKVDKMSEQLNDLTYDME 632
Cdd:pfam15921  467 TAQLESTKEMLRKVVEELTAK-----KMTLESSERTVSDLTASLQEKeraieatnaeITKLRSRVDLKLQELQHLKNEGD 541
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   633 ILQPLLEQRSLLQQQIIHEPKEDTVTRRELQNLIGAVNQ-------LNV----LTKELTKRH------NLLRNEVQSRSE 695
Cdd:pfam15921  542 HLRNVQTECEALKLQMAEKDKVIEILRQQIENMTQLVGQhgrtagaMQVekaqLEKEINDRRlelqefKILKDKKDAKIR 621
                          410
                   ....*....|..
gi 293358604   696 AFERRISDHALE 707
Cdd:pfam15921  622 ELEARVSDLELE 633
CCDC158 pfam15921
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
337-703 1.60e-07

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 55.89  E-value: 1.60e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   337 LRNAYLSLEEKVSK-DNSKEFQSFLKALKSKSIDDLlknivkdqfkvfQDDMQETIAQLfKTVSSLSKD-LESTRQAVLQ 414
Cdd:pfam15921  108 LRQSVIDLQTKLQEmQMERDAMADIRRRESQSQEDL------------RNQLQNTVHEL-EAAKCLKEDmLEDSNTQIEQ 174
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   415 VNQSFVS---VTAQKDSALIRENQPTWKDITELKNSITDIRQEMALTCEKPLKELVAKQSHLEGALEQEHSQI-VLYHQS 490
Cdd:pfam15921  175 LRKMMLShegVLQEIRSILVDFEEASGKKIYEHDSMSTMHFRSLGSAISKILRELDTEISYLKGRIFPVEDQLeALKSES 254
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   491 LNET---LSNMQEAHTQLLSI--LQVSG-TENVATEESVNSNVTKYVSVLQETASKQGLMLLQMLSDLhvqESKISNLTI 564
Cdd:pfam15921  255 QNKIellLQQHQDRIEQLISEheVEITGlTEKASSARSQANSIQSQLEIIQEQARNQNSMYMRQLSDL---ESTVSQLRS 331
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   565 LL------------EMEKE---------SARGEcEEMLSKCRHDFKFQLKDTEENLHVLNQTLT------EVIFPMD--- 614
Cdd:pfam15921  332 ELreakrmyedkieELEKQlvlanseltEARTE-RDQFSQESGNLDDQLQKLLADLHKREKELSlekeqnKRLWDRDtgn 410
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   615 -IKVDKMSEQLNDLTYDMEILQPLLEQ-RSLLQQQIihepkedtvtRRELQNLIG---AVNQLNVLTKELTKRHNLLRNE 689
Cdd:pfam15921  411 sITIDHLRRELDDRNMEVQRLEALLKAmKSECQGQM----------ERQMAAIQGkneSLEKVSSLTAQLESTKEMLRKV 480
                          410       420
                   ....*....|....*....|
gi 293358604   690 VQSRS------EAFERRISD 703
Cdd:pfam15921  481 VEELTakkmtlESSERTVSD 500
EGF pfam00008
EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very ...
1027-1057 5.02e-07

EGF-like domain; There is no clear separation between noise and signal. pfam00053 is very similar, but has 8 instead of 6 conserved cysteines. Includes some cytokine receptors. The EGF domain misses the N-terminus regions of the Ca2+ binding EGF domains (this is the main reason of discrepancy between swiss-prot domain start/end and Pfam). The family is hard to model due to many similar but different sub-types of EGF domains. Pfam certainly misses a number of EGF domains.


Pssm-ID: 394967  Cd Length: 31  Bit Score: 46.99  E-value: 5.02e-07
                           10        20        30
                   ....*....|....*....|....*....|.
gi 293358604  1027 CSSFPCQNGGTCISGRSSFICACRHPFMGDT 1057
Cdd:pfam00008    1 CAPNPCSNGGTCVDTPGGYTCICPEGYTGKR 31
EGF cd00053
Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large ...
1027-1059 1.42e-06

Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.


Pssm-ID: 238010  Cd Length: 36  Bit Score: 45.93  E-value: 1.42e-06
                          10        20        30
                  ....*....|....*....|....*....|....*
gi 293358604 1027 CSSF-PCQNGGTCISGRSSFICACRHPFMGD-TCT 1059
Cdd:cd00053     2 CAASnPCSNGGTCVNTPGSYRCVCPPGYTGDrSCE 36
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
311-851 3.09e-06

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 51.56  E-value: 3.09e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   311 EQLSQQERKLMLLQKKVDNVSLAAGDLRNAYLSLEEKVSkdNSKEFQSFLKALKSKsIDDLLK--NIVKDQFKVFQDDMQ 388
Cdd:TIGR04523  166 KQKEELENELNLLEKEKLNIQKNIDKIKNKLLKLELLLS--NLKKKIQKNKSLESQ-ISELKKqnNQLKDNIEKKQQEIN 242
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   389 ETIAQLFKTVSSLSKDLESTRQAV--LQVNQSFVSVTAQKDSALirENQptwkdITELKNSITDIRQEMALTCEKPLKEL 466
Cdd:TIGR04523  243 EKTTEISNTQTQLNQLKDEQNKIKkqLSEKQKELEQNNKKIKEL--EKQ-----LNQLKSEISDLNNQKEQDWNKELKSE 315
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   467 VAKQshlEGALEQEHSQIVlyhQSlNETLSNMQEAHTQLLSILQVSGTENVATEESVNSNVTKYVSVLQETASKqglmlL 546
Cdd:TIGR04523  316 LKNQ---EKKLEEIQNQIS---QN-NKIISQLNEQISQLKKELTNSESENSEKQRELEEKQNEIEKLKKENQSY-----K 383
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   547 QMLSDLHVQ----ESKISNLTIL----------LEMEKESARGECEEMLSKcRHDFKFQLKDTEENLHVLNQTLTEvifp 612
Cdd:TIGR04523  384 QEIKNLESQindlESKIQNQEKLnqqkdeqikkLQQEKELLEKEIERLKET-IIKNNSEIKDLTNQDSVKELIIKN---- 458
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   613 MDIKVDKMSEQLNDLTYDMEILQPLLEQrslLQQQIihepKEDTvtrRELQNLIGAVNQLNVLTKELTKRHNLLRNEVQ- 691
Cdd:TIGR04523  459 LDNTRESLETQLKVLSRSINKIKQNLEQ---KQKEL----KSKE---KELKKLNEEKKELEEKVKDLTKKISSLKEKIEk 528
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   692 --SRSEAFERRISDhaleTEDGLNKtmivinnaIDFVQDNYVLKETLSAKpynpkvcecNQNmdailsfISEFQHLNDSi 769
Cdd:TIGR04523  529 leSEKKEKESKISD----LEDELNK--------DDFELKKENLEKEIDEK---------NKE-------IEELKQTQKS- 579
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   770 qtLVNDNQKynfilqiakaltaipKDEKLSQLNFQKvyqmyNETTSQVSKCQQNVSYLKEHMLAVKKNTKEFETRLQGIE 849
Cdd:TIGR04523  580 --LKKKQEE---------------KQELIDQKEKEK-----KDLIKEIEEKEKKISSLEKELEKAKKENEKLSSIIKNIK 637

                   ..
gi 293358604   850 SK 851
Cdd:TIGR04523  638 SK 639
EGF_CA smart00179
Calcium-binding EGF-like domain;
1027-1059 4.11e-06

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 44.55  E-value: 4.11e-06
                            10        20        30
                    ....*....|....*....|....*....|....*
gi 293358604   1027 CSSF-PCQNGGTCISGRSSFICACRHPFM-GDTCT 1059
Cdd:smart00179    5 CASGnPCQNGGTCVNTVGSYRCECPPGYTdGRNCE 39
CCDC158 pfam15921
Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. ...
269-727 5.32e-06

Coiled-coil domain-containing protein 158; CCDC158 is a family of proteins found in eukaryotes. The function is not known.


Pssm-ID: 464943 [Multi-domain]  Cd Length: 1112  Bit Score: 50.89  E-value: 5.32e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   269 KAEE-RQQLVHSNQ--AESHTAVDQGTaqqqkQDSGD-PAMIHKMAEQLSQQERKLMLlqKKVDNVSLAAGDLRNA---- 340
Cdd:pfam15921  343 KIEElEKQLVLANSelTEARTERDQFS-----QESGNlDDQLQKLLADLHKREKELSL--EKEQNKRLWDRDTGNSitid 415
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   341 YLSLEEKVSKDNSKEFQSFLKALKSKSIDDLLKNIVKDQFKvfqddmQETIAQlfktVSSLSKDLESTRQAVLQVNQsfv 420
Cdd:pfam15921  416 HLRRELDDRNMEVQRLEALLKAMKSECQGQMERQMAAIQGK------NESLEK----VSSLTAQLESTKEMLRKVVE--- 482
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   421 SVTAQKDSalIRENQPTWKDIT----------ELKNS-ITDIRQEMALTCEKpLKELVAKQSHLEG------ALEQEHSQ 483
Cdd:pfam15921  483 ELTAKKMT--LESSERTVSDLTaslqekeraiEATNAeITKLRSRVDLKLQE-LQHLKNEGDHLRNvqteceALKLQMAE 559
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   484 IVLYHQSLNETLSNMQE---AHTQLLSILQVsgtENVATEESVNSNVT--KYVSVLQETASKQGLMLLQMLSDLHVQESK 558
Cdd:pfam15921  560 KDKVIEILRQQIENMTQlvgQHGRTAGAMQV---EKAQLEKEINDRRLelQEFKILKDKKDAKIRELEARVSDLELEKVK 636
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   559 ISN-----LTILLEMEKEsaRGECEEMLSKCRHDfkfqLKDTEENLHVLNQTLTEVIFPMDIKVDKMSEQLNDLTYDMEI 633
Cdd:pfam15921  637 LVNagserLRAVKDIKQE--RDQLLNEVKTSRNE----LNSLSEDYEVLKRNFRNKSEEMETTTNKLKMQLKSAQSELEQ 710
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   634 LQPLLEQRS-----------LLQQQIIHEPKEDTVTRRELQNLIGAVNQLNVLTKELTKRHNLLRNEVQS---------- 692
Cdd:pfam15921  711 TRNTLKSMEgsdghamkvamGMQKQITAKRGQIDALQSKIQFLEEAMTNANKEKHFLKEEKNKLSQELSTvateknkmag 790
                          490       500       510       520
                   ....*....|....*....|....*....|....*....|
gi 293358604   693 -----RSEafERRISDHALETEDGLNKTMIVINNAIDFVQ 727
Cdd:pfam15921  791 elevlRSQ--ERRLKEKVANMEVALDKASLQFAECQDIIQ 828
235kDa-fam TIGR01612
reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in ...
325-865 1.17e-05

reticulocyte binding/rhoptry protein; This model represents a group of paralogous families in plasmodium species alternately annotated as reticulocyte binding protein, 235-kDa family protein and rhoptry protein. Rhoptry protein is localized on the cell surface and is extremely large (although apparently lacking in repeat structure) and is important for the process of invasion of the RBCs by the parasite. These proteins are found in P. falciparum, P. vivax and P. yoelii.


Pssm-ID: 130673 [Multi-domain]  Cd Length: 2757  Bit Score: 50.05  E-value: 1.17e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   325 KKVDNVSLAAGDLRNAYLSLEEKVSK--DNSKEFQSFLKALKSKSIDDLLkNIVKDQFKVFQDDMQETIAQLFKTVSSLS 402
Cdd:TIGR01612 1774 ETVSKEPITYDEIKNTRINAQNEFLKiiEIEKKSKSYLDDIEAKEFDRII-NHFKKKLDHVNDKFTKEYSKINEGFDDIS 1852
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   403 KDLE----STRQAVL-----QVNQSFVSVTAQKDSALIRENQPTWKDITELKNSIT-DIRQEMALTCEKPLKelVAKQSH 472
Cdd:TIGR01612 1853 KSIEnvknSTDENLLfdilnKTKDAYAGIIGKKYYSYKDEAEKIFINISKLANSINiQIQNNSGIDLFDNIN--IAILSS 1930
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   473 LEGALEqEHSQIVLYHQSLNETLSNMQEAHTQLLSILQVSGTENVATEESVN---SNVTKYVSVLQETASKQglmllqML 549
Cdd:TIGR01612 1931 LDSEKE-DTLKFIPSPEKEPEIYTKIRDSYDTLLDIFKKSQDLHKKEQDTLNiifENQQLYEKIQASNELKD------TL 2003
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   550 SDLHVQESKISN-LTILLEMEKESARGEC-----EEMLSKCRHDfkfQLKDTEENLHVLNQTlteviFPMDIKVDKMSEQ 623
Cdd:TIGR01612 2004 SDLKYKKEKILNdVKLLLHKFDELNKLSCdsqnyDTILELSKQD---KIKEKIDNYEKEKEK-----FGIDFDVKAMEEK 2075
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   624 LNDLTYDMEilqplleqrsLLQQQIIHEPKEDTVTRRELQNLIGAVNQLNVLTkeltkrhNLLRNEVqsrseafeRRISD 703
Cdd:TIGR01612 2076 FDNDIKDIE----------KFENNYKHSEKDNHDFSEEKDNIIQSKKKLKELT-------EAFNTEI--------KIIED 2130
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   704 HALETEDGLNKTMIVINNAIDFVQDNYVlkETLSAK--PYNPKVCECNQNMDAILSFISEFQH-LNDSIQTL---VNDNQ 777
Cdd:TIGR01612 2131 KIIEKNDLIDKLIEMRKECLLFSYATLV--ETLKSKviNHSEFITSAAKFSKDFFEFIEDISDsLNDDIDALqikYNLNQ 2208
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   778 KYNFILQIAKALTA-----IPKDEKLSQL--NFQKVYQM-----------YNETT-----SQVSKCQQNVSYLKEHMLAV 834
Cdd:TIGR01612 2209 TKKHMISILADATKdhnnlIEKEKEATKIinNLTELFTIdfnnadadilhNNKIQiiyfnSELHKSIESIKKLYKKINAF 2288
                          570       580       590       600
                   ....*....|....*....|....*....|....*....|....*...
gi 293358604   835 KKN------------TKEFETRLQGIESKVTKAL-----IPYYISFKK 865
Cdd:TIGR01612 2289 KLLnishinekyfdiSKEFDNIIQLQKHKLTENLndlkeIDQYISDKK 2336
Mplasa_alph_rch TIGR04523
helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of ...
314-854 1.75e-05

helix-rich Mycoplasma protein; Members of this family occur strictly within a subset of Mycoplasma species. Members average 750 amino acids in length, including signal peptide. Sequences are predicted (Jpred 3) to be almost entirely alpha-helical. These sequences show strong periodicity (consistent with long alpha helical structures) and low complexity rich in D,E,N,Q, and K. Genes encoding these proteins are often found in tandem. The function is unknown.


Pssm-ID: 275316 [Multi-domain]  Cd Length: 745  Bit Score: 49.25  E-value: 1.75e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   314 SQQERKLMLLQKKVDNVSLAAGDLRNAYLSLEEKV--SKDNSKEFQSFLKALKSK------SIDDLLKNIVKdqFKVFQD 385
Cdd:TIGR04523   36 KQLEKKLKTIKNELKNKEKELKNLDKNLNKDEEKInnSNNKIKILEQQIKDLNDKlkknkdKINKLNSDLSK--INSEIK 113
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   386 DMQETIAQLFKTVSSLSKDLESTRQAVLQVN------QSFVSVTAQKDSALIRENQPTWKDITELKNSITDIRQEMALTC 459
Cdd:TIGR04523  114 NDKEQKNKLEVELNKLEKQKKENKKNIDKFLteikkkEKELEKLNNKYNDLKKQKEELENELNLLEKEKLNIQKNIDKIK 193
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   460 EKPLKelvakqshlegaLEQEHSQIVLY---HQSLNETLSNMQEAHTQLLSILQVSGTENVATEESVNSNVTKY--VSVL 534
Cdd:TIGR04523  194 NKLLK------------LELLLSNLKKKiqkNKSLESQISELKKQNNQLKDNIEKKQQEINEKTTEISNTQTQLnqLKDE 261
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   535 QETASKQglmLLQMLSDLHVQESKISNLTILL---EMEKESARGECEEMLSKcrhDFKFQLKDTEENLHVLNQTLTE--- 608
Cdd:TIGR04523  262 QNKIKKQ---LSEKQKELEQNNKKIKELEKQLnqlKSEISDLNNQKEQDWNK---ELKSELKNQEKKLEEIQNQISQnnk 335
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   609 VIFPMDIKVDKMSEQLNDLTYDMEILQPLLEQRsllQQQIIHEPKEDTVTRRELQNLIGAVNQLNVLTKELTKRHNLLRN 688
Cdd:TIGR04523  336 IISQLNEQISQLKKELTNSESENSEKQRELEEK---QNEIEKLKKENQSYKQEIKNLESQINDLESKIQNQEKLNQQKDE 412
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   689 EV---QSRSEAFERRIS-------------------DHALETE-DGLNKTMIVINNAIDFVQDNY-VLKETLSAKpynpk 744
Cdd:TIGR04523  413 QIkklQQEKELLEKEIErlketiiknnseikdltnqDSVKELIiKNLDNTRESLETQLKVLSRSInKIKQNLEQK----- 487
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   745 VCECNQNMDAILSFISEFQHLNDSIQTLvndNQKYNFILQIAKALTA--IPKDEKLSQLNFQKVYQMYNETTS----QVS 818
Cdd:TIGR04523  488 QKELKSKEKELKKLNEEKKELEEKVKDL---TKKISSLKEKIEKLESekKEKESKISDLEDELNKDDFELKKEnlekEID 564
                          570       580       590
                   ....*....|....*....|....*....|....*.
gi 293358604   819 KCQQNVSYLKEHMLAVKKNTKEFETRLQGIESKVTK 854
Cdd:TIGR04523  565 EKNKEIEELKQTQKSLKKKQEEKQELIDQKEKEKKD 600
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
431-691 2.48e-05

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 48.90  E-value: 2.48e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   431 IRENqptWKDITELKNSITDIRQEMAlTCEKPLKELVAKQSHLEGALEQEHSQIVLYHQSLnETLSNMQEAHTQLLSILQ 510
Cdd:TIGR02168  679 IEEL---EEKIEELEEKIAELEKALA-ELRKELEELEEELEQLRKELEELSRQISALRKDL-ARLEAEVEQLEERIAQLS 753
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   511 VSGTENVATEESVNSNVTKYVSVLQETASKQGLM---LLQMLSDLHVQESKISNLTILLEMEKESAR--GECEEMLSKCR 585
Cdd:TIGR02168  754 KELTELEAEIEELEERLEEAEEELAEAEAEIEELeaqIEQLKEELKALREALDELRAELTLLNEEAAnlRERLESLERRI 833
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   586 HDFKFQLKDTEENLHVLNQT---LTEVIFPMDIKVDKMSEQLNDltydmeilqpLLEQRSLLQQQIIHEPKEDTVTRREL 662
Cdd:TIGR02168  834 AATERRLEDLEEQIEELSEDiesLAAEIEELEELIEELESELEA----------LLNERASLEEALALLRSELEELSEEL 903
                          250       260
                   ....*....|....*....|....*....
gi 293358604   663 QNLIGAVNQLNVLTKELTKRHNLLRNEVQ 691
Cdd:TIGR02168  904 RELESKRSELRRELEELREKLAQLELRLE 932
EGF smart00181
Epidermal growth factor-like domain;
1027-1058 7.23e-04

Epidermal growth factor-like domain;


Pssm-ID: 214544  Cd Length: 35  Bit Score: 38.27  E-value: 7.23e-04
                            10        20        30
                    ....*....|....*....|....*....|....
gi 293358604   1027 CSSF-PCQNGgTCISGRSSFICACRHPFMGD-TC 1058
Cdd:smart00181    2 CASGgPCSNG-TCINTPGSYTCSCPPGYTGDkRC 34
hEGF pfam12661
Human growth factor-like EGF; hEGF, or human growth factor-like EGF, domains have six ...
1032-1049 1.79e-03

Human growth factor-like EGF; hEGF, or human growth factor-like EGF, domains have six conserved residues disulfide-bonded into the characteriztic 'ababcc' pattern. They are involved in growth and proliferation of cells, in proteins of the Notch/Delta pathway, neurogulin and selectins. hEGFs are also found in mosaic proteins with four-disulfide laminin EGFs such as aggrecan and perlecan. The core fold of the EGF domain consists of two small beta-hairpins packed against each other. Two major structural variants have been identified based on the structural context of the C-terminal Cys residue of disulfide 'c' in the C-terminal hairpin: hEGFs and cEGFs. In hEGFs the C-terminal thiol resides in the beta-turn, resulting in shorter loop-lengths between the Cys residues of disulfide 'c', typically C[8-9]XC. These shorter loop-lengths are also typical of the four-disulfide EGF domains, laminin ad integrin. Tandem hEGF domains have six linking residues between terminal cysteines of adjacent domains. hEGF domains may or may not bind calcium in the linker region. hEGF domains with the consensus motif CXD4X[F,Y]XCXC are hydroxylated exclusively in the Asp residue.


Pssm-ID: 463660  Cd Length: 22  Bit Score: 36.93  E-value: 1.79e-03
                           10
                   ....*....|....*...
gi 293358604  1032 CQNGGTCISGRSSFICAC 1049
Cdd:pfam12661    1 CQNGGTCVDGVNGYKCQC 18
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
269-502 2.54e-03

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 42.35  E-value: 2.54e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   269 KAEERQQLVHSNQAESHTaVDQGTAQQQKQDSGDPAMIHKMAEQLSQQERKLMLLQKKVDNVSLAAGDLRNAYLSLEEKV 348
Cdd:TIGR02168  282 EIEELQKELYALANEISR-LEQQKQILRERLANLERQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAEL 360
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604   349 skdnsKEFQSFLKALKSKsIDDLLKNIvkDQFKVFQDDMQETIAQLFKTVSSLSKDLESTRQAVLQVNQSFVSVTAQKDS 428
Cdd:TIGR02168  361 -----EELEAELEELESR-LEELEEQL--ETLRSKVAQLELQIASLNNEIERLEARLERLEDRRERLQQEIEELLKKLEE 432
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 293358604   429 ALIRENQPTWKDITELKNSITDIRQEMALTCEKPLKELVAKQSHLEGALEQEHSqivlyHQSLNETLSNMQEAH 502
Cdd:TIGR02168  433 AELKELQAELEELEEELEELQEELERLEEALEELREELEEAEQALDAAERELAQ-----LQARLDSLERLQENL 501
ClyA_Cry6Aa-like cd22656
Bacillus thuringiensis crystal 6Aa (Cry6Aa) toxin, and similar proteins; This model includes ...
333-505 6.75e-03

Bacillus thuringiensis crystal 6Aa (Cry6Aa) toxin, and similar proteins; This model includes pesticidal Cry6Aa toxin from Bacillus thuringiensis, one of the many parasporal crystal (Cry) toxins produced during the sporulation phase of growth. Many of these proteins are toxic to numerous insect species and have been effectively used as proteinaceous insecticides to directly kill insect pests; some have been used to control insect growth on transgenic agricultural plants. Cry6Aa exists as a protoxin, which is activated by cleavage using trypsin. Structure studies for Cry6Aa support a mechanism of action by pore formation, similar to cytolysin A (ClyA)-type alpha pore-forming toxins (alpha-PFTs) such as HblB, and bioassay and mutation studies show that Cry6Aa is an active pore-forming toxin. Cry6Aa shows atypical features compared to other members of alpha-PFTs, including internal repeat sequences and small loop regions within major alpha helices.


Pssm-ID: 439154 [Multi-domain]  Cd Length: 309  Bit Score: 40.05  E-value: 6.75e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604  333 AAGDLRNAYLSLEEKVSKDNSKEFQSFLKALKsKSIDDLLKNIVKDQfkvfqDDMQETIAQLFKTVSSLSKDLESTRQAV 412
Cdd:cd22656    85 AGGTIDSYYAEILELIDDLADATDDEELEEAK-KTIKALLDDLLKEA-----KKYQDKAAKVVDKLTDFENQTEKDQTAL 158
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 293358604  413 LQVNQSFVSVTAQKDSALIRenqptwKDITELKNSITDIRQEMALTCEKPLKELVAKQSHLEGALEQEHSQIVLYH---Q 489
Cdd:cd22656   159 ETLEKALKDLLTDEGGAIAR------KEIKDLQKELEKLNEEYAAKLKAKIDELKALIADDEAKLAAALRLIADLTaadT 232
                         170
                  ....*....|....*.
gi 293358604  490 SLNETLSNMQEAHTQL 505
Cdd:cd22656   233 DLDNLLALIGPAIPAL 248
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH