NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|442625924|ref|NP_001260040|]
View 

dumpy, isoform Y [Drosophila melanogaster]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PHA03247 super family cl33720
large tegument protein UL36; Provisional
14019-14660 1.68e-35

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 153.17  E-value: 1.68e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ----PGVVNIPSVPSPSYPAPNPP 14094
Cdd:PHA03247  2478 PVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMltwiRGLEELASDDAGDPPPPLPP 2557
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14095 VNYPTQPSPQIPvqpgviniPSAPLPTtpPQHPPVFIPSPEspspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSP 14174
Cdd:PHA03247  2558 AAPPAAPDRSVP--------PPRPAPR--PSEPAVTSRARR-------------PDAP-PQSARPRAPVDDRGDPRGPAP 2613
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14175 ipqkpgvvniPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP 14254
Cdd:PHA03247  2614 ----------PSPLPPDTHAPDPPP-----PSPSPAANEPD--PHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQA 2676
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14255 SVP-----QPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIP-SVAQ 14328
Cdd:PHA03247  2677 SSPpqrprRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPgGPAR 2756
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14329 PVHP--TYQPPVVERPAIYDVYYPPPPSRPGVINIpSPPRPVYPVPQQPIYVPAPVlhiPAPRPVIhNIPSVPQPTYPhr 14406
Cdd:PHA03247  2757 PARPptTAGPPAPAPPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDPADPPAAV---LAPAAAL-PPAASPAGPLP-- 2829
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14407 nPPiqdvTYPAPQPSPPvpgivniPSLPQPVSTPTSG-------VINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPG 14479
Cdd:PHA03247  2830 -PP----TSAQPTAPPP-------PPGPPPPSLPLGGsvapggdVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14480 IINVPSVPQPIPTAPSPgiinipsvPQPLPSPTPGViniPQQPTPPPlvQQPGIINIPSVQQPSTPTTQHPiQDVQYETQ 14559
Cdd:PHA03247  2898 FALPPDQPERPPQPQAP--------PPPQPQPQPPP---PPQPQPPP--PPPPRPQPPLAPTTDPAGAGEP-SGAVPQPW 2963
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14560 RPQPTPGVINIP----SVSQPTYPTQKPSyqdTSYPTVQPKPPVSG-----IINIPSVPQPV--------------PSLT 14616
Cdd:PHA03247  2964 LGALVPGRVAVPrfrvPQPAPSREAPASS---TPPLTGHSLSRVSSwasslALHEETDPPPVslkqtlwppddtedSDAD 3040
                          650       660       670       680
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14617 PGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPVQEVYH 14660
Cdd:PHA03247  3041 SLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPS 3084
ZP super family cl42957
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona ...
17722-17957 9.63e-17

Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).


The actual alignment was detected with superfamily member smart00241:

Pssm-ID: 214579  Cd Length: 252  Bit Score: 85.13  E-value: 9.63e-17
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17722 CLADGVQVEIHiTEPGFNGVLYVKGHS-KDEECRRVVNLAGETVPRTEifrVHFGSCGM--QAVKDVA--SFVLVIQKHP 17796
Cdd:smart00241     2 CGEDQMVVSVS-TDLLFPGGINVKGLTlGDPSCRPQFTDATSAFVSFE---VPLNGCGTrrQVNPDGIvySNTLVVSPFH 77
                             90       100       110       120       130       140       150       160
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17797 KLVTYKAQ--AYNIKCVYQTGEKnVTLGFNVSMLTTAGTIANTGPPPICQMRIITNEGE----EINSAEIGDNLKLQVDV 17870
Cdd:smart00241    78 PGFITRDDraAYHFQCFYPENEK-VSLNLDVSTIPPTELSSVSEGPLTCSYRLYKDDSFgspyQSADYVLGDPVYHEWEC 156
                            170       180       190       200       210       220       230       240
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17871 EPATI--YGGFARSCIAKTMEDNVQNEYLVTDENGCATDTSIFGNWEYNPDTNSLL-ASFNAFKFPSSDNIRFQCNIRVC 17947
Cdd:smart00241   157 DGADDppLGLLVDNCYATPGPDPSSGPKYFIIDNGCPVDGYLDSTIPYNSNPLHRArFSVKVFKFADRSLVYFHCQIRLC 236
                            250
                     ....*....|....
gi 442625924   17948 ----FGRCQPVNCG 17957
Cdd:smart00241   237 dkddGSSCDGPACS 250
Atrophin-1 super family cl38111
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
13798-14227 9.68e-17

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


The actual alignment was detected with superfamily member pfam03154:

Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 90.21  E-value: 9.68e-17
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13798 PVPIIQESPLTPCDPSPCGPNAQCHPSLNEAVCSCLPEfyGTPPNCRPECTLNSECA-----YDKACVH-------HKCV 13865
Cdd:pfam03154   172 PVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ--GSPATSQPPNQTQSTAAphtliQQTPTLHpqrlpspHPPL 249
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13866 DPCPGICGINADCRVHYHSPicyciSSHTGDPftrcyETPKPVR--PQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSA 13943
Cdd:pfam03154   250 QPMTQPPPPSQVSPQPLPQP-----SLHGQMP-----PMPHSLQtgPSHMQHPVPPQPFPLT-------PQSSQSQVPPG 312
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13944 PQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVyPSPQPPvydvnyPTTPVSQHPGvvniPSAPRLvPPTSQR 14023
Cdd:pfam03154   313 PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSM-PHIKPP------PTTPIPQLPN----PQSHKH-PPHLSG 380
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14024 PVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP 14103
Cdd:pfam03154   381 PSPFQMNSNLPPPPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLHQVPSQS 460
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14104 QIPVQPgviNIPSAPLPTTPPQHPPvfipspespspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSPipqkpgvvN 14183
Cdd:pfam03154   461 PFPQHP---FVPGGPPPITPPSGPP--------------------TSTS-SAMPGIQPPSSASVSSSGPVP--------A 508
                           410       420       430       440
                    ....*....|....*....|....*....|....*....|....*....
gi 442625924  14184 IPSAPQPVHPAPNPPVHEFNYPTPPAVPQ-----QPGVLNIPSYPTPVA 14227
Cdd:pfam03154   509 AVSCPLPPVQIKEEALDEAEEPESPPPPPrspspEPTVVNTPSHASQSA 557
EGF_CA smart00179
Calcium-binding EGF-like domain;
255-286 8.29e-06

Calcium-binding EGF-like domain;


:

Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 46.86  E-value: 8.29e-06
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGY 286
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
338-373 3.22e-05

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


:

Pssm-ID: 238011  Cd Length: 38  Bit Score: 45.32  E-value: 3.22e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 442625924   338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGFVLEH 373
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
137-166 1.20e-04

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


:

Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 43.74  E-value: 1.20e-04
                            10        20        30
                    ....*....|....*....|....*....|
gi 442625924    137 PCDVFAHCTNTLGSFTCTCFPGYRGNGFHC 166
Cdd:pfam12947     7 GCHPNATCTNTGGSFTCTCNDGYTGDGVTC 36
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
212-247 1.63e-04

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


:

Pssm-ID: 238011  Cd Length: 38  Bit Score: 43.39  E-value: 1.63e-04
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 442625924   212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGYVGNN 247
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
EGF_CA smart00179
Calcium-binding EGF-like domain;
1022-1056 1.18e-03

Calcium-binding EGF-like domain;


:

Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.08  E-value: 1.18e-03
                             10        20        30
                     ....*....|....*....|....*....|....*
gi 442625924    1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQ 1056
Cdd:smart00179     1 DIDECASGN--PCQNGGTCVNTVGSYRCECPPGYT 33
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
461-490 4.93e-03

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


:

Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 39.12  E-value: 4.93e-03
                            10        20        30
                    ....*....|....*....|....*....|..
gi 442625924    461 CQDNP--CGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:pfam12947     1 CSDNNggCHPNATCTNTGGSFTCTCNDGYTGD 32
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
676-702 5.77e-03

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


:

Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 39.12  E-value: 5.77e-03
                            10        20
                    ....*....|....*....|....*..
gi 442625924    676 GSCGQNATCTNSAGGFTCACPPGFSGD 702
Cdd:pfam12947     6 GGCHPNATCTNTGGSFTCTCNDGYTGD 32
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
298-331 8.46e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


:

Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.39  E-value: 8.46e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   298 DQDECA-RTPCGRNADCLNTDGSFRCLCPDGYSGD 331
Cdd:cd00054      1 DIDECAsGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
413-456 9.61e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


:

Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.39  E-value: 9.61e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....
gi 442625924   413 DIDECNQPDGvakCGTNAKCINFPGSYRCLCPSGFQGQgylHCE 456
Cdd:cd00054      1 DIDECASGNP---CQNGGTCVNTVGSYRCSCPPGYTGR---NCE 38
 
Name Accession Description Interval E-value
PHA03247 PHA03247
large tegument protein UL36; Provisional
14019-14660 1.68e-35

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 153.17  E-value: 1.68e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ----PGVVNIPSVPSPSYPAPNPP 14094
Cdd:PHA03247  2478 PVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMltwiRGLEELASDDAGDPPPPLPP 2557
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14095 VNYPTQPSPQIPvqpgviniPSAPLPTtpPQHPPVFIPSPEspspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSP 14174
Cdd:PHA03247  2558 AAPPAAPDRSVP--------PPRPAPR--PSEPAVTSRARR-------------PDAP-PQSARPRAPVDDRGDPRGPAP 2613
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14175 ipqkpgvvniPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP 14254
Cdd:PHA03247  2614 ----------PSPLPPDTHAPDPPP-----PSPSPAANEPD--PHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQA 2676
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14255 SVP-----QPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIP-SVAQ 14328
Cdd:PHA03247  2677 SSPpqrprRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPgGPAR 2756
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14329 PVHP--TYQPPVVERPAIYDVYYPPPPSRPGVINIpSPPRPVYPVPQQPIYVPAPVlhiPAPRPVIhNIPSVPQPTYPhr 14406
Cdd:PHA03247  2757 PARPptTAGPPAPAPPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDPADPPAAV---LAPAAAL-PPAASPAGPLP-- 2829
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14407 nPPiqdvTYPAPQPSPPvpgivniPSLPQPVSTPTSG-------VINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPG 14479
Cdd:PHA03247  2830 -PP----TSAQPTAPPP-------PPGPPPPSLPLGGsvapggdVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14480 IINVPSVPQPIPTAPSPgiinipsvPQPLPSPTPGViniPQQPTPPPlvQQPGIINIPSVQQPSTPTTQHPiQDVQYETQ 14559
Cdd:PHA03247  2898 FALPPDQPERPPQPQAP--------PPPQPQPQPPP---PPQPQPPP--PPPPRPQPPLAPTTDPAGAGEP-SGAVPQPW 2963
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14560 RPQPTPGVINIP----SVSQPTYPTQKPSyqdTSYPTVQPKPPVSG-----IINIPSVPQPV--------------PSLT 14616
Cdd:PHA03247  2964 LGALVPGRVAVPrfrvPQPAPSREAPASS---TPPLTGHSLSRVSSwasslALHEETDPPPVslkqtlwppddtedSDAD 3040
                          650       660       670       680
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14617 PGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPVQEVYH 14660
Cdd:PHA03247  3041 SLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPS 3084
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
14048-14503 3.64e-29

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 131.04  E-value: 3.64e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14048 SQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNipsVPSPSYPAPNPPVNYPTQPSPQIPvQPGVINIPSAPLPTTPPqhP 14127
Cdd:pfam03154   144 TSPSIPSPQDNESDSDSSAQQQILQTQPPVLQ---AQSGAASPPSPPPPGTTQAATAGP-TPSAPSVPPQGSPATSQ--P 217
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14128 PVFIPSPESPSPAPKPGviniPSVTHPEYPTSQVPVydvnystTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEfnyptp 14207
Cdd:pfam03154   218 PNQTQSTAAPHTLIQQT----PTLHPQRLPSPHPPL-------QPMTQPPPPSQVSPQPLPQPSLHGQMPPMPH------ 280
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14208 pavPQQPGVLNIPsYPTPVAPTPQSPIYIPSQEQPKPTtrpsvinvPSVPQPAYPTPQAPvydvnyPTSPSVIPHQPGVV 14287
Cdd:pfam03154   281 ---SLQTGPSHMQ-HPVPPQPFPLTPQSSQSQVPPGPS--------PAAPGQSQQRIHTP------PSQSQLQSQQPPRE 342
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14288 N-IPSVPLPAPPVKQRPVfvpSPVHPTPAPQ----PGVVNIPSVAQpVHPTYQPPVVERPAIYDVYYPPPPSRPgvinip 14362
Cdd:pfam03154   343 QpLPPAPLSMPHIKPPPT---TPIPQLPNPQshkhPPHLSGPSPFQ-MNSNLPPPPALKPLSSLSTHHPPSAHP------ 412
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14363 sPPRPVYPVPQQpiyVPAPvlhiPAPRPVIHNIPSVPQPTYPHRNP------PIQDvTYPAPQPSPPVPGIVNIPSLPQP 14436
Cdd:pfam03154   413 -PPLQLMPQSQQ---LPPP----PAQPPVLTQSQSLPPPAASHPPTsglhqvPSQS-PFPQHPFVPGGPPPITPPSGPPT 483
                           410       420       430       440       450       460
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924  14437 VSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQPTPQRPsPGIINVPSVPQPIPTAPS--PGIINIPS 14503
Cdd:pfam03154   484 STSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEA-LDEAEEPESPPPPPRSPSpePTVVNTPS 551
ZP smart00241
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona ...
17722-17957 9.63e-17

Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).


Pssm-ID: 214579  Cd Length: 252  Bit Score: 85.13  E-value: 9.63e-17
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17722 CLADGVQVEIHiTEPGFNGVLYVKGHS-KDEECRRVVNLAGETVPRTEifrVHFGSCGM--QAVKDVA--SFVLVIQKHP 17796
Cdd:smart00241     2 CGEDQMVVSVS-TDLLFPGGINVKGLTlGDPSCRPQFTDATSAFVSFE---VPLNGCGTrrQVNPDGIvySNTLVVSPFH 77
                             90       100       110       120       130       140       150       160
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17797 KLVTYKAQ--AYNIKCVYQTGEKnVTLGFNVSMLTTAGTIANTGPPPICQMRIITNEGE----EINSAEIGDNLKLQVDV 17870
Cdd:smart00241    78 PGFITRDDraAYHFQCFYPENEK-VSLNLDVSTIPPTELSSVSEGPLTCSYRLYKDDSFgspyQSADYVLGDPVYHEWEC 156
                            170       180       190       200       210       220       230       240
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17871 EPATI--YGGFARSCIAKTMEDNVQNEYLVTDENGCATDTSIFGNWEYNPDTNSLL-ASFNAFKFPSSDNIRFQCNIRVC 17947
Cdd:smart00241   157 DGADDppLGLLVDNCYATPGPDPSSGPKYFIIDNGCPVDGYLDSTIPYNSNPLHRArFSVKVFKFADRSLVYFHCQIRLC 236
                            250
                     ....*....|....
gi 442625924   17948 ----FGRCQPVNCG 17957
Cdd:smart00241   237 dkddGSSCDGPACS 250
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
13798-14227 9.68e-17

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 90.21  E-value: 9.68e-17
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13798 PVPIIQESPLTPCDPSPCGPNAQCHPSLNEAVCSCLPEfyGTPPNCRPECTLNSECA-----YDKACVH-------HKCV 13865
Cdd:pfam03154   172 PVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ--GSPATSQPPNQTQSTAAphtliQQTPTLHpqrlpspHPPL 249
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13866 DPCPGICGINADCRVHYHSPicyciSSHTGDPftrcyETPKPVR--PQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSA 13943
Cdd:pfam03154   250 QPMTQPPPPSQVSPQPLPQP-----SLHGQMP-----PMPHSLQtgPSHMQHPVPPQPFPLT-------PQSSQSQVPPG 312
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13944 PQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVyPSPQPPvydvnyPTTPVSQHPGvvniPSAPRLvPPTSQR 14023
Cdd:pfam03154   313 PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSM-PHIKPP------PTTPIPQLPN----PQSHKH-PPHLSG 380
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14024 PVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP 14103
Cdd:pfam03154   381 PSPFQMNSNLPPPPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLHQVPSQS 460
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14104 QIPVQPgviNIPSAPLPTTPPQHPPvfipspespspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSPipqkpgvvN 14183
Cdd:pfam03154   461 PFPQHP---FVPGGPPPITPPSGPP--------------------TSTS-SAMPGIQPPSSASVSSSGPVP--------A 508
                           410       420       430       440
                    ....*....|....*....|....*....|....*....|....*....
gi 442625924  14184 IPSAPQPVHPAPNPPVHEFNYPTPPAVPQ-----QPGVLNIPSYPTPVA 14227
Cdd:pfam03154   509 AVSCPLPPVQIKEEALDEAEEPESPPPPPrspspEPTVVNTPSHASQSA 557
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
14033-14401 3.63e-16

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 87.52  E-value: 3.63e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14033 LSPTPQPGVINIPSVSQPGYPTPQSPIYDAnyPTTQsPIPQqpgvvniPSVPSPSYPAPNPPvnyPTQPSPQIPVQPGVI 14112
Cdd:NF033839   147 SSSSSSSGSSTKPETPQPENPEHQKPTTPA--PDTK-PSPQ-------PEGKKPSVPDINQE---KEKAKLAVATYMSKI 213
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14113 --NIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV----PVYDVNYSTTPSPIPQKPGVVNIPS 14186
Cdd:NF033839   214 ldDIQKHHLQKEKHRQIVALIKELDELKKQALSEIDNVNTKVEIENTVHKIfadmDAVVTKFKKGLTQDTPKEPGNKKPS 293
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQP-VHPAPNPPVHEfnyPTPPAVPQQPGVLNIPSYPTP-VAPTPQS--PIYIPSQEQPKPTTRPSvinvPSVPQPAY- 14261
Cdd:NF033839   294 APKPgMQPSPQPEKKE---VKPEPETPKPEVKPQLEKPKPeVKPQPEKpkPEVKPQLETPKPEVKPQ----PEKPKPEVk 366
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 PTPQAPvydvnyptSPSVIPhQPGVvnipsvplPAPPVKQRPVfVPSP-VHPTP-APQPGVVNIPSVAQP-VHPTYQPPv 14338
Cdd:NF033839   367 PQPEKP--------KPEVKP-QPET--------PKPEVKPQPE-KPKPeVKPQPeKPKPEVKPQPEKPKPeVKPQPEKP- 427
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14339 veRPaiyDVYYPPPPSRPGVINIPSPPRP-VYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVPQP 14401
Cdd:NF033839   428 --KP---EVKPQPEKPKPEVKPQPEKPKPeVKPQPETP--KPEVKPQPEKPKPEVKPQPEKPKP 484
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
13941-14278 8.64e-16

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 86.36  E-value: 8.64e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13941 PSAPQPIYPTPQSPQYNV--NYPSPQP--ANPQKPGVVNIPSVPQP-VYPSPQPPVYDVNYPTTPVSQHPGVVNIPSA-- 14013
Cdd:NF033839   159 PETPQPENPEHQKPTTPApdTKPSPQPegKKPSVPDINQEKEKAKLaVATYMSKILDDIQKHHLQKEKHRQIVALIKEld 238
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14014 --------------PRLVPPTSQRPVFIT--------SPGNLSPTPQPGVINIPSVSQPGY-PTPQSPIydanypTTQSP 14070
Cdd:NF033839   239 elkkqalseidnvnTKVEIENTVHKIFADmdavvtkfKKGLTQDTPKEPGNKKPSAPKPGMqPSPQPEK------KEVKP 312
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14071 IPQQPGVVNIPSVPSPSyPAPNPPvnyPTQPSPQIPVQPGVINIPSAPLPTTP-PQHPPvfipspesPSPAPKPGVINIP 14149
Cdd:NF033839   313 EPETPKPEVKPQLEKPK-PEVKPQ---PEKPKPEVKPQLETPKPEVKPQPEKPkPEVKP--------QPEKPKPEVKPQP 380
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14150 SVTHPEY-PTSQVPVYDVNysttPSPIPQKPGVVNIPSAPQP-VHPAPNPPVHEFNyPTPPAvpQQPGVLNIPSYPTP-V 14226
Cdd:NF033839   381 ETPKPEVkPQPEKPKPEVK----PQPEKPKPEVKPQPEKPKPeVKPQPEKPKPEVK-PQPEK--PKPEVKPQPEKPKPeV 453
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14227 APTPQSPI--YIPSQEQPKPTTRPSvinvPSVPQPAYPTPQApvyDVNYPTSPS 14278
Cdd:NF033839   454 KPQPETPKpeVKPQPEKPKPEVKPQ----PEKPKPDNSKPQA---DDKKPSTPN 500
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
14354-14692 5.51e-15

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 83.66  E-value: 5.51e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVINIPSPPRPVYPVPQQPIyVPAPVLHiPAPRPVIHNiPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSL 14433
Cdd:NF033839   151 SSSGSSTKPETPQPENPEHQKPT-TPAPDTK-PSPQPEGKK-PSVPDINQEKEKAKLAVATYMSKILDDIQKHHLQKEKH 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPVSTPTSgVINIPSQASPPISVPTPGIVnipsiPQPTPQRPSPGIINVPSVPQP--IPTAPSPGIINIPSVPQPL--P 14509
Cdd:NF033839   228 RQIVALIKE-LDELKKQALSEIDNVNTKVE-----IENTVHKIFADMDAVVTKFKKglTQDTPKEPGNKKPSAPKPGmqP 301
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14510 SPTPGVINIPQQPTPP-----PLVQQPGiiniPSVQ-QPSTPTtqhPIQDVQYETQRPQ-------PTPGVINIPSVSQP 14576
Cdd:NF033839   302 SPQPEKKEVKPEPETPkpevkPQLEKPK----PEVKpQPEKPK---PEVKPQLETPKPEvkpqpekPKPEVKPQPEKPKP 374
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14577 TYPTQ----KPSYQ---DTSYPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEP-IP 14648
Cdd:NF033839   375 EVKPQpetpKPEVKpqpEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVK 454
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14649 SIPQNPVQEVYHDTQKPQaiPGVVNVPSAPQPTPGRPYYDVAKP 14692
Cdd:NF033839   455 PQPETPKPEVKPQPEKPK--PEVKPQPEKPKPDNSKPQADDKKP 496
Streccoc_I_II NF033804
antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins ...
14157-14365 1.60e-13

antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins with a glucan-binding domain, two types of repetitive regions, an isopeptide bond-forming domain associated with shear resistance, and a C-terminal LPXTG motif for anchoring to the cell wall. They occur in oral Streptococci, and tend to be major cell surface adhesins. Members of this family include SspA and SspB from Streptococcus gordonii, antigen I/II from S. mutans, etc.


Pssm-ID: 468188 [Multi-domain]  Cd Length: 1552  Bit Score: 79.98  E-value: 1.60e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14157 PTSQVPVYDVNYSTTPspipQKPGV----------VNIPSAPQ-----PVHP-APNPPVHEFNYPTPPAvpqqPGVLNIP 14220
Cdd:NF033804   791 PSDEMPAVPGRDNTEG----KKPNIwyslngkiraVNVPKITKekptpPVAPtAPQAPTYEVEKPLEPA----PVAPTYE 862
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPTPVAPTPQspiyipsQEQPKPTTRPSVinvpSVPQPAYPTPQAPVYDvNYPTSPSVIPHQPgvvnIPSVPLPAPPVK 14300
Cdd:NF033804   863 NEPTPPVKTPD-------QPEPSKPEEPTY----ETEKPLEPAPVAPTYE-NEPTPPVKTPDQP----EPSKPEEPTYET 926
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14301 QRPVfVPSPVHPT----PAPQPGVVNIPSVAQPVHPTYQPpvverpaiydvyYPPPPSRPGVINIPSPP 14365
Cdd:NF033804   927 EKPL-EPAPVAPSyenePTPPVKTPDQPEPSKPVEPTYDP------------LPTPPVAPTPKQLPTPP 982
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
14020-14528 1.18e-10

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 69.71  E-value: 1.18e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14020 TSQRPVFITSPGNLS-PTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPipqqPGVVNIPSvPSPSYPA-------- 14090
Cdd:COG5180      2 RKATILEIRLLATVPiPPNAARPVLSPELWAAANNDAVSQGDRSALASSPTR----PYARKIFE-PLDIKLAlgkpqlps 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14091 -PNPPVNYPTQP---SPQIPVQP--GVINIPSAPLPTTPPQHPPVFIPSPESPSpapkpgVINIPSVTHPEYPTSQVPVY 14164
Cdd:COG5180     77 vAEPEAYLDPAPpksSPDTPEEQlgAPAGDLLVLPAAKTPELAAGALPAPAAAA------ALPKAKVTREATSASAGVAL 150
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14165 DVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPN-----PPVHEFNYPTP---PAVPQQPGVLNIPSYPTPVAPTPQsPIYI 14236
Cdd:COG5180    151 AAALLQRSDPILAKDPDGDSASTLPPPAEKLDkvltePRDALKDSPEKldrPKVEVKDEAQEEPPDLTGGADHPR-PEAA 229
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDvnyptspsviPHQPGVVNIPSVPLPAPPV---KQRPVFV-PSPVHP 14312
Cdd:COG5180    230 SSPKVDPPSTSEARSRPATVDAQPEMRPPADAKE----------RRRAAIGDTPAAEPPGLPVleaGSEPQSDaPEAETA 299
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14313 TPAPQPGVVNIPSVAQPVHPT---------YQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQpiyVPAPVl 14383
Cdd:COG5180    300 RPIDVKGVASAPPATRPVRPPggardpgtpRPGQPTERPAGVPEAASDAGQPPSAYPPAEEAVPGKPLEQG---APRPG- 375
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14384 HIPAPRPVIHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTS------GVINIPSQASPPISV 14457
Cdd:COG5180    376 SSGGDGAPFQPPNGAPQPGLGRRGAPGPPMGAGDLVQAALDGGGRETASLGGAAGGAGQgpkadfVPGDAESVSGPAGLA 455
                          490       500       510       520       530       540       550
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14458 PTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGiinIPSVPQPLPSPTPgVINIPQQPTPPPLV 14528
Cdd:COG5180    456 DQAGAAASTAMADFVAPVTDATPVDVADVLGVRPDAILGG---NVAPASGLDAETR-IIEAEGAPATEDFV 522
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
14066-14586 4.04e-09

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 64.56  E-value: 4.04e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14066 TTQSPIPQQPGVVNIP-SVPSPsyPAPNPPVnypTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESpspapkpg 14144
Cdd:cd22540     18 TTQDSQPSPLALLAATcSKIGP--PAVEAAV---TPPAPPQPTPRKLVPIKPAPLPLGPGKNSIGFLSAKGN-------- 84
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14145 VINI-PSVTHPEYPTSQVPVYDVN-------YSTTPSPIPQKPGVVNIPSAPQP-------VHPAPNPpvhefNYPTPPA 14209
Cdd:cd22540     85 IIQLqGSQLSSSAPGGQQVFAIQNptmiikgSQTRSSTNQQYQISPQIQAAGQInnsgqiqIIPGTNQ-----AIITPVQ 159
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14210 VPQQPgvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTrpsVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVN- 14288
Cdd:cd22540    160 VLQQP---QQAHKPVPIKPAPLQTSNTNSASLQVPGN---VIKLQSGGNVALTLPVNNLVGTQDGATQLQLAAAPSKPSk 233
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14289 -----IPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGvVNIPSVAQPvhPTYQPPVVERpaiydVYYPPPPSRPGVINIps 14363
Cdd:cd22540    234 kirkkSAQAAQPAVTVAEQVETVLIETTADNIIQAG-NNLLIVQSP--GTGQPAVLQQ-----VQVLQPKQEQQVVQI-- 303
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 pprpvypvPQQPIYVpapvlhipaPRPVIHNIPSVPQPtyPHRNPPIQdvtypapqpsppvpgivNIPSLPQPV--STPT 14441
Cdd:cd22540    304 --------PQQALRV---------VQAASATLPTVPQK--PLQNIQIQ-----------------NSEPTPTQVyiKTPS 347
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVPTPgivniPSIPQPTPQRPSPGIINVPSVPQPIPTAPspgiinipsvPQPLPSPTPGVI--NIP 14519
Cdd:cd22540    348 GEVQTVLLQEAPAATATPS-----SSTSTVQQQVTANNGTGTSKPNYNVRKER----------TLPKIAPAGGIIslNAA 412
                          490       500       510       520       530       540
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14520 QQPTPPPLVQQpgiINIPSVQQPSTPTTQhpiqdvqyeTQRP-QPTPGVINIPSVSQPTYPTQKPSYQ 14586
Cdd:cd22540    413 QLAAAAQAIQT---ININGVQVQGVPVTI---------TNAGgQQQLTVQTVSSNNLTISGLSPTQIQ 468
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
13903-14122 5.00e-09

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 64.40  E-value: 5.00e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKP-VRPQiydtPSPPYPVAIPDLvyvQQQQPGIVNIPSAPQP-IYPTPQSPQYNVnypSPQPANPqKPGVVnipsvP 13980
Cdd:NF033839   326 EKPKPeVKPQ----PEKPKPEVKPQL---ETPKPEVKPQPEKPKPeVKPQPEKPKPEV---KPQPETP-KPEVK-----P 389
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13981 QPVYP----SPQPPVYDVNYPTTPVSQHPGVVNIPSAPRL-VPPTSQRPvfitspgNLSPTPQPGVINiPSV-SQPGYPT 14054
Cdd:NF033839   390 QPEKPkpevKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVKPQPEKP-------KPEVKPQPEKPK-PEVkPQPETPK 461
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14055 PQ-SPIYDANYPTTQsPIPQQPGVVNipSVPSPSYPAPNPPVNYP--TQPSPQIPVQPGVINIPSAPLPTT 14122
Cdd:NF033839   462 PEvKPQPEKPKPEVK-PQPEKPKPDN--SKPQADDKKPSTPNNLSkdKQPSNQASTNEKATNKPKKSLPST 529
PRK10263 PRK10263
DNA translocase FtsK; Provisional
13907-14025 8.08e-08

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 61.25  E-value: 8.08e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQIYDTPSPPY------PVAIPDLVYVQQQQPGIVNIPSAP-----QPIYPTPQSPQYNVNY----PSPQPANPQKP 13971
Cdd:PRK10263   731 PMKALLDDGPHEPLftpivePVQQPQQPVAPQQQYQQPQQPVAPqpqyqQPQQPVAPQPQYQQPQqpvaPQPQYQQPQQP 810
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 -------GVVNIPSVPQPVYPSPQPPVYD-------------------VNYPTTPvsqhpgvvnIPSAPRLVPPTSQ-RP 14024
Cdd:PRK10263   811 vapqpqyQQPQQPVAPQPQYQQPQQPVAPqpqdtllhpllmrngdsrpLHKPTTP---------LPSLDLLTPPPSEvEP 881

                   .
gi 442625924 14025 V 14025
Cdd:PRK10263   882 V 882
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14483-14626 2.38e-06

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 52.48  E-value: 2.38e-06
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14483 VPSVPQPIPTAPSPGIINIPSVPQPLPSptpgvinIPQQPtpppLVQQPGiinipsvQQPSTPTTQHPIQDVQYETQRPQ 14562
Cdd:smart00818    40 IPVSQQHPPTHTLQPHHHIPVLPAQQPV-------VPQQP----LMPVPG-------QHSMTPTQHHQPNLPQPAQQPFQ 101
                             90       100       110       120       130       140
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924   14563 PTPgviniPSVSQPTYPTQKPsyqdtsyPTVQPKPPVSGIINIPSVP--QPVPSLTPgviNLPSEP 14626
Cdd:smart00818   102 PQP-----LQPPQPQQPMQPQ-------PPVHPIPPLPPQPPLPPMFpmQPLPPLLP---DLPLEA 152
EGF_CA smart00179
Calcium-binding EGF-like domain;
255-286 8.29e-06

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 46.86  E-value: 8.29e-06
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGY 286
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
255-289 1.44e-05

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 46.48  E-value: 1.44e-05
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGYDGD 289
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
338-373 3.22e-05

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 45.32  E-value: 3.22e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 442625924   338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGFVLEH 373
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
EGF_CA smart00179
Calcium-binding EGF-like domain;
338-369 7.53e-05

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 44.16  E-value: 7.53e-05
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGF 369
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
13903-14128 9.23e-05

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 50.45  E-value: 9.23e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPPYPVAIPDLVYvQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYP---------SPQPANP--QKP 13971
Cdd:COG5180    274 AAEPPGLPVLEAGSEPQSDAPEAETAR-PIDVKGVASAPPATRPVRPPGGARDPGTPRPgqpterpagVPEAASDagQPP 352
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 GVVNIPSVPQPVYPSPQ--PPVYDVNYPTTPV----------SQHPGVVN-IPSAPRLVPPTSQRPVFIT-------SPG 14031
Cdd:COG5180    353 SAYPPAEEAVPGKPLEQgaPRPGSSGGDGAPFqppngapqpgLGRRGAPGpPMGAGDLVQAALDGGGRETaslggaaGGA 432
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14032 NLSPTPQPGVINIPSVSQPGYPTPQSPIydanyptTQSPIPQQPGVV--NIPSVPSPSYPAPNPPVNYPTQPSPQIPVQP 14109
Cdd:COG5180    433 GQGPKADFVPGDAESVSGPAGLADQAGA-------AASTAMADFVAPvtDATPVDVADVLGVRPDAILGGNVAPASGLDA 505
                          250
                   ....*....|....*....
gi 442625924 14110 GVINIPSAPLPTTPPQHPP 14128
Cdd:COG5180    506 ETRIIEAEGAPATEDFVAA 524
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
137-166 1.20e-04

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 43.74  E-value: 1.20e-04
                            10        20        30
                    ....*....|....*....|....*....|
gi 442625924    137 PCDVFAHCTNTLGSFTCTCFPGYRGNGFHC 166
Cdd:pfam12947     7 GCHPNATCTNTGGSFTCTCNDGYTGDGVTC 36
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
212-247 1.63e-04

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 43.39  E-value: 1.63e-04
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 442625924   212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGYVGNN 247
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
Zona_pellucida pfam00100
Zona pellucida-like domain;
17722-17947 1.85e-04

Zona pellucida-like domain;


Pssm-ID: 459673 [Multi-domain]  Cd Length: 254  Bit Score: 48.37  E-value: 1.85e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  17722 CLADGVQVEIHITEPGFNGVLY--VKGHSKDEECRRVVNLAGETVprtEIFRVHFGSCG--MQAVKDVA--SFVLVIQKH 17795
Cdd:pfam00100     1 CTPDTMTVSISKCLLVPSGLLSslSLLGGLDPSCKPVSNTNGSPA---VLFEFPLTGCGttVQVNGTHIiySNTLYSSTD 77
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  17796 PKLVTYK---AQAYNIKCVYQTGEkNVTLGFNVSMLTTAGTIANTGPPPIcQMRIITNE------GEEINSAEIGDNLKL 17866
Cdd:pfam00100    78 LRSGIIRrtiTRRLPFSCSYPRSS-LVSLLVVAPPSPVPITVSGSGVFLV-SMDLYYDSsytspySPYPVTVLLGDPLYV 155
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  17867 QVDVEPAT--IYGGFARSCIAkTMEDNVQNEYLVTD-ENGCATDTSIFGNWEYNPDTNSLLA--SFNAFKF--PSSDNIR 17939
Cdd:pfam00100   156 EVSLLSRTdpNLVLVLDNCWA-TPSPNPTSSPQYQLiVNGCPNDGDSTYPVSSLSNGPSHYVrfSFKAFRFvgSSISQVY 234

                    ....*...
gi 442625924  17940 FQCNIRVC 17947
Cdd:pfam00100   235 LHCSVSVC 242
EGF_CA smart00179
Calcium-binding EGF-like domain;
212-243 7.50e-04

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.46  E-value: 7.50e-04
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGY 243
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
218-246 8.66e-04

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 41.43  E-value: 8.66e-04
                            10        20
                    ....*....|....*....|....*....
gi 442625924    218 NPENCGPNALCTNTPGNYTCSCPDGYVGN 246
Cdd:pfam12947     4 NNGGCHPNATCTNTGGSFTCTCNDGYTGD 32
EGF_CA smart00179
Calcium-binding EGF-like domain;
1022-1056 1.18e-03

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.08  E-value: 1.18e-03
                             10        20        30
                     ....*....|....*....|....*....|....*
gi 442625924    1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQ 1056
Cdd:smart00179     1 DIDECASGN--PCQNGGTCVNTVGSYRCECPPGYT 33
f2_encap_cargo1 NF041166
family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like ...
14451-14667 1.26e-03

family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like encapsulin nanocompartments are commonly found in bacteria and archaea. Encapsulin nanocompartments, which are assembled from shell proteins, encapsulate various cargo proteins, typically peroxidases or ferritin-like proteins, to protect cells from oxidative stress caused by peroxide. Proteins of this family are cysteine desulfurases with an additional N-terminal encapsulation targeting sequence (~200 aa) that is necessary and sufficient for compartmentalization.


Pssm-ID: 469077 [Multi-domain]  Cd Length: 623  Bit Score: 47.16  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14451 ASPPISVPTPGI---VNIPSIPQPTPQRPSPGIINV-PSVPQ-PIPTAPSPGIINIPSVPQPLPSPTPGVinipqqPTPP 14525
Cdd:NF041166    33 SALPGEAPAPGLpaaPPAAPAPPGSNPAPAAGPGGLgAGVPGaALPQGLVPGANLLPSAPSPVGALGASA------PALA 106
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14526 PLVQQPgIINIPSVQQPSTPTTQHPIQDVQY-------ETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSyPTVQPKPP 14598
Cdd:NF041166   107 PHAAAG-NVGLPDAVVAVAPAEPRAGGAALPvglpqapVPAAPSAAAAPPDLVAPQAFGLPGEDAALRALL-PAASPAPP 184
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14599 VSgiiniPSVPQPVPS---LTPGVINLPSEPSYSAPIPKPG---IINVPSIPE--PIpsipqnpVQE-------VYHD-- 14661
Cdd:NF041166   185 SA-----PSAAAAESSyyfLDERAAPSPAAAPPGSPPALASahpPFDVNAVRRdfPI-------LQErvngkplVWFDna 252

                   ....*...
gi 442625924 14662 --TQKPQA 14667
Cdd:NF041166   253 atTQKPQA 260
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1022-1058 4.33e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 39.16  E-value: 4.33e-03
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 442625924  1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQGD 1058
Cdd:cd00054      1 DIDECASGN--PCQNGGTCVNTVGSYRCSCPPGYTGR 35
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
461-490 4.93e-03

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 39.12  E-value: 4.93e-03
                            10        20        30
                    ....*....|....*....|....*....|..
gi 442625924    461 CQDNP--CGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:pfam12947     1 CSDNNggCHPNATCTNTGGSFTCTCNDGYTGD 32
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
13907-13991 4.96e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 42.85  E-value: 4.96e-03
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   13907 PVRPQIYDTPSPPYPVAIPdlvyVQQQQPGIVNIPSAP-QPIYPTPQSPQYNVNYPSPQ-PANPQKPgvvniPSVPQPVY 13984
Cdd:smart00818    66 PVVPQQPLMPVPGQHSMTP----TQHHQPNLPQPAQQPfQPQPLQPPQPQQPMQPQPPVhPIPPLPP-----QPPLPPMF 136

                     ....*...
gi 442625924   13985 P-SPQPPV 13991
Cdd:smart00818   137 PmQPLPPL 144
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
676-702 5.77e-03

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 39.12  E-value: 5.77e-03
                            10        20
                    ....*....|....*....|....*..
gi 442625924    676 GSCGQNATCTNSAGGFTCACPPGFSGD 702
Cdd:pfam12947     6 GGCHPNATCTNTGGSFTCTCNDGYTGD 32
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
457-490 6.55e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.77  E-value: 6.55e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   457 NINECQD-NPCGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:cd00054      1 DIDECASgNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
298-331 8.46e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.39  E-value: 8.46e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   298 DQDECA-RTPCGRNADCLNTDGSFRCLCPDGYSGD 331
Cdd:cd00054      1 DIDECAsGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
413-456 9.61e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.39  E-value: 9.61e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....
gi 442625924   413 DIDECNQPDGvakCGTNAKCINFPGSYRCLCPSGFQGQgylHCE 456
Cdd:cd00054      1 DIDECASGNP---CQNGGTCVNTVGSYRCSCPPGYTGR---NCE 38
 
Name Accession Description Interval E-value
PHA03247 PHA03247
large tegument protein UL36; Provisional
14019-14660 1.68e-35

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 153.17  E-value: 1.68e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ----PGVVNIPSVPSPSYPAPNPP 14094
Cdd:PHA03247  2478 PVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMltwiRGLEELASDDAGDPPPPLPP 2557
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14095 VNYPTQPSPQIPvqpgviniPSAPLPTtpPQHPPVFIPSPEspspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSP 14174
Cdd:PHA03247  2558 AAPPAAPDRSVP--------PPRPAPR--PSEPAVTSRARR-------------PDAP-PQSARPRAPVDDRGDPRGPAP 2613
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14175 ipqkpgvvniPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP 14254
Cdd:PHA03247  2614 ----------PSPLPPDTHAPDPPP-----PSPSPAANEPD--PHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQA 2676
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14255 SVP-----QPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIP-SVAQ 14328
Cdd:PHA03247  2677 SSPpqrprRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPgGPAR 2756
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14329 PVHP--TYQPPVVERPAIYDVYYPPPPSRPGVINIpSPPRPVYPVPQQPIYVPAPVlhiPAPRPVIhNIPSVPQPTYPhr 14406
Cdd:PHA03247  2757 PARPptTAGPPAPAPPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDPADPPAAV---LAPAAAL-PPAASPAGPLP-- 2829
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14407 nPPiqdvTYPAPQPSPPvpgivniPSLPQPVSTPTSG-------VINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPG 14479
Cdd:PHA03247  2830 -PP----TSAQPTAPPP-------PPGPPPPSLPLGGsvapggdVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14480 IINVPSVPQPIPTAPSPgiinipsvPQPLPSPTPGViniPQQPTPPPlvQQPGIINIPSVQQPSTPTTQHPiQDVQYETQ 14559
Cdd:PHA03247  2898 FALPPDQPERPPQPQAP--------PPPQPQPQPPP---PPQPQPPP--PPPPRPQPPLAPTTDPAGAGEP-SGAVPQPW 2963
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14560 RPQPTPGVINIP----SVSQPTYPTQKPSyqdTSYPTVQPKPPVSG-----IINIPSVPQPV--------------PSLT 14616
Cdd:PHA03247  2964 LGALVPGRVAVPrfrvPQPAPSREAPASS---TPPLTGHSLSRVSSwasslALHEETDPPPVslkqtlwppddtedSDAD 3040
                          650       660       670       680
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14617 PGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPVQEVYH 14660
Cdd:PHA03247  3041 SLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPS 3084
PHA03247 PHA03247
large tegument protein UL36; Provisional
13946-14600 3.78e-34

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 148.55  E-value: 3.78e-34
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13946 PIYPTP---QSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVYPSPQPpvydVNYPTTP------------VSQHPGVVNI 14010
Cdd:PHA03247  2478 PVYRRPaeaRFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEP----VGEPVHPrmltwirgleelASDDAGDPPP 2553
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14011 PSAPRLVPPTSQRPVfitspgnlsPTPQPGVINI-PSVS----QPGYP----TPQSPIYDANYPTTQSPipqqpgvvniP 14081
Cdd:PHA03247  2554 PLPPAAPPAAPDRSV---------PPPRPAPRPSePAVTsrarRPDAPpqsaRPRAPVDDRGDPRGPAP----------P 2614
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14082 SVPSPSYPAPNPPVNYPTqPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV 14161
Cdd:PHA03247  2615 SPLPPDTHAPDPPPPSPS-PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVG 2693
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14162 PVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPA------PNPPVHefnyPTPPAVPQQP----GVLNIPSYPTPVAPTPQ 14231
Cdd:PHA03247  2694 SLTSLADPPPPPPTPEPAPHALVSATPLPPGPAaarqasPALPAA----PAPPAVPAGPatpgGPARPARPPTTAGPPAP 2769
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14232 SPIYIPSQEQPKPTTRPSVINVpSVPQPAYPTPQAPvydvnyptSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVH 14311
Cdd:PHA03247  2770 APPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDP--------ADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPP 2840
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14312 PTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIydvyYPPPPSRPGVINIPSPprpvyPVPQQPIYVPAPVLHIPAPRPv 14391
Cdd:PHA03247  2841 PPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAA----KPAAPARPPVRRLARP-----AVSRSTESFALPPDQPERPPQ- 2910
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14392 ihniPSVPQPTYPHRNPPiqdvtypapqpsppvpgivnIPSLPQPvSTPTSGvinIPSQASPPISVPTPGIVNIPSIPQP 14471
Cdd:PHA03247  2911 ----PQAPPPPQPQPQPP--------------------PPPQPQP-PPPPPP---RPQPPLAPTTDPAGAGEPSGAVPQP 2962
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14472 TPQRPSPGIINVPS--VPQPIPTAPSPGiiniPSVPQPLPSPTPGV------INIPQQPTPPPlVQQPGIINIPSVQQPS 14543
Cdd:PHA03247  2963 WLGALVPGRVAVPRfrVPQPAPSREAPA----SSTPPLTGHSLSRVsswassLALHEETDPPP-VSLKQTLWPPDDTEDS 3037
                          650       660       670       680       690
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14544 TPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQPkPPVS 14600
Cdd:PHA03247  3038 DADSLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPSSQFGP-PPLS 3093
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
14048-14503 3.64e-29

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 131.04  E-value: 3.64e-29
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14048 SQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNipsVPSPSYPAPNPPVNYPTQPSPQIPvQPGVINIPSAPLPTTPPqhP 14127
Cdd:pfam03154   144 TSPSIPSPQDNESDSDSSAQQQILQTQPPVLQ---AQSGAASPPSPPPPGTTQAATAGP-TPSAPSVPPQGSPATSQ--P 217
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14128 PVFIPSPESPSPAPKPGviniPSVTHPEYPTSQVPVydvnystTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEfnyptp 14207
Cdd:pfam03154   218 PNQTQSTAAPHTLIQQT----PTLHPQRLPSPHPPL-------QPMTQPPPPSQVSPQPLPQPSLHGQMPPMPH------ 280
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14208 pavPQQPGVLNIPsYPTPVAPTPQSPIYIPSQEQPKPTtrpsvinvPSVPQPAYPTPQAPvydvnyPTSPSVIPHQPGVV 14287
Cdd:pfam03154   281 ---SLQTGPSHMQ-HPVPPQPFPLTPQSSQSQVPPGPS--------PAAPGQSQQRIHTP------PSQSQLQSQQPPRE 342
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14288 N-IPSVPLPAPPVKQRPVfvpSPVHPTPAPQ----PGVVNIPSVAQpVHPTYQPPVVERPAIYDVYYPPPPSRPgvinip 14362
Cdd:pfam03154   343 QpLPPAPLSMPHIKPPPT---TPIPQLPNPQshkhPPHLSGPSPFQ-MNSNLPPPPALKPLSSLSTHHPPSAHP------ 412
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14363 sPPRPVYPVPQQpiyVPAPvlhiPAPRPVIHNIPSVPQPTYPHRNP------PIQDvTYPAPQPSPPVPGIVNIPSLPQP 14436
Cdd:pfam03154   413 -PPLQLMPQSQQ---LPPP----PAQPPVLTQSQSLPPPAASHPPTsglhqvPSQS-PFPQHPFVPGGPPPITPPSGPPT 483
                           410       420       430       440       450       460
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924  14437 VSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQPTPQRPsPGIINVPSVPQPIPTAPS--PGIINIPS 14503
Cdd:pfam03154   484 STSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEA-LDEAEEPESPPPPPRSPSpePTVVNTPS 551
PHA03247 PHA03247
large tegument protein UL36; Provisional
14153-14685 4.34e-29

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 131.60  E-value: 4.34e-29
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14153 HPEY---PTSQVPvydvnYSTTPSPIPQKPGVVNIPSAPQPVHPAP--------NPPVH----------------EFNYP 14205
Cdd:PHA03247  2477 APVYrrpAEARFP-----FAAGAAPDPGGGGPPDPDAPPAPSRLAPailpdepvGEPVHprmltwirgleelasdDAGDP 2551
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14206 TPPAVPQQPgvlniPSYPTPVAPTPQspiYIPSQEQPKPTTRPSVINVPsvPQPAypTPQAPVYDVNYPTSPSVIPHQPG 14285
Cdd:PHA03247  2552 PPPLPPAAP-----PAAPDRSVPPPR---PAPRPSEPAVTSRARRPDAP--PQSA--RPRAPVDDRGDPRGPAPPSPLPP 2619
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14286 VVNIPSVPLPAP------PVKQRPVFVPSPVHPTPAPQPGVVNIPS-VAQPVHPTYQPPVVERPAiydvyypPPPSRPGV 14358
Cdd:PHA03247  2620 DTHAPDPPPPSPspaanePDPHPPPTVPPPERPRDDPAPGRVSRPRrARRLGRAAQASSPPQRPR-------RRAARPTV 2692
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14359 INIPSPPRPvyPVPQQPiyvPAPvlhipAPRPVIHNIPSVPQPTYPHRN---PPIQDVTYPAPQPSPPVPGIVNIPSLPQ 14435
Cdd:PHA03247  2693 GSLTSLADP--PPPPPT---PEP-----APHALVSATPLPPGPAAARQAspaLPAAPAPPAVPAGPATPGGPARPARPPT 2762
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14436 PVSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQ-PTPQRPSPGIINVPSVPQPIPTAPSPGiiniPSVPQPlPSPTPG 14514
Cdd:PHA03247  2763 TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESlPSPWDPADPPAAVLAPAAALPPAASPA----GPLPPP-TSAQPT 2837
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14515 VINIPQQPTPPPLVQQPGIInipsvqqPSTPTTQHPiqdvqyETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQ 14594
Cdd:PHA03247  2838 APPPPPGPPPPSLPLGGSVA-------PGGDVRRRP------PSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQ 2904
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14595 PKPPvsgiinipsvPQPVPSLTPgvinLPSEPSYSAPIPKPgiinvpsiPEPIPSIPQNPVQEVYHDTQKPQA------- 14667
Cdd:PHA03247  2905 PERP----------PQPQAPPPP----QPQPQPPPPPQPQP--------PPPPPPRPQPPLAPTTDPAGAGEPsgavpqp 2962
                          570       580
                   ....*....|....*....|....*
gi 442625924 14668 -----IPGVVNVPS--APQPTPGRP 14685
Cdd:PHA03247  2963 wlgalVPGRVAVPRfrVPQPAPSRE 2987
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
14167-14698 1.08e-24

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 116.40  E-value: 1.08e-24
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14167 NYSTTPS-PIPQKPGVVNIPSAPQPVHPApNPPVheFNYPTPPAVPQQPGVLNIPSYPTPvAPTPQSPiYIPSQEQPkPT 14245
Cdd:pfam03154   141 NRSTSPSiPSPQDNESDSDSSAQQQILQT-QPPV--LQAQSGAASPPSPPPPGTTQAATA-GPTPSAP-SVPPQGSP-AT 214
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14246 TRPSVINVPSVPqPAYPTPQAPvyDVNYPTSPSviPHQPgvvnIPSVPLPAPPVKQRPVFVPSPVHPTPAPqPGvvniPS 14325
Cdd:pfam03154   215 SQPPNQTQSTAA-PHTLIQQTP--TLHPQRLPS--PHPP----LQPMTQPPPPSQVSPQPLPQPSLHGQMP-PM----PH 280
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14326 VAQPVHPTYQPPVVERPAiydvyypPPPSRPGVINIPSPPRPVYPVPQQPiyvpapVLHIPAPRPVihnipsvPQPTYPH 14405
Cdd:pfam03154   281 SLQTGPSHMQHPVPPQPF-------PLTPQSSQSQVPPGPSPAAPGQSQQ------RIHTPPSQSQ-------LQSQQPP 340
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14406 RNPPIQdvtypapqpsppvpgivnipslPQPVSTPtsgvinipsQASPPISVPTPGIVNIPSIPQPtPQRPSPGIINVPS 14485
Cdd:pfam03154   341 REQPLP----------------------PAPLSMP---------HIKPPPTTPIPQLPNPQSHKHP-PHLSGPSPFQMNS 388
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14486 vPQPIPTAPSPgIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVqqpSTPTTQHPiqdvqyetqrpqPTP 14565
Cdd:pfam03154   389 -NLPPPPALKP-LSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSL---PPPAASHP------------PTS 451
                           410       420       430       440       450       460       470       480
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14566 GVINIPSvsQPTYPTQkpSYQDTSYPTVQPkppvsgiiniPSVPQP-VPSLTPGvINLPSEPSYSAPIPKPgiiNVPSIP 14644
Cdd:pfam03154   452 GLHQVPS--QSPFPQH--PFVPGGPPPITP----------PSGPPTsTSSAMPG-IQPPSSASVSSSGPVP---AAVSCP 513
                           490       500       510       520       530       540
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924  14645 EPIPSIPQNPVQEVYH------DTQKPQAIPGVVNVPSAPQPTP------GRPYYDVAKPDFEFNP 14698
Cdd:pfam03154   514 LPPVQIKEEALDEAEEpespppPPRSPSPEPTVVNTPSHASQSArfykhlDRGYNSCARTDLYFMP 579
PHA03247 PHA03247
large tegument protein UL36; Provisional
13897-14375 1.68e-23

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 113.11  E-value: 1.68e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13897 PFTRCyeTPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNI 13976
Cdd:PHA03247  2569 PPPRP--APRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTV 2646
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13977 PSVPQPvYPSPQPPVYDVNYPTTPVSQHPGVVNIPSAPR--LVPPTSQRPVFITSPGNLSPTPQPGviniPSVSQPGYPT 14054
Cdd:PHA03247  2647 PPPERP-RDDPAPGRVSRPRRARRLGRAAQASSPPQRPRrrAARPTVGSLTSLADPPPPPPTPEPA----PHALVSATPL 2721
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14055 PQSPIY--DANYPTTQSPIPQQP--------GVVNIPSVPSPSYP-APNPPVNYPTQPSPQIPVQPGV---INIPSAPLP 14120
Cdd:PHA03247  2722 PPGPAAarQASPALPAAPAPPAVpagpatpgGPARPARPPTTAGPpAPAPPAAPAAGPPRRLTRPAVAslsESRESLPSP 2801
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14121 TTPPQHP-PVFIPSPESPSPAPKPGVINIPSVTHPEYPT--------------SQVPVYDVNYSTTPSPIPQKPGVVNIP 14185
Cdd:PHA03247  2802 WDPADPPaAVLAPAAALPPAASPAGPLPPPTSAQPTAPPpppgppppslplggSVAPGGDVRRRPPSRSPAAKPAAPARP 2881
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14186 SAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPiyipsQEQPKPTTRPsviNVPSVPQPAYPTPQ 14265
Cdd:PHA03247  2882 PVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQP-----QPPPPPPPRP---QPPLAPTTDPAGAG 2953
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14266 APVYDVNYPTSPSVIPHQPGVVNIpSVPLPAPPvkqRPVFVPSPVHPTPAPQPGV--------VNIPSVAQPVH--PTYQ 14335
Cdd:PHA03247  2954 EPSGAVPQPWLGALVPGRVAVPRF-RVPQPAPS---REAPASSTPPLTGHSLSRVsswasslaLHEETDPPPVSlkQTLW 3029
                          490       500       510       520
                   ....*....|....*....|....*....|....*....|.
gi 442625924 14336 PPVVERPAIYDVYYPPPPSRPGVINI-PSPPRPVYPVPQQP 14375
Cdd:PHA03247  3030 PPDDTEDSDADSLFDSDSERSDLEALdPLPPEPHDPFAHEP 3070
PHA03378 PHA03378
EBNA-3B; Provisional
13898-14608 2.26e-19

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 98.99  E-value: 2.26e-19
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13898 FTRCYETPKPVRPQIydtpsPPYPVAIP----------DLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPS-PQPA 13966
Cdd:PHA03378   300 FRQCTGRPRPTKPWL-----RAHPVAVPyddpltseeiDLAYARGLAMEIEAVRLPDDPIIVEDDDESEEIESECdPDED 374
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13967 NPQKPGVVNIP-SVPQPVYPSPQPPVYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPvfitspgNLSPTPQPgvinip 14045
Cdd:PHA03378   375 KSGAEALASIPqTLPDPPTVYGRPKVFARKADLKSTKKCRAIVTDPSVIKAIEEEHRKK-------KAARTEQP------ 441
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14046 svsQPGyPTPQSPIYDANYPTTQspipQQPGVVNIPSVPSPSypAPNPPVNYPT-------QPSPQIPVQPGVI------ 14112
Cdd:PHA03378   442 ---RAT-PHSQAPTVVLHRPPTQ----PLEGPTGPLSVQAPL--EPWQPLPHPQvtpvilhQPPAQGVQAHGSMldllek 511
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14113 -------NIPSAPLPTTPPQ------HPPVFipspespspapkPGVINIPSvthpEYPTSQVPVYD-VNYSTTPSPIPQK 14178
Cdd:PHA03378   512 ddedmeqRVMATLLPPSPPQpragrrAPCVY------------TEDLDIES----DEPASTEPVHDqLLPAPGLGPLQIQ 575
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14179 PGVVNIPSAPQPVHPA----PNPPVHEFNYPTPPAVPQQPGVLNIP-SYPTPVAPTPQSPIYIpsqeqpkpttRPSVINV 14253
Cdd:PHA03378   576 PLTSPTTSQLASSAPSyaqtPWPVPHPSQTPEPPTTQSHIPETSAPrQWPMPLRPIPMRPLRM----------QPITFNV 645
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14254 PSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNI-PSVPLPAPPVKQRpvfvPSPVHPTPAPQPGVvnipsvaqpvhp 14332
Cdd:PHA03378   646 LVFPTPHQPPQVEITPYKPTWTQIGHIPYQPSPTGAnTMLPIQWAPGTMQ----PPPRAPTPMRPPAA------------ 709
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14333 tyqPPV-VERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIq 14411
Cdd:PHA03378   710 ---PPGrAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPA- 785
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14412 dvtypapqpsppvpgivnipslpqPVSTPTSGviniPSQASPPISVPTPGIVNIPSIP---QPTPQRPSPGIINVPSVPQ 14488
Cdd:PHA03378   786 ------------------------PQQRPRGA----PTPQPPPQAGPTSMQLMPRAAPgqqGPTKQILRQLLTGGVKRGR 837
                          650       660       670       680       690       700       710       720
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14489 PIPTAPSPGIINIPSVPQPLPSPTPGViNIPQQPT--PPPL--VQQPGIINIPSVQQPSTpTTQHPiqdvqyeTQRPQPT 14564
Cdd:PHA03378   838 PSLKKPAALERQAAAGPTPSPGSGTSD-KIVQAPVfyPPVLqpIQVMRQLGSVRAAAAST-VTQAP-------TEYTGER 908
                          730       740       750       760
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14565 PGVINIPSVSQPtyptqkPSYQDTSYPTVQPKPPVSGIINIPSV 14608
Cdd:PHA03378   909 RGVGPMHPTDIP------PSKRAKTDAYVESQPPHGGQSHSFSV 946
PRK10263 PRK10263
DNA translocase FtsK; Provisional
14145-14647 4.40e-17

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 91.69  E-value: 4.40e-17
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14145 VINIPSVTHPEYPTSQvPVydVNYSTTPSPIPQKPGVVNIPS--APQPVHPAPNPPVHEfnyptppavP-QQPGVLNIPS 14221
Cdd:PRK10263   340 VTQTPPVASVDVPPAQ-PT--VAWQPVPGPQTGEPVIAPAPEgyPQQSQYAQPAVQYNE---------PlQQPVQPQQPY 407
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14222 YPTPVAPTPQSPIYIPSQEQPKPTTRPS-VINVPSVPQPAYPTPQAPVYDvnypTSPSVIPHQPGVVnipsvPLPAPPVK 14300
Cdd:PRK10263   408 YAPAAEQPAQQPYYAPAPEQPAQQPYYApAPEQPVAGNAWQAEEQQSTFA----PQSTYQTEQTYQQ-----PAAQEPLY 478
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14301 QRPVFVPSPvhPTPAPQPGVVNIPSVAQPVHptYQPPVVERPA----IYDVYYPPPPsRPgvINIPSPPRPVYPVPQQPI 14376
Cdd:PRK10263   479 QQPQPVEQQ--PVVEPEPVVEETKPARPPLY--YFEEVEEKRArereQLAAWYQPIP-EP--VKEPEPIKSSLKAPSVAA 551
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14377 yVPaPVLHIPAPRPVIHNIPS--VPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVniPSLPQP--VSTPT-----SGVINI 14447
Cdd:PRK10263   552 -VP-PVEAAAAVSPLASGVKKatLATGAAATVAAPVFSLANSGGPRPQVKEGIG--PQLPRPkrIRVPTrrelaSYGIKL 627
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14448 PSQASPPISVPTPGIVNIPSIPQPTP------------------------------------------------------ 14473
Cdd:PRK10263   628 PSQRAAEEKAREAQRNQYDSGDQYNDdeidamqqdelarqfaqtqqqrygeqyqhdvpvnaedadaaaeaelarqfaqtq 707
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14474 -QRPS---PGIINVPSVP----QPI-------PTAP--SPGIINIpSVPQPLPSPTPGViNIPQQPTPPPLV-QQPgiin 14535
Cdd:PRK10263   708 qQRYSgeqPAGANPFSLDdfefSPMkallddgPHEPlfTPIVEPV-QQPQQPVAPQQQY-QQPQQPVAPQPQyQQP---- 781
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14536 ipsvQQPSTPTTQH--PIQDVQYETQRPQPTPGVINIPSVSQPTYPTQ-KPSYQdtsyptvQPKPPVSgiinipsvPQPV 14612
Cdd:PRK10263   782 ----QQPVAPQPQYqqPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVApQPQYQ-------QPQQPVA--------PQPQ 842
                          570       580       590       600
                   ....*....|....*....|....*....|....*....|
gi 442625924 14613 PSLTPGVI--NLPSEPSYSAPIPKPG---IINVPSIPEPI 14647
Cdd:PRK10263   843 DTLLHPLLmrNGDSRPLHKPTTPLPSldlLTPPPSEVEPV 882
ZP smart00241
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona ...
17722-17957 9.63e-17

Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).


Pssm-ID: 214579  Cd Length: 252  Bit Score: 85.13  E-value: 9.63e-17
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17722 CLADGVQVEIHiTEPGFNGVLYVKGHS-KDEECRRVVNLAGETVPRTEifrVHFGSCGM--QAVKDVA--SFVLVIQKHP 17796
Cdd:smart00241     2 CGEDQMVVSVS-TDLLFPGGINVKGLTlGDPSCRPQFTDATSAFVSFE---VPLNGCGTrrQVNPDGIvySNTLVVSPFH 77
                             90       100       110       120       130       140       150       160
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17797 KLVTYKAQ--AYNIKCVYQTGEKnVTLGFNVSMLTTAGTIANTGPPPICQMRIITNEGE----EINSAEIGDNLKLQVDV 17870
Cdd:smart00241    78 PGFITRDDraAYHFQCFYPENEK-VSLNLDVSTIPPTELSSVSEGPLTCSYRLYKDDSFgspyQSADYVLGDPVYHEWEC 156
                            170       180       190       200       210       220       230       240
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   17871 EPATI--YGGFARSCIAKTMEDNVQNEYLVTDENGCATDTSIFGNWEYNPDTNSLL-ASFNAFKFPSSDNIRFQCNIRVC 17947
Cdd:smart00241   157 DGADDppLGLLVDNCYATPGPDPSSGPKYFIIDNGCPVDGYLDSTIPYNSNPLHRArFSVKVFKFADRSLVYFHCQIRLC 236
                            250
                     ....*....|....
gi 442625924   17948 ----FGRCQPVNCG 17957
Cdd:smart00241   237 dkddGSSCDGPACS 250
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
13798-14227 9.68e-17

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 90.21  E-value: 9.68e-17
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13798 PVPIIQESPLTPCDPSPCGPNAQCHPSLNEAVCSCLPEfyGTPPNCRPECTLNSECA-----YDKACVH-------HKCV 13865
Cdd:pfam03154   172 PVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ--GSPATSQPPNQTQSTAAphtliQQTPTLHpqrlpspHPPL 249
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13866 DPCPGICGINADCRVHYHSPicyciSSHTGDPftrcyETPKPVR--PQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSA 13943
Cdd:pfam03154   250 QPMTQPPPPSQVSPQPLPQP-----SLHGQMP-----PMPHSLQtgPSHMQHPVPPQPFPLT-------PQSSQSQVPPG 312
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13944 PQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVyPSPQPPvydvnyPTTPVSQHPGvvniPSAPRLvPPTSQR 14023
Cdd:pfam03154   313 PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSM-PHIKPP------PTTPIPQLPN----PQSHKH-PPHLSG 380
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14024 PVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP 14103
Cdd:pfam03154   381 PSPFQMNSNLPPPPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLHQVPSQS 460
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14104 QIPVQPgviNIPSAPLPTTPPQHPPvfipspespspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSPipqkpgvvN 14183
Cdd:pfam03154   461 PFPQHP---FVPGGPPPITPPSGPP--------------------TSTS-SAMPGIQPPSSASVSSSGPVP--------A 508
                           410       420       430       440
                    ....*....|....*....|....*....|....*....|....*....
gi 442625924  14184 IPSAPQPVHPAPNPPVHEFNYPTPPAVPQ-----QPGVLNIPSYPTPVA 14227
Cdd:pfam03154   509 AVSCPLPPVQIKEEALDEAEEPESPPPPPrspspEPTVVNTPSHASQSA 557
PHA03378 PHA03378
EBNA-3B; Provisional
13944-14523 1.39e-16

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 89.74  E-value: 1.39e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13944 PQPIyPTPQSPQYNVNYPSPQP-ANPQKPGVVNIPSVPQPVYPSPQ-PPVYDVNYPTTPVSQHPGVVNIPSA-------- 14013
Cdd:PHA03378   441 PRAT-PHSQAPTVVLHRPPTQPlEGPTGPLSVQAPLEPWQPLPHPQvTPVILHQPPAQGVQAHGSMLDLLEKddedmeqr 519
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14014 --PRLVPPTSQRPVfitsPGNLSPTPQPGVINIPSvsqpgyptpqspiydaNYPTTQSPIPQQPgvvnipsvpspsYPAP 14091
Cdd:PHA03378   520 vmATLLPPSPPQPR----AGRRAPCVYTEDLDIES----------------DEPASTEPVHDQL------------LPAP 567
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14092 NPpvnyptqpsPQIPVQPgVINIPSAPLPTTPPQHppvfipspespspAPKPGVINIPSvTHPEYPTSQvpvydvnystT 14171
Cdd:PHA03378   568 GL---------GPLQIQP-LTSPTTSQLASSAPSY-------------AQTPWPVPHPS-QTPEPPTTQ----------S 613
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14172 PSPIPQKPGVVNIPSAPQPVHPAPNPPVhEFNYPTPPAVPQQPGVlnipsYPTPVAPTPQSPIYIPSqeQPKPTTRPSVI 14251
Cdd:PHA03378   614 HIPETSAPRQWPMPLRPIPMRPLRMQPI-TFNVLVFPTPHQPPQV-----EITPYKPTWTQIGHIPY--QPSPTGANTML 685
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14252 NVPSVPQPAYPTPQAPVydvnyPTSPsviphqpgvvnipsvPLPAPPVKQRPVFVPSPVHPtPAPQPGVVNIPSVAQPVH 14331
Cdd:PHA03378   686 PIQWAPGTMQPPPRAPT-----PMRP---------------PAAPPGRAQRPAAATGRARP-PAAAPGRARPPAAAPGRA 744
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14332 PTYQ--PPVVERPAIYDVYYPPPPSRPGVINiPSPPRPVYPVP-QQPIYVPAPVLHIPA-PRPVIHNIPSVPQPTYPHRN 14407
Cdd:PHA03378   745 RPPAaaPGRARPPAAAPGRARPPAAAPGAPT-PQPPPQAPPAPqQRPRGAPTPQPPPQAgPTSMQLMPRAAPGQQGPTKQ 823
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14408 PPIQDVTYPA----PQPSPPVPGIVNIPSLPQPvsTPTSGVINIPSQAS---PPISVPT--PGIVNIP------SIPQPT 14472
Cdd:PHA03378   824 ILRQLLTGGVkrgrPSLKKPAALERQAAAGPTP--SPGSGTSDKIVQAPvfyPPVLQPIqvMRQLGSVraaaasTVTQAP 901
                          570       580       590       600       610
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14473 PQRPSPGIINVPSVPQPIPTAPSPGIINI--PSVPQPLPSPTPGVI----NIPQQPT 14523
Cdd:PHA03378   902 TEYTGERRGVGPMHPTDIPPSKRAKTDAYveSQPPHGGQSHSFSVIwenvSQGQQQT 958
PHA03247 PHA03247
large tegument protein UL36; Provisional
14295-14705 2.14e-16

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 89.61  E-value: 2.14e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14295 PAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVV---------------------ERPAIYDVYYPPPP 14353
Cdd:PHA03247  2475 PGAPVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAIlpdepvgepvhprmltwirglEELASDDAGDPPPP 2554
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVI------NIPSP---PRPVYP----------VPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPiqdvT 14414
Cdd:PHA03247  2555 LPPAAPpaapdrSVPPPrpaPRPSEPavtsrarrpdAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPP----S 2630
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14415 YPAPQPSPPVPGIVNIPSLPQPVSTPTsgviniPSQASPPISVPTPGIVNIPSIPQPTPQRPS--PGIINVPSVPQPipt 14492
Cdd:PHA03247  2631 PSPAANEPDPHPPPTVPPPERPRDDPA------PGRVSRPRRARRLGRAAQASSPPQRPRRRAarPTVGSLTSLADP--- 2701
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14493 aPSPGiinipsvPQPLPSPTPGVINIPQQPTPPplvqqpgiinipSVQQPSTPTTQHPIqdvqyetqrPQPTPgviNIPS 14572
Cdd:PHA03247  2702 -PPPP-------PTPEPAPHALVSATPLPPGPA------------AARQASPALPAAPA---------PPAVP---AGPA 2749
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14573 VsqPTYPTQKPSYQDTSYPT--VQPKPPVSGiiniPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSi 14650
Cdd:PHA03247  2750 T--PGGPARPARPPTTAGPPapAPPAAPAAG----PPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAA- 2822
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14651 pqnpvqevyhdtqkpqaipgvvnVPSAPQPTPGRPYYDVAKPDFEFNPCYPSPCG 14705
Cdd:PHA03247  2823 -----------------------SPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGG 2854
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
14033-14401 3.63e-16

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 87.52  E-value: 3.63e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14033 LSPTPQPGVINIPSVSQPGYPTPQSPIYDAnyPTTQsPIPQqpgvvniPSVPSPSYPAPNPPvnyPTQPSPQIPVQPGVI 14112
Cdd:NF033839   147 SSSSSSSGSSTKPETPQPENPEHQKPTTPA--PDTK-PSPQ-------PEGKKPSVPDINQE---KEKAKLAVATYMSKI 213
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14113 --NIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV----PVYDVNYSTTPSPIPQKPGVVNIPS 14186
Cdd:NF033839   214 ldDIQKHHLQKEKHRQIVALIKELDELKKQALSEIDNVNTKVEIENTVHKIfadmDAVVTKFKKGLTQDTPKEPGNKKPS 293
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQP-VHPAPNPPVHEfnyPTPPAVPQQPGVLNIPSYPTP-VAPTPQS--PIYIPSQEQPKPTTRPSvinvPSVPQPAY- 14261
Cdd:NF033839   294 APKPgMQPSPQPEKKE---VKPEPETPKPEVKPQLEKPKPeVKPQPEKpkPEVKPQLETPKPEVKPQ----PEKPKPEVk 366
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 PTPQAPvydvnyptSPSVIPhQPGVvnipsvplPAPPVKQRPVfVPSP-VHPTP-APQPGVVNIPSVAQP-VHPTYQPPv 14338
Cdd:NF033839   367 PQPEKP--------KPEVKP-QPET--------PKPEVKPQPE-KPKPeVKPQPeKPKPEVKPQPEKPKPeVKPQPEKP- 427
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14339 veRPaiyDVYYPPPPSRPGVINIPSPPRP-VYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVPQP 14401
Cdd:NF033839   428 --KP---EVKPQPEKPKPEVKPQPEKPKPeVKPQPETP--KPEVKPQPEKPKPEVKPQPEKPKP 484
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
13941-14278 8.64e-16

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 86.36  E-value: 8.64e-16
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13941 PSAPQPIYPTPQSPQYNV--NYPSPQP--ANPQKPGVVNIPSVPQP-VYPSPQPPVYDVNYPTTPVSQHPGVVNIPSA-- 14013
Cdd:NF033839   159 PETPQPENPEHQKPTTPApdTKPSPQPegKKPSVPDINQEKEKAKLaVATYMSKILDDIQKHHLQKEKHRQIVALIKEld 238
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14014 --------------PRLVPPTSQRPVFIT--------SPGNLSPTPQPGVINIPSVSQPGY-PTPQSPIydanypTTQSP 14070
Cdd:NF033839   239 elkkqalseidnvnTKVEIENTVHKIFADmdavvtkfKKGLTQDTPKEPGNKKPSAPKPGMqPSPQPEK------KEVKP 312
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14071 IPQQPGVVNIPSVPSPSyPAPNPPvnyPTQPSPQIPVQPGVINIPSAPLPTTP-PQHPPvfipspesPSPAPKPGVINIP 14149
Cdd:NF033839   313 EPETPKPEVKPQLEKPK-PEVKPQ---PEKPKPEVKPQLETPKPEVKPQPEKPkPEVKP--------QPEKPKPEVKPQP 380
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14150 SVTHPEY-PTSQVPVYDVNysttPSPIPQKPGVVNIPSAPQP-VHPAPNPPVHEFNyPTPPAvpQQPGVLNIPSYPTP-V 14226
Cdd:NF033839   381 ETPKPEVkPQPEKPKPEVK----PQPEKPKPEVKPQPEKPKPeVKPQPEKPKPEVK-PQPEK--PKPEVKPQPEKPKPeV 453
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14227 APTPQSPI--YIPSQEQPKPTTRPSvinvPSVPQPAYPTPQApvyDVNYPTSPS 14278
Cdd:NF033839   454 KPQPETPKpeVKPQPEKPKPEVKPQ----PEKPKPDNSKPQA---DDKKPSTPN 500
PRK10263 PRK10263
DNA translocase FtsK; Provisional
13907-14381 1.72e-15

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 86.29  E-value: 1.72e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQiydTPSPPYPVAIPDLvyvqqQQPGIvnipsAPQPIyPTPQSPQyNVNYPSPQPANPQkpgvvniPSVPQPVYPS 13986
Cdd:PRK10263   336 PVEPV---TQTPPVASVDVPP-----AQPTV-----AWQPV-PGPQTGE-PVIAPAPEGYPQQ-------SQYAQPAVQY 393
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13987 PQPpvYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPVFitspgnLSPTPQPgviNIPSVSQPGYPTPQSPIYDAN--- 14063
Cdd:PRK10263   394 NEP--LQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQ------PYYAPAP---EQPVAGNAWQAEEQQSTFAPQsty 462
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14064 --YPTTQSPIPQQPGVVNIPSVPSPSYPAPNP------PVNYPT---------------------QPSPQiPVQPGVINI 14114
Cdd:PRK10263   463 qtEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPvveetkPARPPLyyfeeveekrarereqlaawyQPIPE-PVKEPEPIK 541
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14115 PSAPlPTTPPQHPPVfipSPESPSPAPKPGVINIPSVTHPEyPTSQVPVYDVNYSTTPSPI------PQ--KPGVVNIPS 14186
Cdd:PRK10263   542 SSLK-APSVAAVPPV---EAAAAVSPLASGVKKATLATGAA-ATVAAPVFSLANSGGPRPQvkegigPQlpRPKRIRVPT 616
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 ------------------------------------------------APQ-----------------PVHPAPNPPVHE 14201
Cdd:PRK10263   617 rrelasygiklpsqraaeekareaqrnqydsgdqynddeidamqqdelARQfaqtqqqrygeqyqhdvPVNAEDADAAAE 696
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14202 FNYPTPPAVPQQ-------PGVLNIPSYP----TP----VAPTPQSPIYIPSQEqpkPTTRPSVinvPSVPQPAYPTPQA 14266
Cdd:PRK10263   697 AELARQFAQTQQqrysgeqPAGANPFSLDdfefSPmkalLDDGPHEPLFTPIVE---PVQQPQQ---PVAPQQQYQQPQQ 770
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14267 PV---YDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVfvpspvhpTPAPQPGVVNIPSVAQPVHPTYQPPVVERPA 14343
Cdd:PRK10263   771 PVapqPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPV--------APQPQYQQPQQPVAPQPQYQQPQQPVAPQPQ 842
                          570       580       590       600
                   ....*....|....*....|....*....|....*....|.
gi 442625924 14344 ---IYDVYYPPPPSRPgvinipsPPRPVYPVPQQPIYVPAP 14381
Cdd:PRK10263   843 dtlLHPLLMRNGDSRP-------LHKPTTPLPSLDLLTPPP 876
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
14354-14692 5.51e-15

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 83.66  E-value: 5.51e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVINIPSPPRPVYPVPQQPIyVPAPVLHiPAPRPVIHNiPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSL 14433
Cdd:NF033839   151 SSSGSSTKPETPQPENPEHQKPT-TPAPDTK-PSPQPEGKK-PSVPDINQEKEKAKLAVATYMSKILDDIQKHHLQKEKH 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPVSTPTSgVINIPSQASPPISVPTPGIVnipsiPQPTPQRPSPGIINVPSVPQP--IPTAPSPGIINIPSVPQPL--P 14509
Cdd:NF033839   228 RQIVALIKE-LDELKKQALSEIDNVNTKVE-----IENTVHKIFADMDAVVTKFKKglTQDTPKEPGNKKPSAPKPGmqP 301
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14510 SPTPGVINIPQQPTPP-----PLVQQPGiiniPSVQ-QPSTPTtqhPIQDVQYETQRPQ-------PTPGVINIPSVSQP 14576
Cdd:NF033839   302 SPQPEKKEVKPEPETPkpevkPQLEKPK----PEVKpQPEKPK---PEVKPQLETPKPEvkpqpekPKPEVKPQPEKPKP 374
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14577 TYPTQ----KPSYQ---DTSYPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEP-IP 14648
Cdd:NF033839   375 EVKPQpetpKPEVKpqpEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVK 454
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14649 SIPQNPVQEVYHDTQKPQaiPGVVNVPSAPQPTPGRPYYDVAKP 14692
Cdd:NF033839   455 PQPETPKPEVKPQPEKPK--PEVKPQPEKPKPDNSKPQADDKKP 496
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
14373-14708 5.40e-14

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 81.35  E-value: 5.40e-14
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14373 QQPIYVPAPVLHIPAPrPVIHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPvpgIVNIPSLPQPVSTPTSGVINIPSQAS 14452
Cdd:pfam03154   164 QQILQTQPPVLQAQSG-AASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPA---TSQPPNQTQSTAAPHTLIQQTPTLHP 239
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14453 PPISVPTPGIVNIPSIPQPT---PQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGViniPQQPTPPPLVQ 14529
Cdd:pfam03154   240 QRLPSPHPPLQPMTQPPPPSqvsPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSS---QSQVPPGPSPA 316
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14530 QPGiiniPSVQQPSTPTTQHPIQDVQYETQRPQPtPGVINIPSVS-QPTYP-TQKPSYQDTSYPTVQPKP-PVSGIINIP 14606
Cdd:pfam03154   317 APG----QSQQRIHTPPSQSQLQSQQPPREQPLP-PAPLSMPHIKpPPTTPiPQLPNPQSHKHPPHLSGPsPFQMNSNLP 391
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14607 svpqPVPSLTPgvinLPSEPSYSAPIPKPGIINVPSIPEPIPSIP-QNPVQevyhdTQKPQAIPGVVNVP--SAPQPTPG 14683
Cdd:pfam03154   392 ----PPPALKP----LSSLSTHHPPSAHPPPLQLMPQSQQLPPPPaQPPVL-----TQSQSLPPPAASHPptSGLHQVPS 458
                           330       340
                    ....*....|....*....|....*
gi 442625924  14684 RPYYdvakPDFEFNPCYPSPCGPYS 14708
Cdd:pfam03154   459 QSPF----PQHPFVPGGPPPITPPS 479
PHA03379 PHA03379
EBNA-3A; Provisional
14117-14575 7.94e-14

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 80.87  E-value: 7.94e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14117 APLPTTPPQHPPVFIPSPESPSPAPkpgvinipsvTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVvnipsAPQPVhpAPN 14196
Cdd:PHA03379   408 ASEPTYGTPRPPVEKPRPEVPQSLE----------TATSHGSAQVPEPPPVHDLEPGPLHDQHSM-----APCPV--AQL 470
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14197 PPVhefnyPTPPAVP--QQPGVLNIPS-YPTPVaPTPQSPIYIPSqeQPKPTTRPSVINVPSVPQPA----YPTPQAPVY 14269
Cdd:PHA03379   471 PPG-----PLQDLEPgdQLPGVVQDGRpACAPV-PAPAGPIVRPW--EASLSQVPGVAFAPVMPQPMpvepVPVPTVALE 542
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14270 DVNYPTSPSVIPHQPGVvnipsvplPAPPVKQRPVFVPSPVHPTPAPQPGVVNI---PSVAQPVHPTYQPPV-VERPAIY 14345
Cdd:PHA03379   543 RPVCPAPPLIAMQGPGE--------TSGIVRVRERWRPAPWTPNPPRSPSQMSVrdrLARLRAEAQPYQASVeVQPPQLT 614
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14346 DVYYPPPPSRPGVINipSPPRPVYPVPQQPIYVPAPvlHIPAPRPvihnipsvpqptyPHRNPPIQDVTYPAPQPSPPVP 14425
Cdd:PHA03379   615 QVSPQQPMEYPLEPE--QQMFPGSPFSQVADVMRAG--GVPAMQP-------------QYFDLPLQQPISQGAPLAPLRA 677
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14426 GIVNIPslPQPVSTPTSGVINIpsqaSPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPI-PTAPSPGIINIPsV 14504
Cdd:PHA03379   678 SMGPVP--PVPATQPQYFDIPL----TEPINQGASAAHFLPQQPMEGPLVPERWMFQGATLSQSVrPGVAQSQYFDLP-L 750
                          410       420       430       440       450       460       470
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14505 PQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVqQPSTPTTQHPIQDVQYeTQRPQPTPGVINIPSVSQ 14575
Cdd:PHA03379   751 TQPINHGAPAAHFLHQPPMEGPWVPEQWMFQGAPP-SQGTDVVQHQLDALGY-VLHVLNHPGVPVSPAVNQ 819
PRK10263 PRK10263
DNA translocase FtsK; Provisional
13966-14410 1.33e-13

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 80.13  E-value: 1.33e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13966 ANPQKPgVVNIPSVPQPVYPSPQPpvyDVNYPTTPVSQHPGVVnIPSAPRLVPPTSQRPVfiTSPGNLSPTPQPGVINIP 14045
Cdd:PRK10263   334 AAPVEP-VTQTPPVASVDVPPAQP---TVAWQPVPGPQTGEPV-IAPAPEGYPQQSQYAQ--PAVQYNEPLQQPVQPQQP 406
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14046 SVSQPGYPTPQSPIYDANYPTTQ-----SPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQipvqpgvinipsaPLP 14120
Cdd:PRK10263   407 YYAPAAEQPAQQPYYAPAPEQPAqqpyyAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQ-------------PAA 473
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14121 TTPPQHPPVFIPSpespspapkpgviniPSVTHPEYPTSQV-----PVY---DVNYSTT-----------PSPIPQKPGV 14181
Cdd:PRK10263   474 QEPLYQQPQPVEQ---------------QPVVEPEPVVEETkparpPLYyfeEVEEKRArereqlaawyqPIPEPVKEPE 538
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14182 VNIPSAPqPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPT------------------PQSP---------- 14233
Cdd:PRK10263   539 PIKSSLK-APSVAAVPPVEAAAAVSPLASGVKKATLATGAAATVAAPVfslansggprpqvkegigPQLPrpkrirvptr 617
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14234 -------IYIPSQEQPKPTTRPSVINVPSVPQPAY----------------PTPQAPVYDVNYPTSPSVIP--------- 14281
Cdd:PRK10263   618 relasygIKLPSQRAAEEKAREAQRNQYDSGDQYNddeidamqqdelarqfAQTQQQRYGEQYQHDVPVNAedadaaaea 697
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14282 ----------------HQPGVVNIPSVP-LPAPPVK-------QRPVFVPSpVHPTPAPQPGVVNIPSVAQPVHPTYQPP 14337
Cdd:PRK10263   698 elarqfaqtqqqrysgEQPAGANPFSLDdFEFSPMKallddgpHEPLFTPI-VEPVQQPQQPVAPQQQYQQPQQPVAPQP 776
                          490       500       510       520       530       540       550
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14338 VVERPAiydvyYP-PPPSRPGVINIPSPPRPVYPVPQQPIyVPAPVLHIPAPrpvihniPSVPQPTYPHRNPPI 14410
Cdd:PRK10263   777 QYQQPQ-----QPvAPQPQYQQPQQPVAPQPQYQQPQQPV-APQPQYQQPQQ-------PVAPQPQYQQPQQPV 837
Streccoc_I_II NF033804
antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins ...
14157-14365 1.60e-13

antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins with a glucan-binding domain, two types of repetitive regions, an isopeptide bond-forming domain associated with shear resistance, and a C-terminal LPXTG motif for anchoring to the cell wall. They occur in oral Streptococci, and tend to be major cell surface adhesins. Members of this family include SspA and SspB from Streptococcus gordonii, antigen I/II from S. mutans, etc.


Pssm-ID: 468188 [Multi-domain]  Cd Length: 1552  Bit Score: 79.98  E-value: 1.60e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14157 PTSQVPVYDVNYSTTPspipQKPGV----------VNIPSAPQ-----PVHP-APNPPVHEFNYPTPPAvpqqPGVLNIP 14220
Cdd:NF033804   791 PSDEMPAVPGRDNTEG----KKPNIwyslngkiraVNVPKITKekptpPVAPtAPQAPTYEVEKPLEPA----PVAPTYE 862
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPTPVAPTPQspiyipsQEQPKPTTRPSVinvpSVPQPAYPTPQAPVYDvNYPTSPSVIPHQPgvvnIPSVPLPAPPVK 14300
Cdd:NF033804   863 NEPTPPVKTPD-------QPEPSKPEEPTY----ETEKPLEPAPVAPTYE-NEPTPPVKTPDQP----EPSKPEEPTYET 926
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14301 QRPVfVPSPVHPT----PAPQPGVVNIPSVAQPVHPTYQPpvverpaiydvyYPPPPSRPGVINIPSPP 14365
Cdd:NF033804   927 EKPL-EPAPVAPSyenePTPPVKTPDQPEPSKPVEPTYDP------------LPTPPVAPTPKQLPTPP 982
PHA03377 PHA03377
EBNA-3C; Provisional
14151-14651 2.15e-13

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 79.33  E-value: 2.15e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14151 VTHPEYPTSQVPVYDVNYSTTPSPIPQKPGvvnipsapqpvhPAPNPPVhefnyPTPPAVPQQPGvlnipsYPTPVA-PT 14229
Cdd:PHA03377   425 KTHPVKRTLVKTSGRSDEAEQAQSTPERPG------------PSDQPSV-----PVEPAHLTPVE------HTTVILhQP 481
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14230 PQSPIYIPSQEQPKPTTRPS------------VINV------PSVPQPAYPTpqapvydvnypTSPSVIPHQPGVVNIPS 14291
Cdd:PHA03377   482 PQSPPTVAIKPAPPPSRRRRgacvvydddiieVIDVetteeeESVTQPAKPH-----------RKVQDGFQRSGRRQKRA 550
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14292 VPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAqpvhPTYQPPVVERPAIYDVYYPPPPSrpgvinipSPPRPVYPV 14371
Cdd:PHA03377   551 TPPKVSPSDRGPPKASPPVMAPPSTGPRVMATPSTG----PRDMAPPSTGPRQQAKCKDGPPA--------SGPHEKQPP 618
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14372 PQQPIYVPAPVLHI------------PAPRP-----VIHNIPSVPQPTYPHRNPPIQDVTYPAPQpsppvpgivnIPSLP 14434
Cdd:PHA03377   619 SSAPRDMAPSVVRMflrerlleqstgPKPKSfwemrAGRDGSGIQQEPSSRRQPATQSTPPRPSW----------LPSVF 688
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14435 QPVSTPTSGVINIPSQASPPISVPTPgivnIPSIPQPT---PQRPSPGIINVPSVPQPIPTAPSPGiiniPSVPQPLPSP 14511
Cdd:PHA03377   689 VLPSVDAGRAQPSEESHLSSMSPTQP----ISHEEQPRyedPDDPLDLSLHPDQAPPPSHQAPYSG----HEEPQAQQAP 760
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14512 TPGVinipQQPTPPPL----VQQP-----GIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTY--PT 14580
Cdd:PHA03377   761 YPGY----WEPRPPQApylgYQEPqaqgvQVSSYPGYAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPWAPRPPHlpPQ 836
                          490       500       510       520       530       540       550
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14581 QKPSY-----QDTSYPTVQPK--PPVSGIINIPSVPQPVPSLTpgvinlPSEPSYSAPIPKPGIINVPSiPEPIPSIP 14651
Cdd:PHA03377   837 WDGSAghgqdQVSQFPHLQSEtgPPRLQLSQVPQLPYSQTLVS------SSAPSWSSPQPRAPIRPIPT-RFPPPPMP 907
PRK10263 PRK10263
DNA translocase FtsK; Provisional
14057-14545 2.16e-13

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 79.36  E-value: 2.16e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14057 SPIYDANYPTTQSPI---PQQPgVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPvQPGVinipsAPLPTTPPQHPPVFIPS 14133
Cdd:PRK10263   318 EPVAVAAAATTATQSwaaPVEP-VTQTPPVASVDVPPAQPTVAWQPVPGPQTG-EPVI-----APAPEGYPQQSQYAQPA 390
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14134 PESPSPAPKPGVINIPSVTHPEYPTSQVPVYDVNYST-----TPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEFNYPTPP 14208
Cdd:PRK10263   391 VQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQpaqqpYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQ 470
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14209 AVPQQPGVLNIPSYPTPVAPTPQspiyiPSQEQPKPTtRPSVINVPSVPQP-AYPTPQAPVYdvnYPTSPSviPHQPGVV 14287
Cdd:PRK10263   471 PAAQEPLYQQPQPVEQQPVVEPE-----PVVEETKPA-RPPLYYFEEVEEKrAREREQLAAW---YQPIPE--PVKEPEP 539
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14288 NIPSVPLPAPPVkqrpvfVPsPVHPTPAPQP-------GVVNIPSVAQPVHPTYQPPV--VERPAIYDVYYP--PPPSRP 14356
Cdd:PRK10263   540 IKSSLKAPSVAA------VP-PVEAAAAVSPlasgvkkATLATGAAATVAAPVFSLANsgGPRPQVKEGIGPqlPRPKRI 612
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14357 GV----------INIPS----------PPRPVYPVPQQPIYVPAPVLH-------------------------------- 14384
Cdd:PRK10263   613 RVptrrelasygIKLPSqraaeekareAQRNQYDSGDQYNDDEIDAMQqdelarqfaqtqqqrygeqyqhdvpvnaedad 692
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14385 IPAPRPVIHNIPSVPQPTYPHRNP--------------PIQDVtypapqpsppvpgIVNIPSLP------QPVSTPTSGV 14444
Cdd:PRK10263   693 AAAEAELARQFAQTQQQRYSGEQPaganpfslddfefsPMKAL-------------LDDGPHEPlftpivEPVQQPQQPV 759
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14445 INIPSQASPPISVPTPGIVNIPSIPQPTPQR---------PSPGIINV--PSVPQPIPTAPSPGIINIPSVPQPLPSPTP 14513
Cdd:PRK10263   760 APQQQYQQPQQPVAPQPQYQQPQQPVAPQPQyqqpqqpvaPQPQYQQPqqPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP 839
                          570       580       590
                   ....*....|....*....|....*....|..
gi 442625924 14514 GviniPQQPTPPPLVQQPGiiNIPSVQQPSTP 14545
Cdd:PRK10263   840 Q----PQDTLLHPLLMRNG--DSRPLHKPTTP 865
PHA03379 PHA03379
EBNA-3A; Provisional
14154-14685 6.03e-13

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 77.79  E-value: 6.03e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14154 PEYPTSQVPVydvnysttPSPIPQKP---------GVVNIPSaPQPVHPAPNPPVHEFNYPTPPAVPQQPgvlnipsyPT 14224
Cdd:PHA03379   411 PTYGTPRPPV--------EKPRPEVPqsletatshGSAQVPE-PPPVHDLEPGPLHDQHSMAPCPVAQLP--------PG 473
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14225 PVAPTPqspiyiPSQEQPKPttrpsvinvPSVPQPAyPTPqapvydVNYPTSPSVIPHQPGVVNIPSVPlPAPPVKQRPV 14304
Cdd:PHA03379   474 PLQDLE------PGDQLPGV---------VQDGRPA-CAP------VPAPAGPIVRPWEASLSQVPGVA-FAPVMPQPMP 530
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14305 FVPSPVhPTPAPQPGVVNIPSVAQ---PVHPTYQPPVVERpaiydvYYPPPPSrpgviniPSPPRPVypvpqqpiyVPAP 14381
Cdd:PHA03379   531 VEPVPV-PTVALERPVCPAPPLIAmqgPGETSGIVRVRER------WRPAPWT-------PNPPRSP---------SQMS 587
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14382 VLHIPA---PRPVIHNIPSVPQPTYPHRNPPIQDVTYpapqpsppvpgivniPSLPQPVSTPTSGVINIPSQA-SPPISV 14457
Cdd:PHA03379   588 VRDRLArlrAEAQPYQASVEVQPPQLTQVSPQQPMEY---------------PLEPEQQMFPGSPFSQVADVMrAGGVPA 652
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14458 PTPGIVNIPsIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPsVPQPLPSPTPGVINIPQQPTPPPLV--------- 14528
Cdd:PHA03379   653 MQPQYFDLP-LQQPISQGAPLAPLRASMGPVPPVPATQPQYFDIP-LTEPINQGASAAHFLPQQPMEGPLVperwmfqga 730
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14529 -----QQPGIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTPGV-----INIPSVSQPT--YPTQKPSYQDTSYPTVQPK 14596
Cdd:PHA03379   731 tlsqsVRPGVAQSQYFDLPLTQPINHGAPAAHFLHQPPMEGPWVpeqwmFQGAPPSQGTdvVQHQLDALGYVLHVLNHPG 810
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14597 PPVSGIINIPSVPQ-----PVPSLTPGVINLPSEPSYSAPIPKPGiinvpsipEPIPSIPQNPVQEvyhdtQKPQAIPGV 14671
Cdd:PHA03379   811 VPVSPAVNQYHVSQaafglPIDEDESGEGSDTSEPCEALDLSIHG--------RPCPQAPEWPVQG-----EGGQDATEV 877
                          570
                   ....*....|....
gi 442625924 14672 VNVPSAPQPTPGRP 14685
Cdd:PHA03379   878 LDLSIHGRPRPRTP 891
PHA03379 PHA03379
EBNA-3A; Provisional
13894-14303 6.51e-13

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 77.79  E-value: 6.51e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13894 TGDPFTRCYETPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQP--GIVNIPSaPQPIYPTPQSPQYNVNYPSPQPanpqkp 13971
Cdd:PHA03379   394 AGKLTERAREALEKASEPTYGTPRPPVEKPRPEVPQSLETATshGSAQVPE-PPPVHDLEPGPLHDQHSMAPCP------ 466
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 gVVNIPsvpqpvyPSPQPPVydvnyptTPVSQHPGVvniPSAPRLVPPTSQRPV-FITSPGNLSPTPQPGVINIPSVSQP 14050
Cdd:PHA03379   467 -VAQLP-------PGPLQDL-------EPGDQLPGV---VQDGRPACAPVPAPAgPIVRPWEASLSQVPGVAFAPVMPQP 528
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14051 --GYPTPQsPIYDANYPTTQSPI------PQQP-GVVNIPSVPSPSYPAPNPPvnyptQPSPQIPVQPGV---------- 14111
Cdd:PHA03379   529 mpVEPVPV-PTVALERPVCPAPPliamqgPGETsGIVRVRERWRPAPWTPNPP-----RSPSQMSVRDRLarlraeaqpy 602
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14112 ---INIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQVPVYDvnYSTTpSPIPQKPGVVNIPSA- 14187
Cdd:PHA03379   603 qasVEVQPPQLTQVSPQQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPAMQPQYFD--LPLQ-QPISQGAPLAPLRASm 679
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14188 -PQPVHPAPNPPVHEFNYPTPPA--------VPQQP--GVLNIPSYPTPVAPTPQS--PIYIPSQEQPKPTTRPsvIN-- 14252
Cdd:PHA03379   680 gPVPPVPATQPQYFDIPLTEPINqgasaahfLPQQPmeGPLVPERWMFQGATLSQSvrPGVAQSQYFDLPLTQP--INhg 757
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14253 ---VPSVPQPAYPTPQAPVYDVNYPTSPS----VIPHQPGVVNIPSVPLPAPPVKQRP 14303
Cdd:PHA03379   758 apaAHFLHQPPMEGPWVPEQWMFQGAPPSqgtdVVQHQLDALGYVLHVLNHPGVPVSP 815
PRK10263 PRK10263
DNA translocase FtsK; Provisional
14203-14706 7.25e-13

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 77.82  E-value: 7.25e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14203 NYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSvinvPSVPQPAyPTPQAPVydVNYPTSPSVIPH 14282
Cdd:PRK10263   297 NRATQPEYDEYDPLLNGAPITEPVAVAAAATTATQSWAAPVEPVTQT----PPVASVD-VPPAQPT--VAWQPVPGPQTG 369
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14283 QPGVVNIPSVPLPAPPVKQRPVFVPSPVHpTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVYYPPPPSrpgviniP 14362
Cdd:PRK10263   370 EPVIAPAPEGYPQQSQYAQPAVQYNEPLQ-QPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQ-------P 441
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 SPPRPVYPVPQQPIYVPAPVLhipaprpvihnipsvpQPTYPHRNPPIQDVTYPAPQPsppvpgivnipsLPQPVSTPTS 14442
Cdd:PRK10263   442 VAGNAWQAEEQQSTFAPQSTY----------------QTEQTYQQPAAQEPLYQQPQP------------VEQQPVVEPE 493
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14443 GVINIPSQASPPI----------------------SVPTPgivnipsIPQPTPQRPSPGIINVPSVPqPIPTAPS----- 14495
Cdd:PRK10263   494 PVVEETKPARPPLyyfeeveekrarereqlaawyqPIPEP-------VKEPEPIKSSLKAPSVAAVP-PVEAAAAvspla 565
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14496 PGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQ--------PGIINIPSVQQPSTPTTQHPIQDVQYE----TQRPQP 14563
Cdd:PRK10263   566 SGVKKATLATGAAATVAAPVFSLANSGGPRPQVKEgigpqlprPKRIRVPTRRELASYGIKLPSQRAAEEkareAQRNQY 645
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14564 TPGVI----NIPSVSQ-----------------------PTYPT------------QKPSYQDTSYPTVQPK-------- 14596
Cdd:PRK10263   646 DSGDQynddEIDAMQQdelarqfaqtqqqrygeqyqhdvPVNAEdadaaaeaelarQFAQTQQQRYSGEQPAganpfsld 725
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14597 ----PPVSGIIN-IPSVPQpvpsLTPGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPV--QEVYHDTQKPQAIP 14669
Cdd:PRK10263   726 dfefSPMKALLDdGPHEPL----FTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVapQPQYQQPQQPVAPQ 801
                          570       580       590       600
                   ....*....|....*....|....*....|....*....|
gi 442625924 14670 GV---VNVPSAPQPTPGRPYYDVAKPdfefnPCYPSPCGP 14706
Cdd:PRK10263   802 PQyqqPQQPVAPQPQYQQPQQPVAPQ-----PQYQQPQQP 836
PHA03247 PHA03247
large tegument protein UL36; Provisional
13904-14209 2.76e-12

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 76.13  E-value: 2.76e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13904 TPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPSPQPANPQKP---GVV------ 13974
Cdd:PHA03247  2784 TRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPlggSVApggdvr 2863
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13975 -NIPSVPQPVYP--SPQPPVYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPG 14051
Cdd:PHA03247  2864 rRPPSRSPAAKPaaPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14052 YPTPQSPiyDANYPTTQSPIPQQ----PGVVNIPSVPSPSyPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHP 14127
Cdd:PHA03247  2944 APTTDPA--GAGEPSGAVPQPWLgalvPGRVAVPRFRVPQ-PAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPP 3020
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14128 PVfipspespspapkpgvinipSVTHPEYPTSqvpvyDVNYSTTPSPIPQKPGVVNIpSAPQPVHPAPNPPVHEFNYPTP 14207
Cdd:PHA03247  3021 PV--------------------SLKQTLWPPD-----DTEDSDADSLFDSDSERSDL-EALDPLPPEPHDPFAHEPDPAT 3074

                   ..
gi 442625924 14208 PA 14209
Cdd:PHA03247  3075 PE 3076
TALPID3 pfam15324
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ...
14108-14683 3.03e-12

Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.


Pssm-ID: 434634 [Multi-domain]  Cd Length: 1288  Bit Score: 75.69  E-value: 3.03e-12
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14108 QPGVINIPSAPLPTTPPQHPP-VFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV---PVYDVNYST----------TPS 14173
Cdd:pfam15324   527 TPNKSVIPRKHFQKQAEEHFRkPPVRSMPASSLQKKEGPLKSTTSLQDEDYLLQVygkAVYQGHRSTlkkgpylrfnSPS 606
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14174 PI--PQKPGVV------NIPSA--------PQPV-------HPAPNPPvHEFNYPTPPA--VPQQPGVLniPSYPTPVA- 14227
Cdd:pfam15324   607 PKskPQRPKVIesvkgtKVKSArtqtdlhaTKPVktdskmqHSVTAPH-QEQQYLFSPSreMPSQSGTL--EGHLIPMAi 683
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14228 ----PTPQSPIYIPSQ---EQPKPTTrpsVINvpSVPqPAYPTPQAPVYDVNY--------PTSPS--VIPHQPGVvNIP 14290
Cdd:pfam15324   684 plgqTQSDSDSPPPAGvivSKPHPVT---VTT--SIP-PSSRKPEPGVKKPNIallemkseKKDPPqlTVQVLPSV-DID 756
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14291 SVPLPAPPVKQRPvFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVYYPP------PPSRPGVINI--- 14361
Cdd:pfam15324   757 SVSCSSRDSSPSP-VLPSPSEASPPLIQTWIQTPELMKEDEEEVKFPGTNFDEVIDVIQDEekedeiPEFSEPPLEFnrs 835
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14362 PSPPRPVYPVPQQPiyvPAPvlhiPAPRPVIHNIPSVPQPTYPHRNPPIQDVTypapqpsppvpgivnipslPQPVSTPT 14441
Cdd:pfam15324   836 VKPPSTKYNGPPFP---PVV----SQPQPTTDILDKVIEQRETLENRLVDWVE-------------------QEIMARII 889
                           410       420       430       440       450       460       470       480
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14442 SGVINIPSQASPPISVP--------TPGIVNIPS-----------IP-----------------------QPTPQRPSPG 14479
Cdd:pfam15324   890 SGMFPQQAQADPDASVSesepsepsTSDIVEAAGggglqlfvdagVPvdsemirhfvnealaetiaimlgDREAQREPPV 969
                           490       500       510       520       530       540       550       560
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14480 iinVPSVPQPIPTapspgiiNIPSVPQPLPSPTPGViniPQQPtPPPLvQQPGIINIPSVQQPSTPTTQHPIQDVQYET- 14558
Cdd:pfam15324   970 ---AASVPGDLPT-------KETLLPTPVPTPQPTP---PCSP-PSPL-KEPSPVKTPDSSPCVSEHDFFPVKEIPPEKg 1034
                           570       580       590       600       610       620       630       640
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14559 QRPQPTPGVINIPSVSqptyPTQKPSyqdtsyPTVQPKPPVSGI-INIPSVPQPVPSLTPGVINLPSEPSYSAPI----- 14632
Cdd:pfam15324  1035 ADTGPAVSLVITPTVT----PIATPP------PAATPTPPLSENsIDKLKSPSPELPKPWEDSDLPLEEENPNSEqeelh 1104
                           650       660       670       680       690
                    ....*....|....*....|....*....|....*....|....*....|....*
gi 442625924  14633 PKPGIINVPSIPEP----IPSIPQNPvqevyhdtqKPQAIPGVVNVPSAPQPTPG 14683
Cdd:pfam15324  1105 PRAVVMSVARDEEPesvvLPASPPEP---------KPLAPPPLGAAPPSPPQSPS 1150
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
13931-14601 8.30e-12

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 73.83  E-value: 8.30e-12
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13931 QQQQPGIV-NIPSAPQPIYPTPQSPQYNVNYPSPqpANPQKPGVVNIPSVPQPVYpspqppvydvnYPTTPvsQHPGVVN 14009
Cdd:pfam03157    92 QQLQQGIFwGIPALLQRYYPGVTSPQQVSYYPGQ--ASPQRPGQGQQPGQGQQWY-----------YPTSP--QQPGQWQ 156
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14010 IPSA--PRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSP- 14086
Cdd:pfam03157   157 QPGQgqQGYYPTSPQQSGQRQQPGQGQQLRQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQg 236
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14087 SYPAPNPPVNYPTQPSpqipvQPGVINIPSAPLpttPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQvpvydv 14166
Cdd:pfam03157   237 QQPGQGQQGQQPGQPQ-----QLGQGQQGYYPI---SPQQPRQWQQSGQGQQGYYPTSLQQPGQGQSGYYPTSQ------ 302
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14167 nysTTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPT-PVAPTPQSPIYIP-SQEQPKP 14244
Cdd:pfam03157   303 ---QQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQGQQPGQGQPGYYPTsPQQPGQGQPGYYPtSQQQPQQ 379
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14245 TTRPSVINVPSVP-------QPA---YPTPQAPVYdvnYPTSPSVIPH-QPGvvNIPSVPLPAPPVKQrpvfvPSPVHPT 14313
Cdd:pfam03157   380 GQQPEQGQQGQQQgqgqqgqQPGqgqQPGQGQPGY---YPTSPQQSGQgQPG--YYPTSPQQSGQGQQ-----PGQGQQP 449
                           410       420       430       440       450       460       470       480
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14314 PAPQPGVVNIPSVAQPVHPTYQPPVVERPAI-YDVYYPPPPSRPGviNIPSPPRPVYPVPQQPIYVPAPVLHiPAPRPVI 14392
Cdd:pfam03157   450 GQEQPGQGQQPGQGQQGQQPGQPEQGQQPGQgQPGYYPTSPQQSG--QGQQLGQWQQQGQGQPGYYPTSPLQ-PGQGQPG 526
                           490       500       510       520       530       540       550       560
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14393 HNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSgviniPSQASPPISVPTPGIVN---IPSIP 14469
Cdd:pfam03157   527 YYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQ-----GQQGQQPGQGQQPGQGQpgyYPTSP 601
                           570       580       590       600       610       620       630       640
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14470 QPTPQRPSPGIINVPSVPQPIPTAPSPGIIN------IPSVP-QPLPSPTPGVIN---------IPQQPTPPPLVQQPGi 14533
Cdd:pfam03157   602 QQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGqgqqgyYPTSPqQPGQGQQPGQWQqsgqgqqgyYPTSPQQSGQAQQPG- 680
                           650       660       670       680       690       700       710
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14534 inipSVQQPStpTTQHPIQDVQ--YETQRPQPTPGvinipsvSQPTYPTQKPSYQDTSYPTVQPKPPVSG 14601
Cdd:pfam03157   681 ----QGQQPG--QWLQPGQGQQgyYPTSPQQPGQG-------QQLGQGQQSGQGQQGYYPTSPGQGQQSG 737
PHA03378 PHA03378
EBNA-3B; Provisional
14228-14683 1.22e-11

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 73.56  E-value: 1.22e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14228 PTPQSPIYI----PSQEQPKPTTRPSViNVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQP---------GVVNIPSVPL 14294
Cdd:PHA03378   445 PHSQAPTVVlhrpPTQPLEGPTGPLSV-QAPLEPWQPLPHPQVTPVILHQPPAQGVQAHGSmldllekddEDMEQRVMAT 523
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14295 PAPPVKQRP--------VF------------VPSPVHPT--PAPQPGVVNIPSVAQPVHP---TYQPPVVERP--AIYDV 14347
Cdd:PHA03378   524 LLPPSPPQPragrrapcVYtedldiesdepaSTEPVHDQllPAPGLGPLQIQPLTSPTTSqlaSSAPSYAQTPwpVPHPS 603
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14348 YYPPPPSRPGVINIPSPPRPvYPVPQQPIyvpapVLHIPAPRPVIHNIPSVPQPTYPhrnPPIQDVTYPAPQPSppvpgI 14427
Cdd:PHA03378   604 QTPEPPTTQSHIPETSAPRQ-WPMPLRPI-----PMRPLRMQPITFNVLVFPTPHQP---PQVEITPYKPTWTQ-----I 669
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14428 VNIPSLPQPVSTPTSGVIN-IPSQASPPISVPTPgiVNIPSIPqPTPQRPSPGIINVPSVPQPIPTAPSPgiinipsvPQ 14506
Cdd:PHA03378   670 GHIPYQPSPTGANTMLPIQwAPGTMQPPPRAPTP--MRPPAAP-PGRAQRPAAATGRARPPAAAPGRARP--------PA 738
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14507 PLPSPTPGVINIPQQPTPPPLVQQPGIiniPSVQQPSTPTtqhpiqdvqyetqrPQPTPGVinipsvsqPTYPTQKPsyQ 14586
Cdd:PHA03378   739 AAPGRARPPAAAPGRARPPAAAPGRAR---PPAAAPGAPT--------------PQPPPQA--------PPAPQQRP--R 791
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14587 DTSYPTVQPK-PPVSGIINIPSVP-QPVPSLTPGVINLPSEPSYSAP---IPKPGIINVPSIPEPIPS------IPQNPV 14655
Cdd:PHA03378   792 GAPTPQPPPQaGPTSMQLMPRAAPgQQGPTKQILRQLLTGGVKRGRPslkKPAALERQAAAGPTPSPGsgtsdkIVQAPV 871
                          490       500       510
                   ....*....|....*....|....*....|....*..
gi 442625924 14656 qeVYHDTQKPQAIPGVV---------NVPSAPQPTPG 14683
Cdd:PHA03378   872 --FYPPVLQPIQVMRQLgsvraaaasTVTQAPTEYTG 906
PHA03377 PHA03377
EBNA-3C; Provisional
13947-14404 2.05e-11

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 72.78  E-value: 2.05e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13947 IYPTPQSPQYNVNYPSPQpANPQKPGVV----------NIPSVPQPVYPSPQPPVydvnyPTTPVsqHPGVVNIPSAPRL 14016
Cdd:PHA03377   408 VSRVPWRKPRTLPWPTPK-THPVKRTLVktsgrsdeaeQAQSTPERPGPSDQPSV-----PVEPA--HLTPVEHTTVILH 479
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14017 VPPTSQRPVFItspgnlSPTPQPG----------------VINI------PSVSQPGYP--TPQSPI-YDANYPTTQSPI 14071
Cdd:PHA03377   480 QPPQSPPTVAI------KPAPPPSrrrrgacvvydddiieVIDVetteeeESVTQPAKPhrKVQDGFqRSGRRQKRATPP 553
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14072 PQQPGVVNIPSV--PSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPapkpgvINIP 14149
Cdd:PHA03377   554 KVSPSDRGPPKAspPVMAPPSTGPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASGPHEKQPPSSAPR------DMAP 627
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14150 SVTHP-------EYPTSQVP--VYDVNYSTTPSPIPQKPGVVNIPsAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIP 14220
Cdd:PHA03377   628 SVVRMflrerllEQSTGPKPksFWEMRAGRDGSGIQQEPSSRRQP-ATQSTPPRPSWLPSVFVLPSVDAGRAQPSEESHL 706
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPTPVAPT----------PQSPIYI---PSQEQPKPTTRP----SVINVPSVPQPAY---PTPQAPVYDVNYPTSpsvi 14280
Cdd:PHA03377   707 SSMSPTQPIsheeqpryedPDDPLDLslhPDQAPPPSHQAPysghEEPQAQQAPYPGYwepRPPQAPYLGYQEPQA---- 782
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14281 pHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDV-YYPPPPSRPGvi 14359
Cdd:PHA03377   783 -QGVQVSSYPGYAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPWAPRPPHLPPQWDGSAGHGQDQVsQFPHLQSETG-- 859
                          490       500       510       520       530
                   ....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14360 nipsPPR------PVYPVPQQPIYVPAPVLHIPAPRPVIHNIPS-VPQPTYP 14404
Cdd:PHA03377   860 ----PPRlqlsqvPQLPYSQTLVSSSAPSWSSPQPRAPIRPIPTrFPPPPMP 907
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
14020-14528 1.18e-10

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 69.71  E-value: 1.18e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14020 TSQRPVFITSPGNLS-PTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPipqqPGVVNIPSvPSPSYPA-------- 14090
Cdd:COG5180      2 RKATILEIRLLATVPiPPNAARPVLSPELWAAANNDAVSQGDRSALASSPTR----PYARKIFE-PLDIKLAlgkpqlps 76
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14091 -PNPPVNYPTQP---SPQIPVQP--GVINIPSAPLPTTPPQHPPVFIPSPESPSpapkpgVINIPSVTHPEYPTSQVPVY 14164
Cdd:COG5180     77 vAEPEAYLDPAPpksSPDTPEEQlgAPAGDLLVLPAAKTPELAAGALPAPAAAA------ALPKAKVTREATSASAGVAL 150
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14165 DVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPN-----PPVHEFNYPTP---PAVPQQPGVLNIPSYPTPVAPTPQsPIYI 14236
Cdd:COG5180    151 AAALLQRSDPILAKDPDGDSASTLPPPAEKLDkvltePRDALKDSPEKldrPKVEVKDEAQEEPPDLTGGADHPR-PEAA 229
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDvnyptspsviPHQPGVVNIPSVPLPAPPV---KQRPVFV-PSPVHP 14312
Cdd:COG5180    230 SSPKVDPPSTSEARSRPATVDAQPEMRPPADAKE----------RRRAAIGDTPAAEPPGLPVleaGSEPQSDaPEAETA 299
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14313 TPAPQPGVVNIPSVAQPVHPT---------YQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQpiyVPAPVl 14383
Cdd:COG5180    300 RPIDVKGVASAPPATRPVRPPggardpgtpRPGQPTERPAGVPEAASDAGQPPSAYPPAEEAVPGKPLEQG---APRPG- 375
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14384 HIPAPRPVIHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTS------GVINIPSQASPPISV 14457
Cdd:COG5180    376 SSGGDGAPFQPPNGAPQPGLGRRGAPGPPMGAGDLVQAALDGGGRETASLGGAAGGAGQgpkadfVPGDAESVSGPAGLA 455
                          490       500       510       520       530       540       550
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14458 PTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGiinIPSVPQPLPSPTPgVINIPQQPTPPPLV 14528
Cdd:COG5180    456 DQAGAAASTAMADFVAPVTDATPVDVADVLGVRPDAILGG---NVAPASGLDAETR-IIEAEGAPATEDFV 522
PHA03379 PHA03379
EBNA-3A; Provisional
13980-14413 3.34e-10

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 68.93  E-value: 3.34e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13980 PQPVYPSPQPPVyDVNYPTTPVS--QHP--GVVNIPSAPrlvpptsqrPVFITSPGNLSPtpQPGVINIPSVSQPgyPTP 14055
Cdd:PHA03379   409 SEPTYGTPRPPV-EKPRPEVPQSleTATshGSAQVPEPP---------PVHDLEPGPLHD--QHSMAPCPVAQLP--PGP 474
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14056 QSPIydanypttqSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP-QIPVQpgvinipsAPLPTTPPQHPPVFIPSP 14134
Cdd:PHA03379   475 LQDL---------EPGDQLPGVVQDGRPACAPVPAPAGPIVRPWEASLsQVPGV--------AFAPVMPQPMPVEPVPVP 537
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14135 ESPSPAPKPGVINIPSVTHPEYPTSQVPVYD--VNYSTTPSPiPQKPGVVNIPSAPQPVHPAPNPPVHEFNYpTPPAVPQ 14212
Cdd:PHA03379   538 TVALERPVCPAPPLIAMQGPGETSGIVRVRErwRPAPWTPNP-PRSPSQMSVRDRLARLRAEAQPYQASVEV-QPPQLTQ 615
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14213 QPgvlnipsyptpvaptPQSPIYIPSQ-EQPKPTTRPSVINVPSVPQPAYPTPQAPVYDvnYPTSpsviphQPGVVNIPS 14291
Cdd:PHA03379   616 VS---------------PQQPMEYPLEpEQQMFPGSPFSQVADVMRAGGVPAMQPQYFD--LPLQ------QPISQGAPL 672
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14292 VPLPAPPVkqrpvfvpsPVHPTPAPQPGVVNIP---------SVAQ--PVHPTyQPPVVERPAIYDVYYPPPPSRPGVIN 14360
Cdd:PHA03379   673 APLRASMG---------PVPPVPATQPQYFDIPltepinqgaSAAHflPQQPM-EGPLVPERWMFQGATLSQSVRPGVAQ 742
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14361 IPSPPRPVypvpQQPIYVPAPVLHIPaPRPVIHNiPSVPQPTYPHRNPPIQDV 14413
Cdd:PHA03379   743 SQYFDLPL----TQPINHGAPAAHFL-HQPPMEG-PWVPEQWMFQGAPPSQGT 789
PHA03377 PHA03377
EBNA-3C; Provisional
13892-14270 2.54e-09

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 65.84  E-value: 2.54e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13892 SHTGDPFTRC-YETPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYpTPQSPQYNVNYPS-------- 13962
Cdd:PHA03377   558 SDRGPPKASPpVMAPPSTGPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASGPHEK-QPPSSAPRDMAPSvvrmflre 636
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13963 ---PQPANPqKPGVV-------NIPSVPQPVYPSPQPPVydvnYPTTPV-SQHPGVVNIPSaprlVPPTSQRPVFITSPG 14031
Cdd:PHA03377   637 rllEQSTGP-KPKSFwemragrDGSGIQQEPSSRRQPAT----QSTPPRpSWLPSVFVLPS----VDAGRAQPSEESHLS 707
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14032 NLSPTpQPgvinIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ---PGVVNIPS--VPSPSYPAPNPPvnyptqPSP--- 14103
Cdd:PHA03377   708 SMSPT-QP----ISHEEQPRYEDPDDPLDLSLHPDQAPPPSHQapySGHEEPQAqqAPYPGYWEPRPP------QAPylg 776
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14104 -QIPVQPG--VINIPSAPLPTTP-PQHppvfipspespspapkpgviniPSVTHPEYPTSQVPVYdvNYSTTP-SPIPQK 14178
Cdd:PHA03377   777 yQEPQAQGvqVSSYPGYAGPWGLrAQH----------------------PRYRHSWAYWSQYPGH--GHPQGPwAPRPPH 832
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14179 PGVVNIPSA-PQPVHPAPNPPVHefNYPTPPAvPQQPGVLNIPSYPTPV---APTPQSPiyipsqeQPKPTTRPsvinVP 14254
Cdd:PHA03377   833 LPPQWDGSAgHGQDQVSQFPHLQ--SETGPPR-LQLSQVPQLPYSQTLVsssAPSWSSP-------QPRAPIRP----IP 898
                          410
                   ....*....|....*.
gi 442625924 14255 SvpqpAYPTPQAPVYD 14270
Cdd:PHA03377   899 T----RFPPPPMPLQD 910
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
14066-14586 4.04e-09

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 64.56  E-value: 4.04e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14066 TTQSPIPQQPGVVNIP-SVPSPsyPAPNPPVnypTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESpspapkpg 14144
Cdd:cd22540     18 TTQDSQPSPLALLAATcSKIGP--PAVEAAV---TPPAPPQPTPRKLVPIKPAPLPLGPGKNSIGFLSAKGN-------- 84
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14145 VINI-PSVTHPEYPTSQVPVYDVN-------YSTTPSPIPQKPGVVNIPSAPQP-------VHPAPNPpvhefNYPTPPA 14209
Cdd:cd22540     85 IIQLqGSQLSSSAPGGQQVFAIQNptmiikgSQTRSSTNQQYQISPQIQAAGQInnsgqiqIIPGTNQ-----AIITPVQ 159
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14210 VPQQPgvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTrpsVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVN- 14288
Cdd:cd22540    160 VLQQP---QQAHKPVPIKPAPLQTSNTNSASLQVPGN---VIKLQSGGNVALTLPVNNLVGTQDGATQLQLAAAPSKPSk 233
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14289 -----IPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGvVNIPSVAQPvhPTYQPPVVERpaiydVYYPPPPSRPGVINIps 14363
Cdd:cd22540    234 kirkkSAQAAQPAVTVAEQVETVLIETTADNIIQAG-NNLLIVQSP--GTGQPAVLQQ-----VQVLQPKQEQQVVQI-- 303
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 pprpvypvPQQPIYVpapvlhipaPRPVIHNIPSVPQPtyPHRNPPIQdvtypapqpsppvpgivNIPSLPQPV--STPT 14441
Cdd:cd22540    304 --------PQQALRV---------VQAASATLPTVPQK--PLQNIQIQ-----------------NSEPTPTQVyiKTPS 347
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVPTPgivniPSIPQPTPQRPSPGIINVPSVPQPIPTAPspgiinipsvPQPLPSPTPGVI--NIP 14519
Cdd:cd22540    348 GEVQTVLLQEAPAATATPS-----SSTSTVQQQVTANNGTGTSKPNYNVRKER----------TLPKIAPAGGIIslNAA 412
                          490       500       510       520       530       540
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14520 QQPTPPPLVQQpgiINIPSVQQPSTPTTQhpiqdvqyeTQRP-QPTPGVINIPSVSQPTYPTQKPSYQ 14586
Cdd:cd22540    413 QLAAAAQAIQT---ININGVQVQGVPVTI---------TNAGgQQQLTVQTVSSNNLTISGLSPTQIQ 468
PspC_subgroup_2 NF033839
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ...
13903-14122 5.00e-09

pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.


Pssm-ID: 468202 [Multi-domain]  Cd Length: 557  Bit Score: 64.40  E-value: 5.00e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKP-VRPQiydtPSPPYPVAIPDLvyvQQQQPGIVNIPSAPQP-IYPTPQSPQYNVnypSPQPANPqKPGVVnipsvP 13980
Cdd:NF033839   326 EKPKPeVKPQ----PEKPKPEVKPQL---ETPKPEVKPQPEKPKPeVKPQPEKPKPEV---KPQPETP-KPEVK-----P 389
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13981 QPVYP----SPQPPVYDVNYPTTPVSQHPGVVNIPSAPRL-VPPTSQRPvfitspgNLSPTPQPGVINiPSV-SQPGYPT 14054
Cdd:NF033839   390 QPEKPkpevKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVKPQPEKP-------KPEVKPQPEKPK-PEVkPQPETPK 461
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14055 PQ-SPIYDANYPTTQsPIPQQPGVVNipSVPSPSYPAPNPPVNYP--TQPSPQIPVQPGVINIPSAPLPTT 14122
Cdd:NF033839   462 PEvKPQPEKPKPEVK-PQPEKPKPDN--SKPQADDKKPSTPNNLSkdKQPSNQASTNEKATNKPKKSLPST 529
PHA03378 PHA03378
EBNA-3B; Provisional
14277-14692 5.88e-09

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 64.70  E-value: 5.88e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14277 PSVIPHQPGVVNIPSVPLPAPPVKQRPVFvpspvhptpapqPGVVNIPSVAQPV---HPTYQPPVVERPAIydvyyPPPP 14353
Cdd:PHA03378   385 PQTLPDPPTVYGRPKVFARKADLKSTKKC------------RAIVTDPSVIKAIeeeHRKKKAARTEQPRA-----TPHS 447
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVInIPSPPRPVYPVPQQPIYVPAPVlhipaprpvihnipsVPQPTYPHrnPPIQDVtypapqpsppvpgIVNIPSL 14433
Cdd:PHA03378   448 QAPTVV-LHRPPTQPLEGPTGPLSVQAPL---------------EPWQPLPH--PQVTPV-------------ILHQPPA 496
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 pQPVSTPTSgVINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGII--------NVPSVPQPIPT----APSPGIINI 14501
Cdd:PHA03378   497 -QGVQAHGS-MLDLLEKDDEDMEQRVMATLLPPSPPQPRAGRRAPCVYtedldiesDEPASTEPVHDqllpAPGLGPLQI 574
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14502 psvpQPLPSPTPGVInipqQPTPPPLVQQPGIINIPSvQQPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPT---- 14577
Cdd:PHA03378   575 ----QPLTSPTTSQL----ASSAPSYAQTPWPVPHPS-QTPEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPItfnv 645
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14578 ----YPTQKPSYQDTSYPTVQPKPPvsgiiNIPSVPQPV-------PSLTPGVINLPsePSYSAPIPKPGIINVPSIPEP 14646
Cdd:PHA03378   646 lvfpTPHQPPQVEITPYKPTWTQIG-----HIPYQPSPTgantmlpIQWAPGTMQPP--PRAPTPMRPPAAPPGRAQRPA 718
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14647 IPSIPQNPVQEVYHDTQKPQAIPGVVNVPSA-------PQPTPGRPYYDVAKP 14692
Cdd:PHA03378   719 AATGRARPPAAAPGRARPPAAAPGRARPPAAapgrarpPAAAPGRARPPAAAP 771
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
13986-14267 7.56e-09

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 63.83  E-value: 7.56e-09
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13986 SPQPPVYDVNYPTTPVSQHPGVVNiPSAP--RLVPPTSQRPVFITSPGNL----SPTPQPGVINIPSVSQPGYPTPQSPI 14059
Cdd:pfam17823   134 IAALPSEAFSAPRAAACRANASAA-PRAAiaAASAPHAASPAPRTAASSTtaasSTTAASSAPTTAASSAPATLTPARGI 212
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14060 YDA----NYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPP-VNYPTQPSPQIPVQPGVINIpSAPLPTT--PPQHPPVFIP 14132
Cdd:pfam17823   213 STAatatGHPAAGTALAAVGNSSPAAGTVTAAVGTVTPAaLATLAAAAGTVASAAGTINM-GDPHARRlsPAKHMPSDTM 291
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14133 SPESPSpapkpgviniPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNPPVhefnyPTPPAVPQ 14212
Cdd:pfam17823   292 ARNPAA----------PMGAQAQGPIIQVSTDQPVHNTAGEPTPSPSNTTLEPNTPKSVASTNLAVV-----TTTKAQAK 356
                           250       260       270       280       290
                    ....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924  14213 QPGvlnipSYPTPVAPTPQspiyIPSQEQPKPTTRPSVInvPSVPQPAYP-TPQAP 14267
Cdd:pfam17823   357 EPS-----ASPVPVLHTSM----IPEVEATSPTTQPSPL--LPTQGAAGPgILLAP 401
PHA03377 PHA03377
EBNA-3C; Provisional
14292-14708 7.86e-09

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 64.30  E-value: 7.86e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14292 VPLPAP---PVKQRPVFVPSPV---HPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAiydvyypPPPSRPGviniPS-- 14363
Cdd:PHA03377   390 LPYIDPnmePVQQRPVMFVSRVpwrKPRTLPWPTPKTHPVKRTLVKTSGRSDEAEQAQ-------STPERPG----PSdq 458
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 PPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPI---QDVTYPAPQPSPPVPGIVNIPSLPQ--PVS 14438
Cdd:PHA03377   459 PSVPVEPAHLTPVEHTTVILHQPPQSPPTVAIKPAPPPSRRRRGACVvydDDIIEVIDVETTEEEESVTQPAKPHrkVQD 538
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14439 TPTSGVINIPSQASPPISvptPGIVNIPSIPQPTPQRPS--PGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPgvi 14516
Cdd:PHA03377   539 GFQRSGRRQKRATPPKVS---PSDRGPPKASPPVMAPPStgPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASGP--- 612
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14517 NIPQQPTPPPLVQQPGII---------------------------NIPSVQQPSTPTTQHPIQDVqyeTQRPQPTPGVIN 14569
Cdd:PHA03377   613 HEKQPPSSAPRDMAPSVVrmflrerlleqstgpkpksfwemragrDGSGIQQEPSSRRQPATQST---PPRPSWLPSVFV 689
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14570 IPSV------------------SQPTYPTQKPSYQDTSYPT-VQPKPPVSgiinipsvPQPVP-SLTPGVINLPSEPSys 14629
Cdd:PHA03377   690 LPSVdagraqpseeshlssmspTQPISHEEQPRYEDPDDPLdLSLHPDQA--------PPPSHqAPYSGHEEPQAQQA-- 759
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14630 apiPKPGiinvpsIPEPIPsiPQNPvqevYHDTQKPQAIPG-VVNVPSAPQPTPGRPYYdvakpdfefnPCYPSPCGPYS 14708
Cdd:PHA03377   760 ---PYPG------YWEPRP--PQAP----YLGYQEPQAQGVqVSSYPGYAGPWGLRAQH----------PRYRHSWAYWS 814
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
14197-14584 1.49e-08

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 63.56  E-value: 1.49e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14197 PPVHEFNYPTPPAVPQQPGVLNIPsyptPVAP---TPQSPIYIPSQE--QPKPTTRPSVINVPSVPQPAYPTPQapvydv 14271
Cdd:PTZ00449   497 APIEEEDSDKHDEPPEGPEASGLP----PKAPgdkEGEEGEHEDSKEsdEPKEGGKPGETKEGEVGKKPGPAKE------ 566
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14272 nyptspsvipHQPGVVnipsvplpaPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQpvHPTyQPPVVERPAIYDVyyPP 14351
Cdd:PTZ00449   567 ----------HKPSKI---------PTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQ--RPT-RPKSPKLPELLDI--PK 622
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14352 PPSRPGVINIP-SPPRPVYPV-PQQPIYVPAPvlhiPAPRPvihniPSVPQPTYphrNPPIQDVTYPAPQPSPPvpgivn 14429
Cdd:PTZ00449   623 SPKRPESPKSPkRPPPPQRPSsPERPEGPKII----KSPKP-----PKSPKPPF---DPKFKEKFYDDYLDAAA------ 684
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 ipslpQPVSTPTSGVINIPSQASPPISVP-TPGIVNIPSIPQPtPQRPSpgiinVPSVP-QPI--PTAPSPGIInipsvp 14505
Cdd:PTZ00449   685 -----KSKETKTTVVLDESFESILKETLPeTPGTPFTTPRPLP-PKLPR-----DEEFPfEPIgdPDAEQPDDI------ 747
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14506 QPLPSPTPGVINIPQQPTPPPLvqqPGIInipsvqqpstpTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPS 14584
Cdd:PTZ00449   748 EFFTPPEEERTFFHETPADTPL---PDIL-----------AEEFKEEDIHAETGEPDEAMKRPDSPSEHEDKPPGDHPS 812
TALPID3 pfam15324
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ...
13952-14495 3.43e-08

Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.


Pssm-ID: 434634 [Multi-domain]  Cd Length: 1288  Bit Score: 62.21  E-value: 3.43e-08
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13952 QSPQYNVNYPSPQpANPQKPGVvnIPSVPQPVYPSPQppvydvnyptTPVSQHPG--VVNIPSAPRLVPPTSQRPVFITS 14029
Cdd:pfam15324   596 KGPYLRFNSPSPK-SKPQRPKV--IESVKGTKVKSAR----------TQTDLHATkpVKTDSKMQHSVTAPHQEQQYLFS 662
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14030 PGNLSPT---PQPGVInIPSVSQPGYPTPQSpiyDANYPTTQSPIPQQPGVVnIPSVPsPSYPAPNPPVNYPT------- 14099
Cdd:pfam15324   663 PSREMPSqsgTLEGHL-IPMAIPLGQTQSDS---DSPPPAGVIVSKPHPVTV-TTSIP-PSSRKPEPGVKKPNiallemk 736
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14100 -----QPSPQIPVQPGViNIPS--------APLPTTPP----QHPPVFIPspespspapkpgvINIPSVTHP-----EYP 14157
Cdd:pfam15324   737 sekkdPPQLTVQVLPSV-DIDSvscssrdsSPSPVLPSpseaSPPLIQTW-------------IQTPELMKEdeeevKFP 802
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14158 -TSQVPVYDVNysttpspipQKPGVVN-IPSAPQPVHpapnppvhEFN-YPTPPAVPqqpgvLNIPSYPtPVAPTPQspi 14234
Cdd:pfam15324   803 gTNFDEVIDVI---------QDEEKEDeIPEFSEPPL--------EFNrSVKPPSTK-----YNGPPFP-PVVSQPQ--- 856
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14235 yiPSQE------QPKPTTRPSVIN------VPSVPQPAYPTPQAPVYDVNYPTS-------------------------- 14276
Cdd:pfam15324   857 --PTTDildkviEQRETLENRLVDwveqeiMARIISGMFPQQAQADPDASVSESepsepstsdiveaagggglqlfvdag 934
                           410       420       430       440       450       460       470       480
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14277 --------------------------------PSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPvhPTPAPQPGVVNIP 14324
Cdd:pfam15324   935 vpvdsemirhfvnealaetiaimlgdreaqrePPVAASVPGDLPTKETLLPTPVPTPQPTPPCSP--PSPLKEPSPVKTP 1012
                           490       500       510       520       530       540       550       560
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14325 SVAQPVHPTYQPPVVERPAIYDVYYPP---PPSRPGVINIPSPPRPVYPVPqqpiyvpapvlhiPAPRPVIHNIPSvPQP 14401
Cdd:pfam15324  1013 DSSPCVSEHDFFPVKEIPPEKGADTGPavsLVITPTVTPIATPPPAATPTP-------------PLSENSIDKLKS-PSP 1078
                           570       580       590       600       610       620       630       640
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14402 TYPH----RNPPIQDVTypapqpsppvpgivniPSLPQPVSTPTSGVINIPsQASPPISVPTPGivnipSIPQPTPQRPS 14477
Cdd:pfam15324  1079 ELPKpwedSDLPLEEEN----------------PNSEQEELHPRAVVMSVA-RDEEPESVVLPA-----SPPEPKPLAPP 1136
                           650
                    ....*....|....*...
gi 442625924  14478 PGIINVPSVPQPIPTAPS 14495
Cdd:pfam15324  1137 PLGAAPPSPPQSPSSSSS 1154
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14178-14380 3.85e-08

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 61.82  E-value: 3.85e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14178 KPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVP 14257
Cdd:PRK12323   364 RPGQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARG 443
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14258 QPAYPTPQAPVYDVNYPTSPSVIPHQPGvvniPSVPLPAPPVKQRPVFVPSPVHPTPAP---QPGVVNIPSVAQPvHPTY 14334
Cdd:PRK12323   444 PGGAPAPAPAPAAAPAAAARPAAAGPRP----VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQP-DAAP 518
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 442625924 14335 QPPVVE---RPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPA 14380
Cdd:PRK12323   519 AGWVAEsipDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPR 567
PHA03247 PHA03247
large tegument protein UL36; Provisional
14235-14566 4.20e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 62.26  E-value: 4.20e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14235 YIPSQEQPKPTTR---PSVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQP-----GVVNIP-----SVPLPAPPVKQ 14301
Cdd:PHA03247   184 YLTYYTQDHPEARwagAMVFFVPSGPGPAAPADLTAAALHLYGASETYLQDEPfverrVVISHPlrgdiAAPAPPPVVGE 263
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14302 RPVFVPSPVHPTPAPQPGvvniPSVAQPVHPTYQPPVVERPAIYDV--YYPPPPSRPgvinipsPPRPVYPVPQQPIYVP 14379
Cdd:PHA03247   264 GADRAPETARGATGPPPP----PEAAAPNGAAAPPDGVWGAALAGAplALPAPPDPP-------PPAPAGDAEEEDDEDG 332
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14380 APVLHIPAPRPVIHnipsvpqptYPhrnppiqdvtypapqpsppvpgiVNIPSLPQPVSTPTSGVINIPSQASPPISVPT 14459
Cdd:PHA03247   333 AMEVVSPLPRPRQH---------YP-----------------------LGFPKRRRPTWTPPSSLEDLSAGRHHPKRASL 380
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14460 PGIVNIPSIPQPTPQRPSPGIINVPSVPQPIP-TAPSPGIINIP-SVP----QPLPSPTPGViniPQQPTPPPLVQQPGI 14533
Cdd:PHA03247   381 PTRKRRSARHAATPFARGPGGDDQTRPAAPVPaSVPTPAPTPVPaSAPpppaTPLPSAEPGS---DDGPAPPPERQPPAP 457
                          330       340       350
                   ....*....|....*....|....*....|...
gi 442625924 14534 INIPSVQQPSTPTTQhpIQDVQYETQRPQPtPG 14566
Cdd:PHA03247   458 ATEPAPDDPDDATRK--ALDALRERRPPEP-PG 487
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
14313-14707 5.31e-08

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 61.13  E-value: 5.31e-08
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14313 TPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVP-APVLHIPAPRPV 14391
Cdd:pfam17823    99 EPATREGAADGAASRALAAAASSSPSSAAQSLPAAIAALPSEAFSAPRAAACRANASAAPRAAIAAAsAPHAASPAPRTA 178
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14392 IHNIPSVPQPTYPHRNPpiqdvtypapqpsppVPGIVNIPSLPQPVS-TPTSGVINI-PS----QASPPISVPTPGIVNi 14465
Cdd:pfam17823   179 ASSTTAASSTTAASSAP---------------TTAASSAPATLTPARgISTAATATGhPAagtaLAAVGNSSPAAGTVT- 242
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14466 PSIPQPTPQrpspGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQQPSTP 14545
Cdd:pfam17823   243 AAVGTVTPA----ALATLAAAAGTVASAAGTINMGDPHARRLSPAKHMPSDTMARNPAAPMGAQAQGPIIQVSTDQPVHN 318
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14546 TTqhpiqdvqyetqrPQPTPGVINipSVSQPTYPTQKPSYQDTSYPT--VQPKPPVSGiinipSVPQPVPSLTPGVinLP 14623
Cdd:pfam17823   319 TA-------------GEPTPSPSN--TTLEPNTPKSVASTNLAVVTTtkAQAKEPSAS-----PVPVLHTSMIPEV--EA 376
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14624 SEPSySAPIPKPgiinvPSIPEPIPSIPQNPVQevyhdtQKPQAIPGVvnvpSAPQPTPgRPYYDVAKPdfEFNPCYPSP 14703
Cdd:pfam17823   377 TSPT-TQPSPLL-----PTQGAAGPGILLAPEQ------VATEATAGT----ASAGPTP-RSSGDPKTL--AMASCQLST 437

                    ....
gi 442625924  14704 CGPY 14707
Cdd:pfam17823   438 QGQY 441
PHA03378 PHA03378
EBNA-3B; Provisional
13904-14333 6.77e-08

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 61.24  E-value: 6.77e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13904 TPKPVRPQIYDTPSPPyPVAIPDLVYVQQQQ-----PGIVNIPS-APQP-IYPTPQSPQYNV-------NYPSPQPANPQ 13969
Cdd:PHA03378   555 STEPVHDQLLPAPGLG-PLQIQPLTSPTTSQlassaPSYAQTPWpVPHPsQTPEPPTTQSHIpetsaprQWPMPLRPIPM 633
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13970 KPGVVNIPSVPQPVYPSP-QPPVYDVNYPTTPVSQHPgvvNIPSAPRLVPPTSQRPVfITSPGNLSPTPQ-PGVINIPSV 14047
Cdd:PHA03378   634 RPLRMQPITFNVLVFPTPhQPPQVEITPYKPTWTQIG---HIPYQPSPTGANTMLPI-QWAPGTMQPPPRaPTPMRPPAA 709
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14048 SqpgyPTPQSPiyDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHP 14127
Cdd:PHA03378   710 P----PGRAQR--PAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAP 783
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14128 PvfipspespspapkpgvinipsvthpeyptsqVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVhPAPNPPVHEFNYPTP 14207
Cdd:PHA03378   784 P--------------------------------APQQRPRGAPTPQPPPQAGPTSMQLMPRAAP-GQQGPTKQILRQLLT 830
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14208 PAVPQQPGVLNIPS---YPTPVAPTP-----------QSPIYIPSQEQPKPTTR----PSVINVPSVPQPayPTPQAPVY 14269
Cdd:PHA03378   831 GGVKRGRPSLKKPAaleRQAAAGPTPspgsgtsdkivQAPVFYPPVLQPIQVMRqlgsVRAAAASTVTQA--PTEYTGER 908
                          410       420       430       440       450       460
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14270 DVNYPTSPSVIPhqpgvvnipsvplPAPPVKQRPVFVPSPVHPTPAPQPGVVnIPSVAQPVHPT 14333
Cdd:PHA03378   909 RGVGPMHPTDIP-------------PSKRAKTDAYVESQPPHGGQSHSFSVI-WENVSQGQQQT 958
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
14328-14625 7.94e-08

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 60.82  E-value: 7.94e-08
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14328 QPVhPTYQPPVVERPAIYDVYYPPPPSRPgviniPSPPRPV---YPVPQQPIYVP-----APVLHIPAPRPVIHniPSVP 14399
Cdd:pfam09770   106 QPA-ARAAQSSAQPPASSLPQYQYASQQS-----QQPSKPVrtgYEKYKEPEPIPdlqvdASLWGVAPKKAAAP--APAP 177
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14400 QPtyphrnppiqdvtypapqpsppvpgivniPSLPQPVSTPTSGVINIP-------SQASPPISVPTPGIVNIPSIPQPT 14472
Cdd:pfam09770   178 QP-----------------------------AAQPASLPAPSRKMMSLEeveaamrAQAKKPAQQPAPAPAQPPAAPPAQ 228
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14473 PQRPspgiinVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQQPStPTTQHPIQ 14552
Cdd:pfam09770   229 QAQQ------QQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPP-PVPVQPTQ 301
                           250       260       270       280       290       300       310
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 442625924  14553 DVQyetqrpQPtpgviNIPSVSQPTYPTQKPsyqdtsyPTVQPKPPVSGIINIPSVPQPVPSLT--PGVINLPSE 14625
Cdd:pfam09770   302 ILQ------NP-----NRLSAARVGYPQNPQ-------PGVQPAPAHQAHRQQGSFGRQAPIIThpQQLAQLSEE 358
PRK10263 PRK10263
DNA translocase FtsK; Provisional
13907-14025 8.08e-08

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 61.25  E-value: 8.08e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQIYDTPSPPY------PVAIPDLVYVQQQQPGIVNIPSAP-----QPIYPTPQSPQYNVNY----PSPQPANPQKP 13971
Cdd:PRK10263   731 PMKALLDDGPHEPLftpivePVQQPQQPVAPQQQYQQPQQPVAPqpqyqQPQQPVAPQPQYQQPQqpvaPQPQYQQPQQP 810
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 -------GVVNIPSVPQPVYPSPQPPVYD-------------------VNYPTTPvsqhpgvvnIPSAPRLVPPTSQ-RP 14024
Cdd:PRK10263   811 vapqpqyQQPQQPVAPQPQYQQPQQPVAPqpqdtllhpllmrngdsrpLHKPTTP---------LPSLDLLTPPPSEvEP 881

                   .
gi 442625924 14025 V 14025
Cdd:PRK10263   882 V 882
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
13891-14119 1.15e-07

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 60.47  E-value: 1.15e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13891 SSHTGDPftrcyETPK-PVRPQIYDTP-SPPYPvaipdlvyvqqQQPGIVNIPSAPQ-PIYPT-PQSPqynvnyPSPQ-P 13965
Cdd:PTZ00449   585 PKHPKDP-----EEPKkPKRPRSAQRPtRPKSP-----------KLPELLDIPKSPKrPESPKsPKRP------PPPQrP 642
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13966 ANPQKPGVVNIPSVPQPVyPSPQPP---------------------------VYDVNYPTTPVSQHPGVVNIP-SAPRLV 14017
Cdd:PTZ00449   643 SSPERPEGPKIIKSPKPP-KSPKPPfdpkfkekfyddyldaaaksketkttvVLDESFESILKETLPETPGTPfTTPRPL 721
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14018 PPtsQRPvfiTSPGnlSPTPQPGVINIPSVSQPGYPTPqsPIYDANYPTTQSPIPQQPGV----VNIPSVPSPSyPAPNP 14093
Cdd:PTZ00449   722 PP--KLP---RDEE--FPFEPIGDPDAEQPDDIEFFTP--PEEERTFFHETPADTPLPDIlaeeFKEEDIHAET-GEPDE 791
                          250       260
                   ....*....|....*....|....*.
gi 442625924 14094 PVNYPTQPSPQIPVQPGviNIPSAPL 14119
Cdd:PTZ00449   792 AMKRPDSPSEHEDKPPG--DHPSLPK 815
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
14446-14706 1.42e-07

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 59.94  E-value: 1.42e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14446 NIPSQ-ASPPISVPTPGIVNIPSIPQPtPQRPSPGIINVPsvPQPIPTAPSP-------GIINIPSVPQPLPsPTPGvin 14517
Cdd:PLN03209   322 KIPSQrVPPKESDAADGPKPVPTKPVT-PEAPSPPIEEEP--PQPKAVVPRPlspytayEDLKPPTSPIPTP-PSSS--- 394
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14518 iPQQPTPPPLVQQPGIINIPSVqqPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTqkPSYQDTSYPTVQPKP 14597
Cdd:PLN03209   395 -PASSKSVDAVAKPAEPDVVPS--PGSASNVPEVEPAQVEAKKTRPLSPYARYEDLKPPTSPS--PTAPTGVSPSVSSTS 469
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14598 PVSGIINIPsvpqPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPvqevyhdtqkPQAIPGVVNVPSA 14677
Cdd:PLN03209   470 SVPAVPDTA----PATAATDAAAPPPANMRPLSPYAVYDDLKPPTSPSPAAPVGKVA----------PSSTNEVVKVGNS 535
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|....*
gi 442625924 14678 --------------PQPTPGRPY--YDVAKPdfefnPCYPSPCGP 14706
Cdd:PLN03209   536 apptaladeqhhaqPKPRPLSPYtmYEDLKP-----PTSPTPSPV 575
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
14326-14687 1.89e-07

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 59.31  E-value: 1.89e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14326 VAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPV---LHIPAPRPVIHNIP---SVP 14399
Cdd:COG5180     15 VPIPPNAARPVLSPELWAAANNDAVSQGDRSALASSPTRPYARKIFEPLDIKLALGKpqlPSVAEPEAYLDPAPpksSPD 94
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14400 QPTYPHRNPPiqDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSGVINipSQASPPISVPTPGIVNIPSIP---------- 14469
Cdd:COG5180     95 TPEEQLGAPA--GDLLVLPAAKTPELAAGALPAPAAAAALPKAKVTR--EATSASAGVALAAALLQRSDPilakdpdgds 170
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14470 QPTPQRPSPGIINVPSVPQPIpTAPSPGIINIPSVPQPLPSPTPgviniPQQPTPPPLVQQPGIINIPSVQQPSTPTTQ- 14548
Cdd:COG5180    171 ASTLPPPAEKLDKVLTEPRDA-LKDSPEKLDRPKVEVKDEAQEE-----PPDLTGGADHPRPEAASSPKVDPPSTSEARs 244
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14549 HPIQ-DVQYETQ------RPQPTPGViNIPSVSQPTYPT----QKPSYQDTSYPTVQPKpPVSGIINIPSVPQPVpSLTP 14617
Cdd:COG5180    245 RPATvDAQPEMRppadakERRRAAIG-DTPAAEPPGLPVleagSEPQSDAPEAETARPI-DVKGVASAPPATRPV-RPPG 321
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14618 GVINL----PSEPSYSA---PIPKPGIINVPSiPEPIPSiPQNPVQEVYHDTQKPQAiPGVVNVPSAPQ---PTPGRPYY 14687
Cdd:COG5180    322 GARDPgtprPGQPTERPagvPEAASDAGQPPS-AYPPAE-EAVPGKPLEQGAPRPGS-SGGDGAPFQPPngaPQPGLGRR 398
PRK10819 PRK10819
transport protein TonB; Provisional
14444-14579 2.23e-07

transport protein TonB; Provisional


Pssm-ID: 236768 [Multi-domain]  Cd Length: 246  Bit Score: 57.00  E-value: 2.23e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14444 VINIPsQASPPISVptpGIVNIPSIPQPTPQRPSPGIINVPSV-PQPIPTAPSPGIINIPS-------VPQPLPSPTPGV 14515
Cdd:PRK10819    38 VIELP-APAQPISV---TMVAPADLEPPQAVQPPPEPVVEPEPePEPIPEPPKEAPVVIPKpepkpkpKPKPKPKPVKKV 113
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14516 INIPQQPTPPPLVQQPGIINIPSVQQPSTPTTqhpiqdvqyETQRPQPTPGVINIP---SVSQPTYP 14579
Cdd:PRK10819   114 EEQPKREVKPVEPRPASPFENTAPARPTSSTA---------TAAASKPVTSVSSGPralSRNQPQYP 171
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
13950-14267 2.72e-07

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 59.16  E-value: 2.72e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13950 TPQSPQYNVNYPSPQPANPQKPGVVNIPS-VPQPVYPSPQPPVYDVNYPTtPVSQHPGVVNI-PS-APRLVPPTSQRPVf 14026
Cdd:pfam05109   428 TTTSPTLNTTGFAAPNTTTGLPSSTHVPTnLTAPASTGPTVSTADVTSPT-PAGTTSGASPVtPSpSPRDNGTESKAPD- 505
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14027 ITSPGNLSPTPQPGVIN-IPSVSQPGyPTPQSPIYDANYPTT--QSPIPQqpgvvniPSVPSPSYPAPNPPVNYPT--QP 14101
Cdd:pfam05109   506 MTSPTSAVTTPTPNATSpTPAVTTPT-PNATSPTLGKTSPTSavTTPTPN-------ATSPTPAVTTPTPNATIPTlgKT 577
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14102 SPQIPVQPGVINIPSAPLPTTPPQhppvfipspESPSPAPKPGVINIPSVTH-PEYPTSQVPV--YDVNYSTT------P 14172
Cdd:pfam05109   578 SPTSAVTTPTPNATSPTVGETSPQ---------ANTTNHTLGGTSSTPVVTSpPKNATSAVTTgqHNITSSSTssmslrP 648
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14173 SPIPQ--KPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVlnipSYPTPvAPTPQSPIYIPSQEQPKPTTRPSV 14250
Cdd:pfam05109   649 SSISEtlSPSTSDNSTSHMPLLTSAHPTGGENITQVTPASTSTHHV----STSSP-APRPGTTSQASGPGNSSTSTKPGE 723
                           330
                    ....*....|....*...
gi 442625924  14251 INVPSVPQPAYPT-PQAP 14267
Cdd:pfam05109   724 VNVTKGTPPKNATsPQAP 741
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
14284-14683 3.19e-07

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 59.16  E-value: 3.19e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14284 PGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPqpgVVNIPSVAQPVHPTYQPPVVERPAIYDVyypPPPSRPGVinipS 14363
Cdd:pfam05109   400 PKTLIITRTATNATTTTHKVIFSKAPESTTTSP---TLNTTGFAAPNTTTGLPSSTHVPTNLTA---PASTGPTV----S 469
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14364 PPRPVYPVPQQPIYVPAPVLHIPAPRPvihnipSVPQPTYPHRNPPIQDVTYPAPqpsppvpgivNIPSLPQPVSTPTsg 14443
Cdd:pfam05109   470 TADVTSPTPAGTTSGASPVTPSPSPRD------NGTESKAPDMTSPTSAVTTPTP----------NATSPTPAVTTPT-- 531
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14444 viniPSQASPPISVPTPgivnIPSIPQPTPQRPSPgiinVPSVPQPIPTAPSPGIINIP---SVPQPLP---SPTPGVIN 14517
Cdd:pfam05109   532 ----PNATSPTLGKTSP----TSAVTTPTPNATSP----TPAVTTPTPNATIPTLGKTSptsAVTTPTPnatSPTVGETS 599
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14518 iPQQPTPPPLV----QQPGIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTpgviNIPSVSQPTYPTQKPSYQ---DTSY 14590
Cdd:pfam05109   600 -PQANTTNHTLggtsSTPVVTSPPKNATSAVTTGQHNITSSSTSSMSLRPS----SISETLSPSTSDNSTSHMpllTSAH 674
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14591 PT----VQPKPPVSGIINIPSVPQPVPSltPGVINLPSEPSYSAPIPKPGIINV-PSIPEPIPSIPQNPvqevyhdTQKP 14665
Cdd:pfam05109   675 PTggenITQVTPASTSTHHVSTSSPAPR--PGTTSQASGPGNSSTSTKPGEVNVtKGTPPKNATSPQAP-------SGQK 745
                           410
                    ....*....|....*...
gi 442625924  14666 QAIPGVVNVPSAPQPTPG 14683
Cdd:pfam05109   746 TAVPTVTSTGGKANSTTG 763
PRK10263 PRK10263
DNA translocase FtsK; Provisional
13794-14264 4.76e-07

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 58.56  E-value: 4.76e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13794 AYCSPV-PIIQESPLTPCDPSPCGPNAQCHPS----LNEAVCSCLPEFYgtPPNcrpectlnSECAYDKACVHHKCVDPC 13868
Cdd:PRK10263   332 SWAAPVePVTQTPPVASVDVPPAQPTVAWQPVpgpqTGEPVIAPAPEGY--PQQ--------SQYAQPAVQYNEPLQQPV 401
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13869 PgicginadcrvhyhsPICYCISSHTGDPFTRCYETPKPVRPQIYDTPSPpypvaipdlvyvQQQQPGIVNIPSAPQPIY 13948
Cdd:PRK10263   402 Q---------------PQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAP------------APEQPVAGNAWQAEEQQS 454
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13949 PTPQSPQYNVNYPSPQPAnPQKPGVVNIPSVPQPVYPSPQ----------PPVY-------------------------- 13992
Cdd:PRK10263   455 TFAPQSTYQTEQTYQQPA-AQEPLYQQPQPVEQQPVVEPEpvveetkparPPLYyfeeveekrarereqlaawyqpipep 533
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13993 ----DVNYPTTPVSQHPGVVNIPSAPRLVP---------------PTSQRPVFITSPGNlSPTPQ-----------PGVI 14042
Cdd:PRK10263   534 vkepEPIKSSLKAPSVAAVPPVEAAAAVSPlasgvkkatlatgaaATVAAPVFSLANSG-GPRPQvkegigpqlprPKRI 612
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14043 NIPS---VSQPGYPTPQSPI------------YDANYPTT----------------------------QSPIPQQPG--- 14076
Cdd:PRK10263   613 RVPTrreLASYGIKLPSQRAaeekareaqrnqYDSGDQYNddeidamqqdelarqfaqtqqqrygeqyQHDVPVNAEdad 692
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14077 -------VVNIPSVPSPSYPAPNPPVNYPTQPS--PQIPVQPGVINIPSAPL--PTTPPQHPPVfipspespspapkpgv 14145
Cdd:PRK10263   693 aaaeaelARQFAQTQQQRYSGEQPAGANPFSLDdfEFSPMKALLDDGPHEPLftPIVEPVQQPQ---------------- 756
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14146 inIPSVTHPEYPTSQVPVYDVNYSTTPS---PIPQKPGVVNIPSAPQPVHPAPNPPV---HEFNYPTPPAVPQQPgvlnI 14219
Cdd:PRK10263   757 --QPVAPQQQYQQPQQPVAPQPQYQQPQqpvAPQPQYQQPQQPVAPQPQYQQPQQPVapqPQYQQPQQPVAPQPQ----Y 830
                          570       580       590       600       610
                   ....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14220 PSYPTPVAPTPQSPIYIP---SQEQPKPTTRPSViNVPSV----PQPAYPTP 14264
Cdd:PRK10263   831 QQPQQPVAPQPQDTLLHPllmRNGDSRPLHKPTT-PLPSLdlltPPPSEVEP 881
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
14068-14331 5.09e-07

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 58.12  E-value: 5.09e-07
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14068 QSPIPQ-QPGVVNIPSVPSPSYPAPNPPVNYPTQPS-------------PQIPVQPGVINIPSAPlPTTPPQHPPVfips 14133
Cdd:pfam09770   105 QQPAARaAQSSAQPPASSLPQYQYASQQSQQPSKPVrtgyekykepepiPDLQVDASLWGVAPKK-AAAPAPAPQP---- 179
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14134 pespspapkpgvinipsvthpeyptsqvpvydvnySTTPSPIPQkpgvvniPS----------------APQPVHPAPNP 14197
Cdd:pfam09770   180 -----------------------------------AAQPASLPA-------PSrkmmsleeveaamraqAKKPAQQPAPA 217
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14198 PVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQP-----KPTTRPSVINVPSVPQPAYPTPQAPvydVN 14272
Cdd:pfam09770   218 PAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPvtilqRPQSPQPDPAQPSIQPQAQQFHQQP---PP 294
                           250       260       270       280       290
                    ....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924  14273 YPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIpsVAQPVH 14331
Cdd:pfam09770   295 VPVQPTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPI--ITHPQQ 351
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
14379-14682 9.26e-07

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 57.39  E-value: 9.26e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14379 PAPVL-HIPAPRPVIHNIPSVPQ-PTYPHRnppiqdvtypapqpsppvpgivniPSLPQPVSTPTSgvinipsqASPPIS 14456
Cdd:PTZ00449   561 PGPAKeHKPSKIPTLSKKPEFPKdPKHPKD------------------------PEEPKKPKRPRS--------AQRPTR 608
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14457 VPTPGIVNIPSIPQpTPQRPSPGiiNVPSVPQPiPTAPS----PGIINIPSVPQPlpsptpgviniPQQPTPP--PLVQQ 14530
Cdd:PTZ00449   609 PKSPKLPELLDIPK-SPKRPESP--KSPKRPPP-PQRPSsperPEGPKIIKSPKP-----------PKSPKPPfdPKFKE 673
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14531 PGIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSyqDTSYPTVQPKPPVSgiinipsvPQ 14610
Cdd:PTZ00449   674 KFYDDYLDAAAKSKETKTTVVLDESFESILKETLPETPGTPFTTPRPLPPKLPR--DEEFPFEPIGDPDA--------EQ 743
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14611 PVPSltpgvinlpsePSYSAPIPKPGIINVPSIPEPIPSIPQNPVQE--VYHDTQKPQAIPGVVNVPSAPQPTP 14682
Cdd:PTZ00449   744 PDDI-----------EFFTPPEEERTFFHETPADTPLPDILAEEFKEedIHAETGEPDEAMKRPDSPSEHEDKP 806
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
14029-14390 1.14e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 57.49  E-value: 1.14e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14029 SPGNLSPTPQPGVINIPSVSQPGYPTPQSPiydaNYPTTQSPIPQQPGVVNIPSVPSPSY--PAPNPPVNYPTQPSPQIP 14106
Cdd:PHA03307    39 SQGQLVSDSAELAAVTVVAGAAACDRFEPP----TGPPPGPGTEAPANESRSTPTWSLSTlaPASPAREGSPTPPGPSSP 114
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14107 VQPGVINIPSAPLPTTPPQHPPVFIPSpespspapkpgviniPSVTHPEyPTSQVPVYDVNYSTTPSPIPQKPGVVNIPS 14186
Cdd:PHA03307   115 DPPPPTPPPASPPPSPAPDLSEMLRPV---------------GSPGPPP-AASPPAAGASPAAVASDAASSRQAALPLSS 178
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPiyiPSQEQPKPTTRPSVINVPSV-----PQPAY 14261
Cdd:PHA03307   179 PEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPG---RSAADDAGASSSDSSSSESSgcgwgPENEC 255
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 PTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPlPAPPVKQRPVFVPS----PVHPTPAPQPGVVNIPSVAQPVHPTYQPP 14337
Cdd:PHA03307   256 PLPRPAPITLPTRIWEASGWNGPSSRPGPASS-SSSPRERSPSPSPSspgsGPAPSSPRASSSSSSSRESSSSSTSSSSE 334
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14338 VVERPAiydVYYPPPPSRPGVINIPSPPRPVYPVPQ-QPIYVPAPVLHIPAPRP 14390
Cdd:PHA03307   335 SSRGAA---VSPGPSPSRSPSPSRPPPPADPSSPRKrPRPSRAPSSPAASAGRP 385
Not5 COG5665
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];
14259-14683 1.19e-06

CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];


Pssm-ID: 444384 [Multi-domain]  Cd Length: 874  Bit Score: 56.98  E-value: 1.19e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14259 PAYP-TPQAPVYDVNYPtspsviphqpgvvnIPSVPLPAPPVKQRPV---FVPSPVHPTPAPQpgvvnipsvAQPVHPTy 14334
Cdd:COG5665    177 IAVPsAPAAPPNAVDYS--------------VLVPIAAQDPAASVSTpqaFNASATSGRSQHI---------VQAAKRV- 232
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14335 qppVVERPAIYDVYyPPPPSRPGVINIPSPPRPVYPVPQQPIYVPapvlhiPAPRPVIHNIpsVPQPTYPHRNPPiqdVT 14414
Cdd:COG5665    233 ---GVEWWGDPSLL-ATPPATPATEEKSSQQPKSQPTSPSGGTTP------PSTNQLTTSN--TPTSTAKAQPQP---PT 297
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14415 YPAPQpsppvpgivnipslpqpVSTPTSGVINIPSQASPPISVPTPGivnipSIPQPTPQRPSPGIINVPSVPQPIPtap 14494
Cdd:COG5665    298 KKQPA-----------------KEPPSDTASGNPSAPSVLINSDSPT-----SEDPATASVPTTEETTAFTTPSSVP--- 352
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14495 spgiinipsVPQPLPSPTPGVINIPQQPTPPPLvqqpgiinipSVQQPSTPTTQHPIQDVQYETQRPQ-PTPGVINIPSV 14573
Cdd:COG5665    353 ---------STPAEKDTPATDLATPVSPTPPET----------SVDKKVSPDSATSSTKSEKEGGTASsPMPPNIAIGAK 413
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14574 SQPTyPTqKPSYQDTSYptvQPKPPVSGiiniPSVPQPVPSLTpgvinlpSEPSYSAPIPKPGIINVPSIPEPIPSIPQN 14653
Cdd:COG5665    414 DDVD-AT-DPSQEAKEY---TKNAPMTP----EADSAPESSVR-------TEASPSAGSDLEPENTTLRDPAPNAIPPPE 477
                          410       420       430
                   ....*....|....*....|....*....|
gi 442625924 14654 PVQEVYHDTQKPQAipgvVNVPSAPQPTPG 14683
Cdd:COG5665    478 DPSTIGRLSSGDKL----ANETGPPVIRRD 503
PHA03247 PHA03247
large tegument protein UL36; Provisional
13993-14246 1.26e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 57.26  E-value: 1.26e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13993 DVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPVfiTSPGNLSPTPQPGVINI------PSVSQPGYPTPQSPIYDANYPT 14066
Cdd:PHA03247   251 DIAAPAPPPVVGEGADRAPETARGATGPPPPPE--AAAPNGAAAPPDGVWGAalagapLALPAPPDPPPPAPAGDAEEED 328
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14067 TQ-------SPIPQqpgvvnipsvPSPSYPAPNPPVNYPT--QPSPQIPVQPGVINIPSAPLPTTPPQHPPvfipspesp 14137
Cdd:PHA03247   329 DEdgamevvSPLPR----------PRQHYPLGFPKRRRPTwtPPSSLEDLSAGRHHPKRASLPTRKRRSAR--------- 389
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14138 spAPKPGVINIPSVTHPEYPTSQVPvydvnySTTPSPIPqKPGVVNIPSAPQPVHPAPNPPVHEfnYPTPPAVPQQPGVL 14217
Cdd:PHA03247   390 --HAATPFARGPGGDDQTRPAAPVP------ASVPTPAP-TPVPASAPPPPATPLPSAEPGSDD--GPAPPPERQPPAPA 458
                          250       260
                   ....*....|....*....|....*....
gi 442625924 14218 NIPSYPTPVAPTPQSPIYIPSQEQPKPTT 14246
Cdd:PHA03247   459 TEPAPDDPDDATRKALDALRERRPPEPPG 487
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
14171-14375 1.60e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 56.53  E-value: 1.60e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14171 TPSPIPQKPGVVNIPSAPQPVhPAPNPPVHefnyPTPPAVPQQPGvlnipsyPTPVAPTPQSPiyiPSQEQPKPTTRPSV 14250
Cdd:PRK07764   598 EGPPAPASSGPPEEAARPAAP-AAPAAPAA----PAPAGAAAAPA-------EASAAPAPGVA---APEHHPKHVAVPDA 662
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14251 INVPSvPQPAYPTPQAPVYDVnyPTSPSVIPHQPGVVNiPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPV 14330
Cdd:PRK07764   663 SDGGD-GWPAKAGGAAPAAPP--PAPAPAAPAAPAGAA-PAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDP 738
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*
gi 442625924 14331 HPTyqPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQP 14375
Cdd:PRK07764   739 VPL--PPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSE 781
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
14336-14656 1.66e-06

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 56.09  E-value: 1.66e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14336 PPVVERPAIydvyyPPPPSRPgVIN--IPSPPRPVYPVPQQPIYVPAP----VLHIPAPRpVIHNIPSVPQPtYPHRNPP 14409
Cdd:cd22540     39 PPAVEAAVT-----PPAPPQP-TPRklVPIKPAPLPLGPGKNSIGFLSakgnIIQLQGSQ-LSSSAPGGQQV-FAIQNPT 110
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14410 IQDVTYPAPQPSPPvpGIVNIPSLPQPVSTPTSGVINI-----PSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVP 14484
Cdd:cd22540    111 MIIKGSQTRSSTNQ--QYQISPQIQAAGQINNSGQIQIipgtnQAIITPVQVLQQPQQAHKPVPIKPAPLQTSNTNSASL 188
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14485 SVPQPIPTAPSPGII--NIPS------------VPQPLPSPTPGVI---NIPQQPTPPPLVQQ-----------PGII-- 14534
Cdd:cd22540    189 QVPGNVIKLQSGGNValTLPVnnlvgtqdgatqLQLAAAPSKPSKKirkKSAQAAQPAVTVAEqvetvliettaDNIIqa 268
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14535 --NIPSVQQPST--PTTQHPIQDVQYETQR------PQPTPGV-------INIPSVS------QPTYPTQKPSYQDTSYP 14591
Cdd:cd22540    269 gnNLLIVQSPGTgqPAVLQQVQVLQPKQEQqvvqipQQALRVVqaasatlPTVPQKPlqniqiQNSEPTPTQVYIKTPSG 348
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14592 TVQ-------PKPPVSGIINIPSVPQPVPSLTPGVINLPS-----EPSYSAPIPKPGIINV-----PSIPEPIPSIPQNP 14654
Cdd:cd22540    349 EVQtvllqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNynvrkERTLPKIAPAGGIISLnaaqlAAAAQAIQTINING 428

                   ..
gi 442625924 14655 VQ 14656
Cdd:cd22540    429 VQ 430
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
14242-14550 1.70e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 56.46  E-value: 1.70e-06
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14242 PKPTT-RPSVINVPS-VPQPAYPTPQAPVYDVNYPTSPSVIPHQPgvvniPSVPLPAPPVKQRPVFVPSPVHPTPA---P 14316
Cdd:pfam05109   442 PNTTTgLPSSTHVPTnLTAPASTGPTVSTADVTSPTPAGTTSGAS-----PVTPSPSPRDNGTESKAPDMTSPTSAvttP 516
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14317 QPGVVN-IPSVAQPVhPTYQPPVVERPAIYDVYYPPPPsrpgviNIPSP-PRPVYPVPQQPIyvpaPVLHIPAPrpvihn 14394
Cdd:pfam05109   517 TPNATSpTPAVTTPT-PNATSPTLGKTSPTSAVTTPTP------NATSPtPAVTTPTPNATI----PTLGKTSP------ 579
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14395 IPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSGVI----NIPSQASPPISV----------PTP 14460
Cdd:pfam05109   580 TSAVTTPTPNATSPTVGETSPQANTTNHTLGGTSSTPVVTSPPKNATSAVTtgqhNITSSSTSSMSLrpssisetlsPST 659
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14461 GIVNIPSIPQPTPQRPSPGiinvPSVPQPIPTAPSPGIINIPSvpqplPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQ 14540
Cdd:pfam05109   660 SDNSTSHMPLLTSAHPTGG----ENITQVTPASTSTHHVSTSS-----PAPRPGTTSQASGPGNSSTSTKPGEVNVTKGT 730
                           330
                    ....*....|.
gi 442625924  14541 QPSTPTT-QHP 14550
Cdd:pfam05109   731 PPKNATSpQAP 741
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
14144-14598 1.73e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 56.72  E-value: 1.73e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14144 GVINIPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNPPvhefnyPTPPAVPQQPGVLNIPSYP 14223
Cdd:PHA03307    17 GGEFFPRPPATPGDAADDLLSGSQGQLVSDSAELAAVTVVAGAAACDRFEPPTGP------PPGPGTEAPANESRSTPTW 90
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14224 TPVAPTPQSPIYIPSQEQPKPTTRPSVinvPSVPQPAYPTPQAPvydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRP 14303
Cdd:PHA03307    91 SLSTLAPASPAREGSPTPPGPSSPDPP---PPTPPPASPPPSPA------PDLSEMLRPVGSPGPPPAASPPAAGASPAA 161
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14304 VfvpsPVHPTPAPQPGVVnIPSVAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVyPVPQQPIYVPAPVL 14383
Cdd:PHA03307   162 V----ASDAASSRQAALP-LSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPA-PAPGRSAADDAGAS 235
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14384 HIPAPRPVIHNIPSVPQPTYPHRNPPIQDVtypapqPSPPVPGIVNIPSLPQPVSTPTSGVINIPSQASPPISVPTPgiv 14463
Cdd:PHA03307   236 SSDSSSSESSGCGWGPENECPLPRPAPITL------PTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSG--- 306
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14464 nipsiPQPTPQRPSPGIINVPSV--PQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQQ 14541
Cdd:PHA03307   307 -----PAPSSPRASSSSSSSRESssSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAAS 381
                          410       420       430       440       450       460
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14542 PSTPTT---------QHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTS-YPTVQPKPP 14598
Cdd:PHA03307   382 AGRPTRrraraavagRARRRDATGRFPAGRPRPSPLDAGAASGAFYARYPLLTPSGEpWPGSPPPPP 448
PRK10819 PRK10819
transport protein TonB; Provisional
14237-14406 1.74e-06

transport protein TonB; Provisional


Pssm-ID: 236768 [Multi-domain]  Cd Length: 246  Bit Score: 54.30  E-value: 1.74e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTS-PSVIPhQPgvvnipsvPLPAPPVKQRPVFVPSPVhPTPA 14315
Cdd:PRK10819    37 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPePEPIP-EP--------PKEAPVVIPKPEPKPKPK-PKPK 106
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14316 PQPGVVNIPSVAQPVhptyqPPVVERPAIYDVyyPPPPSRPgvinIPSPPRPVYPVPQQPiyVPApvlhipAPRPVihni 14395
Cdd:PRK10819   107 PKPVKKVEEQPKREV-----KPVEPRPASPFE--NTAPARP----TSSTATAAASKPVTS--VSS------GPRAL---- 163
                          170
                   ....*....|.
gi 442625924 14396 pSVPQPTYPHR 14406
Cdd:PRK10819   164 -SRNQPQYPAR 173
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
14253-14521 2.25e-06

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 56.20  E-value: 2.25e-06
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14253 VPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQP---GVvNIPSVPLPAP-----------PVKQRPvfvPSPVHPTPAPQP 14318
Cdd:pfam09770   108 AARAAQSSAQPPASSLPQYQYASQQSQQPSKPvrtGY-EKYKEPEPIPdlqvdaslwgvAPKKAA---APAPAPQPAAQP 183
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14319 GVVNIPS-------------VAQPVHPTYQPPVVerPAIYDVYYPPPPSRPGViNIPSPPRPVYPVPQQPIYVPAPVLHI 14385
Cdd:pfam09770   184 ASLPAPSrkmmsleeveaamRAQAKKPAQQPAPA--PAQPPAAPPAQQAQQQQ-QFPPQIQQQQQPQQQPQQPQQHPGQG 260
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14386 PAPRPVIHnipsvPQPtyphrnppiqdvtypapqpsppvpgivniPSLPQPVSTPTSGVINIPSQASPPISVPTPGIVNi 14465
Cdd:pfam09770   261 HPVTILQR-----PQS-----------------------------PQPDPAQPSIQPQAQQFHQQPPPVPVQPTQILQN- 305
                           250       260       270       280       290
                    ....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924  14466 psipqptPQRPSPGIINVPSVPQPiPTAPSPGIINIPSvpQPLPSPTPGVINIPQQ 14521
Cdd:pfam09770   306 -------PNRLSAARVGYPQNPQP-GVQPAPAHQAHRQ--QGSFGRQAPIITHPQQ 351
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
14225-14409 2.35e-06

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 53.50  E-value: 2.35e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14225 PVAPTPQSPIYIPSQEQPKP-----TTRPSVINVPSVPQPAYPTPQAPVYdvnyPTSPSVIPHQPGVVNIPSVPLPAPPV 14299
Cdd:cd21577      2 PVKTDMETSFYSPSHSQLEPvdlslSKRSSPPSSSSSSSSSSSSSSSPSS----RASPPSPYSKSSPPSPPQQRPLSPPL 77
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14300 KQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVyyPPPPSRPGVINIPSPP------------RP 14367
Cdd:cd21577     78 SLPPPVAPPPLSPGSVPGGLPVISPVMVQPVPVLYPPHLHQPIMVSSS--PPPDDDHHHHKASSMKpselggdnhelhKP 155
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*....
gi 442625924 14368 V----YPVPQQPIY---VPAPVlhIPAPRPVIHNIPSVPQPTYPHRNPP 14409
Cdd:cd21577    156 IktepRPEHAQDPYseeMSSSV--ISSPPEYESNTPSVIVHPGKRPLPV 202
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14483-14626 2.38e-06

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 52.48  E-value: 2.38e-06
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14483 VPSVPQPIPTAPSPGIINIPSVPQPLPSptpgvinIPQQPtpppLVQQPGiinipsvQQPSTPTTQHPIQDVQYETQRPQ 14562
Cdd:smart00818    40 IPVSQQHPPTHTLQPHHHIPVLPAQQPV-------VPQQP----LMPVPG-------QHSMTPTQHHQPNLPQPAQQPFQ 101
                             90       100       110       120       130       140
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924   14563 PTPgviniPSVSQPTYPTQKPsyqdtsyPTVQPKPPVSGIINIPSVP--QPVPSLTPgviNLPSEP 14626
Cdd:smart00818   102 PQP-----LQPPQPQQPMQPQ-------PPVHPIPPLPPQPPLPPMFpmQPLPPLLP---DLPLEA 152
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
13917-14323 2.55e-06

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 55.70  E-value: 2.55e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13917 SPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTP---------------QSPQYNVNYP-SPQPANPQKPGVVNIPSVP 13980
Cdd:cd22540     39 PPAVEAAVTPPAPPQPTPRKLVPIKPAPLPLGPGKnsigflsakgniiqlQGSQLSSSAPgGQQVFAIQNPTMIIKGSQT 118
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13981 QpvypspqpPVYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRpvfITSPGNLSPTPQPGvinipSVSQPGYPTPQSPIy 14060
Cdd:cd22540    119 R--------SSTNQQYQISPQIQAAGQINNSGQIQIIPGTNQA---IITPVQVLQQPQQA-----HKPVPIKPAPLQTS- 181
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14061 danypTTQSPIPQQPGvvNIPSVPSPSYPAPNPPVNYptqpspQIPVQPGVINIPSAPLPTTPPQhppvfipspespspa 14140
Cdd:cd22540    182 -----NTNSASLQVPG--NVIKLQSGGNVALTLPVNN------LVGTQDGATQLQLAAAPSKPSK--------------- 233
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14141 pkpGVINIPSVTHPEYPTSQVPVYDVNYSTTPSP---------IPQKPGvVNIPSAPQPVHPApnppvhefnyptPPAvp 14211
Cdd:cd22540    234 ---KIRKKSAQAAQPAVTVAEQVETVLIETTADNiiqagnnllIVQSPG-TGQPAVLQQVQVL------------QPK-- 295
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14212 QQPGVLNIPSYPTPV--------APTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQ 14283
Cdd:cd22540    296 QEQQVVQIPQQALRVvqaasatlPTVPQKPLQNIQIQNSEPTPTQVYIKTPSGEVQTVLLQEAPAATATPSSSTSTVQQQ 375
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|
gi 442625924 14284 PGVVNIPSVPLPAPPVKQRPVFvpspvhPTPAPQPGVVNI 14323
Cdd:cd22540    376 VTANNGTGTSKPNYNVRKERTL------PKIAPAGGIISL 409
PRK14948 PRK14948
DNA polymerase III subunit gamma/tau;
14442-14653 3.90e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237862 [Multi-domain]  Cd Length: 620  Bit Score: 55.35  E-value: 3.90e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVPTPGIVNIPSIPQPTPqrpSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQ 14521
Cdd:PRK14948   362 SAFISEIANASAPANPTPAPNPSPPPAPIQPS---APKTKQAATTPSPPPAKASPPIPVPAEPTEPSPTPPANAANAPPS 438
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14522 PTPPPLVQQpgIINipSVQQPST------------------------------------------PTTQHPIQ---DVQY 14556
Cdd:PRK14948   439 LNLEELWQQ--ILA--KLELPSTrmllsqqaelvsldsnraviavspnwlgmvqsrkplleqafaKVLGRSIKlnlESQS 514
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14557 ETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSysaPIPKPg 14636
Cdd:PRK14948   515 GSASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSSPPPPIPEEPT---PSPTK- 590
                          250
                   ....*....|....*..
gi 442625924 14637 iinvPSIPEPIPSIPQN 14653
Cdd:PRK14948   591 ----DSSPEEIDKAAKN 603
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
14351-14676 4.30e-06

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 54.93  E-value: 4.30e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14351 PPPSRPGVinipSPPRPVYPVPQ-QPIYVPAPvlhIPAPRPViHNIPSVPQPTYPHRNPPIQDVTypapqpsppvpgivN 14429
Cdd:cd22540     39 PPAVEAAV----TPPAPPQPTPRkLVPIKPAP---LPLGPGK-NSIGFLSAKGNIIQLQGSQLSS--------------S 96
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 IPSLPQPVSTPTSGVINIPSQASPPISVPTpgivnipsipQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLP 14509
Cdd:cd22540     97 APGGQQVFAIQNPTMIIKGSQTRSSTNQQY----------QISPQIQAAGQINNSGQIQIIPGTNQAIITPVQVLQQPQQ 166
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14510 SPTPgvinIPQQPTPpplvQQPGIINIPSVQQPSTPTTQH---------PIQ--DVQYETQRPQPTPGviniPSVSQPTY 14578
Cdd:cd22540    167 AHKP----VPIKPAP----LQTSNTNSASLQVPGNVIKLQsggnvaltlPVNnlVGTQDGATQLQLAA----APSKPSKK 234
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14579 PTQKPSYQDTSYPTVQPKPPV------SGII---------------NIPSVPQPVPSLTP----GVINLPSEPsysapip 14633
Cdd:cd22540    235 IRKKSAQAAQPAVTVAEQVETvliettADNIiqagnnllivqspgtGQPAVLQQVQVLQPkqeqQVVQIPQQA------- 307
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|...
gi 442625924 14634 kpgIINVPSIPEPIPSIPQNPVQEVYHDTQKPQAIPGVVNVPS 14676
Cdd:cd22540    308 ---LRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQVYIKTPS 347
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
14030-14687 5.10e-06

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 54.95  E-value: 5.10e-06
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14030 PGNLSPTPQ--PGVI-NIPSVSQPGYPTPQSPIYDANYPTTQSPipQQPGVVNIPSVPSPSYpapnppvnYPTqpSPQip 14106
Cdd:pfam03157    85 PGETTPPQQlqQGIFwGIPALLQRYYPGVTSPQQVSYYPGQASP--QRPGQGQQPGQGQQWY--------YPT--SPQ-- 150
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14107 vQPGVINIP----SAPLPTTPPQHPPVFIPSPESPSPAPKPGviNIPSVTHPEY-PTSQVPVYDVNYsTTPSPIPQKPGv 14181
Cdd:pfam03157   151 -QPGQWQQPgqgqQGYYPTSPQQSGQRQQPGQGQQLRQGQQG--QQSGQGQPGYyPTSSQQPGQLQQ-TGQGQQGQQPE- 225
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14182 vnipSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTpvapTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAY 14261
Cdd:pfam03157   226 ----RGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPI----SPQQPRQWQQSGQGQQGYYPTSLQQPGQGQSGY 297
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14262 ptpqapvydvnYPTSPsvipHQPGvvnipsvPLPAPPVKQRPVFVPSPVHPTPAPQPGvvnipSVAQPVHP-TYQPPVVE 14340
Cdd:pfam03157   298 -----------YPTSQ----QQAG-------QLQQEQQLGQEQQDQQPGQGRQGQQPG-----QGQQGQQPaQGQQPGQG 350
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14341 RPAiydvYYPPPPSRPGvinipspprpvypvPQQPIYVPApvlhipaprpvihnipSVPQPTYPHRNPPIQDVTYPAPQP 14420
Cdd:pfam03157   351 QPG----YYPTSPQQPG--------------QGQPGYYPT----------------SQQQPQQGQQPEQGQQGQQQGQGQ 396
                           410       420       430       440       450       460       470       480
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14421 SPPVPGIVNIPSLPQPVSTPTSgviniPSQasppisvptpgivnipsipqptPQRPSPGiiNVPSVPQPIPTAPSPGIIN 14500
Cdd:pfam03157   397 QGQQPGQGQQPGQGQPGYYPTS-----PQQ----------------------SGQGQPG--YYPTSPQQSGQGQQPGQGQ 447
                           490       500       510       520       530       540       550       560
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14501 IPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGiinipSVQQPSTPTT-------QHPIQDVQYETQRPQPTPGVINIPSV 14573
Cdd:pfam03157   448 QPGQEQPGQGQQPGQGQQGQQPGQPEQGQQPG-----QGQPGYYPTSpqqsgqgQQLGQWQQQGQGQPGYYPTSPLQPGQ 522
                           570       580       590       600       610       620       630       640
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14574 SQPTYPTQKPSYQDTSYPTVQPKPPVSGIINIPSvPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEP--IPSIP 14651
Cdd:pfam03157   523 GQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQS-GQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQPgyYPTSP 601
                           650       660       670
                    ....*....|....*....|....*....|....*...
gi 442625924  14652 QNPVQ--EVYHDTQKPQAIPGVVnVPSAPQPTPGRPYY 14687
Cdd:pfam03157   602 QQSGQgqQPGQWQQPGQGQPGYY-PTSSLQLGQGQQGY 638
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14249-14353 5.25e-06

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 54.82  E-value: 5.25e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14249 SVINVPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPvfVPSPVHPTPAPQPgVVNIPSVAQ 14328
Cdd:PRK14950   358 ALLVPVPAPQPAKPTAAAPS-----PVRPTPAPSTRPKAAAAANIPPKEPVRETA--TPPPVPPRPVAPP-VPHTPESAP 429
                           90       100
                   ....*....|....*....|....*
gi 442625924 14329 PVhPTYQPPVVERPaiydVYYPPPP 14353
Cdd:PRK14950   430 KL-TRAAIPVDEKP----KYTPPAP 449
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
14194-14496 5.64e-06

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 54.55  E-value: 5.64e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14194 APNPPVHEF--NYPTPPAVPQQPGVLNIPSyPTPVAPTPQSPIYIPSQEQPkpttrPSVINVpsVPQPAypTPQAPVYDV 14271
Cdd:PLN03209   311 APLTPMEELlaKIPSQRVPPKESDAADGPK-PVPTKPVTPEAPSPPIEEEP-----PQPKAV--VPRPL--SPYTAYEDL 380
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14272 NYPTSPsvIPHQPGvvnipSVPLPAPPVKQrpvfVPSPVHPTPAPQPGVVniPSVAQpVHPTYQPPVVERPAIYDVYYP- 14350
Cdd:PLN03209   381 KPPTSP--IPTPPS-----SSPASSKSVDA----VAKPAEPDVVPSPGSA--SNVPE-VEPAQVEAKKTRPLSPYARYEd 446
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14351 -PPPSRPGviniPSPPRPVYP-------VPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQDVtypapqpsp 14422
Cdd:PLN03209   447 lKPPTSPS----PTAPTGVSPsvsstssVPAVPDTAPATAATDAAAPPPANMRPLSPYAVYDDLKPPTSPS--------- 513
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14423 pvpgivniPSLPQPVSTPTSGVINIPSQASPPISVPTPGIVNIPsiPQPTPQRPSPGIINVpsvpQPiPTAPSP 14496
Cdd:PLN03209   514 --------PAAPVGKVAPSSTNEVVKVGNSAPPTALADEQHHAQ--PKPRPLSPYTMYEDL----KP-PTSPTP 572
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14350-14602 7.82e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 54.11  E-value: 7.82e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14350 PPPPSRPGVINIP---SPPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQdvtypapqpsppvpg 14426
Cdd:PRK12323   374 PATAAAAPVAQPApaaAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQ--------------- 438
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14427 ivNIPSLPQPVSTPTSGVINIPSQASPPisvPTPGIVNIPSIPQPTPQRPSPgiinvPSVPQPIPTAPSPGiiniPSVPQ 14506
Cdd:PRK12323   439 --ASARGPGGAPAPAPAPAAAPAAAARP---AAAGPRPVAAAAAAAPARAAP-----AAAPAPADDDPPPW----EELPP 504
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14507 PLPSPTPgvinIPQQPTPPPLVQQPgiINIPSVQQPSTPttqhpiqdvqYETQRPQPTPGVINIPSVSQPTYPTQKPSYQ 14586
Cdd:PRK12323   505 EFASPAP----AQPDAAPAGWVAES--IPDPATADPDDA----------FETLAPAPAAAPAPRAAAATEPVVAPRPPRA 568
                          250       260
                   ....*....|....*....|....*
gi 442625924 14587 ---------DTSYPTVQPKPPVSGI 14602
Cdd:PRK12323   569 sasglpdmfDGDWPALAARLPVRGL 593
DUF4106 pfam13388
Protein of unknown function (DUF4106); This family of proteins are found in large numbers in ...
14457-14566 8.17e-06

Protein of unknown function (DUF4106); This family of proteins are found in large numbers in the Trichomonas vaginalis proteome. The function of this protein is unknown.


Pssm-ID: 404296  Cd Length: 431  Bit Score: 53.75  E-value: 8.17e-06
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14457 VPTPGIVnIPsiPQPTPQRPSPGIinvpsvPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINI 14536
Cdd:pfam13388   165 ILASGIY-IP--PNPPREAPAPGL------PKTFTSSHGHRHRHAPKPTVQNPAQQPTVQNPAQQPTQQPTVQNPAQQQN 235
                            90       100       110
                    ....*....|....*....|....*....|
gi 442625924  14537 PSVQQPSTPTTQHPIQDVQyeTQRPQPTPG 14566
Cdd:pfam13388   236 PAQQPPPQPAQQPTVQNPA--QQQPQTEQG 263
EGF_CA smart00179
Calcium-binding EGF-like domain;
255-286 8.29e-06

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 46.86  E-value: 8.29e-06
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGY 286
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
PHA03247 PHA03247
large tegument protein UL36; Provisional
14437-14692 8.68e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 54.56  E-value: 8.68e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14437 VSTPTSGVINIPSqaspPISVPTPGIVNIPSIPQPTPQRPSPgiiNVPSVPQPiPTAPSPGIINIPSVPQPLPSPTPgvi 14516
Cdd:PHA03247   244 ISHPLRGDIAAPA----PPPVVGEGADRAPETARGATGPPPP---PEAAAPNG-AAAPPDGVWGAALAGAPLALPAP--- 312
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14517 nipqqPTPPPlvqqpgiinipsvQQPSTPTTQHPIQDVQYETQRPQPTPGV---INIPSVSQPTYpTQKPSYQDTSYPTV 14593
Cdd:PHA03247   313 -----PDPPP-------------PAPAGDAEEEDDEDGAMEVVSPLPRPRQhypLGFPKRRRPTW-TPPSSLEDLSAGRH 373
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14594 QPK---PPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGiinVPSIPEPIPSIPQNPVQEVYHDTQKPQAIPg 14670
Cdd:PHA03247   374 HPKrasLPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPA---PTPVPASAPPPPATPLPSAEPGSDDGPAPP- 449
                          250       260
                   ....*....|....*....|..
gi 442625924 14671 vvnvpsaPQPTPGRPYYDVAKP 14692
Cdd:PHA03247   450 -------PERQPPAPATEPAPD 464
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
14179-14565 9.07e-06

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 54.24  E-value: 9.07e-06
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14179 PGVVNIPSAPQPVHPAPNPPV-HEFNYPTPPAVPQQPgvLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVinvPSVP 14257
Cdd:pfam09606    90 AGQGTRPQMMGPMGPGPGGPMgQQMGGPGTASNLLAS--LGRPQMPMGGAGFPSQMSRVGRMQPGGQAGGMMQ---PSSG 164
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14258 QPAYPTPQAPVYDV--NYPTSPSVIPHQ--------PGVVNIPSVPLPAPPVKQRPVFVPSPVHPTP-APQPGVVNIPSV 14326
Cdd:pfam09606   165 QPGSGTPNQMGPNGgpGQGQAGGMNGGQqgpmggqmPPQMGVPGMPGPADAGAQMGQQAQANGGMNPqQMGGAPNQVAMQ 244
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14327 AQPVHPTYQPPVVERPAIYDVYYPPppSRPGVINIPSPPRPVYPVPQQPIYVPaPVLHIPAPRPVIHNIPSVPQPTYPHR 14406
Cdd:pfam09606   245 QQQPQQQGQQSQLGMGINQMQQMPQ--GVGGGAGQGGPGQPMGPPGQQPGAMP-NVMSIGDQNNYQQQQTRQQQQQQGGN 321
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14407 NPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSGVINI-PSQASPPISVPTPGIVNIPSIPQPTP--QRPSPGIINV 14483
Cdd:pfam09606   322 HPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGAnPMQRGQPGMMSSPSPVPGQQVRQVTPnqFMRQSPQPSV 401
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14484 PSVPQPI---PTAPSPGIIniPSvPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIP---SVQQPSTPTTQHPIQDvQYE 14557
Cdd:pfam09606   402 PSPQGPGsqpPQSHPGGMI--PS-PALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPgqsAVNSPLNPQEEQLYRE-KYR 477

                    ....*...
gi 442625924  14558 TQRPQPTP 14565
Cdd:pfam09606   478 QLTKYIEP 485
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
14187-14390 9.62e-06

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 54.22  E-value: 9.62e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYP----TPVAPTPQS----PIYIPSQEQPKPTTRPSVINVPSvPQ 14258
Cdd:PRK07764   591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPapagAAAAPAEASaapaPGVAAPEHHPKHVAVPDASDGGD-GW 669
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14259 PAYPTPQAPVYDVnyPTSPSVIPHQPGVVNiPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVA---QPVHPTYQ 14335
Cdd:PRK07764   670 PAKAGGAAPAAPP--PAPAPAAPAAPAGAA-PAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAAddpVPLPPEPD 746
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14336 PPVVERPAIYDVYYPPPPSRPGViniPSPPRPVYPVPQQPiyvPAPVLHIPAPRP 14390
Cdd:PRK07764   747 DPPDPAGAPAQPPPPPAPAPAAA---PAAAPPPSPPSEEE---EMAEDDAPSMDD 795
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
14190-14504 1.11e-05

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 53.77  E-value: 1.11e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14190 PVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPT-PVAPTPQSPIYIPSQEQPKPTTRPSVINVPSvPQPAYPTPQAPV 14268
Cdd:pfam05109   425 PESTTTSPTLNTTGFAAPNTTTGLPSSTHVPTNLTaPASTGPTVSTADVTSPTPAGTTSGASPVTPS-PSPRDNGTESKA 503
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14269 YDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVN-IPSVAQPVHPTYQPPVVERPAIYDV 14347
Cdd:pfam05109   504 PDMTSPTSAVTTPTPNATSPTPAVTTPTPNATSPTLGKTSPTSAVTTPTPNATSpTPAVTTPTPNATIPTLGKTSPTSAV 583
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14348 YYPPPPSRPGVINIPSPP-----RPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHR--------NPPIQD-- 14412
Cdd:pfam05109   584 TTPTPNATSPTVGETSPQanttnHTLGGTSSTPVVTSPPKNATSAVTTGQHNITSSSTSSMSLRpssisetlSPSTSDns 663
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14413 VTYPAPQPSPPVPGIVNIPSLpQPVSTPTSGVinipSQASPpisVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQP--- 14489
Cdd:pfam05109   664 TSHMPLLTSAHPTGGENITQV-TPASTSTHHV----STSSP---APRPGTTSQASGPGNSSTSTKPGEVNVTKGTPPkna 735
                           330
                    ....*....|....*.
gi 442625924  14490 -IPTAPSPGIINIPSV 14504
Cdd:pfam05109   736 tSPQAPSGQKTAVPTV 751
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
14237-14597 1.22e-05

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 53.93  E-value: 1.22e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQpKPTTRPSVINVPSVPQ-PAYP-------TPQAPVyDVNYPTSPSViPHQPGVVNIPSVPlPAPPVKQRPVFVPS 14308
Cdd:PTZ00449   563 PAKEH-KPSKIPTLSKKPEFPKdPKHPkdpeepkKPKRPR-SAQRPTRPKS-PKLPELLDIPKSP-KRPESPKSPKRPPP 638
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14309 PVHPTPAPQPGVVNIPSVAQPVH---PTYQPPVVERpaIYDVYYPPPpSRPGVINIPSPPRPVYPVPQQPIYVPAPVLHI 14385
Cdd:PTZ00449   639 PQRPSSPERPEGPKIIKSPKPPKspkPPFDPKFKEK--FYDDYLDAA-AKSKETKTTVVLDESFESILKETLPETPGTPF 715
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14386 PAPRPVIHNIPSvpQPTYPHRnpPIQDvtypapqpsppvpgivniPSLPQPvstptsgvinipsqasPPISVPTPGIVNI 14465
Cdd:PTZ00449   716 TTPRPLPPKLPR--DEEFPFE--PIGD------------------PDAEQP----------------DDIEFFTPPEEER 757
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14466 PSIPQPTPQRPSPGIInVPSVPQPIPTAPSPGiiniPSVPQPLP-SPTpgviniPQQPTPPPlvqqpgiinipsvQQPST 14544
Cdd:PTZ00449   758 TFFHETPADTPLPDIL-AEEFKEEDIHAETGE----PDEAMKRPdSPS------EHEDKPPG-------------DHPSL 813
                          330       340       350       360       370
                   ....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14545 PTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSypTVQPKP 14597
Cdd:PTZ00449   814 PKKRHRLDGLALSTTDLESDAGRIAKDASGKIVKLKRSKSFDDLT--TVEEAE 864
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
14350-14615 1.34e-05

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 53.39  E-value: 1.34e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14350 PPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPvlhiPAPRPVIHNiPSVPQPTYPHRNPPIQDvtypapqpsppvpgIVN 14429
Cdd:PLN03209   329 PPKESDAADGPKPVPTKPVTPEAPSPPIEEEP----PQPKAVVPR-PLSPYTAYEDLKPPTSP--------------IPT 389
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 IPSLPQPVSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQPTP-QRP-SPGI----INVPSVPQPIP-TAPSPGIINIP 14502
Cdd:PLN03209   390 PPSSSPASSKSVDAVAKPAEPDVVPSPGSASNVPEVEPAQVEAKkTRPlSPYAryedLKPPTSPSPTApTGVSPSVSSTS 469
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14503 SVPQPLPSPTPGVINIPQQPTPPplvqqpgiinipsvqqPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQK 14582
Cdd:PLN03209   470 SVPAVPDTAPATAATDAAAPPPA----------------NMRPLSPYAVYDDLKPPTSPSPAAPVGKVAPSSTNEVVKVG 533
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|...
gi 442625924 14583 PSYQDTSYP----TVQPKP-PVSGI-----INIPSVPQPVPSL 14615
Cdd:PLN03209   534 NSAPPTALAdeqhHAQPKPrPLSPYtmyedLKPPTSPTPSPVL 576
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
255-289 1.44e-05

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 46.48  E-value: 1.44e-05
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGYDGD 289
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
13939-14257 1.53e-05

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 53.39  E-value: 1.53e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13939 NIPSAPQPiyptpqsPQYNVNYPSPQPAnPQKPGVVNIPSVPQPVYPsPQPpvydVNYPTTPVSQHPGVVNI--PSAPRL 14016
Cdd:PLN03209   322 KIPSQRVP-------PKESDAADGPKPV-PTKPVTPEAPSPPIEEEP-PQP----KAVVPRPLSPYTAYEDLkpPTSPIP 388
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14017 VPPTSQRPvfitSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQspipqqpgvvniPSVPSPSYPAPNPPvn 14096
Cdd:PLN03209   389 TPPSSSPA----SSKSVDAVAKPAEPDVVPSPGSASNVPEVEPAQVEAKKTR------------PLSPYARYEDLKPP-- 450
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14097 ypTQPSPQIPVQPGVINIPSAPLPTTPPQHPPvfipspespspapkpgVINIPSVTHPE---YPTSQVPVYDVNYSTTpS 14173
Cdd:PLN03209   451 --TSPSPTAPTGVSPSVSSTSSVPAVPDTAPA----------------TAATDAAAPPPanmRPLSPYAVYDDLKPPT-S 511
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14174 PIPQKPGVVNIPSAPQPVHPAPNPPVhefnyPTPPAVPQQpgvlNIPSYPTPVAPTpqsPIYipsqEQPKPTTRPSvinv 14253
Cdd:PLN03209   512 PSPAAPVGKVAPSSTNEVVKVGNSAP-----PTALADEQH----HAQPKPRPLSPY---TMY----EDLKPPTSPT---- 571

                   ....
gi 442625924 14254 PSVP 14257
Cdd:PLN03209   572 PSPV 575
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
13904-14094 1.59e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 53.34  E-value: 1.59e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13904 TPKPVRPQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSAPQPIYPTPQSP---------QYNVNYPSPQPANPQKPGVV 13974
Cdd:PRK12323   385 PAPAAAAPAAAAPAPAAPPAAP-------AAAPAAAAAARAVAAAPARRSPapealaaarQASARGPGGAPAPAPAPAAA 457
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13975 NIPSVPQPVYPSPQPPVYDvnyPTTPVSQHPGVVNIPsAPRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPT 14054
Cdd:PRK12323   458 PAAAARPAAAGPRPVAAAA---AAAPARAAPAAAPAP-ADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATAD 533
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|
gi 442625924 14055 PQSPIYDANYPTTQSPIPQqpgvvniPSVPSPSYPAPNPP 14094
Cdd:PRK12323   534 PDDAFETLAPAPAAAPAPR-------AAAATEPVVAPRPP 566
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14223-14318 1.63e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 53.27  E-value: 1.63e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14223 PTPVAPTPQSPIYIPSQEQPKPTTRPSVInvpsvpqPAYPTPQAPVydVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQR 14302
Cdd:PRK14950   364 PAPQPAKPTAAAPSPVRPTPAPSTRPKAA-------AAANIPPKEP--VRETATPPPVPPRPVAPPVPHTPESAPKLTRA 434
                           90
                   ....*....|....*..
gi 442625924 14303 PVFVP-SPVHPTPAPQP 14318
Cdd:PRK14950   435 AIPVDeKPKYTPPAPPK 451
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
14178-14389 2.12e-05

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 52.68  E-value: 2.12e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14178 KPGVVNIPSAPQPVHPAPNPPvhefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPsVP 14257
Cdd:PRK12727    59 RSDTPATAAAPAPAPQAPTKP------AAPVHAPLKLSANANMSQRQRVASAAEDMIAAMALRQPVSVPRQAPAAAP-VR 131
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14258 QPAYPTP----QAPVYDVNYPTSP----SVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVnipSVAQp 14329
Cdd:PRK12727   132 AASIPSPaaqaLAHAAAVRTAPRQehalSAVPEQLFADFLTTAPVPRAPVQAPVVAAPAPVPAIAAALAAHA---AYAQ- 207
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14330 vHPTYQppvvERPAIYDVYYPPPPSRPgviniPSPPRPVYPVPQQPIYVPAPVLHIPAPR 14389
Cdd:PRK12727   208 -DDDEQ----LDDDGFDLDDALPQILP-----PAALPPIVVAPAAPAALAAVAAAAPAPQ 257
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
14227-14531 2.35e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 52.93  E-value: 2.35e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14227 APTPQSPIYIPSQeQPKPTTRPsvinvPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPV-- 14304
Cdd:PRK07003   367 APGGGVPARVAGA-VPAPGARA-----AAAVGASAVPAVTAV-----TGAAGAALAPKAAAAAAATRAEAPPAAPAPPat 435
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14305 ---FVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIydvyYPPPPSRPGVINIPSPP----RPVYPVPQQPIY 14377
Cdd:PRK07003   436 adrGDDAADGDAPVPAKANARASADSRCDERDAQPPADSGSAS----APASDAPPDAAFEPAPRaaapSAATPAAVPDAR 511
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14378 VPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQ--------DVTYPA----PQPSPPVPGIVNIPSLPQPVSTPtsgvi 14445
Cdd:PRK07003   512 APAAASREDAPAAAAPPAPEARPPTPAAAAPAARaggaaaalDVLRNAgmrvSSDRGARAAAAAKPAAAPAAAPK----- 586
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14446 niPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQP---IPT-------------APSPGII--------NI 14501
Cdd:PRK07003   587 --PAAPRVAVQVPTPRARAATGDAPPNGAARAEQAAESRGAPPPwedIPPddyvplsadegfgGPDDGFVpvfdsgpdDV 664
                          330       340       350
                   ....*....|....*....|....*....|
gi 442625924 14502 PSVPQPLPSPTPGViniPQQPTPPPLVQQP 14531
Cdd:PRK07003   665 RVAPKPADAPAPPV---DTRPLPPAIPLDA 691
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
14429-14682 2.57e-05

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 52.62  E-value: 2.57e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14429 NIPS--LPQPVSTPTSGVINIPSQASPPiSVPTPGIVNIPSIPQPTPQRP-SPGIINVPSVPqpiPTAPSPgiiNIPSVP 14505
Cdd:PLN03209   322 KIPSqrVPPKESDAADGPKPVPTKPVTP-EAPSPPIEEEPPQPKAVVPRPlSPYTAYEDLKP---PTSPIP---TPPSSS 394
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14506 QPLPSPTPGViNIPQQPTPPPLVQQPgiINIPSVQQPSTPT-TQHPIQD-VQYETQRP----QPTPGVINIPSVSQPTYP 14579
Cdd:PLN03209   395 PASSKSVDAV-AKPAEPDVVPSPGSA--SNVPEVEPAQVEAkKTRPLSPyARYEDLKPptspSPTAPTGVSPSVSSTSSV 471
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14580 TQKP-------SYQDTSYPTVQPKP----PVSGIINIPSVPQPV-------PSLTPGVINLPSEPSYSAPIPKPGIINvp 14641
Cdd:PLN03209   472 PAVPdtapataATDAAAPPPANMRPlspyAVYDDLKPPTSPSPAapvgkvaPSSTNEVVKVGNSAPPTALADEQHHAQ-- 549
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|.
gi 442625924 14642 siPEPIPSIPQNpvqeVYHDTqKPqaipgvvnvPSAPQPTP 14682
Cdd:PLN03209   550 --PKPRPLSPYT----MYEDL-KP---------PTSPTPSP 574
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14283-14533 2.81e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 52.57  E-value: 2.81e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14283 QPGVVNIPSVPlpaPPVKQRPVfvpspVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAiydvyyPPPPSRPGViniP 14362
Cdd:PRK12323   364 RPGQSGGGAGP---ATAAAAPV-----AQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAA------RAVAAAPAR---R 426
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 SPPRPVYPVPQQPIYVPAPVLHIPAPRPvihniPSVPQPTYPhrnPPIQDVtypapqpSPPVPGIVNIPSLPQPVSTPTS 14442
Cdd:PRK12323   427 SPAPEALAAARQASARGPGGAPAPAPAP-----AAAPAAAAR---PAAAGP-------RPVAAAAAAAPARAAPAAAPAP 491
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14443 GVINIPSQASPPISVPTPGivnipsipqPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQP 14522
Cdd:PRK12323   492 ADDDPPPWEELPPEFASPA---------PAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVA 562
                          250
                   ....*....|.
gi 442625924 14523 TPPPLVQQPGI 14533
Cdd:PRK12323   563 PRPPRASASGL 573
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14230-14381 3.06e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 52.41  E-value: 3.06e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14230 PQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVydvnyptspsviPHQPGVVNIPSVPLPAPPvkQRPVFVPSP 14309
Cdd:PRK14951   366 PAAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPA------------AAPAAAASAPAAPPAAAP--PAPVAAPAA 431
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14310 VHPTPAPQPGVVnipSVAQPVHPTYQPPvvERPAIYDVYYPPPPSrpgvinIPSPPRPVYPVPQQPIYVPAP 14381
Cdd:PRK14951   432 AAPAAAPAAAPA---AVALAPAPPAQAA--PETVAIPVRVAPEPA------VASAAPAPAAAPAAARLTPTE 492
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
338-373 3.22e-05

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 45.32  E-value: 3.22e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 442625924   338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGFVLEH 373
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
dnaA PRK14086
chromosomal replication initiator protein DnaA;
14182-14408 3.39e-05

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 52.14  E-value: 3.39e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14182 VNIPSAPQPVHPAPNPPVHE-FNYPTPPAVPQQPgvlnIPSYPTPVA-PTPQSPiyipsqeqPKPTTRPSvinvpsvPQP 14259
Cdd:PRK14086    84 IAITVDPSAGEPAPPPPHARrTSEPELPRPGRRP----YEGYGGPRAdDRPPGL--------PRQDQLPT-------ARP 144
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14260 AYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPvfvpspvHPTPAPQPGVVNIPSVAQPVHPTYQP-PV 14338
Cdd:PRK14086   145 AYPAYQQRPEPGAWPRAADDYGWQQQRLGFPPRAPYASPASYAP-------EQERDREPYDAGRPEYDQRRRDYDHPrPD 217
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14339 VERPAIYDVYYPPPPsrPGVINipsPPRPVyPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHR--NP 14408
Cdd:PRK14086   218 WDRPRRDRTDRPEPP--PGAGH---VHRGG-PGPPERDDAPVVPIRPSAPGPLAAQPAPAPGPGEPTArlNP 283
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
14057-14366 3.39e-05

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 52.38  E-value: 3.39e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14057 SPIYDANYPTTQSPIPQQ-------PGVVNIPSVP----SPSYP----APNPPVNyPTQP-SPQIPVQPGVINIPSAP-- 14118
Cdd:PTZ00449   548 KPGETKEGEVGKKPGPAKehkpskiPTLSKKPEFPkdpkHPKDPeepkKPKRPRS-AQRPtRPKSPKLPELLDIPKSPkr 626
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14119 --LPTTPPQHPPvfipspespspapkpgvinipsvthPEYPTSqvpvydvnysttpspiPQKPGVVNIPSAPQPvhpapn 14196
Cdd:PTZ00449   627 peSPKSPKRPPP-------------------------PQRPSS----------------PERPEGPKIIKSPKP------ 659
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14197 ppvhefnyPTPPAVPQQPGVLN--IPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAyPTPQAPVydvnYP 14274
Cdd:PTZ00449   660 --------PKSPKPPFDPKFKEkfYDDYLDAAAKSKETKTTVVLDESFESILKETLPETPGTPFTT-PRPLPPK----LP 726
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14275 TSPSvIPHQPgvVNIPSVPLP------APPVKQRPVFvpspvHPTPA--PQPGVVNIPSVAQPVHPTYQPP--VVERPAI 14344
Cdd:PTZ00449   727 RDEE-FPFEP--IGDPDAEQPddieffTPPEEERTFF-----HETPAdtPLPDILAEEFKEEDIHAETGEPdeAMKRPDS 798
                          330       340
                   ....*....|....*....|..
gi 442625924 14345 YDVYYPPPPSrpgviNIPSPPR 14366
Cdd:PTZ00449   799 PSEHEDKPPG-----DHPSLPK 815
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
14185-14412 3.78e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 52.30  E-value: 3.78e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14185 PSAPQPVHPAPNPPVHEFNYPTPPAVPQqpgvlnipsyptPVAPTPQSPiyipsqeqPKPTTRPSVINVPSVPQPAYPTP 14264
Cdd:PRK07764   593 GAAGGEGPPAPASSGPPEEAARPAAPAA------------PAAPAAPAP--------AGAAAAPAEASAAPAPGVAAPEH 652
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14265 QAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAP-PVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPA 14343
Cdd:PRK07764   653 HPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPaPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPS 732
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14344 IYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNiPSVPQPTYPHRNPPIQD 14412
Cdd:PRK07764   733 PAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEE-EMAEDDAPSMDDEDRRD 800
Gag_spuma pfam03276
Spumavirus gag protein;
14473-14671 4.11e-05

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 51.67  E-value: 4.11e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14473 PQRPSPGIINVPSVPQPIPTAPSPgiiNIP-SVPQPLPsPTPGVINIPQQ----PTPPPLVQQPGiinipsvqqpstptt 14547
Cdd:pfam03276   196 PSLPAIGGIHLPAIPGIHARAPPG---NIArSLGDDIM-PSLGDAGMPQPrfafHPGNPFAEAEG--------------- 256
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14548 qHPIQDVQYETQRPQPTPGVINIPSVSQPtyptqkpsyqdtsyPTVQPKPPvsgiinipSVPQPVPSLTPgvinlpsePS 14627
Cdd:pfam03276   257 -HPFAEAEGERPRDIPRAPRIDAPSAPAI--------------PAIQPIAP--------PMIPPIGAPIP--------IP 305
                           170       180       190       200
                    ....*....|....*....|....*....|....*....|....
gi 442625924  14628 YSAPIPKPGIINVPSIPepipsiPQNPVQEVYHDTQKPQAIPGV 14671
Cdd:pfam03276   306 HGASIPGEHIRNPREEP------IRLGREAPAIDGRFAPAIDDL 343
Tymo_45kd_70kd pfam03251
Tymovirus 45/70Kd protein; Tymoviruses are single stranded RNA viruses. This family includes a ...
13905-14318 4.60e-05

Tymovirus 45/70Kd protein; Tymoviruses are single stranded RNA viruses. This family includes a protein of unknown function that has been named based on its molecular weight. Tymoviruses such as the ononis yellow mosaic tymovirus encode only three proteins. Of these two are overlapping this protein overlaps a larger ORF that is thought to be the polymerase.


Pssm-ID: 281269 [Multi-domain]  Cd Length: 468  Bit Score: 51.33  E-value: 4.60e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13905 PKPVRPQIYDTPSPPYPVAIP----------DLVYVQQQQPGIVNIpSAPQPIYPTPQ---SPQYNVNYPS--PQPANPQ 13969
Cdd:pfam03251    67 PPPRRPQDNRDFSPLHPLVFPghhsqlrhvhETQQVQQTCPGKLKL-SGAEELPPAPQrqhSLPLHITRPSrfPHHFHAR 145
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13970 KPGVvnIPSVPQpvypspQPPVYDVNYPTTPVSQHPGVVNIPS-APRLVPPTSQrpvFITSPGNLSPTPQpgviniPSVS 14048
Cdd:pfam03251   146 RPDV--LPSVPD------HGPVLTETKPRTSVRQPRSATRGPSfRPILLPKVVH---VHDDPPHSSLRPR------GSRS 208
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14049 QPGYPTPQSPIYDANypttQSPIPQQPGvvniPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINI----PSAPLPTTPP 14124
Cdd:pfam03251   209 RQLQPTVRRPLLAPN----QFHSPRQPP----PLSDDPGILGPRPLAPHSTRDPPPRPITPGPSNThdlrPLSVLPRTSP 280
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14125 QHPPvfipspespspapkpgvinIPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPgVVNIPSAPQPVHPAPNPPVHEFNY 14204
Cdd:pfam03251   281 RRGL-------------------LPNPRRHRTSTGHIPPTTTSRPTGPPSRLQRP-VHLYQSSPHTPNFRPSSIRKDALL 340
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14205 PTPPAVPQQPGvLNIPSYPTPVAPTPQSPIYIPSQEQPK--PTTRPSVINVPSV----PQPAYPTPQAPVYDVNYPTSPS 14278
Cdd:pfam03251   341 QTGPRLGHLER-LGQPANLRTSERSPPTKRRLPRSSEPNrlPKPLPEATLAPSYrhrrPYPLLPNPPAALPSIAYTSSRG 419
                           410       420       430       440
                    ....*....|....*....|....*....|....*....|
gi 442625924  14279 VIPHQPGVVNIPSVPLPAPPVKQrpvfvpspvhPTPAPQP 14318
Cdd:pfam03251   420 KIHHSLPKGALPKEGAPPPPRRL----------PSPAPRP 449
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14220-14379 5.76e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 48.25  E-value: 5.76e-05
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14220 PSYP-TPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVnyPTSPSVIPHQPGVVNIPsvplpaPP 14298
Cdd:smart00818    24 PSYGyEPMGGWLHHQIIPVSQQHPPTHTLQPHHHIPVLPAQQPVVPQQPLMPV--PGQHSMTPTQHHQPNLP------QP 95
                             90       100       110       120       130       140       150       160
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14299 VKQrpvfvpsPVHPTPAPQPgvvnipsvaQPVHPTYQPPVVErpaiydvyyPPPPSRPgviniPSPPRPVYPVPQQPIYV 14378
Cdd:smart00818    96 AQQ-------PFQPQPLQPP---------QPQQPMQPQPPVH---------PIPPLPP-----QPPLPPMFPMQPLPPLL 145

                     .
gi 442625924   14379 P 14379
Cdd:smart00818   146 P 146
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14274-14486 5.77e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 51.42  E-value: 5.77e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14274 PTSPSVIPhQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVhPTYQPPVVERPAIYDVYYPPPP 14353
Cdd:PRK12323   374 PATAAAAP-VAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPA-PEALAAARQASARGPGGAPAPA 451
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVINIPSPPRPVYPVPqqpiyvPAPVLHIPAPRPVIHNIPSVPQPTYPhrnPPIQDVtypapQPSPPVPGIVNIPSL 14433
Cdd:PRK12323   452 PAPAAAPAAAARPAAAGPR------PVAAAAAAAPARAAPAAAPAPADDDP---PPWEEL-----PPEFASPAPAQPDAA 517
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPV---STPTSGVIN----IPSQASPPISVPTPgIVNIPSIPQPTPQRPSPGIINVPSV 14486
Cdd:PRK12323   518 PAGWvaeSIPDPATADpddaFETLAPAPAAAPAP-RAAAATEPVVAPRPPRASASGLPDM 576
Gag_spuma pfam03276
Spumavirus gag protein;
14284-14412 6.22e-05

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 51.29  E-value: 6.22e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14284 PGVVNIPSVPLPAPPvkqrPVFVPSPVHPTPAPQPGvvNIP---SVAQPVHPTY----QPPVVE----RPAIYDVYYPPP 14352
Cdd:pfam03276   196 PSLPAIGGIHLPAIP----GIHARAPPGNIARSLGD--DIMpslGDAGMPQPRFafhpGNPFAEaeghPFAEAEGERPRD 269
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924  14353 PSRPGVINIPSPPRPVYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVP------QPTYPHRNPPIQD 14412
Cdd:pfam03276   270 IPRAPRIDAPSAPAIPAIQPIAP--PMIPPIGAPIPIPHGASIPGEHirnpreEPIRLGREAPAID 333
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
13903-14082 6.32e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 51.58  E-value: 6.32e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13903 ETPKPVRPQIYDTPSPpyPVAIPDLVYVQQQQpgivnipsaPQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQP 13982
Cdd:pfam09770   206 QAKKPAQQPAPAPAQP--PAAPPAQQAQQQQQ---------FPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQP 274
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13983 VYPSPQPPvydvnypttPVSQhpgvvnipSAPRLVPPTSQRPVFIT-SPGNLSPTPQPGVINIPSVSQPGYPTPQSPiyd 14061
Cdd:pfam09770   275 DPAQPSIQ---------PQAQ--------QFHQQPPPVPVQPTQILqNPNRLSAARVGYPQNPQPGVQPAPAHQAHR--- 334
                           170       180
                    ....*....|....*....|.
gi 442625924  14062 anyptTQSPIPQQPGVVNIPS 14082
Cdd:pfam09770   335 -----QQGSFGRQAPIITHPQ 350
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
13883-14127 6.39e-05

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 51.46  E-value: 6.39e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13883 HSPICYCISSHTGdPFTRCYETPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPS 13962
Cdd:pfam05109   453 HVPTNLTAPASTG-PTVSTADVTSPTPAGTTSGASPVTPSPSPRDNGTESKAPDMTSPTSAVTTPTPNATSPTPAVTTPT 531
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13963 PQPANPQ--KPGVVNIPSVPQPVYPSPQP----PVYDVNYPTTPVSQHPGVVNIPSaPRLVPP----------------- 14019
Cdd:pfam05109   532 PNATSPTlgKTSPTSAVTTPTPNATSPTPavttPTPNATIPTLGKTSPTSAVTTPT-PNATSPtvgetspqanttnhtlg 610
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14020 -TSQRPVFITSPGNLSPTPQPGVINI------------PSVSQPGYP------TPQSPIYDANYPTTQSPIPQ-QPGVVN 14079
Cdd:pfam05109   611 gTSSTPVVTSPPKNATSAVTTGQHNItssstssmslrpSSISETLSPstsdnsTSHMPLLTSAHPTGGENITQvTPASTS 690
                           250       260       270       280       290
                    ....*....|....*....|....*....|....*....|....*....|..
gi 442625924  14080 IPSVpSPSYPAPNP-PVNYPTQP-SPQIPVQPGVINIP--SAPLPTTPPQHP 14127
Cdd:pfam05109   691 THHV-STSSPAPRPgTTSQASGPgNSSTSTKPGEVNVTkgTPPKNATSPQAP 741
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14274-14401 6.39e-05

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 48.25  E-value: 6.39e-05
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14274 PTSPSVIPHQ--PGVVNIPSVPLPAPPVKQRPVfVPSPVHPTPAPQPGvvNIPSVAQPVHPTYQPPVVErpaiydvyyPP 14351
Cdd:smart00818    41 PVSQQHPPTHtlQPHHHIPVLPAQQPVVPQQPL-MPVPGQHSMTPTQH--HQPNLPQPAQQPFQPQPLQ---------PP 108
                             90       100       110       120       130
                     ....*....|....*....|....*....|....*....|....*....|
gi 442625924   14352 PPSRPgvINIPSPPRPVYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVPQP 14401
Cdd:smart00818   109 QPQQP--MQPQPPVHPIPPLPPQP--PLPPMFPMQPLPPLLPDLPLEAWP 154
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
13997-14278 6.60e-05

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 51.08  E-value: 6.60e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13997 PTTPVSQHpgVVNIPSapRLVPPtsQRPVFITSPgnlSPTPQPGVINIPSVSQPGYPTPQspiydanyPTTQSPIPQQPG 14076
Cdd:PLN03209   312 PLTPMEEL--LAKIPS--QRVPP--KESDAADGP---KPVPTKPVTPEAPSPPIEEEPPQ--------PKAVVPRPLSPY 374
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14077 VVNIPSVPsPSYPAPNPPVNYPTQPSP----QIPVQPGVIniPSAPLPTTPPQHPPvfipspespspapkpgvinIPSVT 14152
Cdd:PLN03209   375 TAYEDLKP-PTSPIPTPPSSSPASSKSvdavAKPAEPDVV--PSPGSASNVPEVEP-------------------AQVEA 432
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14153 HPEYPTSQVPVY-DVNYSTTPSPIPQKPGVVNIPSAP----QPVHPAPNPPVHEFNYPTPPAVPQQPGV----LNIPSYP 14223
Cdd:PLN03209   433 KKTRPLSPYARYeDLKPPTSPSPTAPTGVSPSVSSTSsvpaVPDTAPATAATDAAAPPPANMRPLSPYAvyddLKPPTSP 512
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14224 TPVAPTPQSPiyiPSQEQPKPTTRPSVINVPSV-------PQPAYPTPQAPVYDVNYPTSPS 14278
Cdd:PLN03209   513 SPAAPVGKVA---PSSTNEVVKVGNSAPPTALAdeqhhaqPKPRPLSPYTMYEDLKPPTSPT 571
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
14502-14706 6.61e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 51.31  E-value: 6.61e-05
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14502 PSVPQPLPSPTPGVINIPQQ--PTPPPLVQQPGiiniPSVQQPSTPTTQHPiqdvqyETQRPQPTPGVINIPSVSQPTyp 14579
Cdd:pfam03154   146 PSIPSPQDNESDSDSSAQQQilQTQPPVLQAQS----GAASPPSPPPPGTT------QAATAGPTPSAPSVPPQGSPA-- 213
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14580 tqkpsyqdTSYPTVQPKPPVSGIINIPSVPQPVPSltpgviNLPSEPSYSAPIPKPgiinvpsiPEPIPSIPQNPVQEVY 14659
Cdd:pfam03154   214 --------TSQPPNQTQSTAAPHTLIQQTPTLHPQ------RLPSPHPPLQPMTQP--------PPPSQVSPQPLPQPSL 271
                           170       180       190       200
                    ....*....|....*....|....*....|....*....|....*..
gi 442625924  14660 HDTQKPQAIPGVVNVPSAPQPTPGRPYYDVAKPDFEFNPCYPSPCGP 14706
Cdd:pfam03154   272 HGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAP 318
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14436-14630 6.61e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 51.42  E-value: 6.61e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14436 PVSTPTsgviniPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGV 14515
Cdd:PRK12323   381 PVAQPA------PAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAP 454
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14516 INIPQQPTPPPLVQQPGiiniPSVQQPSTPTTQHPIQDVQYETQRPQP---TPGVINIPSVSQP--------TYPTQKPS 14584
Cdd:PRK12323   455 AAAPAAAARPAAAGPRP----VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQPdaapagwvAESIPDPA 530
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*...
gi 442625924 14585 YQDTS--YPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSA 14630
Cdd:PRK12323   531 TADPDdaFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMFD 578
rne PRK10811
ribonuclease E; Reviewed
14061-14305 7.32e-05

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 51.19  E-value: 7.32e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14061 DANYPTtQSPIPQQPGVVnipsvpSP---------SYPAPNP--PVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPV 14129
Cdd:PRK10811   816 DERYPT-QSPMPLTVACA------SPemasgkvwiRYPVVRPqdVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSAPVV 888
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14130 fipspespspAPKPGVINIPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNP-PVHEFNYPTPP 14208
Cdd:PRK10811   889 ----------EAVAEVVEEPVVVAEPQPEEVVVVETTHPEVIAAPVTEQPQVITESDVAVAQEVAEHAePVVEPQDETAD 958
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14209 AVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTrpsvinVPSVPQPAyPTPQAPVYdVNYPTSPSVIPHQPGVVn 14288
Cdd:PRK10811   959 IEEAAETAEVVVAEPEVVAQPAAPVVAEVAAEVETVTA------VEPEVAPA-QVPEATVE-HNHATAPMTRAPAPEYV- 1029
                          250       260
                   ....*....|....*....|
gi 442625924 14289 ipsvplPAPPVK---QRPVF 14305
Cdd:PRK10811  1030 ------PEAPRHsdwQRPTF 1043
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14165-14304 7.46e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 50.96  E-value: 7.46e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14165 DVNYSTTPSP-IPQKPGVVNIPSAPQPVhPAPNPPvhefNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIyipSQEQPK 14243
Cdd:PRK14950   338 DFQLRTTSYGqLPLELAVIEALLVPVPA-PQPAKP----TAAAPSPVRPTPAPSTRPKAAAAANIPPKEPV---RETATP 409
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14244 PTTRPSVINVPSVPQPayptPQAPvydvnyPTSPSVIPhqpgVVNIPSVPLPAPPVKQRPV 14304
Cdd:PRK14950   410 PPVPPRPVAPPVPHTP----ESAP------KLTRAAIP----VDEKPKYTPPAPPKEEEKA 456
EGF_CA smart00179
Calcium-binding EGF-like domain;
338-369 7.53e-05

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 44.16  E-value: 7.53e-05
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGF 369
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14454-14565 8.92e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 50.58  E-value: 8.92e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14454 PISVPTPGIvniPSIPQPTPQRPSPGIINVPSVPQPIPTAPSpgiiniPSVPQPLPSPTPgviniPQQPTPPPLVQQPgi 14533
Cdd:PRK14950   362 PVPAPQPAK---PTAAAPSPVRPTPAPSTRPKAAAAANIPPK------EPVRETATPPPV-----PPRPVAPPVPHTP-- 425
                           90       100       110
                   ....*....|....*....|....*....|..
gi 442625924 14534 iniPSVqqPSTPTTQHPIqDVQYETQRPQPTP 14565
Cdd:PRK14950   426 ---ESA--PKLTRAAIPV-DEKPKYTPPAPPK 451
PBP1 COG5180
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ...
13903-14128 9.23e-05

PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];


Pssm-ID: 444064 [Multi-domain]  Cd Length: 548  Bit Score: 50.45  E-value: 9.23e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPPYPVAIPDLVYvQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYP---------SPQPANP--QKP 13971
Cdd:COG5180    274 AAEPPGLPVLEAGSEPQSDAPEAETAR-PIDVKGVASAPPATRPVRPPGGARDPGTPRPgqpterpagVPEAASDagQPP 352
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 GVVNIPSVPQPVYPSPQ--PPVYDVNYPTTPV----------SQHPGVVN-IPSAPRLVPPTSQRPVFIT-------SPG 14031
Cdd:COG5180    353 SAYPPAEEAVPGKPLEQgaPRPGSSGGDGAPFqppngapqpgLGRRGAPGpPMGAGDLVQAALDGGGRETaslggaaGGA 432
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14032 NLSPTPQPGVINIPSVSQPGYPTPQSPIydanyptTQSPIPQQPGVV--NIPSVPSPSYPAPNPPVNYPTQPSPQIPVQP 14109
Cdd:COG5180    433 GQGPKADFVPGDAESVSGPAGLADQAGA-------AASTAMADFVAPvtDATPVDVADVLGVRPDAILGGNVAPASGLDA 505
                          250
                   ....*....|....*....
gi 442625924 14110 GVINIPSAPLPTTPPQHPP 14128
Cdd:COG5180    506 ETRIIEAEGAPATEDFVAA 524
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14187-14336 9.72e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 50.48  E-value: 9.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQPVHPAPNPPVHEFNYPTPPAVPQQPGVlnipsyPTPVAPTPQSPIYIPSQEQPKPTTRPsvinVPSVPQPAYPTPQA 14266
Cdd:PRK14951   363 AFKPAAAAEAAAPAEKKTPARPEAAAPAAA------PVAQAAAAPAPAAAPAAAASAPAAPP----AAAPPAPVAAPAAA 432
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14267 PVydvnyptsPSVIPHQPGVVNIPsvplPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQP 14336
Cdd:PRK14951   433 AP--------AAAPAAAPAAVALA----PAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTP 490
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
137-166 1.20e-04

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 43.74  E-value: 1.20e-04
                            10        20        30
                    ....*....|....*....|....*....|
gi 442625924    137 PCDVFAHCTNTLGSFTCTCFPGYRGNGFHC 166
Cdd:pfam12947     7 GCHPNATCTNTGGSFTCTCNDGYTGDGVTC 36
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14275-14375 1.49e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 49.81  E-value: 1.49e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14275 TSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPA----PQPGVVNIPSVAQPV-HPTYQPPVVERPAIYDVYY 14349
Cdd:PRK14950   344 TSYGQLPLELAVIEALLVPVPAPQPAKPTAAAPSPVRPTPApstrPKAAAAANIPPKEPVrETATPPPVPPRPVAPPVPH 423
                           90       100
                   ....*....|....*....|....*..
gi 442625924 14350 PPPPSRPGV-INIPSPPRPVYPVPQQP 14375
Cdd:PRK14950   424 TPESAPKLTrAAIPVDEKPKYTPPAPP 450
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
212-247 1.63e-04

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 43.39  E-value: 1.63e-04
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 442625924   212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGYVGNN 247
Cdd:cd00054      1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14225-14373 1.63e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 49.71  E-value: 1.63e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14225 PVAPTPQSPIyiPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPV 14304
Cdd:PRK14951   366 PAAAAEAAAP--AEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAPA 443
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14305 FVPSPVHPTPAPQPGVVNIPSVAQPvhptyQPPVVERPAiydvyyPPPPSRPGVINIPSPPRPVY--PVPQ 14373
Cdd:PRK14951   444 AVALAPAPPAQAAPETVAIPVRVAP-----EPAVASAAP------APAAAPAAARLTPTEEGDVWhaTVQQ 503
Zona_pellucida pfam00100
Zona pellucida-like domain;
17722-17947 1.85e-04

Zona pellucida-like domain;


Pssm-ID: 459673 [Multi-domain]  Cd Length: 254  Bit Score: 48.37  E-value: 1.85e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  17722 CLADGVQVEIHITEPGFNGVLY--VKGHSKDEECRRVVNLAGETVprtEIFRVHFGSCG--MQAVKDVA--SFVLVIQKH 17795
Cdd:pfam00100     1 CTPDTMTVSISKCLLVPSGLLSslSLLGGLDPSCKPVSNTNGSPA---VLFEFPLTGCGttVQVNGTHIiySNTLYSSTD 77
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  17796 PKLVTYK---AQAYNIKCVYQTGEkNVTLGFNVSMLTTAGTIANTGPPPIcQMRIITNE------GEEINSAEIGDNLKL 17866
Cdd:pfam00100    78 LRSGIIRrtiTRRLPFSCSYPRSS-LVSLLVVAPPSPVPITVSGSGVFLV-SMDLYYDSsytspySPYPVTVLLGDPLYV 155
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  17867 QVDVEPAT--IYGGFARSCIAkTMEDNVQNEYLVTD-ENGCATDTSIFGNWEYNPDTNSLLA--SFNAFKF--PSSDNIR 17939
Cdd:pfam00100   156 EVSLLSRTdpNLVLVLDNCWA-TPSPNPTSSPQYQLiVNGCPNDGDSTYPVSSLSNGPSHYVrfSFKAFRFvgSSISQVY 234

                    ....*...
gi 442625924  17940 FQCNIRVC 17947
Cdd:pfam00100   235 LHCSVSVC 242
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
14149-14277 2.03e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 49.77  E-value: 2.03e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14149 PSVTHPeyptsqVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPApnppvhefnYPTPPAVPQQPGvlnIPS-YPTPVA 14227
Cdd:PRK14971   381 PVFTQP------AAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQP---------AGTPPTVSVDPP---AAVpVNPPST 442
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 442625924 14228 PTPQSPIYIPSQEQPKPTTRPSVInVPSVPQPAYPTPQAPvyDVNYPTSP 14277
Cdd:PRK14971   443 APQAVRPAQFKEEKKIPVSKVSSL-GPSTLRPIQEKAEQA--TGNIKEAP 489
rne PRK10811
ribonuclease E; Reviewed
14190-14390 3.27e-04

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 49.27  E-value: 3.27e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14190 PVHPAPNPPVHEfnYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPS--QEQPKPTTRPSVINVPSVPQPAYPTPQAP 14267
Cdd:PRK10811   846 PVVRPQDVQVEE--QREAEEVQVQPVVAEVPVAAAVEPVVSAPVVEAVAevVEEPVVVAEPQPEEVVVVETTHPEVIAAP 923
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14268 VYDVNYPTSPSVIPHQPGVVNIPsVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPsVAQPVHPTyQPPVVERPAIYDV 14347
Cdd:PRK10811   924 VTEQPQVITESDVAVAQEVAEHA-EPVVEPQDETADIEEAAETAEVVVAEPEVVAQP-AAPVVAEV-AAEVETVTAVEPE 1000
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....
gi 442625924 14348 YYPPPPSRPGVIN-IPSPPRPVYPVPQqpiYVPAPVLHIPAPRP 14390
Cdd:PRK10811  1001 VAPAQVPEATVEHnHATAPMTRAPAPE---YVPEAPRHSDWQRP 1041
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
13993-14106 3.33e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 49.04  E-value: 3.33e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13993 DVNYPTTPVSQHPGVVNIPSApRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIP 14072
Cdd:PRK14950   338 DFQLRTTSYGQLPLELAVIEA-LLVPVPAPQPAKPTAAAPSPVRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRP 416
                           90       100       110
                   ....*....|....*....|....*....|....
gi 442625924 14073 QQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQIP 14106
Cdd:PRK14950   417 VAPPVPHTPESAPKLTRAAIPVDEKPKYTPPAPP 450
PRK11633 PRK11633
cell division protein DedD; Provisional
14490-14594 3.48e-04

cell division protein DedD; Provisional


Pssm-ID: 236940 [Multi-domain]  Cd Length: 226  Bit Score: 47.30  E-value: 3.48e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14490 IPTAPSPG----IINIPSVPQPLPS-PTPGVINIPQQPTPPPLVQQPGII---NIPSVQQPsTPTTQHPIQDVQyetqRP 14561
Cdd:PRK11633    41 IPLVPKPGdrdePDMMPAATQALPTqPPEGAAEAVRAGDAAAPSLDPATVappNTPVEPEP-APVEPPKPKPVE----KP 115
                           90       100       110
                   ....*....|....*....|....*....|...
gi 442625924 14562 QPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQ 14594
Cdd:PRK11633   116 KPKPKPQQKVEAPPAPKPEPKPVVEEKAAPTGK 148
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14157-14336 3.72e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 48.72  E-value: 3.72e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14157 PTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQSPiyI 14236
Cdd:PRK12323   387 PAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPA-----PEALAAARQASARGPGGAPAPAPAPAAAP--A 459
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYP---------TSPSVIPHQPGVVNIPSVPLPAPPVKQrpvfvP 14307
Cdd:PRK12323   460 AAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPpweelppefASPAPAQPDAAPAGWVAESIPDPATAD-----P 534
                          170       180
                   ....*....|....*....|....*....
gi 442625924 14308 SPVHPTPAPQPGVVNIPSVAQPVHPTYQP 14336
Cdd:PRK12323   535 DDAFETLAPAPAAAPAPRAAAATEPVVAP 563
dnaA PRK14086
chromosomal replication initiator protein DnaA;
13941-14127 3.81e-04

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 48.67  E-value: 3.81e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13941 PSAPQPIYPTPQSPQYNVNYPSP------QPANPQKPGVVNIPSVP--QPVYPSPQPPVYDVNYPTTPVSQHpgvvniPS 14012
Cdd:PRK14086    95 PAPPPPHARRTSEPELPRPGRRPyegyggPRADDRPPGLPRQDQLPtaRPAYPAYQQRPEPGAWPRAADDYG------WQ 168
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14013 APRLVPPTSQRPvfiTSPGNLSPTP----QPGVINIPSVSQPgYPTPQSPIYDANYP---TTQSPIPqQPGVVNIPSVPS 14085
Cdd:PRK14086   169 QQRLGFPPRAPY---ASPASYAPEQerdrEPYDAGRPEYDQR-RRDYDHPRPDWDRPrrdRTDRPEP-PPGAGHVHRGGP 243
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|..
gi 442625924 14086 PSYPAPNPPVNYPTQPSPQIPvqpgviniPSAPLPTTPPQHP 14127
Cdd:PRK14086   244 GPPERDDAPVVPIRPSAPGPL--------AAQPAPAPGPGEP 277
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14434-14531 4.97e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 48.27  E-value: 4.97e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPVSTPTSgviniPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGiinVPSVPQPIPTAPSPGIINIPSVPQPLPSPTP 14513
Cdd:PRK14950   362 PVPAPQPAK-----PTAAAPSPVRPTPAPSTRPKAAAAANIPPKEP---VRETATPPPVPPRPVAPPVPHTPESAPKLTR 433
                           90
                   ....*....|....*...
gi 442625924 14514 GVINIPQQPTPPPLVQQP 14531
Cdd:PRK14950   434 AAIPVDEKPKYTPPAPPK 451
Not5 COG5665
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];
14222-14603 5.06e-04

CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];


Pssm-ID: 444384 [Multi-domain]  Cd Length: 874  Bit Score: 48.51  E-value: 5.06e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14222 YPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP----------SVPQPAYPTP----QAPVYDVNY---------PTSPS 14278
Cdd:COG5665    165 ASNPVAVVVTTMIAVPSAPAAPPNAVDYSVLVPiaaqdpaasvSTPQAFNASAtsgrSQHIVQAAKrvgvewwgdPSLLA 244
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14279 VIPHQPGVVNIPSVP----LPAPPVKQRPvfvPSPVHPTPAPQPGVVnipSVAQPVHPTYQPPVVERPAIYDVYYPPPPS 14354
Cdd:COG5665    245 TPPATPATEEKSSQQpksqPTSPSGGTTP---PSTNQLTTSNTPTST---AKAQPQPPTKKQPAKEPPSDTASGNPSAPS 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14355 RPGVINIPSPPRPV---YPVPQ------QPIYVPAPvlhiPAPRPVIHNIPSVP-QPTyphrnPPIQDVTypapqpsppv 14424
Cdd:COG5665    319 VLINSDSPTSEDPAtasVPTTEettaftTPSSVPST----PAEKDTPATDLATPvSPT-----PPETSVD---------- 379
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14425 pgivnipslpqPVSTPTSGVINIPSQASP--PISVPTPGIVNIPSIPqPTPQRPSPGI----INVPSVPQPIP------- 14491
Cdd:COG5665    380 -----------KKVSPDSATSSTKSEKEGgtASSPMPPNIAIGAKDD-VDATDPSQEAkeytKNAPMTPEADSapessvr 447
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14492 TAPSPGIINIPSV---------PQPLPSPTPGVInIPQQPTPPPLVQQPGIINIPSVQQPSTPTTQHpiQDVQYETQRPQ 14562
Cdd:COG5665    448 TEASPSAGSDLEPenttlrdpaPNAIPPPEDPST-IGRLSSGDKLANETGPPVIRRDSTPSSTADQS--IVGVLAFGLDQ 524
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|.
gi 442625924 14563 PTPGVINIPSVSqpTYPTQKPSYQDTSYPTVQPKPPVSGII 14603
Cdd:COG5665    525 RTQAEISVEAAS--RSNPLLNSQVKSFPLGKRSEGAKGKTQ 563
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14170-14285 5.68e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 48.17  E-value: 5.68e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14170 TTPSPIPQKPGVVniPSAPQPVHPAPNPPvhefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPS 14249
Cdd:PRK14951   387 AAPAAAPVAQAAA--APAPAAAPAAAASA------PAAPPAAAPPAPVAAPAAAAPAAAPAAAPAAVALAPAPPAQAAPE 458
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 442625924 14250 VINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPG 14285
Cdd:PRK14951   459 TVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTEEG 494
GGN pfam15685
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the ...
13963-14103 6.24e-04

Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking.


Pssm-ID: 434857 [Multi-domain]  Cd Length: 668  Bit Score: 48.22  E-value: 6.24e-04
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13963 PQPANPQKPGVVNipSVPQPVYPSPQ---PPVYDVNYPTTPVSQHPGvvniPSAPRLVPPTSQRPVFITSPGNL-SPTPQ 14038
Cdd:pfam15685   389 PWGSPPPPPGKAH--PIPGPRRPAPAllaPPMFIFPAPTNGEPVRPG----PPAPQALLPRPPPPTPPATPPPVpPPIPQ 462
                            90       100       110       120       130       140       150
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 442625924  14039 -PGVINIP-SVSQPGYPTPQS-PIYDANYPTTQSPIP-----QQPGVVNIPSVPSPSyPAPNPPVNYPTQPSP 14103
Cdd:pfam15685   463 lPALQPMPlAAARPPTPRPCPgHGESALAPAPTAPLPpalaaDQAPAPALAAAPAPS-PAPAPATADPLPPAP 534
PRK11633 PRK11633
cell division protein DedD; Provisional
14174-14268 6.61e-04

cell division protein DedD; Provisional


Pssm-ID: 236940 [Multi-domain]  Cd Length: 226  Bit Score: 46.15  E-value: 6.61e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14174 PIPQKPGVVN----IPSAPQPVhPAPNPP--VHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTR 14247
Cdd:PRK11633    42 PLVPKPGDRDepdmMPAATQAL-PTQPPEgaAEAVRAGDAAAPSLDPATVAPPNTPVEPEPAPVEPPKPKPVEKPKPKPK 120
                           90       100
                   ....*....|....*....|....*.
gi 442625924 14248 PSVINVPSV-----PQPAYPTPQAPV 14268
Cdd:PRK11633   121 PQQKVEAPPapkpePKPVVEEKAAPT 146
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
13903-14319 7.01e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.06  E-value: 7.01e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPPyPVAIPdlvyVQQQQPgivniPSAPQPIYPTPQSPQYNVNYPSPQPANPQKPGVVniPSVPQP 13982
Cdd:PRK07764   410 PAPAAAAPAAAAAPAPA-AAPQP----APAPAP-----APAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPA--AAPEPT 477
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13983 VYPSPQPPVYDVNYPTTPVSQHPGVVNIPSA--------PRLVPPTSQRPVFITspGNLSPTPQPG-------VINIPS- 14046
Cdd:PRK07764   478 AAPAPAPPAAPAPAAAPAAPAAPAAPAGADDaatlrerwPEILAAVPKRSRKTW--AILLPEATVLgvrgdtlVLGFSTg 555
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14047 -----VSQPGYPTPqspIYDAnypttqspIPQQPGV---VNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGViniPSAP 14118
Cdd:PRK07764   556 glarrFASPGNAEV---LVTA--------LAEELGGdwqVEAVVGPAPGAAGGEGPPAPASSGPPEEAARPAA---PAAP 621
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14119 LPTTPPQHPPvfipspeSPSPAPKPGVINIPSVTHPEYPTSQVPVYDVNYSTTPSPIP-QKPGVVNIPSAPQPVHPAPNP 14197
Cdd:PRK07764   622 AAPAAPAPAG-------AAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKaGGAAPAAPPPAPAPAAPAAPA 694
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14198 PVhefnyPTPPAVPQQPgvlnipsyPTPVAPTPQSPIYIPSQeQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPtsp 14277
Cdd:PRK07764   695 GA-----APAQPAPAPA--------ATPPAGQADDPAAQPPQ-AAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQ--- 757
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|..
gi 442625924 14278 sviPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPG 14319
Cdd:PRK07764   758 ---PPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAPSMDDE 796
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
14169-14531 7.19e-04

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.06  E-value: 7.19e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14169 STTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQspiyipsqeqPKPTTRP 14248
Cdd:PRK07764   401 AAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPA----PAPAPAPAPPSPAGNAPAGGAPSPPPA----------AAPSAQP 466
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14249 SVINVPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQR----------------PVFVPSPV-- 14310
Cdd:PRK07764   467 APAPAAAPEPTAAPAPAPPA-----APAPAAAPAAPAAPAAPAGADDAATLRERwpeilaavpkrsrktwAILLPEATvl 541
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14311 ----------HPTPA-----PQPGVVNI--PSVAQPVHPTYQPPVV-----------ERPAIYDVYYPPPPSRPgviniP 14362
Cdd:PRK07764   542 gvrgdtlvlgFSTGGlarrfASPGNAEVlvTALAEELGGDWQVEAVvgpapgaaggeGPPAPASSGPPEEAARP-----A 616
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 SPPRPVYPVPQQPIYVPAPvlhiPAPRPViHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVniPSLPQPVSTPTS 14442
Cdd:PRK07764   617 APAAPAAPAAPAPAGAAAA----PAEASA-APAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAA--PAAPPPAPAPAA 689
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14443 GVINIPSQASPPISVPTPGIVNIPSiPQPTPQRPSPGIINVPSVP-----QPIPTAPSPGiiNIPSVPQPLPSPTPGVIN 14517
Cdd:PRK07764   690 PAAPAGAAPAQPAPAPAATPPAGQA-DDPAAQPPQAAQGASAPSPaaddpVPLPPEPDDP--PDPAGAPAQPPPPPAPAP 766
                          410
                   ....*....|....
gi 442625924 14518 IPQQPTPPPLVQQP 14531
Cdd:PRK07764   767 AAAPAAAPPPSPPS 780
EGF_CA smart00179
Calcium-binding EGF-like domain;
212-243 7.50e-04

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.46  E-value: 7.50e-04
                             10        20        30
                     ....*....|....*....|....*....|..
gi 442625924     212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGY 243
Cdd:smart00179     1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
218-246 8.66e-04

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 41.43  E-value: 8.66e-04
                            10        20
                    ....*....|....*....|....*....
gi 442625924    218 NPENCGPNALCTNTPGNYTCSCPDGYVGN 246
Cdd:pfam12947     4 NNGGCHPNATCTNTGGSFTCTCNDGYTGD 32
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14440-14565 9.06e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 47.40  E-value: 9.06e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14440 PTSGVIN-IPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINI 14518
Cdd:PRK14951   366 PAAAAEAaAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAPAAV 445
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 442625924 14519 PQQPtPPPLVQQPGIINIPSVQQPSTPTTQHPiqdvQYETQRPQPTP 14565
Cdd:PRK14951   446 ALAP-APPAQAAPETVAIPVRVAPEPAVASAA----PAPAAAPAAAR 487
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
14240-14454 1.10e-03

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 47.29  E-value: 1.10e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14240 EQPKPTTRPSVINVPSVPQPAYPTPqAPVYDVNYPTSPSVIPHQPGVVN-----IPSVPLPAPPVKQRPVFVPSPVHPTP 14314
Cdd:PRK12727    56 ETARSDTPATAAAPAPAPQAPTKPA-APVHAPLKLSANANMSQRQRVASaaedmIAAMALRQPVSVPRQAPAAAPVRAAS 134
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14315 APQPG-VVNIPSVAQPVHPTYQPPVVERPAiyDVYYPPPPSRPgvinIPSPPRPVYPVPqqpiyVPAPVLHIPAPRPVI- 14392
Cdd:PRK12727   135 IPSPAaQALAHAAAVRTAPRQEHALSAVPE--QLFADFLTTAP----VPRAPVQAPVVA-----APAPVPAIAAALAAHa 203
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14393 ----HNIPSVPQPTYPHRNPPIQdvtypapqpsppvpgIVNIPSLPQPVSTPTSGVINIPSQASPP 14454
Cdd:PRK12727   204 ayaqDDDEQLDDDGFDLDDALPQ---------------ILPPAALPPIVVAPAAPAALAAVAAAAP 254
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14465-14584 1.15e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 44.78  E-value: 1.15e-03
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14465 IPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPsvPQPLPSPTPGvinipQQPTPPPLVQQPgiiniPSVQQPST 14544
Cdd:smart00818    40 IPVSQQHPPTHTLQPHHHIPVLPAQQPVVPQQPLMPVP--GQHSMTPTQH-----HQPNLPQPAQQP-----FQPQPLQP 107
                             90       100       110       120
                     ....*....|....*....|....*....|....*....|
gi 442625924   14545 PTTQHPIQdvqyeTQRPQPTPGVINIPSVSQPTYPTQKPS 14584
Cdd:smart00818   108 PQPQQPMQ-----PQPPVHPIPPLPPQPPLPPMFPMQPLP 142
EGF_CA smart00179
Calcium-binding EGF-like domain;
1022-1056 1.18e-03

Calcium-binding EGF-like domain;


Pssm-ID: 214542 [Multi-domain]  Cd Length: 39  Bit Score: 41.08  E-value: 1.18e-03
                             10        20        30
                     ....*....|....*....|....*....|....*
gi 442625924    1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQ 1056
Cdd:smart00179     1 DIDECASGN--PCQNGGTCVNTVGSYRCECPPGYT 33
f2_encap_cargo1 NF041166
family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like ...
14451-14667 1.26e-03

family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like encapsulin nanocompartments are commonly found in bacteria and archaea. Encapsulin nanocompartments, which are assembled from shell proteins, encapsulate various cargo proteins, typically peroxidases or ferritin-like proteins, to protect cells from oxidative stress caused by peroxide. Proteins of this family are cysteine desulfurases with an additional N-terminal encapsulation targeting sequence (~200 aa) that is necessary and sufficient for compartmentalization.


Pssm-ID: 469077 [Multi-domain]  Cd Length: 623  Bit Score: 47.16  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14451 ASPPISVPTPGI---VNIPSIPQPTPQRPSPGIINV-PSVPQ-PIPTAPSPGIINIPSVPQPLPSPTPGVinipqqPTPP 14525
Cdd:NF041166    33 SALPGEAPAPGLpaaPPAAPAPPGSNPAPAAGPGGLgAGVPGaALPQGLVPGANLLPSAPSPVGALGASA------PALA 106
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14526 PLVQQPgIINIPSVQQPSTPTTQHPIQDVQY-------ETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSyPTVQPKPP 14598
Cdd:NF041166   107 PHAAAG-NVGLPDAVVAVAPAEPRAGGAALPvglpqapVPAAPSAAAAPPDLVAPQAFGLPGEDAALRALL-PAASPAPP 184
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14599 VSgiiniPSVPQPVPS---LTPGVINLPSEPSYSAPIPKPG---IINVPSIPE--PIpsipqnpVQE-------VYHD-- 14661
Cdd:NF041166   185 SA-----PSAAAAESSyyfLDERAAPSPAAAPPGSPPALASahpPFDVNAVRRdfPI-------LQErvngkplVWFDna 252

                   ....*...
gi 442625924 14662 --TQKPQA 14667
Cdd:NF041166   253 atTQKPQA 260
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14169-14275 1.32e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 47.02  E-value: 1.32e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14169 STTPSPIPQKPGVVNIPSAPQPvhPAPNPPVHEFNYPTPPAVPqQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRP 14248
Cdd:PRK14951   398 AAAPAPAAAPAAAASAPAAPPA--AAPPAPVAAPAAAAPAAAP-AAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVAS 474
                           90       100
                   ....*....|....*....|....*....
gi 442625924 14249 SVINVPSVPQPA--YPTPQAPVYDVNYPT 14275
Cdd:PRK14951   475 AAPAPAAAPAAArlTPTEEGDVWHATVQQ 503
PHA03379 PHA03379
EBNA-3A; Provisional
14399-14690 1.39e-03

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 46.98  E-value: 1.39e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14399 PQPTYPHRNPPIQdvtypapqpsppvpgiVNIPSLPQPVSTPTS-GVINIPSqaSPPISVPTPGIVNIPSIPQPTPQRPS 14477
Cdd:PHA03379   409 SEPTYGTPRPPVE----------------KPRPEVPQSLETATShGSAQVPE--PPPVHDLEPGPLHDQHSMAPCPVAQL 470
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14478 PgiinvpsvPQPIPTApSPG--IINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGiinipsvqqpstpttqhpiqdVQ 14555
Cdd:PHA03379   471 P--------PGPLQDL-EPGdqLPGVVQDGRPACAPVPAPAGPIVRPWEASLSQVPG---------------------VA 520
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14556 YETQRPQPTPGviniPSVSQPTYPTQKPSYQDTSYPTVQPKPPVSGIINI-------PSVPQPVPSLTPgvINLPSEPSY 14628
Cdd:PHA03379   521 FAPVMPQPMPV----EPVPVPTVALERPVCPAPPLIAMQGPGETSGIVRVrerwrpaPWTPNPPRSPSQ--MSVRDRLAR 594
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14629 SAPIPKPGIINVPSIPepiPSIPQNPvqevyhdTQKPQAIPGVvnvpSAPQPTPGRPYYDVA 14690
Cdd:PHA03379   595 LRAEAQPYQASVEVQP---PQLTQVS-------PQQPMEYPLE----PEQQMFPGSPFSQVA 642
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14533-14635 1.45e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 46.73  E-value: 1.45e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14533 IINIPSVQQPSTPTTQHPiqdvqyetQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQPKPPVSGiiNIPSVPQPV 14612
Cdd:PRK14950   359 LLVPVPAPQPAKPTAAAP--------SPVRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRPVAP--PVPHTPESA 428
                           90       100
                   ....*....|....*....|...
gi 442625924 14613 PSLTPGVINLPSEPSYSAPIPKP 14635
Cdd:PRK14950   429 PKLTRAAIPVDEKPKYTPPAPPK 451
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
14169-14347 1.47e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.90  E-value: 1.47e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14169 STTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEFNY-----PTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPK 14243
Cdd:PRK07764   619 AAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPkhvavPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAP 698
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14244 PTTRPSVINVPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGvvni 14323
Cdd:PRK07764   699 AQPAPAPAATPPAGQADDPAAQPPQ-----AAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAA---- 769
                          170       180
                   ....*....|....*....|....
gi 442625924 14324 PSVAQPVHPTYQPPVVERPAIYDV 14347
Cdd:PRK07764   770 PAAAPPPSPPSEEEEMAEDDAPSM 793
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
14243-14375 1.62e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 46.69  E-value: 1.62e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14243 KPTTRPSVINVPSVPQPAyPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPvkqrpvfvpspvhPTPAPQPGVVN 14322
Cdd:PRK14971   370 SGGRGPKQHIKPVFTQPA-AAPQPSA-----AAAASPSPSQSSAAAQPSAPQSATQ-------------PAGTPPTVSVD 430
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14323 IPSvAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSpPRPVYPVPQQP 14375
Cdd:PRK14971   431 PPA-AVPVNPPSTAPQAVRPAQFKEEKKIPVSKVSSLGPST-LRPIQEKAEQA 481
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
14366-14497 1.63e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 46.63  E-value: 1.63e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14366 RPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQDVTyPAPQPSPPVPGIVNIPSLPQPVSTPTSGVI 14445
Cdd:PRK14951   365 KPAAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASA-PAAPPAAAPPAPVAAPAAAAPAAAPAAAPA 443
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14446 NIPSQASPPIsVPTPGIVNIPSIPQPTPQRPSPGiinVPSVPQPIPTAPSPG 14497
Cdd:PRK14951   444 AVALAPAPPA-QAAPETVAIPVRVAPEPAVASAA---PAPAAAPAAARLTPT 491
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
14207-14343 1.72e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 46.69  E-value: 1.72e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14207 PPAVPQQPgvLNiPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAyptpqapvydvnyPTSPSVIPHQPGV 14286
Cdd:PRK14971   371 GGRGPKQH--IK-PVFTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPA-------------GTPPTVSVDPPAA 434
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14287 VniPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVniPSVAQPVHPTYQPPVVERPA 14343
Cdd:PRK14971   435 V--PVNPPSTAPQAVRPAQFKEEKKIPVSKVSSLG--PSTLRPIQEKAEQATGNIKE 487
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
14005-14312 1.79e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 46.77  E-value: 1.79e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14005 PGVVNIPSAPRLVPPTSQRP----VFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNI 14080
Cdd:PRK07003   368 PGGGVPARVAGAVPAPGARAaaavGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATADRGDDAADGDA 447
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14081 PSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPS-VTHPEYPTS 14159
Cdd:PRK07003   448 PVPAKANARASADSRCDERDAQPPADSGSASAPASDAPPDAAFEPAPRAAAPSAATPAAVPDARAPAAASrEDAPAAAAP 527
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14160 QVPvydvnYSTTPSPIPQKP-----------------------------GVVNIPSAPQPVHPAPNPPVHEFNYPTPPAV 14210
Cdd:PRK07003   528 PAP-----EARPPTPAAAAPaaraggaaaaldvlrnagmrvssdrgaraAAAAKPAAAPAAAPKPAAPRVAVQVPTPRAR 602
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14211 PQQPGVLNIPSYPTP-VAPTPQSPiyiPSQEQPKPTTRpsvinVPSVPQPAYPTPQ---APVYDVNyPTSPSVIPhqpgv 14286
Cdd:PRK07003   603 AATGDAPPNGAARAEqAAESRGAP---PPWEDIPPDDY-----VPLSADEGFGGPDdgfVPVFDSG-PDDVRVAP----- 668
                          330       340
                   ....*....|....*....|....*.
gi 442625924 14287 vniPSVPLPAPPVKQRPVFVPSPVHP 14312
Cdd:PRK07003   669 ---KPADAPAPPVDTRPLPPAIPLDA 691
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
14431-14614 2.00e-03

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 44.64  E-value: 2.00e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14431 PSLPQPVSTPTSGvinIPSQASPPISVPTPgivnIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIinipSVPQPLPS 14510
Cdd:cd21577     56 PSPYSKSSPPSPP---QQRPLSPPLSLPPP----VAPPPLSPGSVPGGLPVISPVMVQPVPVLYPPHL----HQPIMVSS 124
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14511 PTPGVINIPQQPTPPPlvqqpgiinIPSVQQPSTPTTQHPI-----QDVQYETQRPQPTPGVINIPsvsqptyptqkPSY 14585
Cdd:cd21577    125 SPPPDDDHHHHKASSM---------KPSELGGDNHELHKPIkteprPEHAQDPYSEEMSSSVISSP-----------PEY 184
                          170       180
                   ....*....|....*....|....*....
gi 442625924 14586 QDTSyPTVqpkppvsgIINIPSVPQPVPS 14614
Cdd:cd21577    185 ESNT-PSV--------IVHPGKRPLPVES 204
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
14081-14302 2.05e-03

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 44.64  E-value: 2.05e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14081 PSVPSPSyPAPNPPVNYPTQPSPQI-PVQPgviniPSAPLPTTPPQHPPvfipspespspapkpgviniPSVTHPeYPTS 14159
Cdd:cd21577     30 SSPPSSS-SSSSSSSSSSSSPSSRAsPPSP-----YSKSSPPSPPQQRP--------------------LSPPLS-LPPP 82
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14160 QVPVYDVnYSTTPSPIPQKPGVVnipsaPQPVHPAPNPPVHEF-NYPTPPAVPQQPGVLNIPSYPTPV----APTPQSPI 14234
Cdd:cd21577     83 VAPPPLS-PGSVPGGLPVISPVM-----VQPVPVLYPPHLHQPiMVSSSPPPDDDHHHHKASSMKPSElggdNHELHKPI 156
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14235 YI-PSQEQPKPTTR----PSVINVPsvpqpayptpqaPVYDVNyptSPSVIphqpgvVNIPSVPLPA---PPVKQR 14302
Cdd:cd21577    157 KTePRPEHAQDPYSeemsSSVISSP------------PEYESN---TPSVI------VHPGKRPLPVespDTLKKR 211
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
14303-14393 2.07e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 46.34  E-value: 2.07e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14303 PVFVPSPVHPTPA------PQPGVVNIPSVAQPVHPTYQPPVVERPAiydvyYPPPPSRPGVINIPSPPRPVYPVPQQPI 14376
Cdd:PRK14950   362 PVPAPQPAKPTAAapspvrPTPAPSTRPKAAAAANIPPKEPVRETAT-----PPPVPPRPVAPPVPHTPESAPKLTRAAI 436
                           90
                   ....*....|....*...
gi 442625924 14377 YVP-APVLHIPAPRPVIH 14393
Cdd:PRK14950   437 PVDeKPKYTPPAPPKEEE 454
PHA03247 PHA03247
large tegument protein UL36; Provisional
14070-14278 2.19e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.47  E-value: 2.19e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14070 PIPQQPGVVNIPSVPSPSYPAPNPPvnYPTQPSPQIPVQPGV--INIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVIN 14147
Cdd:PHA03247   258 PPVVGEGADRAPETARGATGPPPPP--EAAAPNGAAAPPDGVwgAALAGAPLALPAPPDPPPPAPAGDAEEEDDEDGAME 335
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14148 IPS-VTHPE------YPTSQVPVYdvnysTTPSPIPQ-KPGVVNIPSAPQPVHPAPNPPvhefNYPTPPAVPQQPGVLNI 14219
Cdd:PHA03247   336 VVSpLPRPRqhyplgFPKRRRPTW-----TPPSSLEDlSAGRHHPKRASLPTRKRRSAR----HAATPFARGPGGDDQTR 406
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14220 PSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTSPS 14278
Cdd:PHA03247   407 PAAPVPASVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPERQPPAPATEPAPDD 465
PHA03247 PHA03247
large tegument protein UL36; Provisional
14172-14373 2.36e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.47  E-value: 2.36e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14172 PSPIP-QKPGVVNIPSAPQPVHPAPNPPvhEFNYPTPPAVPQqPGV---------LNIPSYPTPVAPTPQ---------- 14231
Cdd:PHA03247   255 PAPPPvVGEGADRAPETARGATGPPPPP--EAAAPNGAAAPP-DGVwgaalagapLALPAPPDPPPPAPAgdaeeedded 331
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14232 ------SPIYIPSQEQP-------KPT-TRPSVINVPSV---PQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPL 14294
Cdd:PHA03247   332 gamevvSPLPRPRQHYPlgfpkrrRPTwTPPSSLEDLSAgrhHPKRASLPTRKRRSARHAATPFARGPGGDDQTRPAAPV 411
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14295 PAPPvkQRPVFVPSPVHPTPAPqpgvvnipsvAQPVhPTYQPPVVERPAIydvyyPPPPSRPGVINIPSPPRPVYPVPQ 14373
Cdd:PHA03247   412 PASV--PTPAPTPVPASAPPPP----------ATPL-PSAEPGSDDGPAP-----PPERQPPAPATEPAPDDPDDATRK 472
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
13934-14128 2.58e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 46.02  E-value: 2.58e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13934 QPGIVNIPSAPQPIYPTP--------QSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVYPSPQPPVYDVNYPT--TPVSQ 14003
Cdd:PRK12323   364 RPGQSGGGAGPATAAAAPvaqpapaaAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAArqASARG 443
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14004 HPGVVNIPSAPRLVPPTSQRPVFITSPGNLSPTPQPgviniPSVSQP-GYPTPQS---PIYDANYPTTQSPIPQQ----P 14075
Cdd:PRK12323   444 PGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAA-----PARAAPaAAPAPADddpPPWEELPPEFASPAPAQpdaaP 518
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14076 GVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPgviniPSAPLPTTPPQHPP 14128
Cdd:PRK12323   519 AGWVAESIPDPATADPDDAFETLAPAPAAAPAPR-----AAAATEPVVAPRPP 566
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
14350-14526 3.59e-03

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 43.87  E-value: 3.59e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14350 PPPPSrpgviniPSPPRPVYPVPQQPIYVPAPVLHIPAPRPvihNIPSVPQPTYPHRNPPIqDVTYPAPQPSPPVPGIVN 14429
Cdd:cd21577     32 PPSSS-------SSSSSSSSSSSSPSSRASPPSPYSKSSPP---SPPQQRPLSPPLSLPPP-VAPPPLSPGSVPGGLPVI 100
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 IPSLPQPVSTPTsgviniPSQASPPISVPTPgiVNIPSIPQPTPQRPSPGIINVPSVP---QPIPTAP------------ 14494
Cdd:cd21577    101 SPVMVQPVPVLY------PPHLHQPIMVSSS--PPPDDDHHHHKASSMKPSELGGDNHelhKPIKTEPrpehaqdpysee 172
                          170       180       190
                   ....*....|....*....|....*....|....
gi 442625924 14495 --SPGIinipSVPQPLPSPTPGVINIPQQPTPPP 14526
Cdd:cd21577    173 msSSVI----SSPPEYESNTPSVIVHPGKRPLPV 202
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
13998-14224 3.78e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 45.64  E-value: 3.78e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13998 TTPVSQHPGVVNIPsAPRLVPPTSQRPVFITSPGNLSPTPQPGViniPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGV 14077
Cdd:PRK12323   372 AGPATAAAAPVAQP-APAAAAPAAAAPAPAAPPAAPAAAPAAAA---AARAVAAAPARRSPAPEALAAARQASARGPGGA 447
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14078 VNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPApkpgviniPSVTHPEYP 14157
Cdd:PRK12323   448 PAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPA--------PAQPDAAPA 519
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14158 tsqvpvyDVNYSTTPSPIPQKPGVVNIPSAPQPVhPAPNPPVhefNYPTPPAVPQQPGVLNIPSYPT 14224
Cdd:PRK12323   520 -------GWVAESIPDPATADPDDAFETLAPAPA-AAPAPRA---AAATEPVVAPRPPRASASGLPD 575
Gag_spuma pfam03276
Spumavirus gag protein;
13989-14127 3.95e-03

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 45.51  E-value: 3.95e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  13989 PPVYDVNYPttPVSQHPGVVNIP--SAPRLVPPTSQRPVfiTSPGNLSPTPqpGVINIPSVSQPGYPTPQSPIYDANYPT 14066
Cdd:pfam03276   187 PPGASFSGL--PSLPAIGGIHLPaiPGIHARAPPGNIAR--SLGDDIMPSL--GDAGMPQPRFAFHPGNPFAEAEGHPFA 260
                            90       100       110       120       130       140
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924  14067 TQS-----PIPQQPgVVNIPSVPSPSYPapnppvnyptQPSPQIPVQPGVINIPSAPLPTTPPQHP 14127
Cdd:pfam03276   261 EAEgerprDIPRAP-RIDAPSAPAIPAI----------QPIAPPMIPPIGAPIPIPHGASIPGEHI 315
MISS pfam15822
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ...
14149-14319 4.07e-03

MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.


Pssm-ID: 318115 [Multi-domain]  Cd Length: 238  Bit Score: 44.21  E-value: 4.07e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14149 PSVTHPEYPTSQVP--VYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNP----PVHEFNYPTP----PAVPQQPGVLN 14218
Cdd:pfam15822    51 PSTAPSTVPFGPAPtgMYPSIPLTGPSPGPPAPFPPSGPSCPPPGGPYPAPtvpgPGPIGPYPTPnmpfPELPRPYGAPT 130
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14219 IPSYPTPVAP--TPQSPIYIPSQEQPKPTtrPSVINVPSVPQPAYPTPQAP--VYDVNYPTSPSVIPHQPGVVNIPSVPL 14294
Cdd:pfam15822   131 DPAAAAPSGPwgSMSSGPWAPGMGGQYPA--PNMPYPSPGPYPAVPPPQSPgaAPPVPWGTVPPGPWGPPAPYPDPTGSY 208
                           170       180
                    ....*....|....*....|....*...
gi 442625924  14295 PAP---PVKQRPVFVPSPVHPTPaPQPG 14319
Cdd:pfam15822   209 PMPglyPTPNNPFQVPSGPSGAP-PMPG 235
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14295-14410 4.15e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 42.85  E-value: 4.15e-03
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14295 PAPPVKQR--PVFVPSPVHPTP---APQPGVVNIPSVAQPVHPTYQPPVVERPAIydvyyPPPPSRPGVINIPSPPRPVY 14369
Cdd:smart00818    38 QIIPVSQQhpPTHTLQPHHHIPvlpAQQPVVPQQPLMPVPGQHSMTPTQHHQPNL-----PQPAQQPFQPQPLQPPQPQQ 112
                             90       100       110       120
                     ....*....|....*....|....*....|....*....|.
gi 442625924   14370 PVPQQPiyvpaPVLHIPAPRPvihniPSVPQPTYPHRNPPI 14410
Cdd:smart00818   113 PMQPQP-----PVHPIPPLPP-----QPPLPPMFPMQPLPP 143
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
1022-1058 4.33e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 39.16  E-value: 4.33e-03
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 442625924  1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQGD 1058
Cdd:cd00054      1 DIDECASGN--PCQNGGTCVNTVGSYRCSCPPGYTGR 35
PRK10819 PRK10819
transport protein TonB; Provisional
14149-14269 4.62e-03

transport protein TonB; Provisional


Pssm-ID: 236768 [Multi-domain]  Cd Length: 246  Bit Score: 43.90  E-value: 4.62e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14149 PSVTHPEYPTSQVPVydvnysTTPSPIPQ--KPGVVNIP---SAPQPVhPAPNP-PVHEfnyptPPAVPQQPgvlnipsy 14222
Cdd:PRK10819    61 PQAVQPPPEPVVEPE------PEPEPIPEppKEAPVVIPkpePKPKPK-PKPKPkPVKK-----VEEQPKRE-------- 120
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|
gi 442625924 14223 PTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQA---PVY 14269
Cdd:PRK10819   121 VKPVEPRPASPFENTAPARPTSSTATAAASKPVTSVSSGPRALSrnqPQY 170
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
14184-14298 4.69e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 42.85  E-value: 4.69e-03
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   14184 IPSAPQ--PVHPAPNPPVHEFNYPTPPAVPQQPgVLNIPSYPtPVAPTP--QSPIYIPSQEQPKPTtrpsvinVPSVPQP 14259
Cdd:smart00818    40 IPVSQQhpPTHTLQPHHHIPVLPAQQPVVPQQP-LMPVPGQH-SMTPTQhhQPNLPQPAQQPFQPQ-------PLQPPQP 110
                             90       100       110       120
                     ....*....|....*....|....*....|....*....|....
gi 442625924   14260 AYPTPQAPVYD-----VNYPTSPSVIPHQPGVVNIPSVPLPAPP 14298
Cdd:smart00818   111 QQPMQPQPPVHpipplPPQPPLPPMFPMQPLPPLLPDLPLEAWP 154
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
14007-14126 4.85e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 45.06  E-value: 4.85e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14007 VVNIPSAPRLVPPTSQRPVFI-TSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANY-PTTQSPIPQQPGVVNIPSVP 14084
Cdd:PRK14959   356 LLNLAMLPRLMPVESLRPSGGgASAPSGSAAEGPASGGAATIPTPGTQGPQGTAPAAGMtPSSAAPATPAPSAAPSPRVP 435
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*..
gi 442625924 14085 SPSYPAPNPPVNYPTQPSPQIpvqPGVINIPSAPLPT-----TPPQH 14126
Cdd:PRK14959   436 WDDAPPAPPRSGIPPRPAPRM---PEASPVPGAPDSVasasdAPPTL 479
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
461-490 4.93e-03

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 39.12  E-value: 4.93e-03
                            10        20        30
                    ....*....|....*....|....*....|..
gi 442625924    461 CQDNP--CGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:pfam12947     1 CSDNNggCHPNATCTNTGGSFTCTCNDGYTGD 32
Amelogenin smart00818
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ...
13907-13991 4.96e-03

Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.


Pssm-ID: 197891 [Multi-domain]  Cd Length: 165  Bit Score: 42.85  E-value: 4.96e-03
                             10        20        30        40        50        60        70        80
                     ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924   13907 PVRPQIYDTPSPPYPVAIPdlvyVQQQQPGIVNIPSAP-QPIYPTPQSPQYNVNYPSPQ-PANPQKPgvvniPSVPQPVY 13984
Cdd:smart00818    66 PVVPQQPLMPVPGQHSMTP----TQHHQPNLPQPAQQPfQPQPLQPPQPQQPMQPQPPVhPIPPLPP-----QPPLPPMF 136

                     ....*...
gi 442625924   13985 P-SPQPPV 13991
Cdd:smart00818   137 PmQPLPPL 144
EGF cd00053
Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large ...
258-291 5.31e-03

Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.


Pssm-ID: 238010  Cd Length: 36  Bit Score: 39.00  E-value: 5.31e-03
                           10        20        30
                   ....*....|....*....|....*....|....
gi 442625924   258 ECSYPNVCGPGAICTNLEGSYRCDCPPGYDGDGR 291
Cdd:cd00053      1 ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34
EGF_3 pfam12947
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ...
676-702 5.77e-03

EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.


Pssm-ID: 463759 [Multi-domain]  Cd Length: 36  Bit Score: 39.12  E-value: 5.77e-03
                            10        20
                    ....*....|....*....|....*..
gi 442625924    676 GSCGQNATCTNSAGGFTCACPPGFSGD 702
Cdd:pfam12947     6 GGCHPNATCTNTGGSFTCTCNDGYTGD 32
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
13944-14103 6.02e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 44.86  E-value: 6.02e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13944 PQPIYPTPQSPQYNVNyPSPQPANPQKPGVVNIPSVPQPVYPSPQP-----PVYDVNYPTTPVSQHPgvVNIPSAPRLVP 14018
Cdd:PRK07994   361 PAAPLPEPEVPPQSAA-PAASAQATAAPTAAVAPPQAPAVPPPPASapqqaPAVPLPETTSQLLAAR--QQLQRAQGATK 437
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVfitSPGNLSPTPqPGVINIPSVSQPGYPTPQSPIYDANYPTTqspiPQQPGVVNIPSVPSPSypAPNPPVNYP 14098
Cdd:PRK07994   438 AKKSEPA---AASRARPVN-SALERLASVRPAPSALEKAPAKKEAYRWK----ATNPVEVKKEPVATPK--ALKKALEHE 507

                   ....*
gi 442625924 14099 TQPSP 14103
Cdd:PRK07994   508 KTPEL 512
dnaA PRK14086
chromosomal replication initiator protein DnaA;
13977-14247 6.26e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 44.82  E-value: 6.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13977 PSVPQPVYPSPQPPVYDvnyptTPVSQHPGVVNIPSAPRlvPPTSQRPVfITSPGNLSPTPQPGvinipsvsqpgYPTPQ 14056
Cdd:PRK14086    90 PSAGEPAPPPPHARRTS-----EPELPRPGRRPYEGYGG--PRADDRPP-GLPRQDQLPTARPA-----------YPAYQ 150
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14057 SPIYDANYPTTQSPIPQQpgvvnipsVPSPSYPAPNPPVNyptqpspqipvqpgviniPSAPLPTTPPQHPPvfipspes 14136
Cdd:PRK14086   151 QRPEPGAWPRAADDYGWQ--------QQRLGFPPRAPYAS------------------PASYAPEQERDREP-------- 196
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14137 pspapkpgviniPSVTHPEYPTSQVPvYDvnystTPSPIPQKPGVVNIPSaPQPvHPAPNPPVHEfnYPTPPAVPQQPGV 14216
Cdd:PRK14086   197 ------------YDAGRPEYDQRRRD-YD-----HPRPDWDRPRRDRTDR-PEP-PPGAGHVHRG--GPGPPERDDAPVV 254
                          250       260       270
                   ....*....|....*....|....*....|.
gi 442625924 14217 LNIPSYPTPVAPTPQspiyiPSQEQPKPTTR 14247
Cdd:PRK14086   255 PIRPSAPGPLAAQPA-----PAPGPGEPTAR 280
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
14149-14530 6.54e-03

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 44.57  E-value: 6.54e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14149 PSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNiPSAPQPVHPAPNP--PVHEFNYPTPPAVPQQPGVLNIPSYPTPV 14226
Cdd:pfam17823   123 PSSAAQSLPAAIAALPSEAFSAPRAAACRANASAA-PRAAIAAASAPHAasPAPRTAASSTTAASSTTAASSAPTTAASS 201
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14227 AP---TPQSPIYIPSQEQPKPTTRPSVINVPSVpQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRP 14303
Cdd:pfam17823   202 APatlTPARGISTAATATGHPAAGTALAAVGNS-SPAAGTVTAAVGTVTPAALATLAAAAGTVASAAGTINMGDPHARRL 280
                           170       180       190       200       210       220       230       240
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14304 ---------VFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVverpaiydvyypppPSRPGVINIPSPPRPVYPV--- 14371
Cdd:pfam17823   281 spakhmpsdTMARNPAAPMGAQAQGPIIQVSTDQPVHNTAGEPT--------------PSPSNTTLEPNTPKSVASTnla 346
                           250       260       270       280       290       300       310       320
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14372 --------PQQPIYVPAPVLHIPAprpvihnIPSVpqptyphrnppiqDVTYPAPQpsppvpgivniPSlPQPVSTPTSG 14443
Cdd:pfam17823   347 vvtttkaqAKEPSASPVPVLHTSM-------IPEV-------------EATSPTTQ-----------PS-PLLPTQGAAG 394
                           330       340       350       360       370       380       390       400
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14444 vinipsqasppisvptpgivniPSIPQPTPQrpsPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPT 14523
Cdd:pfam17823   395 ----------------------PGILLAPEQ---VATEATAGTASAGPTPRSSGDPKTLAMASCQLSTQGQYLVVTTDPL 449

                    ....*..
gi 442625924  14524 PPPLVQQ 14530
Cdd:pfam17823   450 TPALVDK 456
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
457-490 6.55e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.77  E-value: 6.55e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   457 NINECQD-NPCGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:cd00054      1 DIDECASgNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
rne PRK10811
ribonuclease E; Reviewed
14221-14406 7.49e-03

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 44.65  E-value: 7.49e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPtpVAPtPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVnyptspSVIPHQPGVVNIPSVPLPAPPVK 14300
Cdd:PRK10811   844 RYP--VVR-PQDVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSAPVVEAV------AEVVEEPVVVAEPQPEEVVVVET 914
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14301 QRPVFVPSPVhpTPAPQPGVVNIPSVAQPVhPTYQPPVVERPAIYDVYYPPPPSRPgvINIPSPPRPVYPVPQQPIYVPA 14380
Cdd:PRK10811   915 THPEVIAAPV--TEQPQVITESDVAVAQEV-AEHAEPVVEPQDETADIEEAAETAE--VVVAEPEVVAQPAAPVVAEVAA 989
                          170       180
                   ....*....|....*....|....*.
gi 442625924 14381 PVLHIPAPRPVIHNIPSVPQPTYPHR 14406
Cdd:PRK10811   990 EVETVTAVEPEVAPAQVPEATVEHNH 1015
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
298-331 8.46e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.39  E-value: 8.46e-03
                           10        20        30
                   ....*....|....*....|....*....|....*
gi 442625924   298 DQDECA-RTPCGRNADCLNTDGSFRCLCPDGYSGD 331
Cdd:cd00054      1 DIDECAsGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
Caprin-1_C pfam12287
Cytoplasmic activation/proliferation-associated protein-1 C term; This family of proteins is ...
14178-14369 9.42e-03

Cytoplasmic activation/proliferation-associated protein-1 C term; This family of proteins is found in eukaryotes. Proteins in this family are typically between 343 and 708 amino acids in length. This family is the C terminal region of caprin-1. Caprin-1 is a protein involved in regulating cellular proliferation. In mutated phenotypes, the G1 phase of the cell cycle is greatly lengthened, impairing normal proliferation. The C terminal region of caprin-1 contains RGG motifs which are characteriztic of RNA binding domains. It is possible that caprin-1 functions through an RNA binding mechanism.


Pssm-ID: 463522 [Multi-domain]  Cd Length: 320  Bit Score: 43.63  E-value: 9.42e-03
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14178 KPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSyPTPVAP-TPQSPIYIPSQEQPKPTTRPSVINVPSV 14256
Cdd:pfam12287    24 KPSDSAIVSAQPPSQSPDLSQMVCPPASPEQRLSQQSDVLQQPE-QTQVSPvSPSSNACASSGSEYQFHTSEPPQPEAID 102
                            90       100       110       120       130       140       150       160
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924  14257 PQPAYPTPQAPVYDVNYPTSPS----VIPHQP---GVVNIPSVPL-----------PAPPVKQRPVFVPSPVHPT----- 14313
Cdd:pfam12287   103 PIQSSMSLPSELAPPSPPLSPAsqpqVFQSKPassSGINVNAAPFqsmqtvfnvnaPVPPRNEQELKESSQYSSGynqsf 182
                           170       180       190       200       210       220
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 442625924  14314 ------PAPQPgvvNIPS--VAQPVHPTYQP-PVVERPAIYDVYYPPPPSrpgviNIPSPPRPVY 14369
Cdd:pfam12287   183 ssqstqTVPQC---QLPSeqLEQTVVGAYHPdGTIQVSNGHLAFYPAQTN-----GFPRPPQPFY 239
EGF_CA cd00054
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ...
413-456 9.61e-03

Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.


Pssm-ID: 238011  Cd Length: 38  Bit Score: 38.39  E-value: 9.61e-03
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....
gi 442625924   413 DIDECNQPDGvakCGTNAKCINFPGSYRCLCPSGFQGQgylHCE 456
Cdd:cd00054      1 DIDECASGNP---CQNGGTCVNTVGSYRCSCPPGYTGR---NCE 38
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
14073-14309 9.74e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.10  E-value: 9.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14073 QQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQhppvfipspespspapkpgvinipsvt 14152
Cdd:PRK12323   367 QSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAV--------------------------- 419
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14153 hPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVhPAPNPPvhefnyptPPAVPQQPGVLNIPSYPTPVAPTPQS 14232
Cdd:PRK12323   420 -AAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAA-PAAAAR--------PAAAGPRPVAAAAAAAPARAAPAAAP 489
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14233 PiyiPSQEQPKP-TTRPSVINVPSVPQPAYPTPQAPVYDVNYP-TSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSP 14309
Cdd:PRK12323   490 A---PADDDPPPwEELPPEFASPAPAQPDAAPAGWVAESIPDPaTADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRP 565
PRK07994 PRK07994
DNA polymerase III subunits gamma and tau; Validated
14033-14211 9.92e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236138 [Multi-domain]  Cd Length: 647  Bit Score: 44.09  E-value: 9.92e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14033 LSPTPQPGV-------------INIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSyPAPNPPVNYPT 14099
Cdd:PRK07994   341 LAPDRRMGVemtllrmlafhpaAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQ-QAPAVPLPETT 419
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14100 QPSPQIPVQpgvinIPSAPLPTTPPQHPPVfipsPESPSPAPKPGVINIPSVTHPEYPTSQVPVYDVNYSTTPSPipqkP 14179
Cdd:PRK07994   420 SQLLAARQQ-----LQRAQGATKAKKSEPA----AASRARPVNSALERLASVRPAPSALEKAPAKKEAYRWKATN----P 486
                          170       180       190
                   ....*....|....*....|....*....|..
gi 442625924 14180 GVVNIPSAPQPVhPAPNPPVHEfnyPTPPAVP 14211
Cdd:PRK07994   487 VEVKKEPVATPK-ALKKALEHE---KTPELAA 514
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH