|
Name |
Accession |
Description |
Interval |
E-value |
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14019-14660 |
1.68e-35 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 153.17 E-value: 1.68e-35
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ----PGVVNIPSVPSPSYPAPNPP 14094
Cdd:PHA03247 2478 PVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMltwiRGLEELASDDAGDPPPPLPP 2557
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14095 VNYPTQPSPQIPvqpgviniPSAPLPTtpPQHPPVFIPSPEspspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSP 14174
Cdd:PHA03247 2558 AAPPAAPDRSVP--------PPRPAPR--PSEPAVTSRARR-------------PDAP-PQSARPRAPVDDRGDPRGPAP 2613
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14175 ipqkpgvvniPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP 14254
Cdd:PHA03247 2614 ----------PSPLPPDTHAPDPPP-----PSPSPAANEPD--PHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQA 2676
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14255 SVP-----QPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIP-SVAQ 14328
Cdd:PHA03247 2677 SSPpqrprRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPgGPAR 2756
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14329 PVHP--TYQPPVVERPAIYDVYYPPPPSRPGVINIpSPPRPVYPVPQQPIYVPAPVlhiPAPRPVIhNIPSVPQPTYPhr 14406
Cdd:PHA03247 2757 PARPptTAGPPAPAPPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDPADPPAAV---LAPAAAL-PPAASPAGPLP-- 2829
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14407 nPPiqdvTYPAPQPSPPvpgivniPSLPQPVSTPTSG-------VINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPG 14479
Cdd:PHA03247 2830 -PP----TSAQPTAPPP-------PPGPPPPSLPLGGsvapggdVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14480 IINVPSVPQPIPTAPSPgiinipsvPQPLPSPTPGViniPQQPTPPPlvQQPGIINIPSVQQPSTPTTQHPiQDVQYETQ 14559
Cdd:PHA03247 2898 FALPPDQPERPPQPQAP--------PPPQPQPQPPP---PPQPQPPP--PPPPRPQPPLAPTTDPAGAGEP-SGAVPQPW 2963
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14560 RPQPTPGVINIP----SVSQPTYPTQKPSyqdTSYPTVQPKPPVSG-----IINIPSVPQPV--------------PSLT 14616
Cdd:PHA03247 2964 LGALVPGRVAVPrfrvPQPAPSREAPASS---TPPLTGHSLSRVSSwasslALHEETDPPPVslkqtlwppddtedSDAD 3040
|
650 660 670 680
....*....|....*....|....*....|....*....|....
gi 442625924 14617 PGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPVQEVYH 14660
Cdd:PHA03247 3041 SLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPS 3084
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
14048-14503 |
3.64e-29 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 131.04 E-value: 3.64e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14048 SQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNipsVPSPSYPAPNPPVNYPTQPSPQIPvQPGVINIPSAPLPTTPPqhP 14127
Cdd:pfam03154 144 TSPSIPSPQDNESDSDSSAQQQILQTQPPVLQ---AQSGAASPPSPPPPGTTQAATAGP-TPSAPSVPPQGSPATSQ--P 217
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14128 PVFIPSPESPSPAPKPGviniPSVTHPEYPTSQVPVydvnystTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEfnyptp 14207
Cdd:pfam03154 218 PNQTQSTAAPHTLIQQT----PTLHPQRLPSPHPPL-------QPMTQPPPPSQVSPQPLPQPSLHGQMPPMPH------ 280
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14208 pavPQQPGVLNIPsYPTPVAPTPQSPIYIPSQEQPKPTtrpsvinvPSVPQPAYPTPQAPvydvnyPTSPSVIPHQPGVV 14287
Cdd:pfam03154 281 ---SLQTGPSHMQ-HPVPPQPFPLTPQSSQSQVPPGPS--------PAAPGQSQQRIHTP------PSQSQLQSQQPPRE 342
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14288 N-IPSVPLPAPPVKQRPVfvpSPVHPTPAPQ----PGVVNIPSVAQpVHPTYQPPVVERPAIYDVYYPPPPSRPgvinip 14362
Cdd:pfam03154 343 QpLPPAPLSMPHIKPPPT---TPIPQLPNPQshkhPPHLSGPSPFQ-MNSNLPPPPALKPLSSLSTHHPPSAHP------ 412
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 sPPRPVYPVPQQpiyVPAPvlhiPAPRPVIHNIPSVPQPTYPHRNP------PIQDvTYPAPQPSPPVPGIVNIPSLPQP 14436
Cdd:pfam03154 413 -PPLQLMPQSQQ---LPPP----PAQPPVLTQSQSLPPPAASHPPTsglhqvPSQS-PFPQHPFVPGGPPPITPPSGPPT 483
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14437 VSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQPTPQRPsPGIINVPSVPQPIPTAPS--PGIINIPS 14503
Cdd:pfam03154 484 STSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEA-LDEAEEPESPPPPPRSPSpePTVVNTPS 551
|
|
| ZP |
smart00241 |
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona ... |
17722-17957 |
9.63e-17 |
|
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).
Pssm-ID: 214579 Cd Length: 252 Bit Score: 85.13 E-value: 9.63e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17722 CLADGVQVEIHiTEPGFNGVLYVKGHS-KDEECRRVVNLAGETVPRTEifrVHFGSCGM--QAVKDVA--SFVLVIQKHP 17796
Cdd:smart00241 2 CGEDQMVVSVS-TDLLFPGGINVKGLTlGDPSCRPQFTDATSAFVSFE---VPLNGCGTrrQVNPDGIvySNTLVVSPFH 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17797 KLVTYKAQ--AYNIKCVYQTGEKnVTLGFNVSMLTTAGTIANTGPPPICQMRIITNEGE----EINSAEIGDNLKLQVDV 17870
Cdd:smart00241 78 PGFITRDDraAYHFQCFYPENEK-VSLNLDVSTIPPTELSSVSEGPLTCSYRLYKDDSFgspyQSADYVLGDPVYHEWEC 156
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17871 EPATI--YGGFARSCIAKTMEDNVQNEYLVTDENGCATDTSIFGNWEYNPDTNSLL-ASFNAFKFPSSDNIRFQCNIRVC 17947
Cdd:smart00241 157 DGADDppLGLLVDNCYATPGPDPSSGPKYFIIDNGCPVDGYLDSTIPYNSNPLHRArFSVKVFKFADRSLVYFHCQIRLC 236
|
250
....*....|....
gi 442625924 17948 ----FGRCQPVNCG 17957
Cdd:smart00241 237 dkddGSSCDGPACS 250
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
13798-14227 |
9.68e-17 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 90.21 E-value: 9.68e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13798 PVPIIQESPLTPCDPSPCGPNAQCHPSLNEAVCSCLPEfyGTPPNCRPECTLNSECA-----YDKACVH-------HKCV 13865
Cdd:pfam03154 172 PVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ--GSPATSQPPNQTQSTAAphtliQQTPTLHpqrlpspHPPL 249
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13866 DPCPGICGINADCRVHYHSPicyciSSHTGDPftrcyETPKPVR--PQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSA 13943
Cdd:pfam03154 250 QPMTQPPPPSQVSPQPLPQP-----SLHGQMP-----PMPHSLQtgPSHMQHPVPPQPFPLT-------PQSSQSQVPPG 312
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13944 PQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVyPSPQPPvydvnyPTTPVSQHPGvvniPSAPRLvPPTSQR 14023
Cdd:pfam03154 313 PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSM-PHIKPP------PTTPIPQLPN----PQSHKH-PPHLSG 380
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14024 PVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP 14103
Cdd:pfam03154 381 PSPFQMNSNLPPPPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLHQVPSQS 460
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14104 QIPVQPgviNIPSAPLPTTPPQHPPvfipspespspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSPipqkpgvvN 14183
Cdd:pfam03154 461 PFPQHP---FVPGGPPPITPPSGPP--------------------TSTS-SAMPGIQPPSSASVSSSGPVP--------A 508
|
410 420 430 440
....*....|....*....|....*....|....*....|....*....
gi 442625924 14184 IPSAPQPVHPAPNPPVHEFNYPTPPAVPQ-----QPGVLNIPSYPTPVA 14227
Cdd:pfam03154 509 AVSCPLPPVQIKEEALDEAEEPESPPPPPrspspEPTVVNTPSHASQSA 557
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
14033-14401 |
3.63e-16 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 87.52 E-value: 3.63e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14033 LSPTPQPGVINIPSVSQPGYPTPQSPIYDAnyPTTQsPIPQqpgvvniPSVPSPSYPAPNPPvnyPTQPSPQIPVQPGVI 14112
Cdd:NF033839 147 SSSSSSSGSSTKPETPQPENPEHQKPTTPA--PDTK-PSPQ-------PEGKKPSVPDINQE---KEKAKLAVATYMSKI 213
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14113 --NIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV----PVYDVNYSTTPSPIPQKPGVVNIPS 14186
Cdd:NF033839 214 ldDIQKHHLQKEKHRQIVALIKELDELKKQALSEIDNVNTKVEIENTVHKIfadmDAVVTKFKKGLTQDTPKEPGNKKPS 293
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQP-VHPAPNPPVHEfnyPTPPAVPQQPGVLNIPSYPTP-VAPTPQS--PIYIPSQEQPKPTTRPSvinvPSVPQPAY- 14261
Cdd:NF033839 294 APKPgMQPSPQPEKKE---VKPEPETPKPEVKPQLEKPKPeVKPQPEKpkPEVKPQLETPKPEVKPQ----PEKPKPEVk 366
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 PTPQAPvydvnyptSPSVIPhQPGVvnipsvplPAPPVKQRPVfVPSP-VHPTP-APQPGVVNIPSVAQP-VHPTYQPPv 14338
Cdd:NF033839 367 PQPEKP--------KPEVKP-QPET--------PKPEVKPQPE-KPKPeVKPQPeKPKPEVKPQPEKPKPeVKPQPEKP- 427
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14339 veRPaiyDVYYPPPPSRPGVINIPSPPRP-VYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVPQP 14401
Cdd:NF033839 428 --KP---EVKPQPEKPKPEVKPQPEKPKPeVKPQPETP--KPEVKPQPEKPKPEVKPQPEKPKP 484
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
13941-14278 |
8.64e-16 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 86.36 E-value: 8.64e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13941 PSAPQPIYPTPQSPQYNV--NYPSPQP--ANPQKPGVVNIPSVPQP-VYPSPQPPVYDVNYPTTPVSQHPGVVNIPSA-- 14013
Cdd:NF033839 159 PETPQPENPEHQKPTTPApdTKPSPQPegKKPSVPDINQEKEKAKLaVATYMSKILDDIQKHHLQKEKHRQIVALIKEld 238
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14014 --------------PRLVPPTSQRPVFIT--------SPGNLSPTPQPGVINIPSVSQPGY-PTPQSPIydanypTTQSP 14070
Cdd:NF033839 239 elkkqalseidnvnTKVEIENTVHKIFADmdavvtkfKKGLTQDTPKEPGNKKPSAPKPGMqPSPQPEK------KEVKP 312
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14071 IPQQPGVVNIPSVPSPSyPAPNPPvnyPTQPSPQIPVQPGVINIPSAPLPTTP-PQHPPvfipspesPSPAPKPGVINIP 14149
Cdd:NF033839 313 EPETPKPEVKPQLEKPK-PEVKPQ---PEKPKPEVKPQLETPKPEVKPQPEKPkPEVKP--------QPEKPKPEVKPQP 380
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14150 SVTHPEY-PTSQVPVYDVNysttPSPIPQKPGVVNIPSAPQP-VHPAPNPPVHEFNyPTPPAvpQQPGVLNIPSYPTP-V 14226
Cdd:NF033839 381 ETPKPEVkPQPEKPKPEVK----PQPEKPKPEVKPQPEKPKPeVKPQPEKPKPEVK-PQPEK--PKPEVKPQPEKPKPeV 453
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14227 APTPQSPI--YIPSQEQPKPTTRPSvinvPSVPQPAYPTPQApvyDVNYPTSPS 14278
Cdd:NF033839 454 KPQPETPKpeVKPQPEKPKPEVKPQ----PEKPKPDNSKPQA---DDKKPSTPN 500
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
14354-14692 |
5.51e-15 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 83.66 E-value: 5.51e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVINIPSPPRPVYPVPQQPIyVPAPVLHiPAPRPVIHNiPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSL 14433
Cdd:NF033839 151 SSSGSSTKPETPQPENPEHQKPT-TPAPDTK-PSPQPEGKK-PSVPDINQEKEKAKLAVATYMSKILDDIQKHHLQKEKH 227
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPVSTPTSgVINIPSQASPPISVPTPGIVnipsiPQPTPQRPSPGIINVPSVPQP--IPTAPSPGIINIPSVPQPL--P 14509
Cdd:NF033839 228 RQIVALIKE-LDELKKQALSEIDNVNTKVE-----IENTVHKIFADMDAVVTKFKKglTQDTPKEPGNKKPSAPKPGmqP 301
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14510 SPTPGVINIPQQPTPP-----PLVQQPGiiniPSVQ-QPSTPTtqhPIQDVQYETQRPQ-------PTPGVINIPSVSQP 14576
Cdd:NF033839 302 SPQPEKKEVKPEPETPkpevkPQLEKPK----PEVKpQPEKPK---PEVKPQLETPKPEvkpqpekPKPEVKPQPEKPKP 374
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14577 TYPTQ----KPSYQ---DTSYPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEP-IP 14648
Cdd:NF033839 375 EVKPQpetpKPEVKpqpEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVK 454
|
330 340 350 360
....*....|....*....|....*....|....*....|....
gi 442625924 14649 SIPQNPVQEVYHDTQKPQaiPGVVNVPSAPQPTPGRPYYDVAKP 14692
Cdd:NF033839 455 PQPETPKPEVKPQPEKPK--PEVKPQPEKPKPDNSKPQADDKKP 496
|
|
| Streccoc_I_II |
NF033804 |
antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins ... |
14157-14365 |
1.60e-13 |
|
antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins with a glucan-binding domain, two types of repetitive regions, an isopeptide bond-forming domain associated with shear resistance, and a C-terminal LPXTG motif for anchoring to the cell wall. They occur in oral Streptococci, and tend to be major cell surface adhesins. Members of this family include SspA and SspB from Streptococcus gordonii, antigen I/II from S. mutans, etc.
Pssm-ID: 468188 [Multi-domain] Cd Length: 1552 Bit Score: 79.98 E-value: 1.60e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14157 PTSQVPVYDVNYSTTPspipQKPGV----------VNIPSAPQ-----PVHP-APNPPVHEFNYPTPPAvpqqPGVLNIP 14220
Cdd:NF033804 791 PSDEMPAVPGRDNTEG----KKPNIwyslngkiraVNVPKITKekptpPVAPtAPQAPTYEVEKPLEPA----PVAPTYE 862
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPTPVAPTPQspiyipsQEQPKPTTRPSVinvpSVPQPAYPTPQAPVYDvNYPTSPSVIPHQPgvvnIPSVPLPAPPVK 14300
Cdd:NF033804 863 NEPTPPVKTPD-------QPEPSKPEEPTY----ETEKPLEPAPVAPTYE-NEPTPPVKTPDQP----EPSKPEEPTYET 926
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14301 QRPVfVPSPVHPT----PAPQPGVVNIPSVAQPVHPTYQPpvverpaiydvyYPPPPSRPGVINIPSPP 14365
Cdd:NF033804 927 EKPL-EPAPVAPSyenePTPPVKTPDQPEPSKPVEPTYDP------------LPTPPVAPTPKQLPTPP 982
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
14020-14528 |
1.18e-10 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 69.71 E-value: 1.18e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14020 TSQRPVFITSPGNLS-PTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPipqqPGVVNIPSvPSPSYPA-------- 14090
Cdd:COG5180 2 RKATILEIRLLATVPiPPNAARPVLSPELWAAANNDAVSQGDRSALASSPTR----PYARKIFE-PLDIKLAlgkpqlps 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14091 -PNPPVNYPTQP---SPQIPVQP--GVINIPSAPLPTTPPQHPPVFIPSPESPSpapkpgVINIPSVTHPEYPTSQVPVY 14164
Cdd:COG5180 77 vAEPEAYLDPAPpksSPDTPEEQlgAPAGDLLVLPAAKTPELAAGALPAPAAAA------ALPKAKVTREATSASAGVAL 150
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14165 DVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPN-----PPVHEFNYPTP---PAVPQQPGVLNIPSYPTPVAPTPQsPIYI 14236
Cdd:COG5180 151 AAALLQRSDPILAKDPDGDSASTLPPPAEKLDkvltePRDALKDSPEKldrPKVEVKDEAQEEPPDLTGGADHPR-PEAA 229
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDvnyptspsviPHQPGVVNIPSVPLPAPPV---KQRPVFV-PSPVHP 14312
Cdd:COG5180 230 SSPKVDPPSTSEARSRPATVDAQPEMRPPADAKE----------RRRAAIGDTPAAEPPGLPVleaGSEPQSDaPEAETA 299
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14313 TPAPQPGVVNIPSVAQPVHPT---------YQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQpiyVPAPVl 14383
Cdd:COG5180 300 RPIDVKGVASAPPATRPVRPPggardpgtpRPGQPTERPAGVPEAASDAGQPPSAYPPAEEAVPGKPLEQG---APRPG- 375
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14384 HIPAPRPVIHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTS------GVINIPSQASPPISV 14457
Cdd:COG5180 376 SSGGDGAPFQPPNGAPQPGLGRRGAPGPPMGAGDLVQAALDGGGRETASLGGAAGGAGQgpkadfVPGDAESVSGPAGLA 455
|
490 500 510 520 530 540 550
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14458 PTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGiinIPSVPQPLPSPTPgVINIPQQPTPPPLV 14528
Cdd:COG5180 456 DQAGAAASTAMADFVAPVTDATPVDVADVLGVRPDAILGG---NVAPASGLDAETR-IIEAEGAPATEDFV 522
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
14066-14586 |
4.04e-09 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 64.56 E-value: 4.04e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14066 TTQSPIPQQPGVVNIP-SVPSPsyPAPNPPVnypTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESpspapkpg 14144
Cdd:cd22540 18 TTQDSQPSPLALLAATcSKIGP--PAVEAAV---TPPAPPQPTPRKLVPIKPAPLPLGPGKNSIGFLSAKGN-------- 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14145 VINI-PSVTHPEYPTSQVPVYDVN-------YSTTPSPIPQKPGVVNIPSAPQP-------VHPAPNPpvhefNYPTPPA 14209
Cdd:cd22540 85 IIQLqGSQLSSSAPGGQQVFAIQNptmiikgSQTRSSTNQQYQISPQIQAAGQInnsgqiqIIPGTNQ-----AIITPVQ 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14210 VPQQPgvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTrpsVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVN- 14288
Cdd:cd22540 160 VLQQP---QQAHKPVPIKPAPLQTSNTNSASLQVPGN---VIKLQSGGNVALTLPVNNLVGTQDGATQLQLAAAPSKPSk 233
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14289 -----IPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGvVNIPSVAQPvhPTYQPPVVERpaiydVYYPPPPSRPGVINIps 14363
Cdd:cd22540 234 kirkkSAQAAQPAVTVAEQVETVLIETTADNIIQAG-NNLLIVQSP--GTGQPAVLQQ-----VQVLQPKQEQQVVQI-- 303
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 pprpvypvPQQPIYVpapvlhipaPRPVIHNIPSVPQPtyPHRNPPIQdvtypapqpsppvpgivNIPSLPQPV--STPT 14441
Cdd:cd22540 304 --------PQQALRV---------VQAASATLPTVPQK--PLQNIQIQ-----------------NSEPTPTQVyiKTPS 347
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVPTPgivniPSIPQPTPQRPSPGIINVPSVPQPIPTAPspgiinipsvPQPLPSPTPGVI--NIP 14519
Cdd:cd22540 348 GEVQTVLLQEAPAATATPS-----SSTSTVQQQVTANNGTGTSKPNYNVRKER----------TLPKIAPAGGIIslNAA 412
|
490 500 510 520 530 540
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14520 QQPTPPPLVQQpgiINIPSVQQPSTPTTQhpiqdvqyeTQRP-QPTPGVINIPSVSQPTYPTQKPSYQ 14586
Cdd:cd22540 413 QLAAAAQAIQT---ININGVQVQGVPVTI---------TNAGgQQQLTVQTVSSNNLTISGLSPTQIQ 468
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
13903-14122 |
5.00e-09 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 64.40 E-value: 5.00e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKP-VRPQiydtPSPPYPVAIPDLvyvQQQQPGIVNIPSAPQP-IYPTPQSPQYNVnypSPQPANPqKPGVVnipsvP 13980
Cdd:NF033839 326 EKPKPeVKPQ----PEKPKPEVKPQL---ETPKPEVKPQPEKPKPeVKPQPEKPKPEV---KPQPETP-KPEVK-----P 389
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13981 QPVYP----SPQPPVYDVNYPTTPVSQHPGVVNIPSAPRL-VPPTSQRPvfitspgNLSPTPQPGVINiPSV-SQPGYPT 14054
Cdd:NF033839 390 QPEKPkpevKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVKPQPEKP-------KPEVKPQPEKPK-PEVkPQPETPK 461
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14055 PQ-SPIYDANYPTTQsPIPQQPGVVNipSVPSPSYPAPNPPVNYP--TQPSPQIPVQPGVINIPSAPLPTT 14122
Cdd:NF033839 462 PEvKPQPEKPKPEVK-PQPEKPKPDN--SKPQADDKKPSTPNNLSkdKQPSNQASTNEKATNKPKKSLPST 529
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
13907-14025 |
8.08e-08 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 61.25 E-value: 8.08e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQIYDTPSPPY------PVAIPDLVYVQQQQPGIVNIPSAP-----QPIYPTPQSPQYNVNY----PSPQPANPQKP 13971
Cdd:PRK10263 731 PMKALLDDGPHEPLftpivePVQQPQQPVAPQQQYQQPQQPVAPqpqyqQPQQPVAPQPQYQQPQqpvaPQPQYQQPQQP 810
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 -------GVVNIPSVPQPVYPSPQPPVYD-------------------VNYPTTPvsqhpgvvnIPSAPRLVPPTSQ-RP 14024
Cdd:PRK10263 811 vapqpqyQQPQQPVAPQPQYQQPQQPVAPqpqdtllhpllmrngdsrpLHKPTTP---------LPSLDLLTPPPSEvEP 881
|
.
gi 442625924 14025 V 14025
Cdd:PRK10263 882 V 882
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14483-14626 |
2.38e-06 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 52.48 E-value: 2.38e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14483 VPSVPQPIPTAPSPGIINIPSVPQPLPSptpgvinIPQQPtpppLVQQPGiinipsvQQPSTPTTQHPIQDVQYETQRPQ 14562
Cdd:smart00818 40 IPVSQQHPPTHTLQPHHHIPVLPAQQPV-------VPQQP----LMPVPG-------QHSMTPTQHHQPNLPQPAQQPFQ 101
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14563 PTPgviniPSVSQPTYPTQKPsyqdtsyPTVQPKPPVSGIINIPSVP--QPVPSLTPgviNLPSEP 14626
Cdd:smart00818 102 PQP-----LQPPQPQQPMQPQ-------PPVHPIPPLPPQPPLPPMFpmQPLPPLLP---DLPLEA 152
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
255-286 |
8.29e-06 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 46.86 E-value: 8.29e-06
10 20 30
....*....|....*....|....*....|..
gi 442625924 255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGY 286
Cdd:smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
255-289 |
1.44e-05 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 46.48 E-value: 1.44e-05
10 20 30
....*....|....*....|....*....|....*
gi 442625924 255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGYDGD 289
Cdd:cd00054 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
338-373 |
3.22e-05 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 45.32 E-value: 3.22e-05
10 20 30
....*....|....*....|....*....|....*.
gi 442625924 338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGFVLEH 373
Cdd:cd00054 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
338-369 |
7.53e-05 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 44.16 E-value: 7.53e-05
10 20 30
....*....|....*....|....*....|..
gi 442625924 338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGF 369
Cdd:smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
13903-14128 |
9.23e-05 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 50.45 E-value: 9.23e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPPYPVAIPDLVYvQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYP---------SPQPANP--QKP 13971
Cdd:COG5180 274 AAEPPGLPVLEAGSEPQSDAPEAETAR-PIDVKGVASAPPATRPVRPPGGARDPGTPRPgqpterpagVPEAASDagQPP 352
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 GVVNIPSVPQPVYPSPQ--PPVYDVNYPTTPV----------SQHPGVVN-IPSAPRLVPPTSQRPVFIT-------SPG 14031
Cdd:COG5180 353 SAYPPAEEAVPGKPLEQgaPRPGSSGGDGAPFqppngapqpgLGRRGAPGpPMGAGDLVQAALDGGGRETaslggaaGGA 432
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14032 NLSPTPQPGVINIPSVSQPGYPTPQSPIydanyptTQSPIPQQPGVV--NIPSVPSPSYPAPNPPVNYPTQPSPQIPVQP 14109
Cdd:COG5180 433 GQGPKADFVPGDAESVSGPAGLADQAGA-------AASTAMADFVAPvtDATPVDVADVLGVRPDAILGGNVAPASGLDA 505
|
250
....*....|....*....
gi 442625924 14110 GVINIPSAPLPTTPPQHPP 14128
Cdd:COG5180 506 ETRIIEAEGAPATEDFVAA 524
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
137-166 |
1.20e-04 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 43.74 E-value: 1.20e-04
10 20 30
....*....|....*....|....*....|
gi 442625924 137 PCDVFAHCTNTLGSFTCTCFPGYRGNGFHC 166
Cdd:pfam12947 7 GCHPNATCTNTGGSFTCTCNDGYTGDGVTC 36
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
212-247 |
1.63e-04 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 43.39 E-value: 1.63e-04
10 20 30
....*....|....*....|....*....|....*.
gi 442625924 212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGYVGNN 247
Cdd:cd00054 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
|
|
| Zona_pellucida |
pfam00100 |
Zona pellucida-like domain; |
17722-17947 |
1.85e-04 |
|
Zona pellucida-like domain;
Pssm-ID: 459673 [Multi-domain] Cd Length: 254 Bit Score: 48.37 E-value: 1.85e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17722 CLADGVQVEIHITEPGFNGVLY--VKGHSKDEECRRVVNLAGETVprtEIFRVHFGSCG--MQAVKDVA--SFVLVIQKH 17795
Cdd:pfam00100 1 CTPDTMTVSISKCLLVPSGLLSslSLLGGLDPSCKPVSNTNGSPA---VLFEFPLTGCGttVQVNGTHIiySNTLYSSTD 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17796 PKLVTYK---AQAYNIKCVYQTGEkNVTLGFNVSMLTTAGTIANTGPPPIcQMRIITNE------GEEINSAEIGDNLKL 17866
Cdd:pfam00100 78 LRSGIIRrtiTRRLPFSCSYPRSS-LVSLLVVAPPSPVPITVSGSGVFLV-SMDLYYDSsytspySPYPVTVLLGDPLYV 155
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17867 QVDVEPAT--IYGGFARSCIAkTMEDNVQNEYLVTD-ENGCATDTSIFGNWEYNPDTNSLLA--SFNAFKF--PSSDNIR 17939
Cdd:pfam00100 156 EVSLLSRTdpNLVLVLDNCWA-TPSPNPTSSPQYQLiVNGCPNDGDSTYPVSSLSNGPSHYVrfSFKAFRFvgSSISQVY 234
|
....*...
gi 442625924 17940 FQCNIRVC 17947
Cdd:pfam00100 235 LHCSVSVC 242
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
212-243 |
7.50e-04 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 41.46 E-value: 7.50e-04
10 20 30
....*....|....*....|....*....|..
gi 442625924 212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGY 243
Cdd:smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
218-246 |
8.66e-04 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 41.43 E-value: 8.66e-04
10 20
....*....|....*....|....*....
gi 442625924 218 NPENCGPNALCTNTPGNYTCSCPDGYVGN 246
Cdd:pfam12947 4 NNGGCHPNATCTNTGGSFTCTCNDGYTGD 32
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
1022-1056 |
1.18e-03 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 41.08 E-value: 1.18e-03
10 20 30
....*....|....*....|....*....|....*
gi 442625924 1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQ 1056
Cdd:smart00179 1 DIDECASGN--PCQNGGTCVNTVGSYRCECPPGYT 33
|
|
| f2_encap_cargo1 |
NF041166 |
family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like ... |
14451-14667 |
1.26e-03 |
|
family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like encapsulin nanocompartments are commonly found in bacteria and archaea. Encapsulin nanocompartments, which are assembled from shell proteins, encapsulate various cargo proteins, typically peroxidases or ferritin-like proteins, to protect cells from oxidative stress caused by peroxide. Proteins of this family are cysteine desulfurases with an additional N-terminal encapsulation targeting sequence (~200 aa) that is necessary and sufficient for compartmentalization.
Pssm-ID: 469077 [Multi-domain] Cd Length: 623 Bit Score: 47.16 E-value: 1.26e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14451 ASPPISVPTPGI---VNIPSIPQPTPQRPSPGIINV-PSVPQ-PIPTAPSPGIINIPSVPQPLPSPTPGVinipqqPTPP 14525
Cdd:NF041166 33 SALPGEAPAPGLpaaPPAAPAPPGSNPAPAAGPGGLgAGVPGaALPQGLVPGANLLPSAPSPVGALGASA------PALA 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14526 PLVQQPgIINIPSVQQPSTPTTQHPIQDVQY-------ETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSyPTVQPKPP 14598
Cdd:NF041166 107 PHAAAG-NVGLPDAVVAVAPAEPRAGGAALPvglpqapVPAAPSAAAAPPDLVAPQAFGLPGEDAALRALL-PAASPAPP 184
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14599 VSgiiniPSVPQPVPS---LTPGVINLPSEPSYSAPIPKPG---IINVPSIPE--PIpsipqnpVQE-------VYHD-- 14661
Cdd:NF041166 185 SA-----PSAAAAESSyyfLDERAAPSPAAAPPGSPPALASahpPFDVNAVRRdfPI-------LQErvngkplVWFDna 252
|
....*...
gi 442625924 14662 --TQKPQA 14667
Cdd:NF041166 253 atTQKPQA 260
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
1022-1058 |
4.33e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 39.16 E-value: 4.33e-03
10 20 30
....*....|....*....|....*....|....*..
gi 442625924 1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQGD 1058
Cdd:cd00054 1 DIDECASGN--PCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
461-490 |
4.93e-03 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 39.12 E-value: 4.93e-03
10 20 30
....*....|....*....|....*....|..
gi 442625924 461 CQDNP--CGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:pfam12947 1 CSDNNggCHPNATCTNTGGSFTCTCNDGYTGD 32
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
13907-13991 |
4.96e-03 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 42.85 E-value: 4.96e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQIYDTPSPPYPVAIPdlvyVQQQQPGIVNIPSAP-QPIYPTPQSPQYNVNYPSPQ-PANPQKPgvvniPSVPQPVY 13984
Cdd:smart00818 66 PVVPQQPLMPVPGQHSMTP----TQHHQPNLPQPAQQPfQPQPLQPPQPQQPMQPQPPVhPIPPLPP-----QPPLPPMF 136
|
....*...
gi 442625924 13985 P-SPQPPV 13991
Cdd:smart00818 137 PmQPLPPL 144
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
676-702 |
5.77e-03 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 39.12 E-value: 5.77e-03
10 20
....*....|....*....|....*..
gi 442625924 676 GSCGQNATCTNSAGGFTCACPPGFSGD 702
Cdd:pfam12947 6 GGCHPNATCTNTGGSFTCTCNDGYTGD 32
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
457-490 |
6.55e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.77 E-value: 6.55e-03
10 20 30
....*....|....*....|....*....|....*
gi 442625924 457 NINECQD-NPCGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:cd00054 1 DIDECASgNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
298-331 |
8.46e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 8.46e-03
10 20 30
....*....|....*....|....*....|....*
gi 442625924 298 DQDECA-RTPCGRNADCLNTDGSFRCLCPDGYSGD 331
Cdd:cd00054 1 DIDECAsGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
413-456 |
9.61e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 9.61e-03
10 20 30 40
....*....|....*....|....*....|....*....|....
gi 442625924 413 DIDECNQPDGvakCGTNAKCINFPGSYRCLCPSGFQGQgylHCE 456
Cdd:cd00054 1 DIDECASGNP---CQNGGTCVNTVGSYRCSCPPGYTGR---NCE 38
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14019-14660 |
1.68e-35 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 153.17 E-value: 1.68e-35
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ----PGVVNIPSVPSPSYPAPNPP 14094
Cdd:PHA03247 2478 PVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEPVGEPVHPRMltwiRGLEELASDDAGDPPPPLPP 2557
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14095 VNYPTQPSPQIPvqpgviniPSAPLPTtpPQHPPVFIPSPEspspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSP 14174
Cdd:PHA03247 2558 AAPPAAPDRSVP--------PPRPAPR--PSEPAVTSRARR-------------PDAP-PQSARPRAPVDDRGDPRGPAP 2613
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14175 ipqkpgvvniPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP 14254
Cdd:PHA03247 2614 ----------PSPLPPDTHAPDPPP-----PSPSPAANEPD--PHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQA 2676
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14255 SVP-----QPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIP-SVAQ 14328
Cdd:PHA03247 2677 SSPpqrprRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPgGPAR 2756
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14329 PVHP--TYQPPVVERPAIYDVYYPPPPSRPGVINIpSPPRPVYPVPQQPIYVPAPVlhiPAPRPVIhNIPSVPQPTYPhr 14406
Cdd:PHA03247 2757 PARPptTAGPPAPAPPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDPADPPAAV---LAPAAAL-PPAASPAGPLP-- 2829
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14407 nPPiqdvTYPAPQPSPPvpgivniPSLPQPVSTPTSG-------VINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPG 14479
Cdd:PHA03247 2830 -PP----TSAQPTAPPP-------PPGPPPPSLPLGGsvapggdVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTES 2897
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14480 IINVPSVPQPIPTAPSPgiinipsvPQPLPSPTPGViniPQQPTPPPlvQQPGIINIPSVQQPSTPTTQHPiQDVQYETQ 14559
Cdd:PHA03247 2898 FALPPDQPERPPQPQAP--------PPPQPQPQPPP---PPQPQPPP--PPPPRPQPPLAPTTDPAGAGEP-SGAVPQPW 2963
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14560 RPQPTPGVINIP----SVSQPTYPTQKPSyqdTSYPTVQPKPPVSG-----IINIPSVPQPV--------------PSLT 14616
Cdd:PHA03247 2964 LGALVPGRVAVPrfrvPQPAPSREAPASS---TPPLTGHSLSRVSSwasslALHEETDPPPVslkqtlwppddtedSDAD 3040
|
650 660 670 680
....*....|....*....|....*....|....*....|....
gi 442625924 14617 PGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPVQEVYH 14660
Cdd:PHA03247 3041 SLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPS 3084
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
13946-14600 |
3.78e-34 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 148.55 E-value: 3.78e-34
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13946 PIYPTP---QSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVYPSPQPpvydVNYPTTP------------VSQHPGVVNI 14010
Cdd:PHA03247 2478 PVYRRPaeaRFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAILPDEP----VGEPVHPrmltwirgleelASDDAGDPPP 2553
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14011 PSAPRLVPPTSQRPVfitspgnlsPTPQPGVINI-PSVS----QPGYP----TPQSPIYDANYPTTQSPipqqpgvvniP 14081
Cdd:PHA03247 2554 PLPPAAPPAAPDRSV---------PPPRPAPRPSePAVTsrarRPDAPpqsaRPRAPVDDRGDPRGPAP----------P 2614
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14082 SVPSPSYPAPNPPVNYPTqPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV 14161
Cdd:PHA03247 2615 SPLPPDTHAPDPPPPSPS-PAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVG 2693
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14162 PVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPA------PNPPVHefnyPTPPAVPQQP----GVLNIPSYPTPVAPTPQ 14231
Cdd:PHA03247 2694 SLTSLADPPPPPPTPEPAPHALVSATPLPPGPAaarqasPALPAA----PAPPAVPAGPatpgGPARPARPPTTAGPPAP 2769
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14232 SPIYIPSQEQPKPTTRPSVINVpSVPQPAYPTPQAPvydvnyptSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVH 14311
Cdd:PHA03247 2770 APPAAPAAGPPRRLTRPAVASL-SESRESLPSPWDP--------ADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPP 2840
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14312 PTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIydvyYPPPPSRPGVINIPSPprpvyPVPQQPIYVPAPVLHIPAPRPv 14391
Cdd:PHA03247 2841 PPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAA----KPAAPARPPVRRLARP-----AVSRSTESFALPPDQPERPPQ- 2910
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14392 ihniPSVPQPTYPHRNPPiqdvtypapqpsppvpgivnIPSLPQPvSTPTSGvinIPSQASPPISVPTPGIVNIPSIPQP 14471
Cdd:PHA03247 2911 ----PQAPPPPQPQPQPP--------------------PPPQPQP-PPPPPP---RPQPPLAPTTDPAGAGEPSGAVPQP 2962
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14472 TPQRPSPGIINVPS--VPQPIPTAPSPGiiniPSVPQPLPSPTPGV------INIPQQPTPPPlVQQPGIINIPSVQQPS 14543
Cdd:PHA03247 2963 WLGALVPGRVAVPRfrVPQPAPSREAPA----SSTPPLTGHSLSRVsswassLALHEETDPPP-VSLKQTLWPPDDTEDS 3037
|
650 660 670 680 690
....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14544 TPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQPkPPVS 14600
Cdd:PHA03247 3038 DADSLFDSDSERSDLEALDPLPPEPHDPFAHEPDPATPEAGARESPSSQFGP-PPLS 3093
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
14048-14503 |
3.64e-29 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 131.04 E-value: 3.64e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14048 SQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNipsVPSPSYPAPNPPVNYPTQPSPQIPvQPGVINIPSAPLPTTPPqhP 14127
Cdd:pfam03154 144 TSPSIPSPQDNESDSDSSAQQQILQTQPPVLQ---AQSGAASPPSPPPPGTTQAATAGP-TPSAPSVPPQGSPATSQ--P 217
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14128 PVFIPSPESPSPAPKPGviniPSVTHPEYPTSQVPVydvnystTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEfnyptp 14207
Cdd:pfam03154 218 PNQTQSTAAPHTLIQQT----PTLHPQRLPSPHPPL-------QPMTQPPPPSQVSPQPLPQPSLHGQMPPMPH------ 280
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14208 pavPQQPGVLNIPsYPTPVAPTPQSPIYIPSQEQPKPTtrpsvinvPSVPQPAYPTPQAPvydvnyPTSPSVIPHQPGVV 14287
Cdd:pfam03154 281 ---SLQTGPSHMQ-HPVPPQPFPLTPQSSQSQVPPGPS--------PAAPGQSQQRIHTP------PSQSQLQSQQPPRE 342
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14288 N-IPSVPLPAPPVKQRPVfvpSPVHPTPAPQ----PGVVNIPSVAQpVHPTYQPPVVERPAIYDVYYPPPPSRPgvinip 14362
Cdd:pfam03154 343 QpLPPAPLSMPHIKPPPT---TPIPQLPNPQshkhPPHLSGPSPFQ-MNSNLPPPPALKPLSSLSTHHPPSAHP------ 412
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 sPPRPVYPVPQQpiyVPAPvlhiPAPRPVIHNIPSVPQPTYPHRNP------PIQDvTYPAPQPSPPVPGIVNIPSLPQP 14436
Cdd:pfam03154 413 -PPLQLMPQSQQ---LPPP----PAQPPVLTQSQSLPPPAASHPPTsglhqvPSQS-PFPQHPFVPGGPPPITPPSGPPT 483
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14437 VSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQPTPQRPsPGIINVPSVPQPIPTAPS--PGIINIPS 14503
Cdd:pfam03154 484 STSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEA-LDEAEEPESPPPPPRSPSpePTVVNTPS 551
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14153-14685 |
4.34e-29 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 131.60 E-value: 4.34e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14153 HPEY---PTSQVPvydvnYSTTPSPIPQKPGVVNIPSAPQPVHPAP--------NPPVH----------------EFNYP 14205
Cdd:PHA03247 2477 APVYrrpAEARFP-----FAAGAAPDPGGGGPPDPDAPPAPSRLAPailpdepvGEPVHprmltwirgleelasdDAGDP 2551
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14206 TPPAVPQQPgvlniPSYPTPVAPTPQspiYIPSQEQPKPTTRPSVINVPsvPQPAypTPQAPVYDVNYPTSPSVIPHQPG 14285
Cdd:PHA03247 2552 PPPLPPAAP-----PAAPDRSVPPPR---PAPRPSEPAVTSRARRPDAP--PQSA--RPRAPVDDRGDPRGPAPPSPLPP 2619
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14286 VVNIPSVPLPAP------PVKQRPVFVPSPVHPTPAPQPGVVNIPS-VAQPVHPTYQPPVVERPAiydvyypPPPSRPGV 14358
Cdd:PHA03247 2620 DTHAPDPPPPSPspaanePDPHPPPTVPPPERPRDDPAPGRVSRPRrARRLGRAAQASSPPQRPR-------RRAARPTV 2692
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14359 INIPSPPRPvyPVPQQPiyvPAPvlhipAPRPVIHNIPSVPQPTYPHRN---PPIQDVTYPAPQPSPPVPGIVNIPSLPQ 14435
Cdd:PHA03247 2693 GSLTSLADP--PPPPPT---PEP-----APHALVSATPLPPGPAAARQAspaLPAAPAPPAVPAGPATPGGPARPARPPT 2762
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14436 PVSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQ-PTPQRPSPGIINVPSVPQPIPTAPSPGiiniPSVPQPlPSPTPG 14514
Cdd:PHA03247 2763 TAGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESlPSPWDPADPPAAVLAPAAALPPAASPA----GPLPPP-TSAQPT 2837
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14515 VINIPQQPTPPPLVQQPGIInipsvqqPSTPTTQHPiqdvqyETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQ 14594
Cdd:PHA03247 2838 APPPPPGPPPPSLPLGGSVA-------PGGDVRRRP------PSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQ 2904
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14595 PKPPvsgiinipsvPQPVPSLTPgvinLPSEPSYSAPIPKPgiinvpsiPEPIPSIPQNPVQEVYHDTQKPQA------- 14667
Cdd:PHA03247 2905 PERP----------PQPQAPPPP----QPQPQPPPPPQPQP--------PPPPPPRPQPPLAPTTDPAGAGEPsgavpqp 2962
|
570 580
....*....|....*....|....*
gi 442625924 14668 -----IPGVVNVPS--APQPTPGRP 14685
Cdd:PHA03247 2963 wlgalVPGRVAVPRfrVPQPAPSRE 2987
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
14167-14698 |
1.08e-24 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 116.40 E-value: 1.08e-24
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14167 NYSTTPS-PIPQKPGVVNIPSAPQPVHPApNPPVheFNYPTPPAVPQQPGVLNIPSYPTPvAPTPQSPiYIPSQEQPkPT 14245
Cdd:pfam03154 141 NRSTSPSiPSPQDNESDSDSSAQQQILQT-QPPV--LQAQSGAASPPSPPPPGTTQAATA-GPTPSAP-SVPPQGSP-AT 214
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14246 TRPSVINVPSVPqPAYPTPQAPvyDVNYPTSPSviPHQPgvvnIPSVPLPAPPVKQRPVFVPSPVHPTPAPqPGvvniPS 14325
Cdd:pfam03154 215 SQPPNQTQSTAA-PHTLIQQTP--TLHPQRLPS--PHPP----LQPMTQPPPPSQVSPQPLPQPSLHGQMP-PM----PH 280
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14326 VAQPVHPTYQPPVVERPAiydvyypPPPSRPGVINIPSPPRPVYPVPQQPiyvpapVLHIPAPRPVihnipsvPQPTYPH 14405
Cdd:pfam03154 281 SLQTGPSHMQHPVPPQPF-------PLTPQSSQSQVPPGPSPAAPGQSQQ------RIHTPPSQSQ-------LQSQQPP 340
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14406 RNPPIQdvtypapqpsppvpgivnipslPQPVSTPtsgvinipsQASPPISVPTPGIVNIPSIPQPtPQRPSPGIINVPS 14485
Cdd:pfam03154 341 REQPLP----------------------PAPLSMP---------HIKPPPTTPIPQLPNPQSHKHP-PHLSGPSPFQMNS 388
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14486 vPQPIPTAPSPgIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVqqpSTPTTQHPiqdvqyetqrpqPTP 14565
Cdd:pfam03154 389 -NLPPPPALKP-LSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSL---PPPAASHP------------PTS 451
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14566 GVINIPSvsQPTYPTQkpSYQDTSYPTVQPkppvsgiiniPSVPQP-VPSLTPGvINLPSEPSYSAPIPKPgiiNVPSIP 14644
Cdd:pfam03154 452 GLHQVPS--QSPFPQH--PFVPGGPPPITP----------PSGPPTsTSSAMPG-IQPPSSASVSSSGPVP---AAVSCP 513
|
490 500 510 520 530 540
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14645 EPIPSIPQNPVQEVYH------DTQKPQAIPGVVNVPSAPQPTP------GRPYYDVAKPDFEFNP 14698
Cdd:pfam03154 514 LPPVQIKEEALDEAEEpespppPPRSPSPEPTVVNTPSHASQSArfykhlDRGYNSCARTDLYFMP 579
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
13897-14375 |
1.68e-23 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 113.11 E-value: 1.68e-23
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13897 PFTRCyeTPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNI 13976
Cdd:PHA03247 2569 PPPRP--APRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTV 2646
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13977 PSVPQPvYPSPQPPVYDVNYPTTPVSQHPGVVNIPSAPR--LVPPTSQRPVFITSPGNLSPTPQPGviniPSVSQPGYPT 14054
Cdd:PHA03247 2647 PPPERP-RDDPAPGRVSRPRRARRLGRAAQASSPPQRPRrrAARPTVGSLTSLADPPPPPPTPEPA----PHALVSATPL 2721
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14055 PQSPIY--DANYPTTQSPIPQQP--------GVVNIPSVPSPSYP-APNPPVNYPTQPSPQIPVQPGV---INIPSAPLP 14120
Cdd:PHA03247 2722 PPGPAAarQASPALPAAPAPPAVpagpatpgGPARPARPPTTAGPpAPAPPAAPAAGPPRRLTRPAVAslsESRESLPSP 2801
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14121 TTPPQHP-PVFIPSPESPSPAPKPGVINIPSVTHPEYPT--------------SQVPVYDVNYSTTPSPIPQKPGVVNIP 14185
Cdd:PHA03247 2802 WDPADPPaAVLAPAAALPPAASPAGPLPPPTSAQPTAPPpppgppppslplggSVAPGGDVRRRPPSRSPAAKPAAPARP 2881
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14186 SAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPiyipsQEQPKPTTRPsviNVPSVPQPAYPTPQ 14265
Cdd:PHA03247 2882 PVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQP-----QPPPPPPPRP---QPPLAPTTDPAGAG 2953
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14266 APVYDVNYPTSPSVIPHQPGVVNIpSVPLPAPPvkqRPVFVPSPVHPTPAPQPGV--------VNIPSVAQPVH--PTYQ 14335
Cdd:PHA03247 2954 EPSGAVPQPWLGALVPGRVAVPRF-RVPQPAPS---REAPASSTPPLTGHSLSRVsswasslaLHEETDPPPVSlkQTLW 3029
|
490 500 510 520
....*....|....*....|....*....|....*....|.
gi 442625924 14336 PPVVERPAIYDVYYPPPPSRPGVINI-PSPPRPVYPVPQQP 14375
Cdd:PHA03247 3030 PPDDTEDSDADSLFDSDSERSDLEALdPLPPEPHDPFAHEP 3070
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
13898-14608 |
2.26e-19 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 98.99 E-value: 2.26e-19
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13898 FTRCYETPKPVRPQIydtpsPPYPVAIP----------DLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPS-PQPA 13966
Cdd:PHA03378 300 FRQCTGRPRPTKPWL-----RAHPVAVPyddpltseeiDLAYARGLAMEIEAVRLPDDPIIVEDDDESEEIESECdPDED 374
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13967 NPQKPGVVNIP-SVPQPVYPSPQPPVYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPvfitspgNLSPTPQPgvinip 14045
Cdd:PHA03378 375 KSGAEALASIPqTLPDPPTVYGRPKVFARKADLKSTKKCRAIVTDPSVIKAIEEEHRKK-------KAARTEQP------ 441
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14046 svsQPGyPTPQSPIYDANYPTTQspipQQPGVVNIPSVPSPSypAPNPPVNYPT-------QPSPQIPVQPGVI------ 14112
Cdd:PHA03378 442 ---RAT-PHSQAPTVVLHRPPTQ----PLEGPTGPLSVQAPL--EPWQPLPHPQvtpvilhQPPAQGVQAHGSMldllek 511
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14113 -------NIPSAPLPTTPPQ------HPPVFipspespspapkPGVINIPSvthpEYPTSQVPVYD-VNYSTTPSPIPQK 14178
Cdd:PHA03378 512 ddedmeqRVMATLLPPSPPQpragrrAPCVY------------TEDLDIES----DEPASTEPVHDqLLPAPGLGPLQIQ 575
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14179 PGVVNIPSAPQPVHPA----PNPPVHEFNYPTPPAVPQQPGVLNIP-SYPTPVAPTPQSPIYIpsqeqpkpttRPSVINV 14253
Cdd:PHA03378 576 PLTSPTTSQLASSAPSyaqtPWPVPHPSQTPEPPTTQSHIPETSAPrQWPMPLRPIPMRPLRM----------QPITFNV 645
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14254 PSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNI-PSVPLPAPPVKQRpvfvPSPVHPTPAPQPGVvnipsvaqpvhp 14332
Cdd:PHA03378 646 LVFPTPHQPPQVEITPYKPTWTQIGHIPYQPSPTGAnTMLPIQWAPGTMQ----PPPRAPTPMRPPAA------------ 709
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14333 tyqPPV-VERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIq 14411
Cdd:PHA03378 710 ---PPGrAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPA- 785
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14412 dvtypapqpsppvpgivnipslpqPVSTPTSGviniPSQASPPISVPTPGIVNIPSIP---QPTPQRPSPGIINVPSVPQ 14488
Cdd:PHA03378 786 ------------------------PQQRPRGA----PTPQPPPQAGPTSMQLMPRAAPgqqGPTKQILRQLLTGGVKRGR 837
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14489 PIPTAPSPGIINIPSVPQPLPSPTPGViNIPQQPT--PPPL--VQQPGIINIPSVQQPSTpTTQHPiqdvqyeTQRPQPT 14564
Cdd:PHA03378 838 PSLKKPAALERQAAAGPTPSPGSGTSD-KIVQAPVfyPPVLqpIQVMRQLGSVRAAAAST-VTQAP-------TEYTGER 908
|
730 740 750 760
....*....|....*....|....*....|....*....|....
gi 442625924 14565 PGVINIPSVSQPtyptqkPSYQDTSYPTVQPKPPVSGIINIPSV 14608
Cdd:PHA03378 909 RGVGPMHPTDIP------PSKRAKTDAYVESQPPHGGQSHSFSV 946
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
14145-14647 |
4.40e-17 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 91.69 E-value: 4.40e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14145 VINIPSVTHPEYPTSQvPVydVNYSTTPSPIPQKPGVVNIPS--APQPVHPAPNPPVHEfnyptppavP-QQPGVLNIPS 14221
Cdd:PRK10263 340 VTQTPPVASVDVPPAQ-PT--VAWQPVPGPQTGEPVIAPAPEgyPQQSQYAQPAVQYNE---------PlQQPVQPQQPY 407
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14222 YPTPVAPTPQSPIYIPSQEQPKPTTRPS-VINVPSVPQPAYPTPQAPVYDvnypTSPSVIPHQPGVVnipsvPLPAPPVK 14300
Cdd:PRK10263 408 YAPAAEQPAQQPYYAPAPEQPAQQPYYApAPEQPVAGNAWQAEEQQSTFA----PQSTYQTEQTYQQ-----PAAQEPLY 478
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14301 QRPVFVPSPvhPTPAPQPGVVNIPSVAQPVHptYQPPVVERPA----IYDVYYPPPPsRPgvINIPSPPRPVYPVPQQPI 14376
Cdd:PRK10263 479 QQPQPVEQQ--PVVEPEPVVEETKPARPPLY--YFEEVEEKRArereQLAAWYQPIP-EP--VKEPEPIKSSLKAPSVAA 551
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14377 yVPaPVLHIPAPRPVIHNIPS--VPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVniPSLPQP--VSTPT-----SGVINI 14447
Cdd:PRK10263 552 -VP-PVEAAAAVSPLASGVKKatLATGAAATVAAPVFSLANSGGPRPQVKEGIG--PQLPRPkrIRVPTrrelaSYGIKL 627
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14448 PSQASPPISVPTPGIVNIPSIPQPTP------------------------------------------------------ 14473
Cdd:PRK10263 628 PSQRAAEEKAREAQRNQYDSGDQYNDdeidamqqdelarqfaqtqqqrygeqyqhdvpvnaedadaaaeaelarqfaqtq 707
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14474 -QRPS---PGIINVPSVP----QPI-------PTAP--SPGIINIpSVPQPLPSPTPGViNIPQQPTPPPLV-QQPgiin 14535
Cdd:PRK10263 708 qQRYSgeqPAGANPFSLDdfefSPMkallddgPHEPlfTPIVEPV-QQPQQPVAPQQQY-QQPQQPVAPQPQyQQP---- 781
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14536 ipsvQQPSTPTTQH--PIQDVQYETQRPQPTPGVINIPSVSQPTYPTQ-KPSYQdtsyptvQPKPPVSgiinipsvPQPV 14612
Cdd:PRK10263 782 ----QQPVAPQPQYqqPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVApQPQYQ-------QPQQPVA--------PQPQ 842
|
570 580 590 600
....*....|....*....|....*....|....*....|
gi 442625924 14613 PSLTPGVI--NLPSEPSYSAPIPKPG---IINVPSIPEPI 14647
Cdd:PRK10263 843 DTLLHPLLmrNGDSRPLHKPTTPLPSldlLTPPPSEVEPV 882
|
|
| ZP |
smart00241 |
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona ... |
17722-17957 |
9.63e-17 |
|
Zona pellucida (ZP) domain; ZP proteins are responsible for sperm-adhesion fo the zona pellucida. ZP domains are also present in multidomain transmembrane proteins such as glycoprotein GP2, uromodulin and TGF-beta receptor type III (betaglycan).
Pssm-ID: 214579 Cd Length: 252 Bit Score: 85.13 E-value: 9.63e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17722 CLADGVQVEIHiTEPGFNGVLYVKGHS-KDEECRRVVNLAGETVPRTEifrVHFGSCGM--QAVKDVA--SFVLVIQKHP 17796
Cdd:smart00241 2 CGEDQMVVSVS-TDLLFPGGINVKGLTlGDPSCRPQFTDATSAFVSFE---VPLNGCGTrrQVNPDGIvySNTLVVSPFH 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17797 KLVTYKAQ--AYNIKCVYQTGEKnVTLGFNVSMLTTAGTIANTGPPPICQMRIITNEGE----EINSAEIGDNLKLQVDV 17870
Cdd:smart00241 78 PGFITRDDraAYHFQCFYPENEK-VSLNLDVSTIPPTELSSVSEGPLTCSYRLYKDDSFgspyQSADYVLGDPVYHEWEC 156
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17871 EPATI--YGGFARSCIAKTMEDNVQNEYLVTDENGCATDTSIFGNWEYNPDTNSLL-ASFNAFKFPSSDNIRFQCNIRVC 17947
Cdd:smart00241 157 DGADDppLGLLVDNCYATPGPDPSSGPKYFIIDNGCPVDGYLDSTIPYNSNPLHRArFSVKVFKFADRSLVYFHCQIRLC 236
|
250
....*....|....
gi 442625924 17948 ----FGRCQPVNCG 17957
Cdd:smart00241 237 dkddGSSCDGPACS 250
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
13798-14227 |
9.68e-17 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 90.21 E-value: 9.68e-17
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13798 PVPIIQESPLTPCDPSPCGPNAQCHPSLNEAVCSCLPEfyGTPPNCRPECTLNSECA-----YDKACVH-------HKCV 13865
Cdd:pfam03154 172 PVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ--GSPATSQPPNQTQSTAAphtliQQTPTLHpqrlpspHPPL 249
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13866 DPCPGICGINADCRVHYHSPicyciSSHTGDPftrcyETPKPVR--PQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSA 13943
Cdd:pfam03154 250 QPMTQPPPPSQVSPQPLPQP-----SLHGQMP-----PMPHSLQtgPSHMQHPVPPQPFPLT-------PQSSQSQVPPG 312
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13944 PQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVyPSPQPPvydvnyPTTPVSQHPGvvniPSAPRLvPPTSQR 14023
Cdd:pfam03154 313 PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPLPPAPLSM-PHIKPP------PTTPIPQLPN----PQSHKH-PPHLSG 380
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14024 PVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP 14103
Cdd:pfam03154 381 PSPFQMNSNLPPPPALKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAASHPPTSGLHQVPSQS 460
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14104 QIPVQPgviNIPSAPLPTTPPQHPPvfipspespspapkpgviniPSVThPEYPTSQVPVYDVNYSTTPSPipqkpgvvN 14183
Cdd:pfam03154 461 PFPQHP---FVPGGPPPITPPSGPP--------------------TSTS-SAMPGIQPPSSASVSSSGPVP--------A 508
|
410 420 430 440
....*....|....*....|....*....|....*....|....*....
gi 442625924 14184 IPSAPQPVHPAPNPPVHEFNYPTPPAVPQ-----QPGVLNIPSYPTPVA 14227
Cdd:pfam03154 509 AVSCPLPPVQIKEEALDEAEEPESPPPPPrspspEPTVVNTPSHASQSA 557
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
13944-14523 |
1.39e-16 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 89.74 E-value: 1.39e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13944 PQPIyPTPQSPQYNVNYPSPQP-ANPQKPGVVNIPSVPQPVYPSPQ-PPVYDVNYPTTPVSQHPGVVNIPSA-------- 14013
Cdd:PHA03378 441 PRAT-PHSQAPTVVLHRPPTQPlEGPTGPLSVQAPLEPWQPLPHPQvTPVILHQPPAQGVQAHGSMLDLLEKddedmeqr 519
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14014 --PRLVPPTSQRPVfitsPGNLSPTPQPGVINIPSvsqpgyptpqspiydaNYPTTQSPIPQQPgvvnipsvpspsYPAP 14091
Cdd:PHA03378 520 vmATLLPPSPPQPR----AGRRAPCVYTEDLDIES----------------DEPASTEPVHDQL------------LPAP 567
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14092 NPpvnyptqpsPQIPVQPgVINIPSAPLPTTPPQHppvfipspespspAPKPGVINIPSvTHPEYPTSQvpvydvnystT 14171
Cdd:PHA03378 568 GL---------GPLQIQP-LTSPTTSQLASSAPSY-------------AQTPWPVPHPS-QTPEPPTTQ----------S 613
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14172 PSPIPQKPGVVNIPSAPQPVHPAPNPPVhEFNYPTPPAVPQQPGVlnipsYPTPVAPTPQSPIYIPSqeQPKPTTRPSVI 14251
Cdd:PHA03378 614 HIPETSAPRQWPMPLRPIPMRPLRMQPI-TFNVLVFPTPHQPPQV-----EITPYKPTWTQIGHIPY--QPSPTGANTML 685
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14252 NVPSVPQPAYPTPQAPVydvnyPTSPsviphqpgvvnipsvPLPAPPVKQRPVFVPSPVHPtPAPQPGVVNIPSVAQPVH 14331
Cdd:PHA03378 686 PIQWAPGTMQPPPRAPT-----PMRP---------------PAAPPGRAQRPAAATGRARP-PAAAPGRARPPAAAPGRA 744
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14332 PTYQ--PPVVERPAIYDVYYPPPPSRPGVINiPSPPRPVYPVP-QQPIYVPAPVLHIPA-PRPVIHNIPSVPQPTYPHRN 14407
Cdd:PHA03378 745 RPPAaaPGRARPPAAAPGRARPPAAAPGAPT-PQPPPQAPPAPqQRPRGAPTPQPPPQAgPTSMQLMPRAAPGQQGPTKQ 823
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14408 PPIQDVTYPA----PQPSPPVPGIVNIPSLPQPvsTPTSGVINIPSQAS---PPISVPT--PGIVNIP------SIPQPT 14472
Cdd:PHA03378 824 ILRQLLTGGVkrgrPSLKKPAALERQAAAGPTP--SPGSGTSDKIVQAPvfyPPVLQPIqvMRQLGSVraaaasTVTQAP 901
|
570 580 590 600 610
....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14473 PQRPSPGIINVPSVPQPIPTAPSPGIINI--PSVPQPLPSPTPGVI----NIPQQPT 14523
Cdd:PHA03378 902 TEYTGERRGVGPMHPTDIPPSKRAKTDAYveSQPPHGGQSHSFSVIwenvSQGQQQT 958
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14295-14705 |
2.14e-16 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 89.61 E-value: 2.14e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14295 PAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVV---------------------ERPAIYDVYYPPPP 14353
Cdd:PHA03247 2475 PGAPVYRRPAEARFPFAAGAAPDPGGGGPPDPDAPPAPSRLAPAIlpdepvgepvhprmltwirglEELASDDAGDPPPP 2554
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVI------NIPSP---PRPVYP----------VPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPiqdvT 14414
Cdd:PHA03247 2555 LPPAAPpaapdrSVPPPrpaPRPSEPavtsrarrpdAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPP----S 2630
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14415 YPAPQPSPPVPGIVNIPSLPQPVSTPTsgviniPSQASPPISVPTPGIVNIPSIPQPTPQRPS--PGIINVPSVPQPipt 14492
Cdd:PHA03247 2631 PSPAANEPDPHPPPTVPPPERPRDDPA------PGRVSRPRRARRLGRAAQASSPPQRPRRRAarPTVGSLTSLADP--- 2701
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14493 aPSPGiinipsvPQPLPSPTPGVINIPQQPTPPplvqqpgiinipSVQQPSTPTTQHPIqdvqyetqrPQPTPgviNIPS 14572
Cdd:PHA03247 2702 -PPPP-------PTPEPAPHALVSATPLPPGPA------------AARQASPALPAAPA---------PPAVP---AGPA 2749
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14573 VsqPTYPTQKPSYQDTSYPT--VQPKPPVSGiiniPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSi 14650
Cdd:PHA03247 2750 T--PGGPARPARPPTTAGPPapAPPAAPAAG----PPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAA- 2822
|
410 420 430 440 450
....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14651 pqnpvqevyhdtqkpqaipgvvnVPSAPQPTPGRPYYDVAKPDFEFNPCYPSPCG 14705
Cdd:PHA03247 2823 -----------------------SPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGG 2854
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
14033-14401 |
3.63e-16 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 87.52 E-value: 3.63e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14033 LSPTPQPGVINIPSVSQPGYPTPQSPIYDAnyPTTQsPIPQqpgvvniPSVPSPSYPAPNPPvnyPTQPSPQIPVQPGVI 14112
Cdd:NF033839 147 SSSSSSSGSSTKPETPQPENPEHQKPTTPA--PDTK-PSPQ-------PEGKKPSVPDINQE---KEKAKLAVATYMSKI 213
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14113 --NIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV----PVYDVNYSTTPSPIPQKPGVVNIPS 14186
Cdd:NF033839 214 ldDIQKHHLQKEKHRQIVALIKELDELKKQALSEIDNVNTKVEIENTVHKIfadmDAVVTKFKKGLTQDTPKEPGNKKPS 293
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQP-VHPAPNPPVHEfnyPTPPAVPQQPGVLNIPSYPTP-VAPTPQS--PIYIPSQEQPKPTTRPSvinvPSVPQPAY- 14261
Cdd:NF033839 294 APKPgMQPSPQPEKKE---VKPEPETPKPEVKPQLEKPKPeVKPQPEKpkPEVKPQLETPKPEVKPQ----PEKPKPEVk 366
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 PTPQAPvydvnyptSPSVIPhQPGVvnipsvplPAPPVKQRPVfVPSP-VHPTP-APQPGVVNIPSVAQP-VHPTYQPPv 14338
Cdd:NF033839 367 PQPEKP--------KPEVKP-QPET--------PKPEVKPQPE-KPKPeVKPQPeKPKPEVKPQPEKPKPeVKPQPEKP- 427
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14339 veRPaiyDVYYPPPPSRPGVINIPSPPRP-VYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVPQP 14401
Cdd:NF033839 428 --KP---EVKPQPEKPKPEVKPQPEKPKPeVKPQPETP--KPEVKPQPEKPKPEVKPQPEKPKP 484
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
13941-14278 |
8.64e-16 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 86.36 E-value: 8.64e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13941 PSAPQPIYPTPQSPQYNV--NYPSPQP--ANPQKPGVVNIPSVPQP-VYPSPQPPVYDVNYPTTPVSQHPGVVNIPSA-- 14013
Cdd:NF033839 159 PETPQPENPEHQKPTTPApdTKPSPQPegKKPSVPDINQEKEKAKLaVATYMSKILDDIQKHHLQKEKHRQIVALIKEld 238
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14014 --------------PRLVPPTSQRPVFIT--------SPGNLSPTPQPGVINIPSVSQPGY-PTPQSPIydanypTTQSP 14070
Cdd:NF033839 239 elkkqalseidnvnTKVEIENTVHKIFADmdavvtkfKKGLTQDTPKEPGNKKPSAPKPGMqPSPQPEK------KEVKP 312
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14071 IPQQPGVVNIPSVPSPSyPAPNPPvnyPTQPSPQIPVQPGVINIPSAPLPTTP-PQHPPvfipspesPSPAPKPGVINIP 14149
Cdd:NF033839 313 EPETPKPEVKPQLEKPK-PEVKPQ---PEKPKPEVKPQLETPKPEVKPQPEKPkPEVKP--------QPEKPKPEVKPQP 380
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14150 SVTHPEY-PTSQVPVYDVNysttPSPIPQKPGVVNIPSAPQP-VHPAPNPPVHEFNyPTPPAvpQQPGVLNIPSYPTP-V 14226
Cdd:NF033839 381 ETPKPEVkPQPEKPKPEVK----PQPEKPKPEVKPQPEKPKPeVKPQPEKPKPEVK-PQPEK--PKPEVKPQPEKPKPeV 453
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14227 APTPQSPI--YIPSQEQPKPTTRPSvinvPSVPQPAYPTPQApvyDVNYPTSPS 14278
Cdd:NF033839 454 KPQPETPKpeVKPQPEKPKPEVKPQ----PEKPKPDNSKPQA---DDKKPSTPN 500
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
13907-14381 |
1.72e-15 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 86.29 E-value: 1.72e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQiydTPSPPYPVAIPDLvyvqqQQPGIvnipsAPQPIyPTPQSPQyNVNYPSPQPANPQkpgvvniPSVPQPVYPS 13986
Cdd:PRK10263 336 PVEPV---TQTPPVASVDVPP-----AQPTV-----AWQPV-PGPQTGE-PVIAPAPEGYPQQ-------SQYAQPAVQY 393
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13987 PQPpvYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPVFitspgnLSPTPQPgviNIPSVSQPGYPTPQSPIYDAN--- 14063
Cdd:PRK10263 394 NEP--LQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQ------PYYAPAP---EQPVAGNAWQAEEQQSTFAPQsty 462
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14064 --YPTTQSPIPQQPGVVNIPSVPSPSYPAPNP------PVNYPT---------------------QPSPQiPVQPGVINI 14114
Cdd:PRK10263 463 qtEQTYQQPAAQEPLYQQPQPVEQQPVVEPEPvveetkPARPPLyyfeeveekrarereqlaawyQPIPE-PVKEPEPIK 541
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14115 PSAPlPTTPPQHPPVfipSPESPSPAPKPGVINIPSVTHPEyPTSQVPVYDVNYSTTPSPI------PQ--KPGVVNIPS 14186
Cdd:PRK10263 542 SSLK-APSVAAVPPV---EAAAAVSPLASGVKKATLATGAA-ATVAAPVFSLANSGGPRPQvkegigPQlpRPKRIRVPT 616
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 ------------------------------------------------APQ-----------------PVHPAPNPPVHE 14201
Cdd:PRK10263 617 rrelasygiklpsqraaeekareaqrnqydsgdqynddeidamqqdelARQfaqtqqqrygeqyqhdvPVNAEDADAAAE 696
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14202 FNYPTPPAVPQQ-------PGVLNIPSYP----TP----VAPTPQSPIYIPSQEqpkPTTRPSVinvPSVPQPAYPTPQA 14266
Cdd:PRK10263 697 AELARQFAQTQQqrysgeqPAGANPFSLDdfefSPmkalLDDGPHEPLFTPIVE---PVQQPQQ---PVAPQQQYQQPQQ 770
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14267 PV---YDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVfvpspvhpTPAPQPGVVNIPSVAQPVHPTYQPPVVERPA 14343
Cdd:PRK10263 771 PVapqPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPV--------APQPQYQQPQQPVAPQPQYQQPQQPVAPQPQ 842
|
570 580 590 600
....*....|....*....|....*....|....*....|.
gi 442625924 14344 ---IYDVYYPPPPSRPgvinipsPPRPVYPVPQQPIYVPAP 14381
Cdd:PRK10263 843 dtlLHPLLMRNGDSRP-------LHKPTTPLPSLDLLTPPP 876
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
14354-14692 |
5.51e-15 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 83.66 E-value: 5.51e-15
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVINIPSPPRPVYPVPQQPIyVPAPVLHiPAPRPVIHNiPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSL 14433
Cdd:NF033839 151 SSSGSSTKPETPQPENPEHQKPT-TPAPDTK-PSPQPEGKK-PSVPDINQEKEKAKLAVATYMSKILDDIQKHHLQKEKH 227
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPVSTPTSgVINIPSQASPPISVPTPGIVnipsiPQPTPQRPSPGIINVPSVPQP--IPTAPSPGIINIPSVPQPL--P 14509
Cdd:NF033839 228 RQIVALIKE-LDELKKQALSEIDNVNTKVE-----IENTVHKIFADMDAVVTKFKKglTQDTPKEPGNKKPSAPKPGmqP 301
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14510 SPTPGVINIPQQPTPP-----PLVQQPGiiniPSVQ-QPSTPTtqhPIQDVQYETQRPQ-------PTPGVINIPSVSQP 14576
Cdd:NF033839 302 SPQPEKKEVKPEPETPkpevkPQLEKPK----PEVKpQPEKPK---PEVKPQLETPKPEvkpqpekPKPEVKPQPEKPKP 374
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14577 TYPTQ----KPSYQ---DTSYPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEP-IP 14648
Cdd:NF033839 375 EVKPQpetpKPEVKpqpEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVK 454
|
330 340 350 360
....*....|....*....|....*....|....*....|....
gi 442625924 14649 SIPQNPVQEVYHDTQKPQaiPGVVNVPSAPQPTPGRPYYDVAKP 14692
Cdd:NF033839 455 PQPETPKPEVKPQPEKPK--PEVKPQPEKPKPDNSKPQADDKKP 496
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
14373-14708 |
5.40e-14 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 81.35 E-value: 5.40e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14373 QQPIYVPAPVLHIPAPrPVIHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPvpgIVNIPSLPQPVSTPTSGVINIPSQAS 14452
Cdd:pfam03154 164 QQILQTQPPVLQAQSG-AASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPA---TSQPPNQTQSTAAPHTLIQQTPTLHP 239
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14453 PPISVPTPGIVNIPSIPQPT---PQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGViniPQQPTPPPLVQ 14529
Cdd:pfam03154 240 QRLPSPHPPLQPMTQPPPPSqvsPQPLPQPSLHGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSS---QSQVPPGPSPA 316
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14530 QPGiiniPSVQQPSTPTTQHPIQDVQYETQRPQPtPGVINIPSVS-QPTYP-TQKPSYQDTSYPTVQPKP-PVSGIINIP 14606
Cdd:pfam03154 317 APG----QSQQRIHTPPSQSQLQSQQPPREQPLP-PAPLSMPHIKpPPTTPiPQLPNPQSHKHPPHLSGPsPFQMNSNLP 391
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14607 svpqPVPSLTPgvinLPSEPSYSAPIPKPGIINVPSIPEPIPSIP-QNPVQevyhdTQKPQAIPGVVNVP--SAPQPTPG 14683
Cdd:pfam03154 392 ----PPPALKP----LSSLSTHHPPSAHPPPLQLMPQSQQLPPPPaQPPVL-----TQSQSLPPPAASHPptSGLHQVPS 458
|
330 340
....*....|....*....|....*
gi 442625924 14684 RPYYdvakPDFEFNPCYPSPCGPYS 14708
Cdd:pfam03154 459 QSPF----PQHPFVPGGPPPITPPS 479
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
14117-14575 |
7.94e-14 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 80.87 E-value: 7.94e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14117 APLPTTPPQHPPVFIPSPESPSPAPkpgvinipsvTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVvnipsAPQPVhpAPN 14196
Cdd:PHA03379 408 ASEPTYGTPRPPVEKPRPEVPQSLE----------TATSHGSAQVPEPPPVHDLEPGPLHDQHSM-----APCPV--AQL 470
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14197 PPVhefnyPTPPAVP--QQPGVLNIPS-YPTPVaPTPQSPIYIPSqeQPKPTTRPSVINVPSVPQPA----YPTPQAPVY 14269
Cdd:PHA03379 471 PPG-----PLQDLEPgdQLPGVVQDGRpACAPV-PAPAGPIVRPW--EASLSQVPGVAFAPVMPQPMpvepVPVPTVALE 542
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14270 DVNYPTSPSVIPHQPGVvnipsvplPAPPVKQRPVFVPSPVHPTPAPQPGVVNI---PSVAQPVHPTYQPPV-VERPAIY 14345
Cdd:PHA03379 543 RPVCPAPPLIAMQGPGE--------TSGIVRVRERWRPAPWTPNPPRSPSQMSVrdrLARLRAEAQPYQASVeVQPPQLT 614
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14346 DVYYPPPPSRPGVINipSPPRPVYPVPQQPIYVPAPvlHIPAPRPvihnipsvpqptyPHRNPPIQDVTYPAPQPSPPVP 14425
Cdd:PHA03379 615 QVSPQQPMEYPLEPE--QQMFPGSPFSQVADVMRAG--GVPAMQP-------------QYFDLPLQQPISQGAPLAPLRA 677
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14426 GIVNIPslPQPVSTPTSGVINIpsqaSPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPI-PTAPSPGIINIPsV 14504
Cdd:PHA03379 678 SMGPVP--PVPATQPQYFDIPL----TEPINQGASAAHFLPQQPMEGPLVPERWMFQGATLSQSVrPGVAQSQYFDLP-L 750
|
410 420 430 440 450 460 470
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14505 PQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVqQPSTPTTQHPIQDVQYeTQRPQPTPGVINIPSVSQ 14575
Cdd:PHA03379 751 TQPINHGAPAAHFLHQPPMEGPWVPEQWMFQGAPP-SQGTDVVQHQLDALGY-VLHVLNHPGVPVSPAVNQ 819
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
13966-14410 |
1.33e-13 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 80.13 E-value: 1.33e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13966 ANPQKPgVVNIPSVPQPVYPSPQPpvyDVNYPTTPVSQHPGVVnIPSAPRLVPPTSQRPVfiTSPGNLSPTPQPGVINIP 14045
Cdd:PRK10263 334 AAPVEP-VTQTPPVASVDVPPAQP---TVAWQPVPGPQTGEPV-IAPAPEGYPQQSQYAQ--PAVQYNEPLQQPVQPQQP 406
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14046 SVSQPGYPTPQSPIYDANYPTTQ-----SPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQipvqpgvinipsaPLP 14120
Cdd:PRK10263 407 YYAPAAEQPAQQPYYAPAPEQPAqqpyyAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQ-------------PAA 473
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14121 TTPPQHPPVFIPSpespspapkpgviniPSVTHPEYPTSQV-----PVY---DVNYSTT-----------PSPIPQKPGV 14181
Cdd:PRK10263 474 QEPLYQQPQPVEQ---------------QPVVEPEPVVEETkparpPLYyfeEVEEKRArereqlaawyqPIPEPVKEPE 538
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14182 VNIPSAPqPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPT------------------PQSP---------- 14233
Cdd:PRK10263 539 PIKSSLK-APSVAAVPPVEAAAAVSPLASGVKKATLATGAAATVAAPVfslansggprpqvkegigPQLPrpkrirvptr 617
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14234 -------IYIPSQEQPKPTTRPSVINVPSVPQPAY----------------PTPQAPVYDVNYPTSPSVIP--------- 14281
Cdd:PRK10263 618 relasygIKLPSQRAAEEKAREAQRNQYDSGDQYNddeidamqqdelarqfAQTQQQRYGEQYQHDVPVNAedadaaaea 697
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14282 ----------------HQPGVVNIPSVP-LPAPPVK-------QRPVFVPSpVHPTPAPQPGVVNIPSVAQPVHPTYQPP 14337
Cdd:PRK10263 698 elarqfaqtqqqrysgEQPAGANPFSLDdFEFSPMKallddgpHEPLFTPI-VEPVQQPQQPVAPQQQYQQPQQPVAPQP 776
|
490 500 510 520 530 540 550
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14338 VVERPAiydvyYP-PPPSRPGVINIPSPPRPVYPVPQQPIyVPAPVLHIPAPrpvihniPSVPQPTYPHRNPPI 14410
Cdd:PRK10263 777 QYQQPQ-----QPvAPQPQYQQPQQPVAPQPQYQQPQQPV-APQPQYQQPQQ-------PVAPQPQYQQPQQPV 837
|
|
| Streccoc_I_II |
NF033804 |
antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins ... |
14157-14365 |
1.60e-13 |
|
antigen I/II family LPXTG-anchored adhesin; Members of the antigen I/II family are adhesins with a glucan-binding domain, two types of repetitive regions, an isopeptide bond-forming domain associated with shear resistance, and a C-terminal LPXTG motif for anchoring to the cell wall. They occur in oral Streptococci, and tend to be major cell surface adhesins. Members of this family include SspA and SspB from Streptococcus gordonii, antigen I/II from S. mutans, etc.
Pssm-ID: 468188 [Multi-domain] Cd Length: 1552 Bit Score: 79.98 E-value: 1.60e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14157 PTSQVPVYDVNYSTTPspipQKPGV----------VNIPSAPQ-----PVHP-APNPPVHEFNYPTPPAvpqqPGVLNIP 14220
Cdd:NF033804 791 PSDEMPAVPGRDNTEG----KKPNIwyslngkiraVNVPKITKekptpPVAPtAPQAPTYEVEKPLEPA----PVAPTYE 862
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPTPVAPTPQspiyipsQEQPKPTTRPSVinvpSVPQPAYPTPQAPVYDvNYPTSPSVIPHQPgvvnIPSVPLPAPPVK 14300
Cdd:NF033804 863 NEPTPPVKTPD-------QPEPSKPEEPTY----ETEKPLEPAPVAPTYE-NEPTPPVKTPDQP----EPSKPEEPTYET 926
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14301 QRPVfVPSPVHPT----PAPQPGVVNIPSVAQPVHPTYQPpvverpaiydvyYPPPPSRPGVINIPSPP 14365
Cdd:NF033804 927 EKPL-EPAPVAPSyenePTPPVKTPDQPEPSKPVEPTYDP------------LPTPPVAPTPKQLPTPP 982
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
14151-14651 |
2.15e-13 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 79.33 E-value: 2.15e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14151 VTHPEYPTSQVPVYDVNYSTTPSPIPQKPGvvnipsapqpvhPAPNPPVhefnyPTPPAVPQQPGvlnipsYPTPVA-PT 14229
Cdd:PHA03377 425 KTHPVKRTLVKTSGRSDEAEQAQSTPERPG------------PSDQPSV-----PVEPAHLTPVE------HTTVILhQP 481
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14230 PQSPIYIPSQEQPKPTTRPS------------VINV------PSVPQPAYPTpqapvydvnypTSPSVIPHQPGVVNIPS 14291
Cdd:PHA03377 482 PQSPPTVAIKPAPPPSRRRRgacvvydddiieVIDVetteeeESVTQPAKPH-----------RKVQDGFQRSGRRQKRA 550
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14292 VPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAqpvhPTYQPPVVERPAIYDVYYPPPPSrpgvinipSPPRPVYPV 14371
Cdd:PHA03377 551 TPPKVSPSDRGPPKASPPVMAPPSTGPRVMATPSTG----PRDMAPPSTGPRQQAKCKDGPPA--------SGPHEKQPP 618
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14372 PQQPIYVPAPVLHI------------PAPRP-----VIHNIPSVPQPTYPHRNPPIQDVTYPAPQpsppvpgivnIPSLP 14434
Cdd:PHA03377 619 SSAPRDMAPSVVRMflrerlleqstgPKPKSfwemrAGRDGSGIQQEPSSRRQPATQSTPPRPSW----------LPSVF 688
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14435 QPVSTPTSGVINIPSQASPPISVPTPgivnIPSIPQPT---PQRPSPGIINVPSVPQPIPTAPSPGiiniPSVPQPLPSP 14511
Cdd:PHA03377 689 VLPSVDAGRAQPSEESHLSSMSPTQP----ISHEEQPRyedPDDPLDLSLHPDQAPPPSHQAPYSG----HEEPQAQQAP 760
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14512 TPGVinipQQPTPPPL----VQQP-----GIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTY--PT 14580
Cdd:PHA03377 761 YPGY----WEPRPPQApylgYQEPqaqgvQVSSYPGYAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPWAPRPPHlpPQ 836
|
490 500 510 520 530 540 550
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14581 QKPSY-----QDTSYPTVQPK--PPVSGIINIPSVPQPVPSLTpgvinlPSEPSYSAPIPKPGIINVPSiPEPIPSIP 14651
Cdd:PHA03377 837 WDGSAghgqdQVSQFPHLQSEtgPPRLQLSQVPQLPYSQTLVS------SSAPSWSSPQPRAPIRPIPT-RFPPPPMP 907
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
14057-14545 |
2.16e-13 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 79.36 E-value: 2.16e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14057 SPIYDANYPTTQSPI---PQQPgVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPvQPGVinipsAPLPTTPPQHPPVFIPS 14133
Cdd:PRK10263 318 EPVAVAAAATTATQSwaaPVEP-VTQTPPVASVDVPPAQPTVAWQPVPGPQTG-EPVI-----APAPEGYPQQSQYAQPA 390
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14134 PESPSPAPKPGVINIPSVTHPEYPTSQVPVYDVNYST-----TPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEFNYPTPP 14208
Cdd:PRK10263 391 VQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQpaqqpYYAPAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQ 470
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14209 AVPQQPGVLNIPSYPTPVAPTPQspiyiPSQEQPKPTtRPSVINVPSVPQP-AYPTPQAPVYdvnYPTSPSviPHQPGVV 14287
Cdd:PRK10263 471 PAAQEPLYQQPQPVEQQPVVEPE-----PVVEETKPA-RPPLYYFEEVEEKrAREREQLAAW---YQPIPE--PVKEPEP 539
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14288 NIPSVPLPAPPVkqrpvfVPsPVHPTPAPQP-------GVVNIPSVAQPVHPTYQPPV--VERPAIYDVYYP--PPPSRP 14356
Cdd:PRK10263 540 IKSSLKAPSVAA------VP-PVEAAAAVSPlasgvkkATLATGAAATVAAPVFSLANsgGPRPQVKEGIGPqlPRPKRI 612
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14357 GV----------INIPS----------PPRPVYPVPQQPIYVPAPVLH-------------------------------- 14384
Cdd:PRK10263 613 RVptrrelasygIKLPSqraaeekareAQRNQYDSGDQYNDDEIDAMQqdelarqfaqtqqqrygeqyqhdvpvnaedad 692
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14385 IPAPRPVIHNIPSVPQPTYPHRNP--------------PIQDVtypapqpsppvpgIVNIPSLP------QPVSTPTSGV 14444
Cdd:PRK10263 693 AAAEAELARQFAQTQQQRYSGEQPaganpfslddfefsPMKAL-------------LDDGPHEPlftpivEPVQQPQQPV 759
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14445 INIPSQASPPISVPTPGIVNIPSIPQPTPQR---------PSPGIINV--PSVPQPIPTAPSPGIINIPSVPQPLPSPTP 14513
Cdd:PRK10263 760 APQQQYQQPQQPVAPQPQYQQPQQPVAPQPQyqqpqqpvaPQPQYQQPqqPVAPQPQYQQPQQPVAPQPQYQQPQQPVAP 839
|
570 580 590
....*....|....*....|....*....|..
gi 442625924 14514 GviniPQQPTPPPLVQQPGiiNIPSVQQPSTP 14545
Cdd:PRK10263 840 Q----PQDTLLHPLLMRNG--DSRPLHKPTTP 865
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
14154-14685 |
6.03e-13 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 77.79 E-value: 6.03e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14154 PEYPTSQVPVydvnysttPSPIPQKP---------GVVNIPSaPQPVHPAPNPPVHEFNYPTPPAVPQQPgvlnipsyPT 14224
Cdd:PHA03379 411 PTYGTPRPPV--------EKPRPEVPqsletatshGSAQVPE-PPPVHDLEPGPLHDQHSMAPCPVAQLP--------PG 473
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14225 PVAPTPqspiyiPSQEQPKPttrpsvinvPSVPQPAyPTPqapvydVNYPTSPSVIPHQPGVVNIPSVPlPAPPVKQRPV 14304
Cdd:PHA03379 474 PLQDLE------PGDQLPGV---------VQDGRPA-CAP------VPAPAGPIVRPWEASLSQVPGVA-FAPVMPQPMP 530
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14305 FVPSPVhPTPAPQPGVVNIPSVAQ---PVHPTYQPPVVERpaiydvYYPPPPSrpgviniPSPPRPVypvpqqpiyVPAP 14381
Cdd:PHA03379 531 VEPVPV-PTVALERPVCPAPPLIAmqgPGETSGIVRVRER------WRPAPWT-------PNPPRSP---------SQMS 587
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14382 VLHIPA---PRPVIHNIPSVPQPTYPHRNPPIQDVTYpapqpsppvpgivniPSLPQPVSTPTSGVINIPSQA-SPPISV 14457
Cdd:PHA03379 588 VRDRLArlrAEAQPYQASVEVQPPQLTQVSPQQPMEY---------------PLEPEQQMFPGSPFSQVADVMrAGGVPA 652
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14458 PTPGIVNIPsIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPsVPQPLPSPTPGVINIPQQPTPPPLV--------- 14528
Cdd:PHA03379 653 MQPQYFDLP-LQQPISQGAPLAPLRASMGPVPPVPATQPQYFDIP-LTEPINQGASAAHFLPQQPMEGPLVperwmfqga 730
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14529 -----QQPGIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTPGV-----INIPSVSQPT--YPTQKPSYQDTSYPTVQPK 14596
Cdd:PHA03379 731 tlsqsVRPGVAQSQYFDLPLTQPINHGAPAAHFLHQPPMEGPWVpeqwmFQGAPPSQGTdvVQHQLDALGYVLHVLNHPG 810
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14597 PPVSGIINIPSVPQ-----PVPSLTPGVINLPSEPSYSAPIPKPGiinvpsipEPIPSIPQNPVQEvyhdtQKPQAIPGV 14671
Cdd:PHA03379 811 VPVSPAVNQYHVSQaafglPIDEDESGEGSDTSEPCEALDLSIHG--------RPCPQAPEWPVQG-----EGGQDATEV 877
|
570
....*....|....
gi 442625924 14672 VNVPSAPQPTPGRP 14685
Cdd:PHA03379 878 LDLSIHGRPRPRTP 891
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
13894-14303 |
6.51e-13 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 77.79 E-value: 6.51e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13894 TGDPFTRCYETPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQP--GIVNIPSaPQPIYPTPQSPQYNVNYPSPQPanpqkp 13971
Cdd:PHA03379 394 AGKLTERAREALEKASEPTYGTPRPPVEKPRPEVPQSLETATshGSAQVPE-PPPVHDLEPGPLHDQHSMAPCP------ 466
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 gVVNIPsvpqpvyPSPQPPVydvnyptTPVSQHPGVvniPSAPRLVPPTSQRPV-FITSPGNLSPTPQPGVINIPSVSQP 14050
Cdd:PHA03379 467 -VAQLP-------PGPLQDL-------EPGDQLPGV---VQDGRPACAPVPAPAgPIVRPWEASLSQVPGVAFAPVMPQP 528
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14051 --GYPTPQsPIYDANYPTTQSPI------PQQP-GVVNIPSVPSPSYPAPNPPvnyptQPSPQIPVQPGV---------- 14111
Cdd:PHA03379 529 mpVEPVPV-PTVALERPVCPAPPliamqgPGETsGIVRVRERWRPAPWTPNPP-----RSPSQMSVRDRLarlraeaqpy 602
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14112 ---INIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQVPVYDvnYSTTpSPIPQKPGVVNIPSA- 14187
Cdd:PHA03379 603 qasVEVQPPQLTQVSPQQPMEYPLEPEQQMFPGSPFSQVADVMRAGGVPAMQPQYFD--LPLQ-QPISQGAPLAPLRASm 679
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14188 -PQPVHPAPNPPVHEFNYPTPPA--------VPQQP--GVLNIPSYPTPVAPTPQS--PIYIPSQEQPKPTTRPsvIN-- 14252
Cdd:PHA03379 680 gPVPPVPATQPQYFDIPLTEPINqgasaahfLPQQPmeGPLVPERWMFQGATLSQSvrPGVAQSQYFDLPLTQP--INhg 757
|
410 420 430 440 450
....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14253 ---VPSVPQPAYPTPQAPVYDVNYPTSPS----VIPHQPGVVNIPSVPLPAPPVKQRP 14303
Cdd:PHA03379 758 apaAHFLHQPPMEGPWVPEQWMFQGAPPSqgtdVVQHQLDALGYVLHVLNHPGVPVSP 815
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
14203-14706 |
7.25e-13 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 77.82 E-value: 7.25e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14203 NYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSvinvPSVPQPAyPTPQAPVydVNYPTSPSVIPH 14282
Cdd:PRK10263 297 NRATQPEYDEYDPLLNGAPITEPVAVAAAATTATQSWAAPVEPVTQT----PPVASVD-VPPAQPT--VAWQPVPGPQTG 369
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14283 QPGVVNIPSVPLPAPPVKQRPVFVPSPVHpTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVYYPPPPSrpgviniP 14362
Cdd:PRK10263 370 EPVIAPAPEGYPQQSQYAQPAVQYNEPLQ-QPVQPQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPEQ-------P 441
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 SPPRPVYPVPQQPIYVPAPVLhipaprpvihnipsvpQPTYPHRNPPIQDVTYPAPQPsppvpgivnipsLPQPVSTPTS 14442
Cdd:PRK10263 442 VAGNAWQAEEQQSTFAPQSTY----------------QTEQTYQQPAAQEPLYQQPQP------------VEQQPVVEPE 493
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14443 GVINIPSQASPPI----------------------SVPTPgivnipsIPQPTPQRPSPGIINVPSVPqPIPTAPS----- 14495
Cdd:PRK10263 494 PVVEETKPARPPLyyfeeveekrarereqlaawyqPIPEP-------VKEPEPIKSSLKAPSVAAVP-PVEAAAAvspla 565
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14496 PGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQ--------PGIINIPSVQQPSTPTTQHPIQDVQYE----TQRPQP 14563
Cdd:PRK10263 566 SGVKKATLATGAAATVAAPVFSLANSGGPRPQVKEgigpqlprPKRIRVPTRRELASYGIKLPSQRAAEEkareAQRNQY 645
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14564 TPGVI----NIPSVSQ-----------------------PTYPT------------QKPSYQDTSYPTVQPK-------- 14596
Cdd:PRK10263 646 DSGDQynddEIDAMQQdelarqfaqtqqqrygeqyqhdvPVNAEdadaaaeaelarQFAQTQQQRYSGEQPAganpfsld 725
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14597 ----PPVSGIIN-IPSVPQpvpsLTPGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPV--QEVYHDTQKPQAIP 14669
Cdd:PRK10263 726 dfefSPMKALLDdGPHEPL----FTPIVEPVQQPQQPVAPQQQYQQPQQPVAPQPQYQQPQQPVapQPQYQQPQQPVAPQ 801
|
570 580 590 600
....*....|....*....|....*....|....*....|
gi 442625924 14670 GV---VNVPSAPQPTPGRPYYDVAKPdfefnPCYPSPCGP 14706
Cdd:PRK10263 802 PQyqqPQQPVAPQPQYQQPQQPVAPQ-----PQYQQPQQP 836
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
13904-14209 |
2.76e-12 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 76.13 E-value: 2.76e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13904 TPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPSPQPANPQKP---GVV------ 13974
Cdd:PHA03247 2784 TRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPlggSVApggdvr 2863
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13975 -NIPSVPQPVYP--SPQPPVYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPG 14051
Cdd:PHA03247 2864 rRPPSRSPAAKPaaPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14052 YPTPQSPiyDANYPTTQSPIPQQ----PGVVNIPSVPSPSyPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHP 14127
Cdd:PHA03247 2944 APTTDPA--GAGEPSGAVPQPWLgalvPGRVAVPRFRVPQ-PAPSREAPASSTPPLTGHSLSRVSSWASSLALHEETDPP 3020
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14128 PVfipspespspapkpgvinipSVTHPEYPTSqvpvyDVNYSTTPSPIPQKPGVVNIpSAPQPVHPAPNPPVHEFNYPTP 14207
Cdd:PHA03247 3021 PV--------------------SLKQTLWPPD-----DTEDSDADSLFDSDSERSDL-EALDPLPPEPHDPFAHEPDPAT 3074
|
..
gi 442625924 14208 PA 14209
Cdd:PHA03247 3075 PE 3076
|
|
| TALPID3 |
pfam15324 |
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ... |
14108-14683 |
3.03e-12 |
|
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.
Pssm-ID: 434634 [Multi-domain] Cd Length: 1288 Bit Score: 75.69 E-value: 3.03e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14108 QPGVINIPSAPLPTTPPQHPP-VFIPSPESPSPAPKPGVINIPSVTHPEYPTSQV---PVYDVNYST----------TPS 14173
Cdd:pfam15324 527 TPNKSVIPRKHFQKQAEEHFRkPPVRSMPASSLQKKEGPLKSTTSLQDEDYLLQVygkAVYQGHRSTlkkgpylrfnSPS 606
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14174 PI--PQKPGVV------NIPSA--------PQPV-------HPAPNPPvHEFNYPTPPA--VPQQPGVLniPSYPTPVA- 14227
Cdd:pfam15324 607 PKskPQRPKVIesvkgtKVKSArtqtdlhaTKPVktdskmqHSVTAPH-QEQQYLFSPSreMPSQSGTL--EGHLIPMAi 683
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14228 ----PTPQSPIYIPSQ---EQPKPTTrpsVINvpSVPqPAYPTPQAPVYDVNY--------PTSPS--VIPHQPGVvNIP 14290
Cdd:pfam15324 684 plgqTQSDSDSPPPAGvivSKPHPVT---VTT--SIP-PSSRKPEPGVKKPNIallemkseKKDPPqlTVQVLPSV-DID 756
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14291 SVPLPAPPVKQRPvFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVYYPP------PPSRPGVINI--- 14361
Cdd:pfam15324 757 SVSCSSRDSSPSP-VLPSPSEASPPLIQTWIQTPELMKEDEEEVKFPGTNFDEVIDVIQDEekedeiPEFSEPPLEFnrs 835
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14362 PSPPRPVYPVPQQPiyvPAPvlhiPAPRPVIHNIPSVPQPTYPHRNPPIQDVTypapqpsppvpgivnipslPQPVSTPT 14441
Cdd:pfam15324 836 VKPPSTKYNGPPFP---PVV----SQPQPTTDILDKVIEQRETLENRLVDWVE-------------------QEIMARII 889
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVP--------TPGIVNIPS-----------IP-----------------------QPTPQRPSPG 14479
Cdd:pfam15324 890 SGMFPQQAQADPDASVSesepsepsTSDIVEAAGggglqlfvdagVPvdsemirhfvnealaetiaimlgDREAQREPPV 969
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14480 iinVPSVPQPIPTapspgiiNIPSVPQPLPSPTPGViniPQQPtPPPLvQQPGIINIPSVQQPSTPTTQHPIQDVQYET- 14558
Cdd:pfam15324 970 ---AASVPGDLPT-------KETLLPTPVPTPQPTP---PCSP-PSPL-KEPSPVKTPDSSPCVSEHDFFPVKEIPPEKg 1034
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14559 QRPQPTPGVINIPSVSqptyPTQKPSyqdtsyPTVQPKPPVSGI-INIPSVPQPVPSLTPGVINLPSEPSYSAPI----- 14632
Cdd:pfam15324 1035 ADTGPAVSLVITPTVT----PIATPP------PAATPTPPLSENsIDKLKSPSPELPKPWEDSDLPLEEENPNSEqeelh 1104
|
650 660 670 680 690
....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14633 PKPGIINVPSIPEP----IPSIPQNPvqevyhdtqKPQAIPGVVNVPSAPQPTPG 14683
Cdd:pfam15324 1105 PRAVVMSVARDEEPesvvLPASPPEP---------KPLAPPPLGAAPPSPPQSPS 1150
|
|
| Glutenin_hmw |
pfam03157 |
High molecular weight glutenin subunit; Members of this family include high molecular weight ... |
13931-14601 |
8.30e-12 |
|
High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.
Pssm-ID: 367362 [Multi-domain] Cd Length: 786 Bit Score: 73.83 E-value: 8.30e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13931 QQQQPGIV-NIPSAPQPIYPTPQSPQYNVNYPSPqpANPQKPGVVNIPSVPQPVYpspqppvydvnYPTTPvsQHPGVVN 14009
Cdd:pfam03157 92 QQLQQGIFwGIPALLQRYYPGVTSPQQVSYYPGQ--ASPQRPGQGQQPGQGQQWY-----------YPTSP--QQPGQWQ 156
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14010 IPSA--PRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSP- 14086
Cdd:pfam03157 157 QPGQgqQGYYPTSPQQSGQRQQPGQGQQLRQGQQGQQSGQGQPGYYPTSSQQPGQLQQTGQGQQGQQPERGQQGQQPGQg 236
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14087 SYPAPNPPVNYPTQPSpqipvQPGVINIPSAPLpttPPQHPPVFIPSPESPSPAPKPGVINIPSVTHPEYPTSQvpvydv 14166
Cdd:pfam03157 237 QQPGQGQQGQQPGQPQ-----QLGQGQQGYYPI---SPQQPRQWQQSGQGQQGYYPTSLQQPGQGQSGYYPTSQ------ 302
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14167 nysTTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPT-PVAPTPQSPIYIP-SQEQPKP 14244
Cdd:pfam03157 303 ---QQAGQLQQEQQLGQEQQDQQPGQGRQGQQPGQGQQGQQPAQGQQPGQGQPGYYPTsPQQPGQGQPGYYPtSQQQPQQ 379
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14245 TTRPSVINVPSVP-------QPA---YPTPQAPVYdvnYPTSPSVIPH-QPGvvNIPSVPLPAPPVKQrpvfvPSPVHPT 14313
Cdd:pfam03157 380 GQQPEQGQQGQQQgqgqqgqQPGqgqQPGQGQPGY---YPTSPQQSGQgQPG--YYPTSPQQSGQGQQ-----PGQGQQP 449
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14314 PAPQPGVVNIPSVAQPVHPTYQPPVVERPAI-YDVYYPPPPSRPGviNIPSPPRPVYPVPQQPIYVPAPVLHiPAPRPVI 14392
Cdd:pfam03157 450 GQEQPGQGQQPGQGQQGQQPGQPEQGQQPGQgQPGYYPTSPQQSG--QGQQLGQWQQQGQGQPGYYPTSPLQ-PGQGQPG 526
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14393 HNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSgviniPSQASPPISVPTPGIVN---IPSIP 14469
Cdd:pfam03157 527 YYPTSPQQPGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQ-----GQQGQQPGQGQQPGQGQpgyYPTSP 601
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14470 QPTPQRPSPGIINVPSVPQPIPTAPSPGIIN------IPSVP-QPLPSPTPGVIN---------IPQQPTPPPLVQQPGi 14533
Cdd:pfam03157 602 QQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGqgqqgyYPTSPqQPGQGQQPGQWQqsgqgqqgyYPTSPQQSGQAQQPG- 680
|
650 660 670 680 690 700 710
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14534 inipSVQQPStpTTQHPIQDVQ--YETQRPQPTPGvinipsvSQPTYPTQKPSYQDTSYPTVQPKPPVSG 14601
Cdd:pfam03157 681 ----QGQQPG--QWLQPGQGQQgyYPTSPQQPGQG-------QQLGQGQQSGQGQQGYYPTSPGQGQQSG 737
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
14228-14683 |
1.22e-11 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 73.56 E-value: 1.22e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14228 PTPQSPIYI----PSQEQPKPTTRPSViNVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQP---------GVVNIPSVPL 14294
Cdd:PHA03378 445 PHSQAPTVVlhrpPTQPLEGPTGPLSV-QAPLEPWQPLPHPQVTPVILHQPPAQGVQAHGSmldllekddEDMEQRVMAT 523
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14295 PAPPVKQRP--------VF------------VPSPVHPT--PAPQPGVVNIPSVAQPVHP---TYQPPVVERP--AIYDV 14347
Cdd:PHA03378 524 LLPPSPPQPragrrapcVYtedldiesdepaSTEPVHDQllPAPGLGPLQIQPLTSPTTSqlaSSAPSYAQTPwpVPHPS 603
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14348 YYPPPPSRPGVINIPSPPRPvYPVPQQPIyvpapVLHIPAPRPVIHNIPSVPQPTYPhrnPPIQDVTYPAPQPSppvpgI 14427
Cdd:PHA03378 604 QTPEPPTTQSHIPETSAPRQ-WPMPLRPI-----PMRPLRMQPITFNVLVFPTPHQP---PQVEITPYKPTWTQ-----I 669
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14428 VNIPSLPQPVSTPTSGVIN-IPSQASPPISVPTPgiVNIPSIPqPTPQRPSPGIINVPSVPQPIPTAPSPgiinipsvPQ 14506
Cdd:PHA03378 670 GHIPYQPSPTGANTMLPIQwAPGTMQPPPRAPTP--MRPPAAP-PGRAQRPAAATGRARPPAAAPGRARP--------PA 738
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14507 PLPSPTPGVINIPQQPTPPPLVQQPGIiniPSVQQPSTPTtqhpiqdvqyetqrPQPTPGVinipsvsqPTYPTQKPsyQ 14586
Cdd:PHA03378 739 AAPGRARPPAAAPGRARPPAAAPGRAR---PPAAAPGAPT--------------PQPPPQA--------PPAPQQRP--R 791
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14587 DTSYPTVQPK-PPVSGIINIPSVP-QPVPSLTPGVINLPSEPSYSAP---IPKPGIINVPSIPEPIPS------IPQNPV 14655
Cdd:PHA03378 792 GAPTPQPPPQaGPTSMQLMPRAAPgQQGPTKQILRQLLTGGVKRGRPslkKPAALERQAAAGPTPSPGsgtsdkIVQAPV 871
|
490 500 510
....*....|....*....|....*....|....*..
gi 442625924 14656 qeVYHDTQKPQAIPGVV---------NVPSAPQPTPG 14683
Cdd:PHA03378 872 --FYPPVLQPIQVMRQLgsvraaaasTVTQAPTEYTG 906
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
13947-14404 |
2.05e-11 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 72.78 E-value: 2.05e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13947 IYPTPQSPQYNVNYPSPQpANPQKPGVV----------NIPSVPQPVYPSPQPPVydvnyPTTPVsqHPGVVNIPSAPRL 14016
Cdd:PHA03377 408 VSRVPWRKPRTLPWPTPK-THPVKRTLVktsgrsdeaeQAQSTPERPGPSDQPSV-----PVEPA--HLTPVEHTTVILH 479
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14017 VPPTSQRPVFItspgnlSPTPQPG----------------VINI------PSVSQPGYP--TPQSPI-YDANYPTTQSPI 14071
Cdd:PHA03377 480 QPPQSPPTVAI------KPAPPPSrrrrgacvvydddiieVIDVetteeeESVTQPAKPhrKVQDGFqRSGRRQKRATPP 553
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14072 PQQPGVVNIPSV--PSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPapkpgvINIP 14149
Cdd:PHA03377 554 KVSPSDRGPPKAspPVMAPPSTGPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASGPHEKQPPSSAPR------DMAP 627
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14150 SVTHP-------EYPTSQVP--VYDVNYSTTPSPIPQKPGVVNIPsAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIP 14220
Cdd:PHA03377 628 SVVRMflrerllEQSTGPKPksFWEMRAGRDGSGIQQEPSSRRQP-ATQSTPPRPSWLPSVFVLPSVDAGRAQPSEESHL 706
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPTPVAPT----------PQSPIYI---PSQEQPKPTTRP----SVINVPSVPQPAY---PTPQAPVYDVNYPTSpsvi 14280
Cdd:PHA03377 707 SSMSPTQPIsheeqpryedPDDPLDLslhPDQAPPPSHQAPysghEEPQAQQAPYPGYwepRPPQAPYLGYQEPQA---- 782
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14281 pHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDV-YYPPPPSRPGvi 14359
Cdd:PHA03377 783 -QGVQVSSYPGYAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPWAPRPPHLPPQWDGSAGHGQDQVsQFPHLQSETG-- 859
|
490 500 510 520 530
....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14360 nipsPPR------PVYPVPQQPIYVPAPVLHIPAPRPVIHNIPS-VPQPTYP 14404
Cdd:PHA03377 860 ----PPRlqlsqvPQLPYSQTLVSSSAPSWSSPQPRAPIRPIPTrFPPPPMP 907
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
14020-14528 |
1.18e-10 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 69.71 E-value: 1.18e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14020 TSQRPVFITSPGNLS-PTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPipqqPGVVNIPSvPSPSYPA-------- 14090
Cdd:COG5180 2 RKATILEIRLLATVPiPPNAARPVLSPELWAAANNDAVSQGDRSALASSPTR----PYARKIFE-PLDIKLAlgkpqlps 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14091 -PNPPVNYPTQP---SPQIPVQP--GVINIPSAPLPTTPPQHPPVFIPSPESPSpapkpgVINIPSVTHPEYPTSQVPVY 14164
Cdd:COG5180 77 vAEPEAYLDPAPpksSPDTPEEQlgAPAGDLLVLPAAKTPELAAGALPAPAAAA------ALPKAKVTREATSASAGVAL 150
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14165 DVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPN-----PPVHEFNYPTP---PAVPQQPGVLNIPSYPTPVAPTPQsPIYI 14236
Cdd:COG5180 151 AAALLQRSDPILAKDPDGDSASTLPPPAEKLDkvltePRDALKDSPEKldrPKVEVKDEAQEEPPDLTGGADHPR-PEAA 229
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDvnyptspsviPHQPGVVNIPSVPLPAPPV---KQRPVFV-PSPVHP 14312
Cdd:COG5180 230 SSPKVDPPSTSEARSRPATVDAQPEMRPPADAKE----------RRRAAIGDTPAAEPPGLPVleaGSEPQSDaPEAETA 299
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14313 TPAPQPGVVNIPSVAQPVHPT---------YQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQpiyVPAPVl 14383
Cdd:COG5180 300 RPIDVKGVASAPPATRPVRPPggardpgtpRPGQPTERPAGVPEAASDAGQPPSAYPPAEEAVPGKPLEQG---APRPG- 375
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14384 HIPAPRPVIHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTS------GVINIPSQASPPISV 14457
Cdd:COG5180 376 SSGGDGAPFQPPNGAPQPGLGRRGAPGPPMGAGDLVQAALDGGGRETASLGGAAGGAGQgpkadfVPGDAESVSGPAGLA 455
|
490 500 510 520 530 540 550
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14458 PTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGiinIPSVPQPLPSPTPgVINIPQQPTPPPLV 14528
Cdd:COG5180 456 DQAGAAASTAMADFVAPVTDATPVDVADVLGVRPDAILGG---NVAPASGLDAETR-IIEAEGAPATEDFV 522
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
13980-14413 |
3.34e-10 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 68.93 E-value: 3.34e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13980 PQPVYPSPQPPVyDVNYPTTPVS--QHP--GVVNIPSAPrlvpptsqrPVFITSPGNLSPtpQPGVINIPSVSQPgyPTP 14055
Cdd:PHA03379 409 SEPTYGTPRPPV-EKPRPEVPQSleTATshGSAQVPEPP---------PVHDLEPGPLHD--QHSMAPCPVAQLP--PGP 474
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14056 QSPIydanypttqSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSP-QIPVQpgvinipsAPLPTTPPQHPPVFIPSP 14134
Cdd:PHA03379 475 LQDL---------EPGDQLPGVVQDGRPACAPVPAPAGPIVRPWEASLsQVPGV--------AFAPVMPQPMPVEPVPVP 537
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14135 ESPSPAPKPGVINIPSVTHPEYPTSQVPVYD--VNYSTTPSPiPQKPGVVNIPSAPQPVHPAPNPPVHEFNYpTPPAVPQ 14212
Cdd:PHA03379 538 TVALERPVCPAPPLIAMQGPGETSGIVRVRErwRPAPWTPNP-PRSPSQMSVRDRLARLRAEAQPYQASVEV-QPPQLTQ 615
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14213 QPgvlnipsyptpvaptPQSPIYIPSQ-EQPKPTTRPSVINVPSVPQPAYPTPQAPVYDvnYPTSpsviphQPGVVNIPS 14291
Cdd:PHA03379 616 VS---------------PQQPMEYPLEpEQQMFPGSPFSQVADVMRAGGVPAMQPQYFD--LPLQ------QPISQGAPL 672
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14292 VPLPAPPVkqrpvfvpsPVHPTPAPQPGVVNIP---------SVAQ--PVHPTyQPPVVERPAIYDVYYPPPPSRPGVIN 14360
Cdd:PHA03379 673 APLRASMG---------PVPPVPATQPQYFDIPltepinqgaSAAHflPQQPM-EGPLVPERWMFQGATLSQSVRPGVAQ 742
|
410 420 430 440 450
....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14361 IPSPPRPVypvpQQPIYVPAPVLHIPaPRPVIHNiPSVPQPTYPHRNPPIQDV 14413
Cdd:PHA03379 743 SQYFDLPL----TQPINHGAPAAHFL-HQPPMEG-PWVPEQWMFQGAPPSQGT 789
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
13892-14270 |
2.54e-09 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 65.84 E-value: 2.54e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13892 SHTGDPFTRC-YETPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYpTPQSPQYNVNYPS-------- 13962
Cdd:PHA03377 558 SDRGPPKASPpVMAPPSTGPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASGPHEK-QPPSSAPRDMAPSvvrmflre 636
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13963 ---PQPANPqKPGVV-------NIPSVPQPVYPSPQPPVydvnYPTTPV-SQHPGVVNIPSaprlVPPTSQRPVFITSPG 14031
Cdd:PHA03377 637 rllEQSTGP-KPKSFwemragrDGSGIQQEPSSRRQPAT----QSTPPRpSWLPSVFVLPS----VDAGRAQPSEESHLS 707
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14032 NLSPTpQPgvinIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQ---PGVVNIPS--VPSPSYPAPNPPvnyptqPSP--- 14103
Cdd:PHA03377 708 SMSPT-QP----ISHEEQPRYEDPDDPLDLSLHPDQAPPPSHQapySGHEEPQAqqAPYPGYWEPRPP------QAPylg 776
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14104 -QIPVQPG--VINIPSAPLPTTP-PQHppvfipspespspapkpgviniPSVTHPEYPTSQVPVYdvNYSTTP-SPIPQK 14178
Cdd:PHA03377 777 yQEPQAQGvqVSSYPGYAGPWGLrAQH----------------------PRYRHSWAYWSQYPGH--GHPQGPwAPRPPH 832
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14179 PGVVNIPSA-PQPVHPAPNPPVHefNYPTPPAvPQQPGVLNIPSYPTPV---APTPQSPiyipsqeQPKPTTRPsvinVP 14254
Cdd:PHA03377 833 LPPQWDGSAgHGQDQVSQFPHLQ--SETGPPR-LQLSQVPQLPYSQTLVsssAPSWSSP-------QPRAPIRP----IP 898
|
410
....*....|....*.
gi 442625924 14255 SvpqpAYPTPQAPVYD 14270
Cdd:PHA03377 899 T----RFPPPPMPLQD 910
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
14066-14586 |
4.04e-09 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 64.56 E-value: 4.04e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14066 TTQSPIPQQPGVVNIP-SVPSPsyPAPNPPVnypTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESpspapkpg 14144
Cdd:cd22540 18 TTQDSQPSPLALLAATcSKIGP--PAVEAAV---TPPAPPQPTPRKLVPIKPAPLPLGPGKNSIGFLSAKGN-------- 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14145 VINI-PSVTHPEYPTSQVPVYDVN-------YSTTPSPIPQKPGVVNIPSAPQP-------VHPAPNPpvhefNYPTPPA 14209
Cdd:cd22540 85 IIQLqGSQLSSSAPGGQQVFAIQNptmiikgSQTRSSTNQQYQISPQIQAAGQInnsgqiqIIPGTNQ-----AIITPVQ 159
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14210 VPQQPgvlNIPSYPTPVAPTPQSPIYIPSQEQPKPTTrpsVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVN- 14288
Cdd:cd22540 160 VLQQP---QQAHKPVPIKPAPLQTSNTNSASLQVPGN---VIKLQSGGNVALTLPVNNLVGTQDGATQLQLAAAPSKPSk 233
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14289 -----IPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGvVNIPSVAQPvhPTYQPPVVERpaiydVYYPPPPSRPGVINIps 14363
Cdd:cd22540 234 kirkkSAQAAQPAVTVAEQVETVLIETTADNIIQAG-NNLLIVQSP--GTGQPAVLQQ-----VQVLQPKQEQQVVQI-- 303
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 pprpvypvPQQPIYVpapvlhipaPRPVIHNIPSVPQPtyPHRNPPIQdvtypapqpsppvpgivNIPSLPQPV--STPT 14441
Cdd:cd22540 304 --------PQQALRV---------VQAASATLPTVPQK--PLQNIQIQ-----------------NSEPTPTQVyiKTPS 347
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVPTPgivniPSIPQPTPQRPSPGIINVPSVPQPIPTAPspgiinipsvPQPLPSPTPGVI--NIP 14519
Cdd:cd22540 348 GEVQTVLLQEAPAATATPS-----SSTSTVQQQVTANNGTGTSKPNYNVRKER----------TLPKIAPAGGIIslNAA 412
|
490 500 510 520 530 540
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 442625924 14520 QQPTPPPLVQQpgiINIPSVQQPSTPTTQhpiqdvqyeTQRP-QPTPGVINIPSVSQPTYPTQKPSYQ 14586
Cdd:cd22540 413 QLAAAAQAIQT---ININGVQVQGVPVTI---------TNAGgQQQLTVQTVSSNNLTISGLSPTQIQ 468
|
|
| PspC_subgroup_2 |
NF033839 |
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, ... |
13903-14122 |
5.00e-09 |
|
pneumococcal surface protein PspC, LPXTG-anchored form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site. The other form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A.
Pssm-ID: 468202 [Multi-domain] Cd Length: 557 Bit Score: 64.40 E-value: 5.00e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKP-VRPQiydtPSPPYPVAIPDLvyvQQQQPGIVNIPSAPQP-IYPTPQSPQYNVnypSPQPANPqKPGVVnipsvP 13980
Cdd:NF033839 326 EKPKPeVKPQ----PEKPKPEVKPQL---ETPKPEVKPQPEKPKPeVKPQPEKPKPEV---KPQPETP-KPEVK-----P 389
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13981 QPVYP----SPQPPVYDVNYPTTPVSQHPGVVNIPSAPRL-VPPTSQRPvfitspgNLSPTPQPGVINiPSV-SQPGYPT 14054
Cdd:NF033839 390 QPEKPkpevKPQPEKPKPEVKPQPEKPKPEVKPQPEKPKPeVKPQPEKP-------KPEVKPQPEKPK-PEVkPQPETPK 461
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14055 PQ-SPIYDANYPTTQsPIPQQPGVVNipSVPSPSYPAPNPPVNYP--TQPSPQIPVQPGVINIPSAPLPTT 14122
Cdd:NF033839 462 PEvKPQPEKPKPEVK-PQPEKPKPDN--SKPQADDKKPSTPNNLSkdKQPSNQASTNEKATNKPKKSLPST 529
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
14277-14692 |
5.88e-09 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 64.70 E-value: 5.88e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14277 PSVIPHQPGVVNIPSVPLPAPPVKQRPVFvpspvhptpapqPGVVNIPSVAQPV---HPTYQPPVVERPAIydvyyPPPP 14353
Cdd:PHA03378 385 PQTLPDPPTVYGRPKVFARKADLKSTKKC------------RAIVTDPSVIKAIeeeHRKKKAARTEQPRA-----TPHS 447
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVInIPSPPRPVYPVPQQPIYVPAPVlhipaprpvihnipsVPQPTYPHrnPPIQDVtypapqpsppvpgIVNIPSL 14433
Cdd:PHA03378 448 QAPTVV-LHRPPTQPLEGPTGPLSVQAPL---------------EPWQPLPH--PQVTPV-------------ILHQPPA 496
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 pQPVSTPTSgVINIPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGII--------NVPSVPQPIPT----APSPGIINI 14501
Cdd:PHA03378 497 -QGVQAHGS-MLDLLEKDDEDMEQRVMATLLPPSPPQPRAGRRAPCVYtedldiesDEPASTEPVHDqllpAPGLGPLQI 574
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14502 psvpQPLPSPTPGVInipqQPTPPPLVQQPGIINIPSvQQPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPT---- 14577
Cdd:PHA03378 575 ----QPLTSPTTSQL----ASSAPSYAQTPWPVPHPS-QTPEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPItfnv 645
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14578 ----YPTQKPSYQDTSYPTVQPKPPvsgiiNIPSVPQPV-------PSLTPGVINLPsePSYSAPIPKPGIINVPSIPEP 14646
Cdd:PHA03378 646 lvfpTPHQPPQVEITPYKPTWTQIG-----HIPYQPSPTgantmlpIQWAPGTMQPP--PRAPTPMRPPAAPPGRAQRPA 718
|
410 420 430 440 450
....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14647 IPSIPQNPVQEVYHDTQKPQAIPGVVNVPSA-------PQPTPGRPYYDVAKP 14692
Cdd:PHA03378 719 AATGRARPPAAAPGRARPPAAAPGRARPPAAapgrarpPAAAPGRARPPAAAP 771
|
|
| DUF5585 |
pfam17823 |
Family of unknown function (DUF5585); This is a family of unknown function found in chordata. |
13986-14267 |
7.56e-09 |
|
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
Pssm-ID: 465521 [Multi-domain] Cd Length: 506 Bit Score: 63.83 E-value: 7.56e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13986 SPQPPVYDVNYPTTPVSQHPGVVNiPSAP--RLVPPTSQRPVFITSPGNL----SPTPQPGVINIPSVSQPGYPTPQSPI 14059
Cdd:pfam17823 134 IAALPSEAFSAPRAAACRANASAA-PRAAiaAASAPHAASPAPRTAASSTtaasSTTAASSAPTTAASSAPATLTPARGI 212
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14060 YDA----NYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPP-VNYPTQPSPQIPVQPGVINIpSAPLPTT--PPQHPPVFIP 14132
Cdd:pfam17823 213 STAatatGHPAAGTALAAVGNSSPAAGTVTAAVGTVTPAaLATLAAAAGTVASAAGTINM-GDPHARRlsPAKHMPSDTM 291
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14133 SPESPSpapkpgviniPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNPPVhefnyPTPPAVPQ 14212
Cdd:pfam17823 292 ARNPAA----------PMGAQAQGPIIQVSTDQPVHNTAGEPTPSPSNTTLEPNTPKSVASTNLAVV-----TTTKAQAK 356
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14213 QPGvlnipSYPTPVAPTPQspiyIPSQEQPKPTTRPSVInvPSVPQPAYP-TPQAP 14267
Cdd:pfam17823 357 EPS-----ASPVPVLHTSM----IPEVEATSPTTQPSPL--LPTQGAAGPgILLAP 401
|
|
| PHA03377 |
PHA03377 |
EBNA-3C; Provisional |
14292-14708 |
7.86e-09 |
|
EBNA-3C; Provisional
Pssm-ID: 177614 [Multi-domain] Cd Length: 1000 Bit Score: 64.30 E-value: 7.86e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14292 VPLPAP---PVKQRPVFVPSPV---HPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAiydvyypPPPSRPGviniPS-- 14363
Cdd:PHA03377 390 LPYIDPnmePVQQRPVMFVSRVpwrKPRTLPWPTPKTHPVKRTLVKTSGRSDEAEQAQ-------STPERPG----PSdq 458
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 PPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPI---QDVTYPAPQPSPPVPGIVNIPSLPQ--PVS 14438
Cdd:PHA03377 459 PSVPVEPAHLTPVEHTTVILHQPPQSPPTVAIKPAPPPSRRRRGACVvydDDIIEVIDVETTEEEESVTQPAKPHrkVQD 538
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14439 TPTSGVINIPSQASPPISvptPGIVNIPSIPQPTPQRPS--PGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPgvi 14516
Cdd:PHA03377 539 GFQRSGRRQKRATPPKVS---PSDRGPPKASPPVMAPPStgPRVMATPSTGPRDMAPPSTGPRQQAKCKDGPPASGP--- 612
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14517 NIPQQPTPPPLVQQPGII---------------------------NIPSVQQPSTPTTQHPIQDVqyeTQRPQPTPGVIN 14569
Cdd:PHA03377 613 HEKQPPSSAPRDMAPSVVrmflrerlleqstgpkpksfwemragrDGSGIQQEPSSRRQPATQST---PPRPSWLPSVFV 689
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14570 IPSV------------------SQPTYPTQKPSYQDTSYPT-VQPKPPVSgiinipsvPQPVP-SLTPGVINLPSEPSys 14629
Cdd:PHA03377 690 LPSVdagraqpseeshlssmspTQPISHEEQPRYEDPDDPLdLSLHPDQA--------PPPSHqAPYSGHEEPQAQQA-- 759
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14630 apiPKPGiinvpsIPEPIPsiPQNPvqevYHDTQKPQAIPG-VVNVPSAPQPTPGRPYYdvakpdfefnPCYPSPCGPYS 14708
Cdd:PHA03377 760 ---PYPG------YWEPRP--PQAP----YLGYQEPQAQGVqVSSYPGYAGPWGLRAQH----------PRYRHSWAYWS 814
|
|
| PTZ00449 |
PTZ00449 |
104 kDa microneme/rhoptry antigen; Provisional |
14197-14584 |
1.49e-08 |
|
104 kDa microneme/rhoptry antigen; Provisional
Pssm-ID: 185628 [Multi-domain] Cd Length: 943 Bit Score: 63.56 E-value: 1.49e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14197 PPVHEFNYPTPPAVPQQPGVLNIPsyptPVAP---TPQSPIYIPSQE--QPKPTTRPSVINVPSVPQPAYPTPQapvydv 14271
Cdd:PTZ00449 497 APIEEEDSDKHDEPPEGPEASGLP----PKAPgdkEGEEGEHEDSKEsdEPKEGGKPGETKEGEVGKKPGPAKE------ 566
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14272 nyptspsvipHQPGVVnipsvplpaPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQpvHPTyQPPVVERPAIYDVyyPP 14351
Cdd:PTZ00449 567 ----------HKPSKI---------PTLSKKPEFPKDPKHPKDPEEPKKPKRPRSAQ--RPT-RPKSPKLPELLDI--PK 622
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14352 PPSRPGVINIP-SPPRPVYPV-PQQPIYVPAPvlhiPAPRPvihniPSVPQPTYphrNPPIQDVTYPAPQPSPPvpgivn 14429
Cdd:PTZ00449 623 SPKRPESPKSPkRPPPPQRPSsPERPEGPKII----KSPKP-----PKSPKPPF---DPKFKEKFYDDYLDAAA------ 684
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 ipslpQPVSTPTSGVINIPSQASPPISVP-TPGIVNIPSIPQPtPQRPSpgiinVPSVP-QPI--PTAPSPGIInipsvp 14505
Cdd:PTZ00449 685 -----KSKETKTTVVLDESFESILKETLPeTPGTPFTTPRPLP-PKLPR-----DEEFPfEPIgdPDAEQPDDI------ 747
|
330 340 350 360 370 380 390
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14506 QPLPSPTPGVINIPQQPTPPPLvqqPGIInipsvqqpstpTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPS 14584
Cdd:PTZ00449 748 EFFTPPEEERTFFHETPADTPL---PDIL-----------AEEFKEEDIHAETGEPDEAMKRPDSPSEHEDKPPGDHPS 812
|
|
| TALPID3 |
pfam15324 |
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ... |
13952-14495 |
3.43e-08 |
|
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.
Pssm-ID: 434634 [Multi-domain] Cd Length: 1288 Bit Score: 62.21 E-value: 3.43e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13952 QSPQYNVNYPSPQpANPQKPGVvnIPSVPQPVYPSPQppvydvnyptTPVSQHPG--VVNIPSAPRLVPPTSQRPVFITS 14029
Cdd:pfam15324 596 KGPYLRFNSPSPK-SKPQRPKV--IESVKGTKVKSAR----------TQTDLHATkpVKTDSKMQHSVTAPHQEQQYLFS 662
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14030 PGNLSPT---PQPGVInIPSVSQPGYPTPQSpiyDANYPTTQSPIPQQPGVVnIPSVPsPSYPAPNPPVNYPT------- 14099
Cdd:pfam15324 663 PSREMPSqsgTLEGHL-IPMAIPLGQTQSDS---DSPPPAGVIVSKPHPVTV-TTSIP-PSSRKPEPGVKKPNiallemk 736
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14100 -----QPSPQIPVQPGViNIPS--------APLPTTPP----QHPPVFIPspespspapkpgvINIPSVTHP-----EYP 14157
Cdd:pfam15324 737 sekkdPPQLTVQVLPSV-DIDSvscssrdsSPSPVLPSpseaSPPLIQTW-------------IQTPELMKEdeeevKFP 802
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14158 -TSQVPVYDVNysttpspipQKPGVVN-IPSAPQPVHpapnppvhEFN-YPTPPAVPqqpgvLNIPSYPtPVAPTPQspi 14234
Cdd:pfam15324 803 gTNFDEVIDVI---------QDEEKEDeIPEFSEPPL--------EFNrSVKPPSTK-----YNGPPFP-PVVSQPQ--- 856
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14235 yiPSQE------QPKPTTRPSVIN------VPSVPQPAYPTPQAPVYDVNYPTS-------------------------- 14276
Cdd:pfam15324 857 --PTTDildkviEQRETLENRLVDwveqeiMARIISGMFPQQAQADPDASVSESepsepstsdiveaagggglqlfvdag 934
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14277 --------------------------------PSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPvhPTPAPQPGVVNIP 14324
Cdd:pfam15324 935 vpvdsemirhfvnealaetiaimlgdreaqrePPVAASVPGDLPTKETLLPTPVPTPQPTPPCSP--PSPLKEPSPVKTP 1012
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14325 SVAQPVHPTYQPPVVERPAIYDVYYPP---PPSRPGVINIPSPPRPVYPVPqqpiyvpapvlhiPAPRPVIHNIPSvPQP 14401
Cdd:pfam15324 1013 DSSPCVSEHDFFPVKEIPPEKGADTGPavsLVITPTVTPIATPPPAATPTP-------------PLSENSIDKLKS-PSP 1078
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14402 TYPH----RNPPIQDVTypapqpsppvpgivniPSLPQPVSTPTSGVINIPsQASPPISVPTPGivnipSIPQPTPQRPS 14477
Cdd:pfam15324 1079 ELPKpwedSDLPLEEEN----------------PNSEQEELHPRAVVMSVA-RDEEPESVVLPA-----SPPEPKPLAPP 1136
|
650
....*....|....*...
gi 442625924 14478 PGIINVPSVPQPIPTAPS 14495
Cdd:pfam15324 1137 PLGAAPPSPPQSPSSSSS 1154
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14178-14380 |
3.85e-08 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 61.82 E-value: 3.85e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14178 KPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVP 14257
Cdd:PRK12323 364 RPGQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARG 443
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14258 QPAYPTPQAPVYDVNYPTSPSVIPHQPGvvniPSVPLPAPPVKQRPVFVPSPVHPTPAP---QPGVVNIPSVAQPvHPTY 14334
Cdd:PRK12323 444 PGGAPAPAPAPAAAPAAAARPAAAGPRP----VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQP-DAAP 518
|
170 180 190 200
....*....|....*....|....*....|....*....|....*....
gi 442625924 14335 QPPVVE---RPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPA 14380
Cdd:PRK12323 519 AGWVAEsipDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPR 567
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14235-14566 |
4.20e-08 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 62.26 E-value: 4.20e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14235 YIPSQEQPKPTTR---PSVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQP-----GVVNIP-----SVPLPAPPVKQ 14301
Cdd:PHA03247 184 YLTYYTQDHPEARwagAMVFFVPSGPGPAAPADLTAAALHLYGASETYLQDEPfverrVVISHPlrgdiAAPAPPPVVGE 263
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14302 RPVFVPSPVHPTPAPQPGvvniPSVAQPVHPTYQPPVVERPAIYDV--YYPPPPSRPgvinipsPPRPVYPVPQQPIYVP 14379
Cdd:PHA03247 264 GADRAPETARGATGPPPP----PEAAAPNGAAAPPDGVWGAALAGAplALPAPPDPP-------PPAPAGDAEEEDDEDG 332
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14380 APVLHIPAPRPVIHnipsvpqptYPhrnppiqdvtypapqpsppvpgiVNIPSLPQPVSTPTSGVINIPSQASPPISVPT 14459
Cdd:PHA03247 333 AMEVVSPLPRPRQH---------YP-----------------------LGFPKRRRPTWTPPSSLEDLSAGRHHPKRASL 380
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14460 PGIVNIPSIPQPTPQRPSPGIINVPSVPQPIP-TAPSPGIINIP-SVP----QPLPSPTPGViniPQQPTPPPLVQQPGI 14533
Cdd:PHA03247 381 PTRKRRSARHAATPFARGPGGDDQTRPAAPVPaSVPTPAPTPVPaSAPpppaTPLPSAEPGS---DDGPAPPPERQPPAP 457
|
330 340 350
....*....|....*....|....*....|...
gi 442625924 14534 INIPSVQQPSTPTTQhpIQDVQYETQRPQPtPG 14566
Cdd:PHA03247 458 ATEPAPDDPDDATRK--ALDALRERRPPEP-PG 487
|
|
| DUF5585 |
pfam17823 |
Family of unknown function (DUF5585); This is a family of unknown function found in chordata. |
14313-14707 |
5.31e-08 |
|
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
Pssm-ID: 465521 [Multi-domain] Cd Length: 506 Bit Score: 61.13 E-value: 5.31e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14313 TPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVP-APVLHIPAPRPV 14391
Cdd:pfam17823 99 EPATREGAADGAASRALAAAASSSPSSAAQSLPAAIAALPSEAFSAPRAAACRANASAAPRAAIAAAsAPHAASPAPRTA 178
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14392 IHNIPSVPQPTYPHRNPpiqdvtypapqpsppVPGIVNIPSLPQPVS-TPTSGVINI-PS----QASPPISVPTPGIVNi 14465
Cdd:pfam17823 179 ASSTTAASSTTAASSAP---------------TTAASSAPATLTPARgISTAATATGhPAagtaLAAVGNSSPAAGTVT- 242
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14466 PSIPQPTPQrpspGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQQPSTP 14545
Cdd:pfam17823 243 AAVGTVTPA----ALATLAAAAGTVASAAGTINMGDPHARRLSPAKHMPSDTMARNPAAPMGAQAQGPIIQVSTDQPVHN 318
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14546 TTqhpiqdvqyetqrPQPTPGVINipSVSQPTYPTQKPSYQDTSYPT--VQPKPPVSGiinipSVPQPVPSLTPGVinLP 14623
Cdd:pfam17823 319 TA-------------GEPTPSPSN--TTLEPNTPKSVASTNLAVVTTtkAQAKEPSAS-----PVPVLHTSMIPEV--EA 376
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14624 SEPSySAPIPKPgiinvPSIPEPIPSIPQNPVQevyhdtQKPQAIPGVvnvpSAPQPTPgRPYYDVAKPdfEFNPCYPSP 14703
Cdd:pfam17823 377 TSPT-TQPSPLL-----PTQGAAGPGILLAPEQ------VATEATAGT----ASAGPTP-RSSGDPKTL--AMASCQLST 437
|
....
gi 442625924 14704 CGPY 14707
Cdd:pfam17823 438 QGQY 441
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
13904-14333 |
6.77e-08 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 61.24 E-value: 6.77e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13904 TPKPVRPQIYDTPSPPyPVAIPDLVYVQQQQ-----PGIVNIPS-APQP-IYPTPQSPQYNV-------NYPSPQPANPQ 13969
Cdd:PHA03378 555 STEPVHDQLLPAPGLG-PLQIQPLTSPTTSQlassaPSYAQTPWpVPHPsQTPEPPTTQSHIpetsaprQWPMPLRPIPM 633
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13970 KPGVVNIPSVPQPVYPSP-QPPVYDVNYPTTPVSQHPgvvNIPSAPRLVPPTSQRPVfITSPGNLSPTPQ-PGVINIPSV 14047
Cdd:PHA03378 634 RPLRMQPITFNVLVFPTPhQPPQVEITPYKPTWTQIG---HIPYQPSPTGANTMLPI-QWAPGTMQPPPRaPTPMRPPAA 709
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14048 SqpgyPTPQSPiyDANYPTTQSPIPQQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHP 14127
Cdd:PHA03378 710 P----PGRAQR--PAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAP 783
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14128 PvfipspespspapkpgvinipsvthpeyptsqVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVhPAPNPPVHEFNYPTP 14207
Cdd:PHA03378 784 P--------------------------------APQQRPRGAPTPQPPPQAGPTSMQLMPRAAP-GQQGPTKQILRQLLT 830
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14208 PAVPQQPGVLNIPS---YPTPVAPTP-----------QSPIYIPSQEQPKPTTR----PSVINVPSVPQPayPTPQAPVY 14269
Cdd:PHA03378 831 GGVKRGRPSLKKPAaleRQAAAGPTPspgsgtsdkivQAPVFYPPVLQPIQVMRqlgsVRAAAASTVTQA--PTEYTGER 908
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14270 DVNYPTSPSVIPhqpgvvnipsvplPAPPVKQRPVFVPSPVHPTPAPQPGVVnIPSVAQPVHPT 14333
Cdd:PHA03378 909 RGVGPMHPTDIP-------------PSKRAKTDAYVESQPPHGGQSHSFSVI-WENVSQGQQQT 958
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
14328-14625 |
7.94e-08 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 60.82 E-value: 7.94e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14328 QPVhPTYQPPVVERPAIYDVYYPPPPSRPgviniPSPPRPV---YPVPQQPIYVP-----APVLHIPAPRPVIHniPSVP 14399
Cdd:pfam09770 106 QPA-ARAAQSSAQPPASSLPQYQYASQQS-----QQPSKPVrtgYEKYKEPEPIPdlqvdASLWGVAPKKAAAP--APAP 177
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14400 QPtyphrnppiqdvtypapqpsppvpgivniPSLPQPVSTPTSGVINIP-------SQASPPISVPTPGIVNIPSIPQPT 14472
Cdd:pfam09770 178 QP-----------------------------AAQPASLPAPSRKMMSLEeveaamrAQAKKPAQQPAPAPAQPPAAPPAQ 228
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14473 PQRPspgiinVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQQPStPTTQHPIQ 14552
Cdd:pfam09770 229 QAQQ------QQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPP-PVPVQPTQ 301
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14553 DVQyetqrpQPtpgviNIPSVSQPTYPTQKPsyqdtsyPTVQPKPPVSGIINIPSVPQPVPSLT--PGVINLPSE 14625
Cdd:pfam09770 302 ILQ------NP-----NRLSAARVGYPQNPQ-------PGVQPAPAHQAHRQQGSFGRQAPIIThpQQLAQLSEE 358
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
13907-14025 |
8.08e-08 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 61.25 E-value: 8.08e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQIYDTPSPPY------PVAIPDLVYVQQQQPGIVNIPSAP-----QPIYPTPQSPQYNVNY----PSPQPANPQKP 13971
Cdd:PRK10263 731 PMKALLDDGPHEPLftpivePVQQPQQPVAPQQQYQQPQQPVAPqpqyqQPQQPVAPQPQYQQPQqpvaPQPQYQQPQQP 810
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 -------GVVNIPSVPQPVYPSPQPPVYD-------------------VNYPTTPvsqhpgvvnIPSAPRLVPPTSQ-RP 14024
Cdd:PRK10263 811 vapqpqyQQPQQPVAPQPQYQQPQQPVAPqpqdtllhpllmrngdsrpLHKPTTP---------LPSLDLLTPPPSEvEP 881
|
.
gi 442625924 14025 V 14025
Cdd:PRK10263 882 V 882
|
|
| PTZ00449 |
PTZ00449 |
104 kDa microneme/rhoptry antigen; Provisional |
13891-14119 |
1.15e-07 |
|
104 kDa microneme/rhoptry antigen; Provisional
Pssm-ID: 185628 [Multi-domain] Cd Length: 943 Bit Score: 60.47 E-value: 1.15e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13891 SSHTGDPftrcyETPK-PVRPQIYDTP-SPPYPvaipdlvyvqqQQPGIVNIPSAPQ-PIYPT-PQSPqynvnyPSPQ-P 13965
Cdd:PTZ00449 585 PKHPKDP-----EEPKkPKRPRSAQRPtRPKSP-----------KLPELLDIPKSPKrPESPKsPKRP------PPPQrP 642
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13966 ANPQKPGVVNIPSVPQPVyPSPQPP---------------------------VYDVNYPTTPVSQHPGVVNIP-SAPRLV 14017
Cdd:PTZ00449 643 SSPERPEGPKIIKSPKPP-KSPKPPfdpkfkekfyddyldaaaksketkttvVLDESFESILKETLPETPGTPfTTPRPL 721
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14018 PPtsQRPvfiTSPGnlSPTPQPGVINIPSVSQPGYPTPqsPIYDANYPTTQSPIPQQPGV----VNIPSVPSPSyPAPNP 14093
Cdd:PTZ00449 722 PP--KLP---RDEE--FPFEPIGDPDAEQPDDIEFFTP--PEEERTFFHETPADTPLPDIlaeeFKEEDIHAET-GEPDE 791
|
250 260
....*....|....*....|....*.
gi 442625924 14094 PVNYPTQPSPQIPVQPGviNIPSAPL 14119
Cdd:PTZ00449 792 AMKRPDSPSEHEDKPPG--DHPSLPK 815
|
|
| PLN03209 |
PLN03209 |
translocon at the inner envelope of chloroplast subunit 62; Provisional |
14446-14706 |
1.42e-07 |
|
translocon at the inner envelope of chloroplast subunit 62; Provisional
Pssm-ID: 178748 [Multi-domain] Cd Length: 576 Bit Score: 59.94 E-value: 1.42e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14446 NIPSQ-ASPPISVPTPGIVNIPSIPQPtPQRPSPGIINVPsvPQPIPTAPSP-------GIINIPSVPQPLPsPTPGvin 14517
Cdd:PLN03209 322 KIPSQrVPPKESDAADGPKPVPTKPVT-PEAPSPPIEEEP--PQPKAVVPRPlspytayEDLKPPTSPIPTP-PSSS--- 394
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14518 iPQQPTPPPLVQQPGIINIPSVqqPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTqkPSYQDTSYPTVQPKP 14597
Cdd:PLN03209 395 -PASSKSVDAVAKPAEPDVVPS--PGSASNVPEVEPAQVEAKKTRPLSPYARYEDLKPPTSPS--PTAPTGVSPSVSSTS 469
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14598 PVSGIINIPsvpqPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEPIPSIPQNPvqevyhdtqkPQAIPGVVNVPSA 14677
Cdd:PLN03209 470 SVPAVPDTA----PATAATDAAAPPPANMRPLSPYAVYDDLKPPTSPSPAAPVGKVA----------PSSTNEVVKVGNS 535
|
250 260 270 280
....*....|....*....|....*....|....*....|....*
gi 442625924 14678 --------------PQPTPGRPY--YDVAKPdfefnPCYPSPCGP 14706
Cdd:PLN03209 536 apptaladeqhhaqPKPRPLSPYtmYEDLKP-----PTSPTPSPV 575
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
14326-14687 |
1.89e-07 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 59.31 E-value: 1.89e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14326 VAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPV---LHIPAPRPVIHNIP---SVP 14399
Cdd:COG5180 15 VPIPPNAARPVLSPELWAAANNDAVSQGDRSALASSPTRPYARKIFEPLDIKLALGKpqlPSVAEPEAYLDPAPpksSPD 94
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14400 QPTYPHRNPPiqDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSGVINipSQASPPISVPTPGIVNIPSIP---------- 14469
Cdd:COG5180 95 TPEEQLGAPA--GDLLVLPAAKTPELAAGALPAPAAAAALPKAKVTR--EATSASAGVALAAALLQRSDPilakdpdgds 170
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14470 QPTPQRPSPGIINVPSVPQPIpTAPSPGIINIPSVPQPLPSPTPgviniPQQPTPPPLVQQPGIINIPSVQQPSTPTTQ- 14548
Cdd:COG5180 171 ASTLPPPAEKLDKVLTEPRDA-LKDSPEKLDRPKVEVKDEAQEE-----PPDLTGGADHPRPEAASSPKVDPPSTSEARs 244
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14549 HPIQ-DVQYETQ------RPQPTPGViNIPSVSQPTYPT----QKPSYQDTSYPTVQPKpPVSGIINIPSVPQPVpSLTP 14617
Cdd:COG5180 245 RPATvDAQPEMRppadakERRRAAIG-DTPAAEPPGLPVleagSEPQSDAPEAETARPI-DVKGVASAPPATRPV-RPPG 321
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14618 GVINL----PSEPSYSA---PIPKPGIINVPSiPEPIPSiPQNPVQEVYHDTQKPQAiPGVVNVPSAPQ---PTPGRPYY 14687
Cdd:COG5180 322 GARDPgtprPGQPTERPagvPEAASDAGQPPS-AYPPAE-EAVPGKPLEQGAPRPGS-SGGDGAPFQPPngaPQPGLGRR 398
|
|
| PRK10819 |
PRK10819 |
transport protein TonB; Provisional |
14444-14579 |
2.23e-07 |
|
transport protein TonB; Provisional
Pssm-ID: 236768 [Multi-domain] Cd Length: 246 Bit Score: 57.00 E-value: 2.23e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14444 VINIPsQASPPISVptpGIVNIPSIPQPTPQRPSPGIINVPSV-PQPIPTAPSPGIINIPS-------VPQPLPSPTPGV 14515
Cdd:PRK10819 38 VIELP-APAQPISV---TMVAPADLEPPQAVQPPPEPVVEPEPePEPIPEPPKEAPVVIPKpepkpkpKPKPKPKPVKKV 113
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14516 INIPQQPTPPPLVQQPGIINIPSVQQPSTPTTqhpiqdvqyETQRPQPTPGVINIP---SVSQPTYP 14579
Cdd:PRK10819 114 EEQPKREVKPVEPRPASPFENTAPARPTSSTA---------TAAASKPVTSVSSGPralSRNQPQYP 171
|
|
| Herpes_BLLF1 |
pfam05109 |
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ... |
13950-14267 |
2.72e-07 |
|
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.
Pssm-ID: 282904 [Multi-domain] Cd Length: 886 Bit Score: 59.16 E-value: 2.72e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13950 TPQSPQYNVNYPSPQPANPQKPGVVNIPS-VPQPVYPSPQPPVYDVNYPTtPVSQHPGVVNI-PS-APRLVPPTSQRPVf 14026
Cdd:pfam05109 428 TTTSPTLNTTGFAAPNTTTGLPSSTHVPTnLTAPASTGPTVSTADVTSPT-PAGTTSGASPVtPSpSPRDNGTESKAPD- 505
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14027 ITSPGNLSPTPQPGVIN-IPSVSQPGyPTPQSPIYDANYPTT--QSPIPQqpgvvniPSVPSPSYPAPNPPVNYPT--QP 14101
Cdd:pfam05109 506 MTSPTSAVTTPTPNATSpTPAVTTPT-PNATSPTLGKTSPTSavTTPTPN-------ATSPTPAVTTPTPNATIPTlgKT 577
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14102 SPQIPVQPGVINIPSAPLPTTPPQhppvfipspESPSPAPKPGVINIPSVTH-PEYPTSQVPV--YDVNYSTT------P 14172
Cdd:pfam05109 578 SPTSAVTTPTPNATSPTVGETSPQ---------ANTTNHTLGGTSSTPVVTSpPKNATSAVTTgqHNITSSSTssmslrP 648
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14173 SPIPQ--KPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVlnipSYPTPvAPTPQSPIYIPSQEQPKPTTRPSV 14250
Cdd:pfam05109 649 SSISEtlSPSTSDNSTSHMPLLTSAHPTGGENITQVTPASTSTHHV----STSSP-APRPGTTSQASGPGNSSTSTKPGE 723
|
330
....*....|....*...
gi 442625924 14251 INVPSVPQPAYPT-PQAP 14267
Cdd:pfam05109 724 VNVTKGTPPKNATsPQAP 741
|
|
| Herpes_BLLF1 |
pfam05109 |
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ... |
14284-14683 |
3.19e-07 |
|
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.
Pssm-ID: 282904 [Multi-domain] Cd Length: 886 Bit Score: 59.16 E-value: 3.19e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14284 PGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPqpgVVNIPSVAQPVHPTYQPPVVERPAIYDVyypPPPSRPGVinipS 14363
Cdd:pfam05109 400 PKTLIITRTATNATTTTHKVIFSKAPESTTTSP---TLNTTGFAAPNTTTGLPSSTHVPTNLTA---PASTGPTV----S 469
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14364 PPRPVYPVPQQPIYVPAPVLHIPAPRPvihnipSVPQPTYPHRNPPIQDVTYPAPqpsppvpgivNIPSLPQPVSTPTsg 14443
Cdd:pfam05109 470 TADVTSPTPAGTTSGASPVTPSPSPRD------NGTESKAPDMTSPTSAVTTPTP----------NATSPTPAVTTPT-- 531
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14444 viniPSQASPPISVPTPgivnIPSIPQPTPQRPSPgiinVPSVPQPIPTAPSPGIINIP---SVPQPLP---SPTPGVIN 14517
Cdd:pfam05109 532 ----PNATSPTLGKTSP----TSAVTTPTPNATSP----TPAVTTPTPNATIPTLGKTSptsAVTTPTPnatSPTVGETS 599
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14518 iPQQPTPPPLV----QQPGIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTpgviNIPSVSQPTYPTQKPSYQ---DTSY 14590
Cdd:pfam05109 600 -PQANTTNHTLggtsSTPVVTSPPKNATSAVTTGQHNITSSSTSSMSLRPS----SISETLSPSTSDNSTSHMpllTSAH 674
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14591 PT----VQPKPPVSGIINIPSVPQPVPSltPGVINLPSEPSYSAPIPKPGIINV-PSIPEPIPSIPQNPvqevyhdTQKP 14665
Cdd:pfam05109 675 PTggenITQVTPASTSTHHVSTSSPAPR--PGTTSQASGPGNSSTSTKPGEVNVtKGTPPKNATSPQAP-------SGQK 745
|
410
....*....|....*...
gi 442625924 14666 QAIPGVVNVPSAPQPTPG 14683
Cdd:pfam05109 746 TAVPTVTSTGGKANSTTG 763
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
13794-14264 |
4.76e-07 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 58.56 E-value: 4.76e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13794 AYCSPV-PIIQESPLTPCDPSPCGPNAQCHPS----LNEAVCSCLPEFYgtPPNcrpectlnSECAYDKACVHHKCVDPC 13868
Cdd:PRK10263 332 SWAAPVePVTQTPPVASVDVPPAQPTVAWQPVpgpqTGEPVIAPAPEGY--PQQ--------SQYAQPAVQYNEPLQQPV 401
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13869 PgicginadcrvhyhsPICYCISSHTGDPFTRCYETPKPVRPQIYDTPSPpypvaipdlvyvQQQQPGIVNIPSAPQPIY 13948
Cdd:PRK10263 402 Q---------------PQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAP------------APEQPVAGNAWQAEEQQS 454
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13949 PTPQSPQYNVNYPSPQPAnPQKPGVVNIPSVPQPVYPSPQ----------PPVY-------------------------- 13992
Cdd:PRK10263 455 TFAPQSTYQTEQTYQQPA-AQEPLYQQPQPVEQQPVVEPEpvveetkparPPLYyfeeveekrarereqlaawyqpipep 533
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13993 ----DVNYPTTPVSQHPGVVNIPSAPRLVP---------------PTSQRPVFITSPGNlSPTPQ-----------PGVI 14042
Cdd:PRK10263 534 vkepEPIKSSLKAPSVAAVPPVEAAAAVSPlasgvkkatlatgaaATVAAPVFSLANSG-GPRPQvkegigpqlprPKRI 612
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14043 NIPS---VSQPGYPTPQSPI------------YDANYPTT----------------------------QSPIPQQPG--- 14076
Cdd:PRK10263 613 RVPTrreLASYGIKLPSQRAaeekareaqrnqYDSGDQYNddeidamqqdelarqfaqtqqqrygeqyQHDVPVNAEdad 692
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14077 -------VVNIPSVPSPSYPAPNPPVNYPTQPS--PQIPVQPGVINIPSAPL--PTTPPQHPPVfipspespspapkpgv 14145
Cdd:PRK10263 693 aaaeaelARQFAQTQQQRYSGEQPAGANPFSLDdfEFSPMKALLDDGPHEPLftPIVEPVQQPQ---------------- 756
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14146 inIPSVTHPEYPTSQVPVYDVNYSTTPS---PIPQKPGVVNIPSAPQPVHPAPNPPV---HEFNYPTPPAVPQQPgvlnI 14219
Cdd:PRK10263 757 --QPVAPQQQYQQPQQPVAPQPQYQQPQqpvAPQPQYQQPQQPVAPQPQYQQPQQPVapqPQYQQPQQPVAPQPQ----Y 830
|
570 580 590 600 610
....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14220 PSYPTPVAPTPQSPIYIP---SQEQPKPTTRPSViNVPSV----PQPAYPTP 14264
Cdd:PRK10263 831 QQPQQPVAPQPQDTLLHPllmRNGDSRPLHKPTT-PLPSLdlltPPPSEVEP 881
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
14068-14331 |
5.09e-07 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 58.12 E-value: 5.09e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14068 QSPIPQ-QPGVVNIPSVPSPSYPAPNPPVNYPTQPS-------------PQIPVQPGVINIPSAPlPTTPPQHPPVfips 14133
Cdd:pfam09770 105 QQPAARaAQSSAQPPASSLPQYQYASQQSQQPSKPVrtgyekykepepiPDLQVDASLWGVAPKK-AAAPAPAPQP---- 179
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14134 pespspapkpgvinipsvthpeyptsqvpvydvnySTTPSPIPQkpgvvniPS----------------APQPVHPAPNP 14197
Cdd:pfam09770 180 -----------------------------------AAQPASLPA-------PSrkmmsleeveaamraqAKKPAQQPAPA 217
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14198 PVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQP-----KPTTRPSVINVPSVPQPAYPTPQAPvydVN 14272
Cdd:pfam09770 218 PAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPvtilqRPQSPQPDPAQPSIQPQAQQFHQQP---PP 294
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14273 YPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIpsVAQPVH 14331
Cdd:pfam09770 295 VPVQPTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPI--ITHPQQ 351
|
|
| PTZ00449 |
PTZ00449 |
104 kDa microneme/rhoptry antigen; Provisional |
14379-14682 |
9.26e-07 |
|
104 kDa microneme/rhoptry antigen; Provisional
Pssm-ID: 185628 [Multi-domain] Cd Length: 943 Bit Score: 57.39 E-value: 9.26e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14379 PAPVL-HIPAPRPVIHNIPSVPQ-PTYPHRnppiqdvtypapqpsppvpgivniPSLPQPVSTPTSgvinipsqASPPIS 14456
Cdd:PTZ00449 561 PGPAKeHKPSKIPTLSKKPEFPKdPKHPKD------------------------PEEPKKPKRPRS--------AQRPTR 608
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14457 VPTPGIVNIPSIPQpTPQRPSPGiiNVPSVPQPiPTAPS----PGIINIPSVPQPlpsptpgviniPQQPTPP--PLVQQ 14530
Cdd:PTZ00449 609 PKSPKLPELLDIPK-SPKRPESP--KSPKRPPP-PQRPSsperPEGPKIIKSPKP-----------PKSPKPPfdPKFKE 673
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14531 PGIINIPSVQQPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSyqDTSYPTVQPKPPVSgiinipsvPQ 14610
Cdd:PTZ00449 674 KFYDDYLDAAAKSKETKTTVVLDESFESILKETLPETPGTPFTTPRPLPPKLPR--DEEFPFEPIGDPDA--------EQ 743
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14611 PVPSltpgvinlpsePSYSAPIPKPGIINVPSIPEPIPSIPQNPVQE--VYHDTQKPQAIPGVVNVPSAPQPTP 14682
Cdd:PTZ00449 744 PDDI-----------EFFTPPEEERTFFHETPADTPLPDILAEEFKEedIHAETGEPDEAMKRPDSPSEHEDKP 806
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
14029-14390 |
1.14e-06 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 57.49 E-value: 1.14e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14029 SPGNLSPTPQPGVINIPSVSQPGYPTPQSPiydaNYPTTQSPIPQQPGVVNIPSVPSPSY--PAPNPPVNYPTQPSPQIP 14106
Cdd:PHA03307 39 SQGQLVSDSAELAAVTVVAGAAACDRFEPP----TGPPPGPGTEAPANESRSTPTWSLSTlaPASPAREGSPTPPGPSSP 114
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14107 VQPGVINIPSAPLPTTPPQHPPVFIPSpespspapkpgviniPSVTHPEyPTSQVPVYDVNYSTTPSPIPQKPGVVNIPS 14186
Cdd:PHA03307 115 DPPPPTPPPASPPPSPAPDLSEMLRPV---------------GSPGPPP-AASPPAAGASPAAVASDAASSRQAALPLSS 178
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPiyiPSQEQPKPTTRPSVINVPSV-----PQPAY 14261
Cdd:PHA03307 179 PEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPG---RSAADDAGASSSDSSSSESSgcgwgPENEC 255
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 PTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPlPAPPVKQRPVFVPS----PVHPTPAPQPGVVNIPSVAQPVHPTYQPP 14337
Cdd:PHA03307 256 PLPRPAPITLPTRIWEASGWNGPSSRPGPASS-SSSPRERSPSPSPSspgsGPAPSSPRASSSSSSSRESSSSSTSSSSE 334
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14338 VVERPAiydVYYPPPPSRPGVINIPSPPRPVYPVPQ-QPIYVPAPVLHIPAPRP 14390
Cdd:PHA03307 335 SSRGAA---VSPGPSPSRSPSPSRPPPPADPSSPRKrPRPSRAPSSPAASAGRP 385
|
|
| Not5 |
COG5665 |
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription]; |
14259-14683 |
1.19e-06 |
|
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];
Pssm-ID: 444384 [Multi-domain] Cd Length: 874 Bit Score: 56.98 E-value: 1.19e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14259 PAYP-TPQAPVYDVNYPtspsviphqpgvvnIPSVPLPAPPVKQRPV---FVPSPVHPTPAPQpgvvnipsvAQPVHPTy 14334
Cdd:COG5665 177 IAVPsAPAAPPNAVDYS--------------VLVPIAAQDPAASVSTpqaFNASATSGRSQHI---------VQAAKRV- 232
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14335 qppVVERPAIYDVYyPPPPSRPGVINIPSPPRPVYPVPQQPIYVPapvlhiPAPRPVIHNIpsVPQPTYPHRNPPiqdVT 14414
Cdd:COG5665 233 ---GVEWWGDPSLL-ATPPATPATEEKSSQQPKSQPTSPSGGTTP------PSTNQLTTSN--TPTSTAKAQPQP---PT 297
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14415 YPAPQpsppvpgivnipslpqpVSTPTSGVINIPSQASPPISVPTPGivnipSIPQPTPQRPSPGIINVPSVPQPIPtap 14494
Cdd:COG5665 298 KKQPA-----------------KEPPSDTASGNPSAPSVLINSDSPT-----SEDPATASVPTTEETTAFTTPSSVP--- 352
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14495 spgiinipsVPQPLPSPTPGVINIPQQPTPPPLvqqpgiinipSVQQPSTPTTQHPIQDVQYETQRPQ-PTPGVINIPSV 14573
Cdd:COG5665 353 ---------STPAEKDTPATDLATPVSPTPPET----------SVDKKVSPDSATSSTKSEKEGGTASsPMPPNIAIGAK 413
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14574 SQPTyPTqKPSYQDTSYptvQPKPPVSGiiniPSVPQPVPSLTpgvinlpSEPSYSAPIPKPGIINVPSIPEPIPSIPQN 14653
Cdd:COG5665 414 DDVD-AT-DPSQEAKEY---TKNAPMTP----EADSAPESSVR-------TEASPSAGSDLEPENTTLRDPAPNAIPPPE 477
|
410 420 430
....*....|....*....|....*....|
gi 442625924 14654 PVQEVYHDTQKPQAipgvVNVPSAPQPTPG 14683
Cdd:COG5665 478 DPSTIGRLSSGDKL----ANETGPPVIRRD 503
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
13993-14246 |
1.26e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 57.26 E-value: 1.26e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13993 DVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRPVfiTSPGNLSPTPQPGVINI------PSVSQPGYPTPQSPIYDANYPT 14066
Cdd:PHA03247 251 DIAAPAPPPVVGEGADRAPETARGATGPPPPPE--AAAPNGAAAPPDGVWGAalagapLALPAPPDPPPPAPAGDAEEED 328
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14067 TQ-------SPIPQqpgvvnipsvPSPSYPAPNPPVNYPT--QPSPQIPVQPGVINIPSAPLPTTPPQHPPvfipspesp 14137
Cdd:PHA03247 329 DEdgamevvSPLPR----------PRQHYPLGFPKRRRPTwtPPSSLEDLSAGRHHPKRASLPTRKRRSAR--------- 389
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14138 spAPKPGVINIPSVTHPEYPTSQVPvydvnySTTPSPIPqKPGVVNIPSAPQPVHPAPNPPVHEfnYPTPPAVPQQPGVL 14217
Cdd:PHA03247 390 --HAATPFARGPGGDDQTRPAAPVP------ASVPTPAP-TPVPASAPPPPATPLPSAEPGSDD--GPAPPPERQPPAPA 458
|
250 260
....*....|....*....|....*....
gi 442625924 14218 NIPSYPTPVAPTPQSPIYIPSQEQPKPTT 14246
Cdd:PHA03247 459 TEPAPDDPDDATRKALDALRERRPPEPPG 487
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
14171-14375 |
1.60e-06 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 56.53 E-value: 1.60e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14171 TPSPIPQKPGVVNIPSAPQPVhPAPNPPVHefnyPTPPAVPQQPGvlnipsyPTPVAPTPQSPiyiPSQEQPKPTTRPSV 14250
Cdd:PRK07764 598 EGPPAPASSGPPEEAARPAAP-AAPAAPAA----PAPAGAAAAPA-------EASAAPAPGVA---APEHHPKHVAVPDA 662
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14251 INVPSvPQPAYPTPQAPVYDVnyPTSPSVIPHQPGVVNiPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPV 14330
Cdd:PRK07764 663 SDGGD-GWPAKAGGAAPAAPP--PAPAPAAPAAPAGAA-PAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDP 738
|
170 180 190 200
....*....|....*....|....*....|....*....|....*
gi 442625924 14331 HPTyqPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVYPVPQQP 14375
Cdd:PRK07764 739 VPL--PPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSE 781
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
14336-14656 |
1.66e-06 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 56.09 E-value: 1.66e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14336 PPVVERPAIydvyyPPPPSRPgVIN--IPSPPRPVYPVPQQPIYVPAP----VLHIPAPRpVIHNIPSVPQPtYPHRNPP 14409
Cdd:cd22540 39 PPAVEAAVT-----PPAPPQP-TPRklVPIKPAPLPLGPGKNSIGFLSakgnIIQLQGSQ-LSSSAPGGQQV-FAIQNPT 110
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14410 IQDVTYPAPQPSPPvpGIVNIPSLPQPVSTPTSGVINI-----PSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVP 14484
Cdd:cd22540 111 MIIKGSQTRSSTNQ--QYQISPQIQAAGQINNSGQIQIipgtnQAIITPVQVLQQPQQAHKPVPIKPAPLQTSNTNSASL 188
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14485 SVPQPIPTAPSPGII--NIPS------------VPQPLPSPTPGVI---NIPQQPTPPPLVQQ-----------PGII-- 14534
Cdd:cd22540 189 QVPGNVIKLQSGGNValTLPVnnlvgtqdgatqLQLAAAPSKPSKKirkKSAQAAQPAVTVAEqvetvliettaDNIIqa 268
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14535 --NIPSVQQPST--PTTQHPIQDVQYETQR------PQPTPGV-------INIPSVS------QPTYPTQKPSYQDTSYP 14591
Cdd:cd22540 269 gnNLLIVQSPGTgqPAVLQQVQVLQPKQEQqvvqipQQALRVVqaasatlPTVPQKPlqniqiQNSEPTPTQVYIKTPSG 348
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14592 TVQ-------PKPPVSGIINIPSVPQPVPSLTPGVINLPS-----EPSYSAPIPKPGIINV-----PSIPEPIPSIPQNP 14654
Cdd:cd22540 349 EVQtvllqeaPAATATPSSSTSTVQQQVTANNGTGTSKPNynvrkERTLPKIAPAGGIISLnaaqlAAAAQAIQTINING 428
|
..
gi 442625924 14655 VQ 14656
Cdd:cd22540 429 VQ 430
|
|
| Herpes_BLLF1 |
pfam05109 |
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ... |
14242-14550 |
1.70e-06 |
|
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.
Pssm-ID: 282904 [Multi-domain] Cd Length: 886 Bit Score: 56.46 E-value: 1.70e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14242 PKPTT-RPSVINVPS-VPQPAYPTPQAPVYDVNYPTSPSVIPHQPgvvniPSVPLPAPPVKQRPVFVPSPVHPTPA---P 14316
Cdd:pfam05109 442 PNTTTgLPSSTHVPTnLTAPASTGPTVSTADVTSPTPAGTTSGAS-----PVTPSPSPRDNGTESKAPDMTSPTSAvttP 516
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14317 QPGVVN-IPSVAQPVhPTYQPPVVERPAIYDVYYPPPPsrpgviNIPSP-PRPVYPVPQQPIyvpaPVLHIPAPrpvihn 14394
Cdd:pfam05109 517 TPNATSpTPAVTTPT-PNATSPTLGKTSPTSAVTTPTP------NATSPtPAVTTPTPNATI----PTLGKTSP------ 579
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14395 IPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSGVI----NIPSQASPPISV----------PTP 14460
Cdd:pfam05109 580 TSAVTTPTPNATSPTVGETSPQANTTNHTLGGTSSTPVVTSPPKNATSAVTtgqhNITSSSTSSMSLrpssisetlsPST 659
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14461 GIVNIPSIPQPTPQRPSPGiinvPSVPQPIPTAPSPGIINIPSvpqplPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQ 14540
Cdd:pfam05109 660 SDNSTSHMPLLTSAHPTGG----ENITQVTPASTSTHHVSTSS-----PAPRPGTTSQASGPGNSSTSTKPGEVNVTKGT 730
|
330
....*....|.
gi 442625924 14541 QPSTPTT-QHP 14550
Cdd:pfam05109 731 PPKNATSpQAP 741
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
14144-14598 |
1.73e-06 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 56.72 E-value: 1.73e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14144 GVINIPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNPPvhefnyPTPPAVPQQPGVLNIPSYP 14223
Cdd:PHA03307 17 GGEFFPRPPATPGDAADDLLSGSQGQLVSDSAELAAVTVVAGAAACDRFEPPTGP------PPGPGTEAPANESRSTPTW 90
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14224 TPVAPTPQSPIYIPSQEQPKPTTRPSVinvPSVPQPAYPTPQAPvydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRP 14303
Cdd:PHA03307 91 SLSTLAPASPAREGSPTPPGPSSPDPP---PPTPPPASPPPSPA------PDLSEMLRPVGSPGPPPAASPPAAGASPAA 161
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14304 VfvpsPVHPTPAPQPGVVnIPSVAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSPPRPVyPVPQQPIYVPAPVL 14383
Cdd:PHA03307 162 V----ASDAASSRQAALP-LSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPA-PAPGRSAADDAGAS 235
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14384 HIPAPRPVIHNIPSVPQPTYPHRNPPIQDVtypapqPSPPVPGIVNIPSLPQPVSTPTSGVINIPSQASPPISVPTPgiv 14463
Cdd:PHA03307 236 SSDSSSSESSGCGWGPENECPLPRPAPITL------PTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSG--- 306
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14464 nipsiPQPTPQRPSPGIINVPSV--PQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIPSVQQ 14541
Cdd:PHA03307 307 -----PAPSSPRASSSSSSSRESssSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAAS 381
|
410 420 430 440 450 460
....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14542 PSTPTT---------QHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTS-YPTVQPKPP 14598
Cdd:PHA03307 382 AGRPTRrraraavagRARRRDATGRFPAGRPRPSPLDAGAASGAFYARYPLLTPSGEpWPGSPPPPP 448
|
|
| PRK10819 |
PRK10819 |
transport protein TonB; Provisional |
14237-14406 |
1.74e-06 |
|
transport protein TonB; Provisional
Pssm-ID: 236768 [Multi-domain] Cd Length: 246 Bit Score: 54.30 E-value: 1.74e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTS-PSVIPhQPgvvnipsvPLPAPPVKQRPVFVPSPVhPTPA 14315
Cdd:PRK10819 37 QVIELPAPAQPISVTMVAPADLEPPQAVQPPPEPVVEPEPePEPIP-EP--------PKEAPVVIPKPEPKPKPK-PKPK 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14316 PQPGVVNIPSVAQPVhptyqPPVVERPAIYDVyyPPPPSRPgvinIPSPPRPVYPVPQQPiyVPApvlhipAPRPVihni 14395
Cdd:PRK10819 107 PKPVKKVEEQPKREV-----KPVEPRPASPFE--NTAPARP----TSSTATAAASKPVTS--VSS------GPRAL---- 163
|
170
....*....|.
gi 442625924 14396 pSVPQPTYPHR 14406
Cdd:PRK10819 164 -SRNQPQYPAR 173
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
14253-14521 |
2.25e-06 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 56.20 E-value: 2.25e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14253 VPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQP---GVvNIPSVPLPAP-----------PVKQRPvfvPSPVHPTPAPQP 14318
Cdd:pfam09770 108 AARAAQSSAQPPASSLPQYQYASQQSQQPSKPvrtGY-EKYKEPEPIPdlqvdaslwgvAPKKAA---APAPAPQPAAQP 183
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14319 GVVNIPS-------------VAQPVHPTYQPPVVerPAIYDVYYPPPPSRPGViNIPSPPRPVYPVPQQPIYVPAPVLHI 14385
Cdd:pfam09770 184 ASLPAPSrkmmsleeveaamRAQAKKPAQQPAPA--PAQPPAAPPAQQAQQQQ-QFPPQIQQQQQPQQQPQQPQQHPGQG 260
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14386 PAPRPVIHnipsvPQPtyphrnppiqdvtypapqpsppvpgivniPSLPQPVSTPTSGVINIPSQASPPISVPTPGIVNi 14465
Cdd:pfam09770 261 HPVTILQR-----PQS-----------------------------PQPDPAQPSIQPQAQQFHQQPPPVPVQPTQILQN- 305
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14466 psipqptPQRPSPGIINVPSVPQPiPTAPSPGIINIPSvpQPLPSPTPGVINIPQQ 14521
Cdd:pfam09770 306 -------PNRLSAARVGYPQNPQP-GVQPAPAHQAHRQ--QGSFGRQAPIITHPQQ 351
|
|
| KLF3_N |
cd21577 |
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ... |
14225-14409 |
2.35e-06 |
|
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.
Pssm-ID: 410554 [Multi-domain] Cd Length: 214 Bit Score: 53.50 E-value: 2.35e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14225 PVAPTPQSPIYIPSQEQPKP-----TTRPSVINVPSVPQPAYPTPQAPVYdvnyPTSPSVIPHQPGVVNIPSVPLPAPPV 14299
Cdd:cd21577 2 PVKTDMETSFYSPSHSQLEPvdlslSKRSSPPSSSSSSSSSSSSSSSPSS----RASPPSPYSKSSPPSPPQQRPLSPPL 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14300 KQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIYDVyyPPPPSRPGVINIPSPP------------RP 14367
Cdd:cd21577 78 SLPPPVAPPPLSPGSVPGGLPVISPVMVQPVPVLYPPHLHQPIMVSSS--PPPDDDHHHHKASSMKpselggdnhelhKP 155
|
170 180 190 200
....*....|....*....|....*....|....*....|....*....
gi 442625924 14368 V----YPVPQQPIY---VPAPVlhIPAPRPVIHNIPSVPQPTYPHRNPP 14409
Cdd:cd21577 156 IktepRPEHAQDPYseeMSSSV--ISSPPEYESNTPSVIVHPGKRPLPV 202
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14483-14626 |
2.38e-06 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 52.48 E-value: 2.38e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14483 VPSVPQPIPTAPSPGIINIPSVPQPLPSptpgvinIPQQPtpppLVQQPGiinipsvQQPSTPTTQHPIQDVQYETQRPQ 14562
Cdd:smart00818 40 IPVSQQHPPTHTLQPHHHIPVLPAQQPV-------VPQQP----LMPVPG-------QHSMTPTQHHQPNLPQPAQQPFQ 101
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14563 PTPgviniPSVSQPTYPTQKPsyqdtsyPTVQPKPPVSGIINIPSVP--QPVPSLTPgviNLPSEP 14626
Cdd:smart00818 102 PQP-----LQPPQPQQPMQPQ-------PPVHPIPPLPPQPPLPPMFpmQPLPPLLP---DLPLEA 152
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
13917-14323 |
2.55e-06 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 55.70 E-value: 2.55e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13917 SPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTP---------------QSPQYNVNYP-SPQPANPQKPGVVNIPSVP 13980
Cdd:cd22540 39 PPAVEAAVTPPAPPQPTPRKLVPIKPAPLPLGPGKnsigflsakgniiqlQGSQLSSSAPgGQQVFAIQNPTMIIKGSQT 118
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13981 QpvypspqpPVYDVNYPTTPVSQHPGVVNIPSAPRLVPPTSQRpvfITSPGNLSPTPQPGvinipSVSQPGYPTPQSPIy 14060
Cdd:cd22540 119 R--------SSTNQQYQISPQIQAAGQINNSGQIQIIPGTNQA---IITPVQVLQQPQQA-----HKPVPIKPAPLQTS- 181
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14061 danypTTQSPIPQQPGvvNIPSVPSPSYPAPNPPVNYptqpspQIPVQPGVINIPSAPLPTTPPQhppvfipspespspa 14140
Cdd:cd22540 182 -----NTNSASLQVPG--NVIKLQSGGNVALTLPVNN------LVGTQDGATQLQLAAAPSKPSK--------------- 233
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14141 pkpGVINIPSVTHPEYPTSQVPVYDVNYSTTPSP---------IPQKPGvVNIPSAPQPVHPApnppvhefnyptPPAvp 14211
Cdd:cd22540 234 ---KIRKKSAQAAQPAVTVAEQVETVLIETTADNiiqagnnllIVQSPG-TGQPAVLQQVQVL------------QPK-- 295
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14212 QQPGVLNIPSYPTPV--------APTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQ 14283
Cdd:cd22540 296 QEQQVVQIPQQALRVvqaasatlPTVPQKPLQNIQIQNSEPTPTQVYIKTPSGEVQTVLLQEAPAATATPSSSTSTVQQQ 375
|
410 420 430 440
....*....|....*....|....*....|....*....|
gi 442625924 14284 PGVVNIPSVPLPAPPVKQRPVFvpspvhPTPAPQPGVVNI 14323
Cdd:cd22540 376 VTANNGTGTSKPNYNVRKERTL------PKIAPAGGIISL 409
|
|
| PRK14948 |
PRK14948 |
DNA polymerase III subunit gamma/tau; |
14442-14653 |
3.90e-06 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237862 [Multi-domain] Cd Length: 620 Bit Score: 55.35 E-value: 3.90e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14442 SGVINIPSQASPPISVPTPGIVNIPSIPQPTPqrpSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQ 14521
Cdd:PRK14948 362 SAFISEIANASAPANPTPAPNPSPPPAPIQPS---APKTKQAATTPSPPPAKASPPIPVPAEPTEPSPTPPANAANAPPS 438
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14522 PTPPPLVQQpgIINipSVQQPST------------------------------------------PTTQHPIQ---DVQY 14556
Cdd:PRK14948 439 LNLEELWQQ--ILA--KLELPSTrmllsqqaelvsldsnraviavspnwlgmvqsrkplleqafaKVLGRSIKlnlESQS 514
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14557 ETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSysaPIPKPg 14636
Cdd:PRK14948 515 GSASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSSPPPPIPEEPT---PSPTK- 590
|
250
....*....|....*..
gi 442625924 14637 iinvPSIPEPIPSIPQN 14653
Cdd:PRK14948 591 ----DSSPEEIDKAAKN 603
|
|
| SP2_N |
cd22540 |
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ... |
14351-14676 |
4.30e-06 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.
Pssm-ID: 411776 [Multi-domain] Cd Length: 511 Bit Score: 54.93 E-value: 4.30e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14351 PPPSRPGVinipSPPRPVYPVPQ-QPIYVPAPvlhIPAPRPViHNIPSVPQPTYPHRNPPIQDVTypapqpsppvpgivN 14429
Cdd:cd22540 39 PPAVEAAV----TPPAPPQPTPRkLVPIKPAP---LPLGPGK-NSIGFLSAKGNIIQLQGSQLSS--------------S 96
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 IPSLPQPVSTPTSGVINIPSQASPPISVPTpgivnipsipQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLP 14509
Cdd:cd22540 97 APGGQQVFAIQNPTMIIKGSQTRSSTNQQY----------QISPQIQAAGQINNSGQIQIIPGTNQAIITPVQVLQQPQQ 166
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14510 SPTPgvinIPQQPTPpplvQQPGIINIPSVQQPSTPTTQH---------PIQ--DVQYETQRPQPTPGviniPSVSQPTY 14578
Cdd:cd22540 167 AHKP----VPIKPAP----LQTSNTNSASLQVPGNVIKLQsggnvaltlPVNnlVGTQDGATQLQLAA----APSKPSKK 234
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14579 PTQKPSYQDTSYPTVQPKPPV------SGII---------------NIPSVPQPVPSLTP----GVINLPSEPsysapip 14633
Cdd:cd22540 235 IRKKSAQAAQPAVTVAEQVETvliettADNIiqagnnllivqspgtGQPAVLQQVQVLQPkqeqQVVQIPQQA------- 307
|
330 340 350 360
....*....|....*....|....*....|....*....|...
gi 442625924 14634 kpgIINVPSIPEPIPSIPQNPVQEVYHDTQKPQAIPGVVNVPS 14676
Cdd:cd22540 308 ---LRVVQAASATLPTVPQKPLQNIQIQNSEPTPTQVYIKTPS 347
|
|
| Glutenin_hmw |
pfam03157 |
High molecular weight glutenin subunit; Members of this family include high molecular weight ... |
14030-14687 |
5.10e-06 |
|
High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.
Pssm-ID: 367362 [Multi-domain] Cd Length: 786 Bit Score: 54.95 E-value: 5.10e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14030 PGNLSPTPQ--PGVI-NIPSVSQPGYPTPQSPIYDANYPTTQSPipQQPGVVNIPSVPSPSYpapnppvnYPTqpSPQip 14106
Cdd:pfam03157 85 PGETTPPQQlqQGIFwGIPALLQRYYPGVTSPQQVSYYPGQASP--QRPGQGQQPGQGQQWY--------YPT--SPQ-- 150
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14107 vQPGVINIP----SAPLPTTPPQHPPVFIPSPESPSPAPKPGviNIPSVTHPEY-PTSQVPVYDVNYsTTPSPIPQKPGv 14181
Cdd:pfam03157 151 -QPGQWQQPgqgqQGYYPTSPQQSGQRQQPGQGQQLRQGQQG--QQSGQGQPGYyPTSSQQPGQLQQ-TGQGQQGQQPE- 225
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14182 vnipSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPTpvapTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAY 14261
Cdd:pfam03157 226 ----RGQQGQQPGQGQQPGQGQQGQQPGQPQQLGQGQQGYYPI----SPQQPRQWQQSGQGQQGYYPTSLQQPGQGQSGY 297
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14262 ptpqapvydvnYPTSPsvipHQPGvvnipsvPLPAPPVKQRPVFVPSPVHPTPAPQPGvvnipSVAQPVHP-TYQPPVVE 14340
Cdd:pfam03157 298 -----------YPTSQ----QQAG-------QLQQEQQLGQEQQDQQPGQGRQGQQPG-----QGQQGQQPaQGQQPGQG 350
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14341 RPAiydvYYPPPPSRPGvinipspprpvypvPQQPIYVPApvlhipaprpvihnipSVPQPTYPHRNPPIQDVTYPAPQP 14420
Cdd:pfam03157 351 QPG----YYPTSPQQPG--------------QGQPGYYPT----------------SQQQPQQGQQPEQGQQGQQQGQGQ 396
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14421 SPPVPGIVNIPSLPQPVSTPTSgviniPSQasppisvptpgivnipsipqptPQRPSPGiiNVPSVPQPIPTAPSPGIIN 14500
Cdd:pfam03157 397 QGQQPGQGQQPGQGQPGYYPTS-----PQQ----------------------SGQGQPG--YYPTSPQQSGQGQQPGQGQ 447
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14501 IPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGiinipSVQQPSTPTT-------QHPIQDVQYETQRPQPTPGVINIPSV 14573
Cdd:pfam03157 448 QPGQEQPGQGQQPGQGQQGQQPGQPEQGQQPG-----QGQPGYYPTSpqqsgqgQQLGQWQQQGQGQPGYYPTSPLQPGQ 522
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14574 SQPTYPTQKPSYQDTSYPTVQPKPPVSGIINIPSvPQPVPSLTPGVINLPSEPSYSAPIPKPGIINVPSIPEP--IPSIP 14651
Cdd:pfam03157 523 GQPGYYPTSPQQPGQGQQLGQLQQPTQGQQGQQS-GQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQPgyYPTSP 601
|
650 660 670
....*....|....*....|....*....|....*...
gi 442625924 14652 QNPVQ--EVYHDTQKPQAIPGVVnVPSAPQPTPGRPYY 14687
Cdd:pfam03157 602 QQSGQgqQPGQWQQPGQGQPGYY-PTSSLQLGQGQQGY 638
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14249-14353 |
5.25e-06 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 54.82 E-value: 5.25e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14249 SVINVPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPvfVPSPVHPTPAPQPgVVNIPSVAQ 14328
Cdd:PRK14950 358 ALLVPVPAPQPAKPTAAAPS-----PVRPTPAPSTRPKAAAAANIPPKEPVRETA--TPPPVPPRPVAPP-VPHTPESAP 429
|
90 100
....*....|....*....|....*
gi 442625924 14329 PVhPTYQPPVVERPaiydVYYPPPP 14353
Cdd:PRK14950 430 KL-TRAAIPVDEKP----KYTPPAP 449
|
|
| PLN03209 |
PLN03209 |
translocon at the inner envelope of chloroplast subunit 62; Provisional |
14194-14496 |
5.64e-06 |
|
translocon at the inner envelope of chloroplast subunit 62; Provisional
Pssm-ID: 178748 [Multi-domain] Cd Length: 576 Bit Score: 54.55 E-value: 5.64e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14194 APNPPVHEF--NYPTPPAVPQQPGVLNIPSyPTPVAPTPQSPIYIPSQEQPkpttrPSVINVpsVPQPAypTPQAPVYDV 14271
Cdd:PLN03209 311 APLTPMEELlaKIPSQRVPPKESDAADGPK-PVPTKPVTPEAPSPPIEEEP-----PQPKAV--VPRPL--SPYTAYEDL 380
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14272 NYPTSPsvIPHQPGvvnipSVPLPAPPVKQrpvfVPSPVHPTPAPQPGVVniPSVAQpVHPTYQPPVVERPAIYDVYYP- 14350
Cdd:PLN03209 381 KPPTSP--IPTPPS-----SSPASSKSVDA----VAKPAEPDVVPSPGSA--SNVPE-VEPAQVEAKKTRPLSPYARYEd 446
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14351 -PPPSRPGviniPSPPRPVYP-------VPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQDVtypapqpsp 14422
Cdd:PLN03209 447 lKPPTSPS----PTAPTGVSPsvsstssVPAVPDTAPATAATDAAAPPPANMRPLSPYAVYDDLKPPTSPS--------- 513
|
250 260 270 280 290 300 310
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 442625924 14423 pvpgivniPSLPQPVSTPTSGVINIPSQASPPISVPTPGIVNIPsiPQPTPQRPSPGIINVpsvpQPiPTAPSP 14496
Cdd:PLN03209 514 --------PAAPVGKVAPSSTNEVVKVGNSAPPTALADEQHHAQ--PKPRPLSPYTMYEDL----KP-PTSPTP 572
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14350-14602 |
7.82e-06 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 54.11 E-value: 7.82e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14350 PPPPSRPGVINIP---SPPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQdvtypapqpsppvpg 14426
Cdd:PRK12323 374 PATAAAAPVAQPApaaAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQ--------------- 438
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14427 ivNIPSLPQPVSTPTSGVINIPSQASPPisvPTPGIVNIPSIPQPTPQRPSPgiinvPSVPQPIPTAPSPGiiniPSVPQ 14506
Cdd:PRK12323 439 --ASARGPGGAPAPAPAPAAAPAAAARP---AAAGPRPVAAAAAAAPARAAP-----AAAPAPADDDPPPW----EELPP 504
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14507 PLPSPTPgvinIPQQPTPPPLVQQPgiINIPSVQQPSTPttqhpiqdvqYETQRPQPTPGVINIPSVSQPTYPTQKPSYQ 14586
Cdd:PRK12323 505 EFASPAP----AQPDAAPAGWVAES--IPDPATADPDDA----------FETLAPAPAAAPAPRAAAATEPVVAPRPPRA 568
|
250 260
....*....|....*....|....*
gi 442625924 14587 ---------DTSYPTVQPKPPVSGI 14602
Cdd:PRK12323 569 sasglpdmfDGDWPALAARLPVRGL 593
|
|
| DUF4106 |
pfam13388 |
Protein of unknown function (DUF4106); This family of proteins are found in large numbers in ... |
14457-14566 |
8.17e-06 |
|
Protein of unknown function (DUF4106); This family of proteins are found in large numbers in the Trichomonas vaginalis proteome. The function of this protein is unknown.
Pssm-ID: 404296 Cd Length: 431 Bit Score: 53.75 E-value: 8.17e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14457 VPTPGIVnIPsiPQPTPQRPSPGIinvpsvPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGIINI 14536
Cdd:pfam13388 165 ILASGIY-IP--PNPPREAPAPGL------PKTFTSSHGHRHRHAPKPTVQNPAQQPTVQNPAQQPTQQPTVQNPAQQQN 235
|
90 100 110
....*....|....*....|....*....|
gi 442625924 14537 PSVQQPSTPTTQHPIQDVQyeTQRPQPTPG 14566
Cdd:pfam13388 236 PAQQPPPQPAQQPTVQNPA--QQQPQTEQG 263
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
255-286 |
8.29e-06 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 46.86 E-value: 8.29e-06
10 20 30
....*....|....*....|....*....|..
gi 442625924 255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGY 286
Cdd:smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14437-14692 |
8.68e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 54.56 E-value: 8.68e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14437 VSTPTSGVINIPSqaspPISVPTPGIVNIPSIPQPTPQRPSPgiiNVPSVPQPiPTAPSPGIINIPSVPQPLPSPTPgvi 14516
Cdd:PHA03247 244 ISHPLRGDIAAPA----PPPVVGEGADRAPETARGATGPPPP---PEAAAPNG-AAAPPDGVWGAALAGAPLALPAP--- 312
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14517 nipqqPTPPPlvqqpgiinipsvQQPSTPTTQHPIQDVQYETQRPQPTPGV---INIPSVSQPTYpTQKPSYQDTSYPTV 14593
Cdd:PHA03247 313 -----PDPPP-------------PAPAGDAEEEDDEDGAMEVVSPLPRPRQhypLGFPKRRRPTW-TPPSSLEDLSAGRH 373
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14594 QPK---PPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSAPIPKPGiinVPSIPEPIPSIPQNPVQEVYHDTQKPQAIPg 14670
Cdd:PHA03247 374 HPKrasLPTRKRRSARHAATPFARGPGGDDQTRPAAPVPASVPTPA---PTPVPASAPPPPATPLPSAEPGSDDGPAPP- 449
|
250 260
....*....|....*....|..
gi 442625924 14671 vvnvpsaPQPTPGRPYYDVAKP 14692
Cdd:PHA03247 450 -------PERQPPAPATEPAPD 464
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
14179-14565 |
9.07e-06 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 54.24 E-value: 9.07e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14179 PGVVNIPSAPQPVHPAPNPPV-HEFNYPTPPAVPQQPgvLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVinvPSVP 14257
Cdd:pfam09606 90 AGQGTRPQMMGPMGPGPGGPMgQQMGGPGTASNLLAS--LGRPQMPMGGAGFPSQMSRVGRMQPGGQAGGMMQ---PSSG 164
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14258 QPAYPTPQAPVYDV--NYPTSPSVIPHQ--------PGVVNIPSVPLPAPPVKQRPVFVPSPVHPTP-APQPGVVNIPSV 14326
Cdd:pfam09606 165 QPGSGTPNQMGPNGgpGQGQAGGMNGGQqgpmggqmPPQMGVPGMPGPADAGAQMGQQAQANGGMNPqQMGGAPNQVAMQ 244
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14327 AQPVHPTYQPPVVERPAIYDVYYPPppSRPGVINIPSPPRPVYPVPQQPIYVPaPVLHIPAPRPVIHNIPSVPQPTYPHR 14406
Cdd:pfam09606 245 QQQPQQQGQQSQLGMGINQMQQMPQ--GVGGGAGQGGPGQPMGPPGQQPGAMP-NVMSIGDQNNYQQQQTRQQQQQQGGN 321
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14407 NPPIQDVTYPAPQPSPPVPGIVNIPSLPQPVSTPTSGVINI-PSQASPPISVPTPGIVNIPSIPQPTP--QRPSPGIINV 14483
Cdd:pfam09606 322 HPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGAnPMQRGQPGMMSSPSPVPGQQVRQVTPnqFMRQSPQPSV 401
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14484 PSVPQPI---PTAPSPGIIniPSvPQPLPSPTPGVINIPQQPTPPPLVQQPGIINIP---SVQQPSTPTTQHPIQDvQYE 14557
Cdd:pfam09606 402 PSPQGPGsqpPQSHPGGMI--PS-PALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPgqsAVNSPLNPQEEQLYRE-KYR 477
|
....*...
gi 442625924 14558 TQRPQPTP 14565
Cdd:pfam09606 478 QLTKYIEP 485
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
14187-14390 |
9.62e-06 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 54.22 E-value: 9.62e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYP----TPVAPTPQS----PIYIPSQEQPKPTTRPSVINVPSvPQ 14258
Cdd:PRK07764 591 APGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPapagAAAAPAEASaapaPGVAAPEHHPKHVAVPDASDGGD-GW 669
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14259 PAYPTPQAPVYDVnyPTSPSVIPHQPGVVNiPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVA---QPVHPTYQ 14335
Cdd:PRK07764 670 PAKAGGAAPAAPP--PAPAPAAPAAPAGAA-PAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAAddpVPLPPEPD 746
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14336 PPVVERPAIYDVYYPPPPSRPGViniPSPPRPVYPVPQQPiyvPAPVLHIPAPRP 14390
Cdd:PRK07764 747 DPPDPAGAPAQPPPPPAPAPAAA---PAAAPPPSPPSEEE---EMAEDDAPSMDD 795
|
|
| Herpes_BLLF1 |
pfam05109 |
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ... |
14190-14504 |
1.11e-05 |
|
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.
Pssm-ID: 282904 [Multi-domain] Cd Length: 886 Bit Score: 53.77 E-value: 1.11e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14190 PVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSYPT-PVAPTPQSPIYIPSQEQPKPTTRPSVINVPSvPQPAYPTPQAPV 14268
Cdd:pfam05109 425 PESTTTSPTLNTTGFAAPNTTTGLPSSTHVPTNLTaPASTGPTVSTADVTSPTPAGTTSGASPVTPS-PSPRDNGTESKA 503
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14269 YDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVN-IPSVAQPVHPTYQPPVVERPAIYDV 14347
Cdd:pfam05109 504 PDMTSPTSAVTTPTPNATSPTPAVTTPTPNATSPTLGKTSPTSAVTTPTPNATSpTPAVTTPTPNATIPTLGKTSPTSAV 583
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14348 YYPPPPSRPGVINIPSPP-----RPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHR--------NPPIQD-- 14412
Cdd:pfam05109 584 TTPTPNATSPTVGETSPQanttnHTLGGTSSTPVVTSPPKNATSAVTTGQHNITSSSTSSMSLRpssisetlSPSTSDns 663
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14413 VTYPAPQPSPPVPGIVNIPSLpQPVSTPTSGVinipSQASPpisVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQP--- 14489
Cdd:pfam05109 664 TSHMPLLTSAHPTGGENITQV-TPASTSTHHV----STSSP---APRPGTTSQASGPGNSSTSTKPGEVNVTKGTPPkna 735
|
330
....*....|....*.
gi 442625924 14490 -IPTAPSPGIINIPSV 14504
Cdd:pfam05109 736 tSPQAPSGQKTAVPTV 751
|
|
| PTZ00449 |
PTZ00449 |
104 kDa microneme/rhoptry antigen; Provisional |
14237-14597 |
1.22e-05 |
|
104 kDa microneme/rhoptry antigen; Provisional
Pssm-ID: 185628 [Multi-domain] Cd Length: 943 Bit Score: 53.93 E-value: 1.22e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQpKPTTRPSVINVPSVPQ-PAYP-------TPQAPVyDVNYPTSPSViPHQPGVVNIPSVPlPAPPVKQRPVFVPS 14308
Cdd:PTZ00449 563 PAKEH-KPSKIPTLSKKPEFPKdPKHPkdpeepkKPKRPR-SAQRPTRPKS-PKLPELLDIPKSP-KRPESPKSPKRPPP 638
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14309 PVHPTPAPQPGVVNIPSVAQPVH---PTYQPPVVERpaIYDVYYPPPpSRPGVINIPSPPRPVYPVPQQPIYVPAPVLHI 14385
Cdd:PTZ00449 639 PQRPSSPERPEGPKIIKSPKPPKspkPPFDPKFKEK--FYDDYLDAA-AKSKETKTTVVLDESFESILKETLPETPGTPF 715
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14386 PAPRPVIHNIPSvpQPTYPHRnpPIQDvtypapqpsppvpgivniPSLPQPvstptsgvinipsqasPPISVPTPGIVNI 14465
Cdd:PTZ00449 716 TTPRPLPPKLPR--DEEFPFE--PIGD------------------PDAEQP----------------DDIEFFTPPEEER 757
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14466 PSIPQPTPQRPSPGIInVPSVPQPIPTAPSPGiiniPSVPQPLP-SPTpgviniPQQPTPPPlvqqpgiinipsvQQPST 14544
Cdd:PTZ00449 758 TFFHETPADTPLPDIL-AEEFKEEDIHAETGE----PDEAMKRPdSPS------EHEDKPPG-------------DHPSL 813
|
330 340 350 360 370
....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14545 PTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSypTVQPKP 14597
Cdd:PTZ00449 814 PKKRHRLDGLALSTTDLESDAGRIAKDASGKIVKLKRSKSFDDLT--TVEEAE 864
|
|
| PLN03209 |
PLN03209 |
translocon at the inner envelope of chloroplast subunit 62; Provisional |
14350-14615 |
1.34e-05 |
|
translocon at the inner envelope of chloroplast subunit 62; Provisional
Pssm-ID: 178748 [Multi-domain] Cd Length: 576 Bit Score: 53.39 E-value: 1.34e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14350 PPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPvlhiPAPRPVIHNiPSVPQPTYPHRNPPIQDvtypapqpsppvpgIVN 14429
Cdd:PLN03209 329 PPKESDAADGPKPVPTKPVTPEAPSPPIEEEP----PQPKAVVPR-PLSPYTAYEDLKPPTSP--------------IPT 389
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 IPSLPQPVSTPTSGVINIPSQASPPISVPTPGIVNIPSIPQPTP-QRP-SPGI----INVPSVPQPIP-TAPSPGIINIP 14502
Cdd:PLN03209 390 PPSSSPASSKSVDAVAKPAEPDVVPSPGSASNVPEVEPAQVEAKkTRPlSPYAryedLKPPTSPSPTApTGVSPSVSSTS 469
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14503 SVPQPLPSPTPGVINIPQQPTPPplvqqpgiinipsvqqPSTPTTQHPIQDVQYETQRPQPTPGVINIPSVSQPTYPTQK 14582
Cdd:PLN03209 470 SVPAVPDTAPATAATDAAAPPPA----------------NMRPLSPYAVYDDLKPPTSPSPAAPVGKVAPSSTNEVVKVG 533
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 442625924 14583 PSYQDTSYP----TVQPKP-PVSGI-----INIPSVPQPVPSL 14615
Cdd:PLN03209 534 NSAPPTALAdeqhHAQPKPrPLSPYtmyedLKPPTSPTPSPVL 576
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
255-289 |
1.44e-05 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 46.48 E-value: 1.44e-05
10 20 30
....*....|....*....|....*....|....*
gi 442625924 255 DVDECSYPNVCGPGAICTNLEGSYRCDCPPGYDGD 289
Cdd:cd00054 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| PLN03209 |
PLN03209 |
translocon at the inner envelope of chloroplast subunit 62; Provisional |
13939-14257 |
1.53e-05 |
|
translocon at the inner envelope of chloroplast subunit 62; Provisional
Pssm-ID: 178748 [Multi-domain] Cd Length: 576 Bit Score: 53.39 E-value: 1.53e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13939 NIPSAPQPiyptpqsPQYNVNYPSPQPAnPQKPGVVNIPSVPQPVYPsPQPpvydVNYPTTPVSQHPGVVNI--PSAPRL 14016
Cdd:PLN03209 322 KIPSQRVP-------PKESDAADGPKPV-PTKPVTPEAPSPPIEEEP-PQP----KAVVPRPLSPYTAYEDLkpPTSPIP 388
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14017 VPPTSQRPvfitSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQspipqqpgvvniPSVPSPSYPAPNPPvn 14096
Cdd:PLN03209 389 TPPSSSPA----SSKSVDAVAKPAEPDVVPSPGSASNVPEVEPAQVEAKKTR------------PLSPYARYEDLKPP-- 450
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14097 ypTQPSPQIPVQPGVINIPSAPLPTTPPQHPPvfipspespspapkpgVINIPSVTHPE---YPTSQVPVYDVNYSTTpS 14173
Cdd:PLN03209 451 --TSPSPTAPTGVSPSVSSTSSVPAVPDTAPA----------------TAATDAAAPPPanmRPLSPYAVYDDLKPPT-S 511
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14174 PIPQKPGVVNIPSAPQPVHPAPNPPVhefnyPTPPAVPQQpgvlNIPSYPTPVAPTpqsPIYipsqEQPKPTTRPSvinv 14253
Cdd:PLN03209 512 PSPAAPVGKVAPSSTNEVVKVGNSAP-----PTALADEQH----HAQPKPRPLSPY---TMY----EDLKPPTSPT---- 571
|
....
gi 442625924 14254 PSVP 14257
Cdd:PLN03209 572 PSPV 575
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
13904-14094 |
1.59e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 53.34 E-value: 1.59e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13904 TPKPVRPQIYDTPSPPYPVAIPdlvyvqqQQPGIVNIPSAPQPIYPTPQSP---------QYNVNYPSPQPANPQKPGVV 13974
Cdd:PRK12323 385 PAPAAAAPAAAAPAPAAPPAAP-------AAAPAAAAAARAVAAAPARRSPapealaaarQASARGPGGAPAPAPAPAAA 457
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13975 NIPSVPQPVYPSPQPPVYDvnyPTTPVSQHPGVVNIPsAPRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPT 14054
Cdd:PRK12323 458 PAAAARPAAAGPRPVAAAA---AAAPARAAPAAAPAP-ADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATAD 533
|
170 180 190 200
....*....|....*....|....*....|....*....|
gi 442625924 14055 PQSPIYDANYPTTQSPIPQqpgvvniPSVPSPSYPAPNPP 14094
Cdd:PRK12323 534 PDDAFETLAPAPAAAPAPR-------AAAATEPVVAPRPP 566
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14223-14318 |
1.63e-05 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 53.27 E-value: 1.63e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14223 PTPVAPTPQSPIYIPSQEQPKPTTRPSVInvpsvpqPAYPTPQAPVydVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQR 14302
Cdd:PRK14950 364 PAPQPAKPTAAAPSPVRPTPAPSTRPKAA-------AAANIPPKEP--VRETATPPPVPPRPVAPPVPHTPESAPKLTRA 434
|
90
....*....|....*..
gi 442625924 14303 PVFVP-SPVHPTPAPQP 14318
Cdd:PRK14950 435 AIPVDeKPKYTPPAPPK 451
|
|
| PRK12727 |
PRK12727 |
flagellar biosynthesis protein FlhF; |
14178-14389 |
2.12e-05 |
|
flagellar biosynthesis protein FlhF;
Pssm-ID: 237182 [Multi-domain] Cd Length: 559 Bit Score: 52.68 E-value: 2.12e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14178 KPGVVNIPSAPQPVHPAPNPPvhefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPsVP 14257
Cdd:PRK12727 59 RSDTPATAAAPAPAPQAPTKP------AAPVHAPLKLSANANMSQRQRVASAAEDMIAAMALRQPVSVPRQAPAAAP-VR 131
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14258 QPAYPTP----QAPVYDVNYPTSP----SVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVnipSVAQp 14329
Cdd:PRK12727 132 AASIPSPaaqaLAHAAAVRTAPRQehalSAVPEQLFADFLTTAPVPRAPVQAPVVAAPAPVPAIAAALAAHA---AYAQ- 207
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14330 vHPTYQppvvERPAIYDVYYPPPPSRPgviniPSPPRPVYPVPQQPIYVPAPVLHIPAPR 14389
Cdd:PRK12727 208 -DDDEQ----LDDDGFDLDDALPQILP-----PAALPPIVVAPAAPAALAAVAAAAPAPQ 257
|
|
| PRK07003 |
PRK07003 |
DNA polymerase III subunit gamma/tau; |
14227-14531 |
2.35e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 235906 [Multi-domain] Cd Length: 830 Bit Score: 52.93 E-value: 2.35e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14227 APTPQSPIYIPSQeQPKPTTRPsvinvPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPV-- 14304
Cdd:PRK07003 367 APGGGVPARVAGA-VPAPGARA-----AAAVGASAVPAVTAV-----TGAAGAALAPKAAAAAAATRAEAPPAAPAPPat 435
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14305 ---FVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAIydvyYPPPPSRPGVINIPSPP----RPVYPVPQQPIY 14377
Cdd:PRK07003 436 adrGDDAADGDAPVPAKANARASADSRCDERDAQPPADSGSAS----APASDAPPDAAFEPAPRaaapSAATPAAVPDAR 511
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14378 VPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQ--------DVTYPA----PQPSPPVPGIVNIPSLPQPVSTPtsgvi 14445
Cdd:PRK07003 512 APAAASREDAPAAAAPPAPEARPPTPAAAAPAARaggaaaalDVLRNAgmrvSSDRGARAAAAAKPAAAPAAAPK----- 586
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14446 niPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQP---IPT-------------APSPGII--------NI 14501
Cdd:PRK07003 587 --PAAPRVAVQVPTPRARAATGDAPPNGAARAEQAAESRGAPPPwedIPPddyvplsadegfgGPDDGFVpvfdsgpdDV 664
|
330 340 350
....*....|....*....|....*....|
gi 442625924 14502 PSVPQPLPSPTPGViniPQQPTPPPLVQQP 14531
Cdd:PRK07003 665 RVAPKPADAPAPPV---DTRPLPPAIPLDA 691
|
|
| PLN03209 |
PLN03209 |
translocon at the inner envelope of chloroplast subunit 62; Provisional |
14429-14682 |
2.57e-05 |
|
translocon at the inner envelope of chloroplast subunit 62; Provisional
Pssm-ID: 178748 [Multi-domain] Cd Length: 576 Bit Score: 52.62 E-value: 2.57e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14429 NIPS--LPQPVSTPTSGVINIPSQASPPiSVPTPGIVNIPSIPQPTPQRP-SPGIINVPSVPqpiPTAPSPgiiNIPSVP 14505
Cdd:PLN03209 322 KIPSqrVPPKESDAADGPKPVPTKPVTP-EAPSPPIEEEPPQPKAVVPRPlSPYTAYEDLKP---PTSPIP---TPPSSS 394
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14506 QPLPSPTPGViNIPQQPTPPPLVQQPgiINIPSVQQPSTPT-TQHPIQD-VQYETQRP----QPTPGVINIPSVSQPTYP 14579
Cdd:PLN03209 395 PASSKSVDAV-AKPAEPDVVPSPGSA--SNVPEVEPAQVEAkKTRPLSPyARYEDLKPptspSPTAPTGVSPSVSSTSSV 471
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14580 TQKP-------SYQDTSYPTVQPKP----PVSGIINIPSVPQPV-------PSLTPGVINLPSEPSYSAPIPKPGIINvp 14641
Cdd:PLN03209 472 PAVPdtapataATDAAAPPPANMRPlspyAVYDDLKPPTSPSPAapvgkvaPSSTNEVVKVGNSAPPTALADEQHHAQ-- 549
|
250 260 270 280
....*....|....*....|....*....|....*....|.
gi 442625924 14642 siPEPIPSIPQNpvqeVYHDTqKPqaipgvvnvPSAPQPTP 14682
Cdd:PLN03209 550 --PKPRPLSPYT----MYEDL-KP---------PTSPTPSP 574
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14283-14533 |
2.81e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 52.57 E-value: 2.81e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14283 QPGVVNIPSVPlpaPPVKQRPVfvpspVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPAiydvyyPPPPSRPGViniP 14362
Cdd:PRK12323 364 RPGQSGGGAGP---ATAAAAPV-----AQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAA------RAVAAAPAR---R 426
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 SPPRPVYPVPQQPIYVPAPVLHIPAPRPvihniPSVPQPTYPhrnPPIQDVtypapqpSPPVPGIVNIPSLPQPVSTPTS 14442
Cdd:PRK12323 427 SPAPEALAAARQASARGPGGAPAPAPAP-----AAAPAAAAR---PAAAGP-------RPVAAAAAAAPARAAPAAAPAP 491
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14443 GVINIPSQASPPISVPTPGivnipsipqPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQP 14522
Cdd:PRK12323 492 ADDDPPPWEELPPEFASPA---------PAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVA 562
|
250
....*....|.
gi 442625924 14523 TPPPLVQQPGI 14533
Cdd:PRK12323 563 PRPPRASASGL 573
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14230-14381 |
3.06e-05 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 52.41 E-value: 3.06e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14230 PQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVydvnyptspsviPHQPGVVNIPSVPLPAPPvkQRPVFVPSP 14309
Cdd:PRK14951 366 PAAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPA------------AAPAAAASAPAAPPAAAP--PAPVAAPAA 431
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14310 VHPTPAPQPGVVnipSVAQPVHPTYQPPvvERPAIYDVYYPPPPSrpgvinIPSPPRPVYPVPQQPIYVPAP 14381
Cdd:PRK14951 432 AAPAAAPAAAPA---AVALAPAPPAQAA--PETVAIPVRVAPEPA------VASAAPAPAAAPAAARLTPTE 492
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
338-373 |
3.22e-05 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 45.32 E-value: 3.22e-05
10 20 30
....*....|....*....|....*....|....*.
gi 442625924 338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGFVLEH 373
Cdd:cd00054 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
|
|
| dnaA |
PRK14086 |
chromosomal replication initiator protein DnaA; |
14182-14408 |
3.39e-05 |
|
chromosomal replication initiator protein DnaA;
Pssm-ID: 237605 [Multi-domain] Cd Length: 617 Bit Score: 52.14 E-value: 3.39e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14182 VNIPSAPQPVHPAPNPPVHE-FNYPTPPAVPQQPgvlnIPSYPTPVA-PTPQSPiyipsqeqPKPTTRPSvinvpsvPQP 14259
Cdd:PRK14086 84 IAITVDPSAGEPAPPPPHARrTSEPELPRPGRRP----YEGYGGPRAdDRPPGL--------PRQDQLPT-------ARP 144
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14260 AYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPvfvpspvHPTPAPQPGVVNIPSVAQPVHPTYQP-PV 14338
Cdd:PRK14086 145 AYPAYQQRPEPGAWPRAADDYGWQQQRLGFPPRAPYASPASYAP-------EQERDREPYDAGRPEYDQRRRDYDHPrPD 217
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14339 VERPAIYDVYYPPPPsrPGVINipsPPRPVyPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHR--NP 14408
Cdd:PRK14086 218 WDRPRRDRTDRPEPP--PGAGH---VHRGG-PGPPERDDAPVVPIRPSAPGPLAAQPAPAPGPGEPTArlNP 283
|
|
| PTZ00449 |
PTZ00449 |
104 kDa microneme/rhoptry antigen; Provisional |
14057-14366 |
3.39e-05 |
|
104 kDa microneme/rhoptry antigen; Provisional
Pssm-ID: 185628 [Multi-domain] Cd Length: 943 Bit Score: 52.38 E-value: 3.39e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14057 SPIYDANYPTTQSPIPQQ-------PGVVNIPSVP----SPSYP----APNPPVNyPTQP-SPQIPVQPGVINIPSAP-- 14118
Cdd:PTZ00449 548 KPGETKEGEVGKKPGPAKehkpskiPTLSKKPEFPkdpkHPKDPeepkKPKRPRS-AQRPtRPKSPKLPELLDIPKSPkr 626
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14119 --LPTTPPQHPPvfipspespspapkpgvinipsvthPEYPTSqvpvydvnysttpspiPQKPGVVNIPSAPQPvhpapn 14196
Cdd:PTZ00449 627 peSPKSPKRPPP-------------------------PQRPSS----------------PERPEGPKIIKSPKP------ 659
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14197 ppvhefnyPTPPAVPQQPGVLN--IPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAyPTPQAPVydvnYP 14274
Cdd:PTZ00449 660 --------PKSPKPPFDPKFKEkfYDDYLDAAAKSKETKTTVVLDESFESILKETLPETPGTPFTT-PRPLPPK----LP 726
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14275 TSPSvIPHQPgvVNIPSVPLP------APPVKQRPVFvpspvHPTPA--PQPGVVNIPSVAQPVHPTYQPP--VVERPAI 14344
Cdd:PTZ00449 727 RDEE-FPFEP--IGDPDAEQPddieffTPPEEERTFF-----HETPAdtPLPDILAEEFKEEDIHAETGEPdeAMKRPDS 798
|
330 340
....*....|....*....|..
gi 442625924 14345 YDVYYPPPPSrpgviNIPSPPR 14366
Cdd:PTZ00449 799 PSEHEDKPPG-----DHPSLPK 815
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
14185-14412 |
3.78e-05 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 52.30 E-value: 3.78e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14185 PSAPQPVHPAPNPPVHEFNYPTPPAVPQqpgvlnipsyptPVAPTPQSPiyipsqeqPKPTTRPSVINVPSVPQPAYPTP 14264
Cdd:PRK07764 593 GAAGGEGPPAPASSGPPEEAARPAAPAA------------PAAPAAPAP--------AGAAAAPAEASAAPAPGVAAPEH 652
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14265 QAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAP-PVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVVERPA 14343
Cdd:PRK07764 653 HPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPaPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPS 732
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14344 IYDVYYPPPPSRPGVINIPSPPRPVYPVPQQPIYVPAPVLHIPAPRPVIHNiPSVPQPTYPHRNPPIQD 14412
Cdd:PRK07764 733 PAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEE-EMAEDDAPSMDDEDRRD 800
|
|
| Gag_spuma |
pfam03276 |
Spumavirus gag protein; |
14473-14671 |
4.11e-05 |
|
Spumavirus gag protein;
Pssm-ID: 460872 [Multi-domain] Cd Length: 614 Bit Score: 51.67 E-value: 4.11e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14473 PQRPSPGIINVPSVPQPIPTAPSPgiiNIP-SVPQPLPsPTPGVINIPQQ----PTPPPLVQQPGiinipsvqqpstptt 14547
Cdd:pfam03276 196 PSLPAIGGIHLPAIPGIHARAPPG---NIArSLGDDIM-PSLGDAGMPQPrfafHPGNPFAEAEG--------------- 256
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14548 qHPIQDVQYETQRPQPTPGVINIPSVSQPtyptqkpsyqdtsyPTVQPKPPvsgiinipSVPQPVPSLTPgvinlpsePS 14627
Cdd:pfam03276 257 -HPFAEAEGERPRDIPRAPRIDAPSAPAI--------------PAIQPIAP--------PMIPPIGAPIP--------IP 305
|
170 180 190 200
....*....|....*....|....*....|....*....|....
gi 442625924 14628 YSAPIPKPGIINVPSIPepipsiPQNPVQEVYHDTQKPQAIPGV 14671
Cdd:pfam03276 306 HGASIPGEHIRNPREEP------IRLGREAPAIDGRFAPAIDDL 343
|
|
| Tymo_45kd_70kd |
pfam03251 |
Tymovirus 45/70Kd protein; Tymoviruses are single stranded RNA viruses. This family includes a ... |
13905-14318 |
4.60e-05 |
|
Tymovirus 45/70Kd protein; Tymoviruses are single stranded RNA viruses. This family includes a protein of unknown function that has been named based on its molecular weight. Tymoviruses such as the ononis yellow mosaic tymovirus encode only three proteins. Of these two are overlapping this protein overlaps a larger ORF that is thought to be the polymerase.
Pssm-ID: 281269 [Multi-domain] Cd Length: 468 Bit Score: 51.33 E-value: 4.60e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13905 PKPVRPQIYDTPSPPYPVAIP----------DLVYVQQQQPGIVNIpSAPQPIYPTPQ---SPQYNVNYPS--PQPANPQ 13969
Cdd:pfam03251 67 PPPRRPQDNRDFSPLHPLVFPghhsqlrhvhETQQVQQTCPGKLKL-SGAEELPPAPQrqhSLPLHITRPSrfPHHFHAR 145
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13970 KPGVvnIPSVPQpvypspQPPVYDVNYPTTPVSQHPGVVNIPS-APRLVPPTSQrpvFITSPGNLSPTPQpgviniPSVS 14048
Cdd:pfam03251 146 RPDV--LPSVPD------HGPVLTETKPRTSVRQPRSATRGPSfRPILLPKVVH---VHDDPPHSSLRPR------GSRS 208
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14049 QPGYPTPQSPIYDANypttQSPIPQQPGvvniPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINI----PSAPLPTTPP 14124
Cdd:pfam03251 209 RQLQPTVRRPLLAPN----QFHSPRQPP----PLSDDPGILGPRPLAPHSTRDPPPRPITPGPSNThdlrPLSVLPRTSP 280
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14125 QHPPvfipspespspapkpgvinIPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPgVVNIPSAPQPVHPAPNPPVHEFNY 14204
Cdd:pfam03251 281 RRGL-------------------LPNPRRHRTSTGHIPPTTTSRPTGPPSRLQRP-VHLYQSSPHTPNFRPSSIRKDALL 340
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14205 PTPPAVPQQPGvLNIPSYPTPVAPTPQSPIYIPSQEQPK--PTTRPSVINVPSV----PQPAYPTPQAPVYDVNYPTSPS 14278
Cdd:pfam03251 341 QTGPRLGHLER-LGQPANLRTSERSPPTKRRLPRSSEPNrlPKPLPEATLAPSYrhrrPYPLLPNPPAALPSIAYTSSRG 419
|
410 420 430 440
....*....|....*....|....*....|....*....|
gi 442625924 14279 VIPHQPGVVNIPSVPLPAPPVKQrpvfvpspvhPTPAPQP 14318
Cdd:pfam03251 420 KIHHSLPKGALPKEGAPPPPRRL----------PSPAPRP 449
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14220-14379 |
5.76e-05 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 48.25 E-value: 5.76e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14220 PSYP-TPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVnyPTSPSVIPHQPGVVNIPsvplpaPP 14298
Cdd:smart00818 24 PSYGyEPMGGWLHHQIIPVSQQHPPTHTLQPHHHIPVLPAQQPVVPQQPLMPV--PGQHSMTPTQHHQPNLP------QP 95
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14299 VKQrpvfvpsPVHPTPAPQPgvvnipsvaQPVHPTYQPPVVErpaiydvyyPPPPSRPgviniPSPPRPVYPVPQQPIYV 14378
Cdd:smart00818 96 AQQ-------PFQPQPLQPP---------QPQQPMQPQPPVH---------PIPPLPP-----QPPLPPMFPMQPLPPLL 145
|
.
gi 442625924 14379 P 14379
Cdd:smart00818 146 P 146
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14274-14486 |
5.77e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 51.42 E-value: 5.77e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14274 PTSPSVIPhQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVhPTYQPPVVERPAIYDVYYPPPP 14353
Cdd:PRK12323 374 PATAAAAP-VAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPA-PEALAAARQASARGPGGAPAPA 451
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14354 SRPGVINIPSPPRPVYPVPqqpiyvPAPVLHIPAPRPVIHNIPSVPQPTYPhrnPPIQDVtypapQPSPPVPGIVNIPSL 14433
Cdd:PRK12323 452 PAPAAAPAAAARPAAAGPR------PVAAAAAAAPARAAPAAAPAPADDDP---PPWEEL-----PPEFASPAPAQPDAA 517
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPV---STPTSGVIN----IPSQASPPISVPTPgIVNIPSIPQPTPQRPSPGIINVPSV 14486
Cdd:PRK12323 518 PAGWvaeSIPDPATADpddaFETLAPAPAAAPAP-RAAAATEPVVAPRPPRASASGLPDM 576
|
|
| Gag_spuma |
pfam03276 |
Spumavirus gag protein; |
14284-14412 |
6.22e-05 |
|
Spumavirus gag protein;
Pssm-ID: 460872 [Multi-domain] Cd Length: 614 Bit Score: 51.29 E-value: 6.22e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14284 PGVVNIPSVPLPAPPvkqrPVFVPSPVHPTPAPQPGvvNIP---SVAQPVHPTY----QPPVVE----RPAIYDVYYPPP 14352
Cdd:pfam03276 196 PSLPAIGGIHLPAIP----GIHARAPPGNIARSLGD--DIMpslGDAGMPQPRFafhpGNPFAEaeghPFAEAEGERPRD 269
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14353 PSRPGVINIPSPPRPVYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVP------QPTYPHRNPPIQD 14412
Cdd:pfam03276 270 IPRAPRIDAPSAPAIPAIQPIAP--PMIPPIGAPIPIPHGASIPGEHirnpreEPIRLGREAPAID 333
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
13903-14082 |
6.32e-05 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 51.58 E-value: 6.32e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPpyPVAIPDLVYVQQQQpgivnipsaPQPIYPTPQSPQYNVNYPSPQPANPQKPGVVNIPSVPQP 13982
Cdd:pfam09770 206 QAKKPAQQPAPAPAQP--PAAPPAQQAQQQQQ---------FPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQP 274
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13983 VYPSPQPPvydvnypttPVSQhpgvvnipSAPRLVPPTSQRPVFIT-SPGNLSPTPQPGVINIPSVSQPGYPTPQSPiyd 14061
Cdd:pfam09770 275 DPAQPSIQ---------PQAQ--------QFHQQPPPVPVQPTQILqNPNRLSAARVGYPQNPQPGVQPAPAHQAHR--- 334
|
170 180
....*....|....*....|.
gi 442625924 14062 anyptTQSPIPQQPGVVNIPS 14082
Cdd:pfam09770 335 -----QQGSFGRQAPIITHPQ 350
|
|
| Herpes_BLLF1 |
pfam05109 |
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ... |
13883-14127 |
6.39e-05 |
|
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.
Pssm-ID: 282904 [Multi-domain] Cd Length: 886 Bit Score: 51.46 E-value: 6.39e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13883 HSPICYCISSHTGdPFTRCYETPKPVRPQIYDTPSPPYPVAIPDLVYVQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYPS 13962
Cdd:pfam05109 453 HVPTNLTAPASTG-PTVSTADVTSPTPAGTTSGASPVTPSPSPRDNGTESKAPDMTSPTSAVTTPTPNATSPTPAVTTPT 531
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13963 PQPANPQ--KPGVVNIPSVPQPVYPSPQP----PVYDVNYPTTPVSQHPGVVNIPSaPRLVPP----------------- 14019
Cdd:pfam05109 532 PNATSPTlgKTSPTSAVTTPTPNATSPTPavttPTPNATIPTLGKTSPTSAVTTPT-PNATSPtvgetspqanttnhtlg 610
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14020 -TSQRPVFITSPGNLSPTPQPGVINI------------PSVSQPGYP------TPQSPIYDANYPTTQSPIPQ-QPGVVN 14079
Cdd:pfam05109 611 gTSSTPVVTSPPKNATSAVTTGQHNItssstssmslrpSSISETLSPstsdnsTSHMPLLTSAHPTGGENITQvTPASTS 690
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14080 IPSVpSPSYPAPNP-PVNYPTQP-SPQIPVQPGVINIP--SAPLPTTPPQHP 14127
Cdd:pfam05109 691 THHV-STSSPAPRPgTTSQASGPgNSSTSTKPGEVNVTkgTPPKNATSPQAP 741
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14274-14401 |
6.39e-05 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 48.25 E-value: 6.39e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14274 PTSPSVIPHQ--PGVVNIPSVPLPAPPVKQRPVfVPSPVHPTPAPQPGvvNIPSVAQPVHPTYQPPVVErpaiydvyyPP 14351
Cdd:smart00818 41 PVSQQHPPTHtlQPHHHIPVLPAQQPVVPQQPL-MPVPGQHSMTPTQH--HQPNLPQPAQQPFQPQPLQ---------PP 108
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|
gi 442625924 14352 PPSRPgvINIPSPPRPVYPVPQQPiyVPAPVLHIPAPRPVIHNIPSVPQP 14401
Cdd:smart00818 109 QPQQP--MQPQPPVHPIPPLPPQP--PLPPMFPMQPLPPLLPDLPLEAWP 154
|
|
| PLN03209 |
PLN03209 |
translocon at the inner envelope of chloroplast subunit 62; Provisional |
13997-14278 |
6.60e-05 |
|
translocon at the inner envelope of chloroplast subunit 62; Provisional
Pssm-ID: 178748 [Multi-domain] Cd Length: 576 Bit Score: 51.08 E-value: 6.60e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13997 PTTPVSQHpgVVNIPSapRLVPPtsQRPVFITSPgnlSPTPQPGVINIPSVSQPGYPTPQspiydanyPTTQSPIPQQPG 14076
Cdd:PLN03209 312 PLTPMEEL--LAKIPS--QRVPP--KESDAADGP---KPVPTKPVTPEAPSPPIEEEPPQ--------PKAVVPRPLSPY 374
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14077 VVNIPSVPsPSYPAPNPPVNYPTQPSP----QIPVQPGVIniPSAPLPTTPPQHPPvfipspespspapkpgvinIPSVT 14152
Cdd:PLN03209 375 TAYEDLKP-PTSPIPTPPSSSPASSKSvdavAKPAEPDVV--PSPGSASNVPEVEP-------------------AQVEA 432
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14153 HPEYPTSQVPVY-DVNYSTTPSPIPQKPGVVNIPSAP----QPVHPAPNPPVHEFNYPTPPAVPQQPGV----LNIPSYP 14223
Cdd:PLN03209 433 KKTRPLSPYARYeDLKPPTSPSPTAPTGVSPSVSSTSsvpaVPDTAPATAATDAAAPPPANMRPLSPYAvyddLKPPTSP 512
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14224 TPVAPTPQSPiyiPSQEQPKPTTRPSVINVPSV-------PQPAYPTPQAPVYDVNYPTSPS 14278
Cdd:PLN03209 513 SPAAPVGKVA---PSSTNEVVKVGNSAPPTALAdeqhhaqPKPRPLSPYTMYEDLKPPTSPT 571
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
14502-14706 |
6.61e-05 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 51.31 E-value: 6.61e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14502 PSVPQPLPSPTPGVINIPQQ--PTPPPLVQQPGiiniPSVQQPSTPTTQHPiqdvqyETQRPQPTPGVINIPSVSQPTyp 14579
Cdd:pfam03154 146 PSIPSPQDNESDSDSSAQQQilQTQPPVLQAQS----GAASPPSPPPPGTT------QAATAGPTPSAPSVPPQGSPA-- 213
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14580 tqkpsyqdTSYPTVQPKPPVSGIINIPSVPQPVPSltpgviNLPSEPSYSAPIPKPgiinvpsiPEPIPSIPQNPVQEVY 14659
Cdd:pfam03154 214 --------TSQPPNQTQSTAAPHTLIQQTPTLHPQ------RLPSPHPPLQPMTQP--------PPPSQVSPQPLPQPSL 271
|
170 180 190 200
....*....|....*....|....*....|....*....|....*..
gi 442625924 14660 HDTQKPQAIPGVVNVPSAPQPTPGRPYYDVAKPDFEFNPCYPSPCGP 14706
Cdd:pfam03154 272 HGQMPPMPHSLQTGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAP 318
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14436-14630 |
6.61e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 51.42 E-value: 6.61e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14436 PVSTPTsgviniPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGV 14515
Cdd:PRK12323 381 PVAQPA------PAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAP 454
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14516 INIPQQPTPPPLVQQPGiiniPSVQQPSTPTTQHPIQDVQYETQRPQP---TPGVINIPSVSQP--------TYPTQKPS 14584
Cdd:PRK12323 455 AAAPAAAARPAAAGPRP----VAAAAAAAPARAAPAAAPAPADDDPPPweeLPPEFASPAPAQPdaapagwvAESIPDPA 530
|
170 180 190 200
....*....|....*....|....*....|....*....|....*...
gi 442625924 14585 YQDTS--YPTVQPKPPVSGIINIPSVPQPVPSLTPGVINLPSEPSYSA 14630
Cdd:PRK12323 531 TADPDdaFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASGLPDMFD 578
|
|
| rne |
PRK10811 |
ribonuclease E; Reviewed |
14061-14305 |
7.32e-05 |
|
ribonuclease E; Reviewed
Pssm-ID: 236766 [Multi-domain] Cd Length: 1068 Bit Score: 51.19 E-value: 7.32e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14061 DANYPTtQSPIPQQPGVVnipsvpSP---------SYPAPNP--PVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPV 14129
Cdd:PRK10811 816 DERYPT-QSPMPLTVACA------SPemasgkvwiRYPVVRPqdVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSAPVV 888
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14130 fipspespspAPKPGVINIPSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNP-PVHEFNYPTPP 14208
Cdd:PRK10811 889 ----------EAVAEVVEEPVVVAEPQPEEVVVVETTHPEVIAAPVTEQPQVITESDVAVAQEVAEHAePVVEPQDETAD 958
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14209 AVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTrpsvinVPSVPQPAyPTPQAPVYdVNYPTSPSVIPHQPGVVn 14288
Cdd:PRK10811 959 IEEAAETAEVVVAEPEVVAQPAAPVVAEVAAEVETVTA------VEPEVAPA-QVPEATVE-HNHATAPMTRAPAPEYV- 1029
|
250 260
....*....|....*....|
gi 442625924 14289 ipsvplPAPPVK---QRPVF 14305
Cdd:PRK10811 1030 ------PEAPRHsdwQRPTF 1043
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14165-14304 |
7.46e-05 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 50.96 E-value: 7.46e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14165 DVNYSTTPSP-IPQKPGVVNIPSAPQPVhPAPNPPvhefNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIyipSQEQPK 14243
Cdd:PRK14950 338 DFQLRTTSYGqLPLELAVIEALLVPVPA-PQPAKP----TAAAPSPVRPTPAPSTRPKAAAAANIPPKEPV---RETATP 409
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14244 PTTRPSVINVPSVPQPayptPQAPvydvnyPTSPSVIPhqpgVVNIPSVPLPAPPVKQRPV 14304
Cdd:PRK14950 410 PPVPPRPVAPPVPHTP----ESAP------KLTRAAIP----VDEKPKYTPPAPPKEEEKA 456
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
338-369 |
7.53e-05 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 44.16 E-value: 7.53e-05
10 20 30
....*....|....*....|....*....|..
gi 442625924 338 DVDECATNNPCGLGAECVNLGGSFQCRCPSGF 369
Cdd:smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14454-14565 |
8.92e-05 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 50.58 E-value: 8.92e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14454 PISVPTPGIvniPSIPQPTPQRPSPGIINVPSVPQPIPTAPSpgiiniPSVPQPLPSPTPgviniPQQPTPPPLVQQPgi 14533
Cdd:PRK14950 362 PVPAPQPAK---PTAAAPSPVRPTPAPSTRPKAAAAANIPPK------EPVRETATPPPV-----PPRPVAPPVPHTP-- 425
|
90 100 110
....*....|....*....|....*....|..
gi 442625924 14534 iniPSVqqPSTPTTQHPIqDVQYETQRPQPTP 14565
Cdd:PRK14950 426 ---ESA--PKLTRAAIPV-DEKPKYTPPAPPK 451
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
13903-14128 |
9.23e-05 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 50.45 E-value: 9.23e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPPYPVAIPDLVYvQQQQPGIVNIPSAPQPIYPTPQSPQYNVNYP---------SPQPANP--QKP 13971
Cdd:COG5180 274 AAEPPGLPVLEAGSEPQSDAPEAETAR-PIDVKGVASAPPATRPVRPPGGARDPGTPRPgqpterpagVPEAASDagQPP 352
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13972 GVVNIPSVPQPVYPSPQ--PPVYDVNYPTTPV----------SQHPGVVN-IPSAPRLVPPTSQRPVFIT-------SPG 14031
Cdd:COG5180 353 SAYPPAEEAVPGKPLEQgaPRPGSSGGDGAPFqppngapqpgLGRRGAPGpPMGAGDLVQAALDGGGRETaslggaaGGA 432
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14032 NLSPTPQPGVINIPSVSQPGYPTPQSPIydanyptTQSPIPQQPGVV--NIPSVPSPSYPAPNPPVNYPTQPSPQIPVQP 14109
Cdd:COG5180 433 GQGPKADFVPGDAESVSGPAGLADQAGA-------AASTAMADFVAPvtDATPVDVADVLGVRPDAILGGNVAPASGLDA 505
|
250
....*....|....*....
gi 442625924 14110 GVINIPSAPLPTTPPQHPP 14128
Cdd:COG5180 506 ETRIIEAEGAPATEDFVAA 524
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14187-14336 |
9.72e-05 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 50.48 E-value: 9.72e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14187 APQPVHPAPNPPVHEFNYPTPPAVPQQPGVlnipsyPTPVAPTPQSPIYIPSQEQPKPTTRPsvinVPSVPQPAYPTPQA 14266
Cdd:PRK14951 363 AFKPAAAAEAAAPAEKKTPARPEAAAPAAA------PVAQAAAAPAPAAAPAAAASAPAAPP----AAAPPAPVAAPAAA 432
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14267 PVydvnyptsPSVIPHQPGVVNIPsvplPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQP 14336
Cdd:PRK14951 433 AP--------AAAPAAAPAAVALA----PAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTP 490
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
137-166 |
1.20e-04 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 43.74 E-value: 1.20e-04
10 20 30
....*....|....*....|....*....|
gi 442625924 137 PCDVFAHCTNTLGSFTCTCFPGYRGNGFHC 166
Cdd:pfam12947 7 GCHPNATCTNTGGSFTCTCNDGYTGDGVTC 36
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14275-14375 |
1.49e-04 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 49.81 E-value: 1.49e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14275 TSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPA----PQPGVVNIPSVAQPV-HPTYQPPVVERPAIYDVYY 14349
Cdd:PRK14950 344 TSYGQLPLELAVIEALLVPVPAPQPAKPTAAAPSPVRPTPApstrPKAAAAANIPPKEPVrETATPPPVPPRPVAPPVPH 423
|
90 100
....*....|....*....|....*..
gi 442625924 14350 PPPPSRPGV-INIPSPPRPVYPVPQQP 14375
Cdd:PRK14950 424 TPESAPKLTrAAIPVDEKPKYTPPAPP 450
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
212-247 |
1.63e-04 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 43.39 E-value: 1.63e-04
10 20 30
....*....|....*....|....*....|....*.
gi 442625924 212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGYVGNN 247
Cdd:cd00054 1 DIDECASGNPCQNGGTCVNTVGSYRCSCPPGYTGRN 36
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14225-14373 |
1.63e-04 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 49.71 E-value: 1.63e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14225 PVAPTPQSPIyiPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPV 14304
Cdd:PRK14951 366 PAAAAEAAAP--AEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAPA 443
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 442625924 14305 FVPSPVHPTPAPQPGVVNIPSVAQPvhptyQPPVVERPAiydvyyPPPPSRPGVINIPSPPRPVY--PVPQ 14373
Cdd:PRK14951 444 AVALAPAPPAQAAPETVAIPVRVAP-----EPAVASAAP------APAAAPAAARLTPTEEGDVWhaTVQQ 503
|
|
| Zona_pellucida |
pfam00100 |
Zona pellucida-like domain; |
17722-17947 |
1.85e-04 |
|
Zona pellucida-like domain;
Pssm-ID: 459673 [Multi-domain] Cd Length: 254 Bit Score: 48.37 E-value: 1.85e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17722 CLADGVQVEIHITEPGFNGVLY--VKGHSKDEECRRVVNLAGETVprtEIFRVHFGSCG--MQAVKDVA--SFVLVIQKH 17795
Cdd:pfam00100 1 CTPDTMTVSISKCLLVPSGLLSslSLLGGLDPSCKPVSNTNGSPA---VLFEFPLTGCGttVQVNGTHIiySNTLYSSTD 77
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17796 PKLVTYK---AQAYNIKCVYQTGEkNVTLGFNVSMLTTAGTIANTGPPPIcQMRIITNE------GEEINSAEIGDNLKL 17866
Cdd:pfam00100 78 LRSGIIRrtiTRRLPFSCSYPRSS-LVSLLVVAPPSPVPITVSGSGVFLV-SMDLYYDSsytspySPYPVTVLLGDPLYV 155
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 17867 QVDVEPAT--IYGGFARSCIAkTMEDNVQNEYLVTD-ENGCATDTSIFGNWEYNPDTNSLLA--SFNAFKF--PSSDNIR 17939
Cdd:pfam00100 156 EVSLLSRTdpNLVLVLDNCWA-TPSPNPTSSPQYQLiVNGCPNDGDSTYPVSSLSNGPSHYVrfSFKAFRFvgSSISQVY 234
|
....*...
gi 442625924 17940 FQCNIRVC 17947
Cdd:pfam00100 235 LHCSVSVC 242
|
|
| PRK14971 |
PRK14971 |
DNA polymerase III subunit gamma/tau; |
14149-14277 |
2.03e-04 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237874 [Multi-domain] Cd Length: 614 Bit Score: 49.77 E-value: 2.03e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14149 PSVTHPeyptsqVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPApnppvhefnYPTPPAVPQQPGvlnIPS-YPTPVA 14227
Cdd:PRK14971 381 PVFTQP------AAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQP---------AGTPPTVSVDPP---AAVpVNPPST 442
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|
gi 442625924 14228 PTPQSPIYIPSQEQPKPTTRPSVInVPSVPQPAYPTPQAPvyDVNYPTSP 14277
Cdd:PRK14971 443 APQAVRPAQFKEEKKIPVSKVSSL-GPSTLRPIQEKAEQA--TGNIKEAP 489
|
|
| rne |
PRK10811 |
ribonuclease E; Reviewed |
14190-14390 |
3.27e-04 |
|
ribonuclease E; Reviewed
Pssm-ID: 236766 [Multi-domain] Cd Length: 1068 Bit Score: 49.27 E-value: 3.27e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14190 PVHPAPNPPVHEfnYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPS--QEQPKPTTRPSVINVPSVPQPAYPTPQAP 14267
Cdd:PRK10811 846 PVVRPQDVQVEE--QREAEEVQVQPVVAEVPVAAAVEPVVSAPVVEAVAevVEEPVVVAEPQPEEVVVVETTHPEVIAAP 923
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14268 VYDVNYPTSPSVIPHQPGVVNIPsVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVNIPsVAQPVHPTyQPPVVERPAIYDV 14347
Cdd:PRK10811 924 VTEQPQVITESDVAVAQEVAEHA-EPVVEPQDETADIEEAAETAEVVVAEPEVVAQP-AAPVVAEV-AAEVETVTAVEPE 1000
|
170 180 190 200
....*....|....*....|....*....|....*....|....
gi 442625924 14348 YYPPPPSRPGVIN-IPSPPRPVYPVPQqpiYVPAPVLHIPAPRP 14390
Cdd:PRK10811 1001 VAPAQVPEATVEHnHATAPMTRAPAPE---YVPEAPRHSDWQRP 1041
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
13993-14106 |
3.33e-04 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 49.04 E-value: 3.33e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13993 DVNYPTTPVSQHPGVVNIPSApRLVPPTSQRPVFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIP 14072
Cdd:PRK14950 338 DFQLRTTSYGQLPLELAVIEA-LLVPVPAPQPAKPTAAAPSPVRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRP 416
|
90 100 110
....*....|....*....|....*....|....
gi 442625924 14073 QQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQIP 14106
Cdd:PRK14950 417 VAPPVPHTPESAPKLTRAAIPVDEKPKYTPPAPP 450
|
|
| PRK11633 |
PRK11633 |
cell division protein DedD; Provisional |
14490-14594 |
3.48e-04 |
|
cell division protein DedD; Provisional
Pssm-ID: 236940 [Multi-domain] Cd Length: 226 Bit Score: 47.30 E-value: 3.48e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14490 IPTAPSPG----IINIPSVPQPLPS-PTPGVINIPQQPTPPPLVQQPGII---NIPSVQQPsTPTTQHPIQDVQyetqRP 14561
Cdd:PRK11633 41 IPLVPKPGdrdePDMMPAATQALPTqPPEGAAEAVRAGDAAAPSLDPATVappNTPVEPEP-APVEPPKPKPVE----KP 115
|
90 100 110
....*....|....*....|....*....|...
gi 442625924 14562 QPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQ 14594
Cdd:PRK11633 116 KPKPKPQQKVEAPPAPKPEPKPVVEEKAAPTGK 148
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14157-14336 |
3.72e-04 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 48.72 E-value: 3.72e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14157 PTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNPPVhefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQSPiyI 14236
Cdd:PRK12323 387 PAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPA-----PEALAAARQASARGPGGAPAPAPAPAAAP--A 459
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14237 PSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYP---------TSPSVIPHQPGVVNIPSVPLPAPPVKQrpvfvP 14307
Cdd:PRK12323 460 AAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPpweelppefASPAPAQPDAAPAGWVAESIPDPATAD-----P 534
|
170 180
....*....|....*....|....*....
gi 442625924 14308 SPVHPTPAPQPGVVNIPSVAQPVHPTYQP 14336
Cdd:PRK12323 535 DDAFETLAPAPAAAPAPRAAAATEPVVAP 563
|
|
| dnaA |
PRK14086 |
chromosomal replication initiator protein DnaA; |
13941-14127 |
3.81e-04 |
|
chromosomal replication initiator protein DnaA;
Pssm-ID: 237605 [Multi-domain] Cd Length: 617 Bit Score: 48.67 E-value: 3.81e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13941 PSAPQPIYPTPQSPQYNVNYPSP------QPANPQKPGVVNIPSVP--QPVYPSPQPPVYDVNYPTTPVSQHpgvvniPS 14012
Cdd:PRK14086 95 PAPPPPHARRTSEPELPRPGRRPyegyggPRADDRPPGLPRQDQLPtaRPAYPAYQQRPEPGAWPRAADDYG------WQ 168
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14013 APRLVPPTSQRPvfiTSPGNLSPTP----QPGVINIPSVSQPgYPTPQSPIYDANYP---TTQSPIPqQPGVVNIPSVPS 14085
Cdd:PRK14086 169 QQRLGFPPRAPY---ASPASYAPEQerdrEPYDAGRPEYDQR-RRDYDHPRPDWDRPrrdRTDRPEP-PPGAGHVHRGGP 243
|
170 180 190 200
....*....|....*....|....*....|....*....|..
gi 442625924 14086 PSYPAPNPPVNYPTQPSPQIPvqpgviniPSAPLPTTPPQHP 14127
Cdd:PRK14086 244 GPPERDDAPVVPIRPSAPGPL--------AAQPAPAPGPGEP 277
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14434-14531 |
4.97e-04 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 48.27 E-value: 4.97e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14434 PQPVSTPTSgviniPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGiinVPSVPQPIPTAPSPGIINIPSVPQPLPSPTP 14513
Cdd:PRK14950 362 PVPAPQPAK-----PTAAAPSPVRPTPAPSTRPKAAAAANIPPKEP---VRETATPPPVPPRPVAPPVPHTPESAPKLTR 433
|
90
....*....|....*...
gi 442625924 14514 GVINIPQQPTPPPLVQQP 14531
Cdd:PRK14950 434 AAIPVDEKPKYTPPAPPK 451
|
|
| Not5 |
COG5665 |
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription]; |
14222-14603 |
5.06e-04 |
|
CCR4-NOT transcriptional regulation complex, NOT5 subunit [Transcription];
Pssm-ID: 444384 [Multi-domain] Cd Length: 874 Bit Score: 48.51 E-value: 5.06e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14222 YPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVP----------SVPQPAYPTP----QAPVYDVNY---------PTSPS 14278
Cdd:COG5665 165 ASNPVAVVVTTMIAVPSAPAAPPNAVDYSVLVPiaaqdpaasvSTPQAFNASAtsgrSQHIVQAAKrvgvewwgdPSLLA 244
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14279 VIPHQPGVVNIPSVP----LPAPPVKQRPvfvPSPVHPTPAPQPGVVnipSVAQPVHPTYQPPVVERPAIYDVYYPPPPS 14354
Cdd:COG5665 245 TPPATPATEEKSSQQpksqPTSPSGGTTP---PSTNQLTTSNTPTST---AKAQPQPPTKKQPAKEPPSDTASGNPSAPS 318
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14355 RPGVINIPSPPRPV---YPVPQ------QPIYVPAPvlhiPAPRPVIHNIPSVP-QPTyphrnPPIQDVTypapqpsppv 14424
Cdd:COG5665 319 VLINSDSPTSEDPAtasVPTTEettaftTPSSVPST----PAEKDTPATDLATPvSPT-----PPETSVD---------- 379
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14425 pgivnipslpqPVSTPTSGVINIPSQASP--PISVPTPGIVNIPSIPqPTPQRPSPGI----INVPSVPQPIP------- 14491
Cdd:COG5665 380 -----------KKVSPDSATSSTKSEKEGgtASSPMPPNIAIGAKDD-VDATDPSQEAkeytKNAPMTPEADSapessvr 447
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14492 TAPSPGIINIPSV---------PQPLPSPTPGVInIPQQPTPPPLVQQPGIINIPSVQQPSTPTTQHpiQDVQYETQRPQ 14562
Cdd:COG5665 448 TEASPSAGSDLEPenttlrdpaPNAIPPPEDPST-IGRLSSGDKLANETGPPVIRRDSTPSSTADQS--IVGVLAFGLDQ 524
|
410 420 430 440
....*....|....*....|....*....|....*....|.
gi 442625924 14563 PTPGVINIPSVSqpTYPTQKPSYQDTSYPTVQPKPPVSGII 14603
Cdd:COG5665 525 RTQAEISVEAAS--RSNPLLNSQVKSFPLGKRSEGAKGKTQ 563
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14170-14285 |
5.68e-04 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 48.17 E-value: 5.68e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14170 TTPSPIPQKPGVVniPSAPQPVHPAPNPPvhefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRPS 14249
Cdd:PRK14951 387 AAPAAAPVAQAAA--APAPAAAPAAAASA------PAAPPAAAPPAPVAAPAAAAPAAAPAAAPAAVALAPAPPAQAAPE 458
|
90 100 110
....*....|....*....|....*....|....*.
gi 442625924 14250 VINVPSVPQPAYPTPQAPVYDVNYPTSPSVIPHQPG 14285
Cdd:PRK14951 459 TVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTEEG 494
|
|
| GGN |
pfam15685 |
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the ... |
13963-14103 |
6.24e-04 |
|
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking.
Pssm-ID: 434857 [Multi-domain] Cd Length: 668 Bit Score: 48.22 E-value: 6.24e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13963 PQPANPQKPGVVNipSVPQPVYPSPQ---PPVYDVNYPTTPVSQHPGvvniPSAPRLVPPTSQRPVFITSPGNL-SPTPQ 14038
Cdd:pfam15685 389 PWGSPPPPPGKAH--PIPGPRRPAPAllaPPMFIFPAPTNGEPVRPG----PPAPQALLPRPPPPTPPATPPPVpPPIPQ 462
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14039 -PGVINIP-SVSQPGYPTPQS-PIYDANYPTTQSPIP-----QQPGVVNIPSVPSPSyPAPNPPVNYPTQPSP 14103
Cdd:pfam15685 463 lPALQPMPlAAARPPTPRPCPgHGESALAPAPTAPLPpalaaDQAPAPALAAAPAPS-PAPAPATADPLPPAP 534
|
|
| PRK11633 |
PRK11633 |
cell division protein DedD; Provisional |
14174-14268 |
6.61e-04 |
|
cell division protein DedD; Provisional
Pssm-ID: 236940 [Multi-domain] Cd Length: 226 Bit Score: 46.15 E-value: 6.61e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14174 PIPQKPGVVN----IPSAPQPVhPAPNPP--VHEFNYPTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTR 14247
Cdd:PRK11633 42 PLVPKPGDRDepdmMPAATQAL-PTQPPEgaAEAVRAGDAAAPSLDPATVAPPNTPVEPEPAPVEPPKPKPVEKPKPKPK 120
|
90 100
....*....|....*....|....*.
gi 442625924 14248 PSVINVPSV-----PQPAYPTPQAPV 14268
Cdd:PRK11633 121 PQQKVEAPPapkpePKPVVEEKAAPT 146
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
13903-14319 |
7.01e-04 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 48.06 E-value: 7.01e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13903 ETPKPVRPQIYDTPSPPyPVAIPdlvyVQQQQPgivniPSAPQPIYPTPQSPQYNVNYPSPQPANPQKPGVVniPSVPQP 13982
Cdd:PRK07764 410 PAPAAAAPAAAAAPAPA-AAPQP----APAPAP-----APAPPSPAGNAPAGGAPSPPPAAAPSAQPAPAPA--AAPEPT 477
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13983 VYPSPQPPVYDVNYPTTPVSQHPGVVNIPSA--------PRLVPPTSQRPVFITspGNLSPTPQPG-------VINIPS- 14046
Cdd:PRK07764 478 AAPAPAPPAAPAPAAAPAAPAAPAAPAGADDaatlrerwPEILAAVPKRSRKTW--AILLPEATVLgvrgdtlVLGFSTg 555
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14047 -----VSQPGYPTPqspIYDAnypttqspIPQQPGV---VNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGViniPSAP 14118
Cdd:PRK07764 556 glarrFASPGNAEV---LVTA--------LAEELGGdwqVEAVVGPAPGAAGGEGPPAPASSGPPEEAARPAA---PAAP 621
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14119 LPTTPPQHPPvfipspeSPSPAPKPGVINIPSVTHPEYPTSQVPVYDVNYSTTPSPIP-QKPGVVNIPSAPQPVHPAPNP 14197
Cdd:PRK07764 622 AAPAAPAPAG-------AAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKaGGAAPAAPPPAPAPAAPAAPA 694
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14198 PVhefnyPTPPAVPQQPgvlnipsyPTPVAPTPQSPIYIPSQeQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPtsp 14277
Cdd:PRK07764 695 GA-----APAQPAPAPA--------ATPPAGQADDPAAQPPQ-AAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQ--- 757
|
410 420 430 440
....*....|....*....|....*....|....*....|..
gi 442625924 14278 sviPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPG 14319
Cdd:PRK07764 758 ---PPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAPSMDDE 796
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
14169-14531 |
7.19e-04 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 48.06 E-value: 7.19e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14169 STTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHefnyPTPPAVPQQPGVLNIPSYPTPVAPTPQspiyipsqeqPKPTTRP 14248
Cdd:PRK07764 401 AAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPA----PAPAPAPAPPSPAGNAPAGGAPSPPPA----------AAPSAQP 466
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14249 SVINVPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQR----------------PVFVPSPV-- 14310
Cdd:PRK07764 467 APAPAAAPEPTAAPAPAPPA-----APAPAAAPAAPAAPAAPAGADDAATLRERwpeilaavpkrsrktwAILLPEATvl 541
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14311 ----------HPTPA-----PQPGVVNI--PSVAQPVHPTYQPPVV-----------ERPAIYDVYYPPPPSRPgviniP 14362
Cdd:PRK07764 542 gvrgdtlvlgFSTGGlarrfASPGNAEVlvTALAEELGGDWQVEAVvgpapgaaggeGPPAPASSGPPEEAARP-----A 616
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14363 SPPRPVYPVPQQPIYVPAPvlhiPAPRPViHNIPSVPQPTYPHRNPPIQDVTYPAPQPSPPVPGIVniPSLPQPVSTPTS 14442
Cdd:PRK07764 617 APAAPAAPAAPAPAGAAAA----PAEASA-APAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAA--PAAPPPAPAPAA 689
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14443 GVINIPSQASPPISVPTPGIVNIPSiPQPTPQRPSPGIINVPSVP-----QPIPTAPSPGiiNIPSVPQPLPSPTPGVIN 14517
Cdd:PRK07764 690 PAAPAGAAPAQPAPAPAATPPAGQA-DDPAAQPPQAAQGASAPSPaaddpVPLPPEPDDP--PDPAGAPAQPPPPPAPAP 766
|
410
....*....|....
gi 442625924 14518 IPQQPTPPPLVQQP 14531
Cdd:PRK07764 767 AAAPAAAPPPSPPS 780
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
212-243 |
7.50e-04 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 41.46 E-value: 7.50e-04
10 20 30
....*....|....*....|....*....|..
gi 442625924 212 DVDECRNPENCGPNALCTNTPGNYTCSCPDGY 243
Cdd:smart00179 1 DIDECASGNPCQNGGTCVNTVGSYRCECPPGY 32
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
218-246 |
8.66e-04 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 41.43 E-value: 8.66e-04
10 20
....*....|....*....|....*....
gi 442625924 218 NPENCGPNALCTNTPGNYTCSCPDGYVGN 246
Cdd:pfam12947 4 NNGGCHPNATCTNTGGSFTCTCNDGYTGD 32
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14440-14565 |
9.06e-04 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 47.40 E-value: 9.06e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14440 PTSGVIN-IPSQASPPISVPTPGIVNIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINI 14518
Cdd:PRK14951 366 PAAAAEAaAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAPAAV 445
|
90 100 110 120
....*....|....*....|....*....|....*....|....*..
gi 442625924 14519 PQQPtPPPLVQQPGIINIPSVQQPSTPTTQHPiqdvQYETQRPQPTP 14565
Cdd:PRK14951 446 ALAP-APPAQAAPETVAIPVRVAPEPAVASAA----PAPAAAPAAAR 487
|
|
| PRK12727 |
PRK12727 |
flagellar biosynthesis protein FlhF; |
14240-14454 |
1.10e-03 |
|
flagellar biosynthesis protein FlhF;
Pssm-ID: 237182 [Multi-domain] Cd Length: 559 Bit Score: 47.29 E-value: 1.10e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14240 EQPKPTTRPSVINVPSVPQPAYPTPqAPVYDVNYPTSPSVIPHQPGVVN-----IPSVPLPAPPVKQRPVFVPSPVHPTP 14314
Cdd:PRK12727 56 ETARSDTPATAAAPAPAPQAPTKPA-APVHAPLKLSANANMSQRQRVASaaedmIAAMALRQPVSVPRQAPAAAPVRAAS 134
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14315 APQPG-VVNIPSVAQPVHPTYQPPVVERPAiyDVYYPPPPSRPgvinIPSPPRPVYPVPqqpiyVPAPVLHIPAPRPVI- 14392
Cdd:PRK12727 135 IPSPAaQALAHAAAVRTAPRQEHALSAVPE--QLFADFLTTAP----VPRAPVQAPVVA-----APAPVPAIAAALAAHa 203
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14393 ----HNIPSVPQPTYPHRNPPIQdvtypapqpsppvpgIVNIPSLPQPVSTPTSGVINIPSQASPP 14454
Cdd:PRK12727 204 ayaqDDDEQLDDDGFDLDDALPQ---------------ILPPAALPPIVVAPAAPAALAAVAAAAP 254
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14465-14584 |
1.15e-03 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 44.78 E-value: 1.15e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14465 IPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIINIPsvPQPLPSPTPGvinipQQPTPPPLVQQPgiiniPSVQQPST 14544
Cdd:smart00818 40 IPVSQQHPPTHTLQPHHHIPVLPAQQPVVPQQPLMPVP--GQHSMTPTQH-----HQPNLPQPAQQP-----FQPQPLQP 107
|
90 100 110 120
....*....|....*....|....*....|....*....|
gi 442625924 14545 PTTQHPIQdvqyeTQRPQPTPGVINIPSVSQPTYPTQKPS 14584
Cdd:smart00818 108 PQPQQPMQ-----PQPPVHPIPPLPPQPPLPPMFPMQPLP 142
|
|
| EGF_CA |
smart00179 |
Calcium-binding EGF-like domain; |
1022-1056 |
1.18e-03 |
|
Calcium-binding EGF-like domain;
Pssm-ID: 214542 [Multi-domain] Cd Length: 39 Bit Score: 41.08 E-value: 1.18e-03
10 20 30
....*....|....*....|....*....|....*
gi 442625924 1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQ 1056
Cdd:smart00179 1 DIDECASGN--PCQNGGTCVNTVGSYRCECPPGYT 33
|
|
| f2_encap_cargo1 |
NF041166 |
family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like ... |
14451-14667 |
1.26e-03 |
|
family 2A encapsulin nanocompartment cargo protein cysteine desulfurase; Capsid-like encapsulin nanocompartments are commonly found in bacteria and archaea. Encapsulin nanocompartments, which are assembled from shell proteins, encapsulate various cargo proteins, typically peroxidases or ferritin-like proteins, to protect cells from oxidative stress caused by peroxide. Proteins of this family are cysteine desulfurases with an additional N-terminal encapsulation targeting sequence (~200 aa) that is necessary and sufficient for compartmentalization.
Pssm-ID: 469077 [Multi-domain] Cd Length: 623 Bit Score: 47.16 E-value: 1.26e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14451 ASPPISVPTPGI---VNIPSIPQPTPQRPSPGIINV-PSVPQ-PIPTAPSPGIINIPSVPQPLPSPTPGVinipqqPTPP 14525
Cdd:NF041166 33 SALPGEAPAPGLpaaPPAAPAPPGSNPAPAAGPGGLgAGVPGaALPQGLVPGANLLPSAPSPVGALGASA------PALA 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14526 PLVQQPgIINIPSVQQPSTPTTQHPIQDVQY-------ETQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSyPTVQPKPP 14598
Cdd:NF041166 107 PHAAAG-NVGLPDAVVAVAPAEPRAGGAALPvglpqapVPAAPSAAAAPPDLVAPQAFGLPGEDAALRALL-PAASPAPP 184
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14599 VSgiiniPSVPQPVPS---LTPGVINLPSEPSYSAPIPKPG---IINVPSIPE--PIpsipqnpVQE-------VYHD-- 14661
Cdd:NF041166 185 SA-----PSAAAAESSyyfLDERAAPSPAAAPPGSPPALASahpPFDVNAVRRdfPI-------LQErvngkplVWFDna 252
|
....*...
gi 442625924 14662 --TQKPQA 14667
Cdd:NF041166 253 atTQKPQA 260
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14169-14275 |
1.32e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 47.02 E-value: 1.32e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14169 STTPSPIPQKPGVVNIPSAPQPvhPAPNPPVHEFNYPTPPAVPqQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPKPTTRP 14248
Cdd:PRK14951 398 AAAPAPAAAPAAAASAPAAPPA--AAPPAPVAAPAAAAPAAAP-AAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVAS 474
|
90 100
....*....|....*....|....*....
gi 442625924 14249 SVINVPSVPQPA--YPTPQAPVYDVNYPT 14275
Cdd:PRK14951 475 AAPAPAAAPAAArlTPTEEGDVWHATVQQ 503
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
14399-14690 |
1.39e-03 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 46.98 E-value: 1.39e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14399 PQPTYPHRNPPIQdvtypapqpsppvpgiVNIPSLPQPVSTPTS-GVINIPSqaSPPISVPTPGIVNIPSIPQPTPQRPS 14477
Cdd:PHA03379 409 SEPTYGTPRPPVE----------------KPRPEVPQSLETATShGSAQVPE--PPPVHDLEPGPLHDQHSMAPCPVAQL 470
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14478 PgiinvpsvPQPIPTApSPG--IINIPSVPQPLPSPTPGVINIPQQPTPPPLVQQPGiinipsvqqpstpttqhpiqdVQ 14555
Cdd:PHA03379 471 P--------PGPLQDL-EPGdqLPGVVQDGRPACAPVPAPAGPIVRPWEASLSQVPG---------------------VA 520
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14556 YETQRPQPTPGviniPSVSQPTYPTQKPSYQDTSYPTVQPKPPVSGIINI-------PSVPQPVPSLTPgvINLPSEPSY 14628
Cdd:PHA03379 521 FAPVMPQPMPV----EPVPVPTVALERPVCPAPPLIAMQGPGETSGIVRVrerwrpaPWTPNPPRSPSQ--MSVRDRLAR 594
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14629 SAPIPKPGIINVPSIPepiPSIPQNPvqevyhdTQKPQAIPGVvnvpSAPQPTPGRPYYDVA 14690
Cdd:PHA03379 595 LRAEAQPYQASVEVQP---PQLTQVS-------PQQPMEYPLE----PEQQMFPGSPFSQVA 642
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14533-14635 |
1.45e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 46.73 E-value: 1.45e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14533 IINIPSVQQPSTPTTQHPiqdvqyetQRPQPTPGVINIPSVSQPTYPTQKPSYQDTSYPTVQPKPPVSGiiNIPSVPQPV 14612
Cdd:PRK14950 359 LLVPVPAPQPAKPTAAAP--------SPVRPTPAPSTRPKAAAAANIPPKEPVRETATPPPVPPRPVAP--PVPHTPESA 428
|
90 100
....*....|....*....|...
gi 442625924 14613 PSLTPGVINLPSEPSYSAPIPKP 14635
Cdd:PRK14950 429 PKLTRAAIPVDEKPKYTPPAPPK 451
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
14169-14347 |
1.47e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 46.90 E-value: 1.47e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14169 STTPSPIPQKPGVVNIPSAPQPVHPAPNPPVHEFNY-----PTPPAVPQQPGVLNIPSYPTPVAPTPQSPIYIPSQEQPK 14243
Cdd:PRK07764 619 AAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPkhvavPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAP 698
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14244 PTTRPSVINVPSVPQPAYPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGvvni 14323
Cdd:PRK07764 699 AQPAPAPAATPPAGQADDPAAQPPQ-----AAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAA---- 769
|
170 180
....*....|....*....|....
gi 442625924 14324 PSVAQPVHPTYQPPVVERPAIYDV 14347
Cdd:PRK07764 770 PAAAPPPSPPSEEEEMAEDDAPSM 793
|
|
| PRK14971 |
PRK14971 |
DNA polymerase III subunit gamma/tau; |
14243-14375 |
1.62e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237874 [Multi-domain] Cd Length: 614 Bit Score: 46.69 E-value: 1.62e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14243 KPTTRPSVINVPSVPQPAyPTPQAPVydvnyPTSPSVIPHQPGVVNIPSVPLPAPPvkqrpvfvpspvhPTPAPQPGVVN 14322
Cdd:PRK14971 370 SGGRGPKQHIKPVFTQPA-AAPQPSA-----AAAASPSPSQSSAAAQPSAPQSATQ-------------PAGTPPTVSVD 430
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14323 IPSvAQPVHPTYQPPVVERPAIYDVYYPPPPSRPGVINIPSpPRPVYPVPQQP 14375
Cdd:PRK14971 431 PPA-AVPVNPPSTAPQAVRPAQFKEEKKIPVSKVSSLGPST-LRPIQEKAEQA 481
|
|
| PRK14951 |
PRK14951 |
DNA polymerase III subunits gamma and tau; Provisional |
14366-14497 |
1.63e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237865 [Multi-domain] Cd Length: 618 Bit Score: 46.63 E-value: 1.63e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14366 RPVYPVPQQPIYVPAPVLHIPAPRPVIHNIPSVPQPTYPHRNPPIQDVTyPAPQPSPPVPGIVNIPSLPQPVSTPTSGVI 14445
Cdd:PRK14951 365 KPAAAAEAAAPAEKKTPARPEAAAPAAAPVAQAAAAPAPAAAPAAAASA-PAAPPAAAPPAPVAAPAAAAPAAAPAAAPA 443
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|..
gi 442625924 14446 NIPSQASPPIsVPTPGIVNIPSIPQPTPQRPSPGiinVPSVPQPIPTAPSPG 14497
Cdd:PRK14951 444 AVALAPAPPA-QAAPETVAIPVRVAPEPAVASAA---PAPAAAPAAARLTPT 491
|
|
| PRK14971 |
PRK14971 |
DNA polymerase III subunit gamma/tau; |
14207-14343 |
1.72e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237874 [Multi-domain] Cd Length: 614 Bit Score: 46.69 E-value: 1.72e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14207 PPAVPQQPgvLNiPSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAyptpqapvydvnyPTSPSVIPHQPGV 14286
Cdd:PRK14971 371 GGRGPKQH--IK-PVFTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPA-------------GTPPTVSVDPPAA 434
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14287 VniPSVPLPAPPVKQRPVFVPSPVHPTPAPQPGVVniPSVAQPVHPTYQPPVVERPA 14343
Cdd:PRK14971 435 V--PVNPPSTAPQAVRPAQFKEEKKIPVSKVSSLG--PSTLRPIQEKAEQATGNIKE 487
|
|
| PRK07003 |
PRK07003 |
DNA polymerase III subunit gamma/tau; |
14005-14312 |
1.79e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 235906 [Multi-domain] Cd Length: 830 Bit Score: 46.77 E-value: 1.79e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14005 PGVVNIPSAPRLVPPTSQRP----VFITSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNI 14080
Cdd:PRK07003 368 PGGGVPARVAGAVPAPGARAaaavGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATADRGDDAADGDA 447
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14081 PSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVINIPS-VTHPEYPTS 14159
Cdd:PRK07003 448 PVPAKANARASADSRCDERDAQPPADSGSASAPASDAPPDAAFEPAPRAAAPSAATPAAVPDARAPAAASrEDAPAAAAP 527
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14160 QVPvydvnYSTTPSPIPQKP-----------------------------GVVNIPSAPQPVHPAPNPPVHEFNYPTPPAV 14210
Cdd:PRK07003 528 PAP-----EARPPTPAAAAPaaraggaaaaldvlrnagmrvssdrgaraAAAAKPAAAPAAAPKPAAPRVAVQVPTPRAR 602
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14211 PQQPGVLNIPSYPTP-VAPTPQSPiyiPSQEQPKPTTRpsvinVPSVPQPAYPTPQ---APVYDVNyPTSPSVIPhqpgv 14286
Cdd:PRK07003 603 AATGDAPPNGAARAEqAAESRGAP---PPWEDIPPDDY-----VPLSADEGFGGPDdgfVPVFDSG-PDDVRVAP----- 668
|
330 340
....*....|....*....|....*.
gi 442625924 14287 vniPSVPLPAPPVKQRPVFVPSPVHP 14312
Cdd:PRK07003 669 ---KPADAPAPPVDTRPLPPAIPLDA 691
|
|
| KLF3_N |
cd21577 |
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ... |
14431-14614 |
2.00e-03 |
|
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.
Pssm-ID: 410554 [Multi-domain] Cd Length: 214 Bit Score: 44.64 E-value: 2.00e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14431 PSLPQPVSTPTSGvinIPSQASPPISVPTPgivnIPSIPQPTPQRPSPGIINVPSVPQPIPTAPSPGIinipSVPQPLPS 14510
Cdd:cd21577 56 PSPYSKSSPPSPP---QQRPLSPPLSLPPP----VAPPPLSPGSVPGGLPVISPVMVQPVPVLYPPHL----HQPIMVSS 124
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14511 PTPGVINIPQQPTPPPlvqqpgiinIPSVQQPSTPTTQHPI-----QDVQYETQRPQPTPGVINIPsvsqptyptqkPSY 14585
Cdd:cd21577 125 SPPPDDDHHHHKASSM---------KPSELGGDNHELHKPIkteprPEHAQDPYSEEMSSSVISSP-----------PEY 184
|
170 180
....*....|....*....|....*....
gi 442625924 14586 QDTSyPTVqpkppvsgIINIPSVPQPVPS 14614
Cdd:cd21577 185 ESNT-PSV--------IVHPGKRPLPVES 204
|
|
| KLF3_N |
cd21577 |
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ... |
14081-14302 |
2.05e-03 |
|
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.
Pssm-ID: 410554 [Multi-domain] Cd Length: 214 Bit Score: 44.64 E-value: 2.05e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14081 PSVPSPSyPAPNPPVNYPTQPSPQI-PVQPgviniPSAPLPTTPPQHPPvfipspespspapkpgviniPSVTHPeYPTS 14159
Cdd:cd21577 30 SSPPSSS-SSSSSSSSSSSSPSSRAsPPSP-----YSKSSPPSPPQQRP--------------------LSPPLS-LPPP 82
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14160 QVPVYDVnYSTTPSPIPQKPGVVnipsaPQPVHPAPNPPVHEF-NYPTPPAVPQQPGVLNIPSYPTPV----APTPQSPI 14234
Cdd:cd21577 83 VAPPPLS-PGSVPGGLPVISPVM-----VQPVPVLYPPHLHQPiMVSSSPPPDDDHHHHKASSMKPSElggdNHELHKPI 156
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14235 YI-PSQEQPKPTTR----PSVINVPsvpqpayptpqaPVYDVNyptSPSVIphqpgvVNIPSVPLPA---PPVKQR 14302
Cdd:cd21577 157 KTePRPEHAQDPYSeemsSSVISSP------------PEYESN---TPSVI------VHPGKRPLPVespDTLKKR 211
|
|
| PRK14950 |
PRK14950 |
DNA polymerase III subunits gamma and tau; Provisional |
14303-14393 |
2.07e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 237864 [Multi-domain] Cd Length: 585 Bit Score: 46.34 E-value: 2.07e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14303 PVFVPSPVHPTPA------PQPGVVNIPSVAQPVHPTYQPPVVERPAiydvyYPPPPSRPGVINIPSPPRPVYPVPQQPI 14376
Cdd:PRK14950 362 PVPAPQPAKPTAAapspvrPTPAPSTRPKAAAAANIPPKEPVRETAT-----PPPVPPRPVAPPVPHTPESAPKLTRAAI 436
|
90
....*....|....*...
gi 442625924 14377 YVP-APVLHIPAPRPVIH 14393
Cdd:PRK14950 437 PVDeKPKYTPPAPPKEEE 454
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14070-14278 |
2.19e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 46.47 E-value: 2.19e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14070 PIPQQPGVVNIPSVPSPSYPAPNPPvnYPTQPSPQIPVQPGV--INIPSAPLPTTPPQHPPVFIPSPESPSPAPKPGVIN 14147
Cdd:PHA03247 258 PPVVGEGADRAPETARGATGPPPPP--EAAAPNGAAAPPDGVwgAALAGAPLALPAPPDPPPPAPAGDAEEEDDEDGAME 335
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14148 IPS-VTHPE------YPTSQVPVYdvnysTTPSPIPQ-KPGVVNIPSAPQPVHPAPNPPvhefNYPTPPAVPQQPGVLNI 14219
Cdd:PHA03247 336 VVSpLPRPRqhyplgFPKRRRPTW-----TPPSSLEDlSAGRHHPKRASLPTRKRRSAR----HAATPFARGPGGDDQTR 406
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14220 PSYPTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVNYPTSPS 14278
Cdd:PHA03247 407 PAAPVPASVPTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPERQPPAPATEPAPDD 465
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
14172-14373 |
2.36e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 46.47 E-value: 2.36e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14172 PSPIP-QKPGVVNIPSAPQPVHPAPNPPvhEFNYPTPPAVPQqPGV---------LNIPSYPTPVAPTPQ---------- 14231
Cdd:PHA03247 255 PAPPPvVGEGADRAPETARGATGPPPPP--EAAAPNGAAAPP-DGVwgaalagapLALPAPPDPPPPAPAgdaeeedded 331
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14232 ------SPIYIPSQEQP-------KPT-TRPSVINVPSV---PQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPL 14294
Cdd:PHA03247 332 gamevvSPLPRPRQHYPlgfpkrrRPTwTPPSSLEDLSAgrhHPKRASLPTRKRRSARHAATPFARGPGGDDQTRPAAPV 411
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14295 PAPPvkQRPVFVPSPVHPTPAPqpgvvnipsvAQPVhPTYQPPVVERPAIydvyyPPPPSRPGVINIPSPPRPVYPVPQ 14373
Cdd:PHA03247 412 PASV--PTPAPTPVPASAPPPP----------ATPL-PSAEPGSDDGPAP-----PPERQPPAPATEPAPDDPDDATRK 472
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
13934-14128 |
2.58e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 46.02 E-value: 2.58e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13934 QPGIVNIPSAPQPIYPTP--------QSPQYNVNYPSPQPANPQKPGVVNIPSVPQPVYPSPQPPVYDVNYPT--TPVSQ 14003
Cdd:PRK12323 364 RPGQSGGGAGPATAAAAPvaqpapaaAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAArqASARG 443
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14004 HPGVVNIPSAPRLVPPTSQRPVFITSPGNLSPTPQPgviniPSVSQP-GYPTPQS---PIYDANYPTTQSPIPQQ----P 14075
Cdd:PRK12323 444 PGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAA-----PARAAPaAAPAPADddpPPWEELPPEFASPAPAQpdaaP 518
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|...
gi 442625924 14076 GVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPgviniPSAPLPTTPPQHPP 14128
Cdd:PRK12323 519 AGWVAESIPDPATADPDDAFETLAPAPAAAPAPR-----AAAATEPVVAPRPP 566
|
|
| KLF3_N |
cd21577 |
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ... |
14350-14526 |
3.59e-03 |
|
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.
Pssm-ID: 410554 [Multi-domain] Cd Length: 214 Bit Score: 43.87 E-value: 3.59e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14350 PPPPSrpgviniPSPPRPVYPVPQQPIYVPAPVLHIPAPRPvihNIPSVPQPTYPHRNPPIqDVTYPAPQPSPPVPGIVN 14429
Cdd:cd21577 32 PPSSS-------SSSSSSSSSSSSPSSRASPPSPYSKSSPP---SPPQQRPLSPPLSLPPP-VAPPPLSPGSVPGGLPVI 100
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14430 IPSLPQPVSTPTsgviniPSQASPPISVPTPgiVNIPSIPQPTPQRPSPGIINVPSVP---QPIPTAP------------ 14494
Cdd:cd21577 101 SPVMVQPVPVLY------PPHLHQPIMVSSS--PPPDDDHHHHKASSMKPSELGGDNHelhKPIKTEPrpehaqdpysee 172
|
170 180 190
....*....|....*....|....*....|....
gi 442625924 14495 --SPGIinipSVPQPLPSPTPGVINIPQQPTPPP 14526
Cdd:cd21577 173 msSSVI----SSPPEYESNTPSVIVHPGKRPLPV 202
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
13998-14224 |
3.78e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 45.64 E-value: 3.78e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13998 TTPVSQHPGVVNIPsAPRLVPPTSQRPVFITSPGNLSPTPQPGViniPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGV 14077
Cdd:PRK12323 372 AGPATAAAAPVAQP-APAAAAPAAAAPAPAAPPAAPAAAPAAAA---AARAVAAAPARRSPAPEALAAARQASARGPGGA 447
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14078 VNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQHPPVFIPSPESPSPApkpgviniPSVTHPEYP 14157
Cdd:PRK12323 448 PAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPA--------PAQPDAAPA 519
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 442625924 14158 tsqvpvyDVNYSTTPSPIPQKPGVVNIPSAPQPVhPAPNPPVhefNYPTPPAVPQQPGVLNIPSYPT 14224
Cdd:PRK12323 520 -------GWVAESIPDPATADPDDAFETLAPAPA-AAPAPRA---AAATEPVVAPRPPRASASGLPD 575
|
|
| Gag_spuma |
pfam03276 |
Spumavirus gag protein; |
13989-14127 |
3.95e-03 |
|
Spumavirus gag protein;
Pssm-ID: 460872 [Multi-domain] Cd Length: 614 Bit Score: 45.51 E-value: 3.95e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13989 PPVYDVNYPttPVSQHPGVVNIP--SAPRLVPPTSQRPVfiTSPGNLSPTPqpGVINIPSVSQPGYPTPQSPIYDANYPT 14066
Cdd:pfam03276 187 PPGASFSGL--PSLPAIGGIHLPaiPGIHARAPPGNIAR--SLGDDIMPSL--GDAGMPQPRFAFHPGNPFAEAEGHPFA 260
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 442625924 14067 TQS-----PIPQQPgVVNIPSVPSPSYPapnppvnyptQPSPQIPVQPGVINIPSAPLPTTPPQHP 14127
Cdd:pfam03276 261 EAEgerprDIPRAP-RIDAPSAPAIPAI----------QPIAPPMIPPIGAPIPIPHGASIPGEHI 315
|
|
| MISS |
pfam15822 |
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ... |
14149-14319 |
4.07e-03 |
|
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.
Pssm-ID: 318115 [Multi-domain] Cd Length: 238 Bit Score: 44.21 E-value: 4.07e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14149 PSVTHPEYPTSQVP--VYDVNYSTTPSPIPQKPGVVNIPSAPQPVHPAPNP----PVHEFNYPTP----PAVPQQPGVLN 14218
Cdd:pfam15822 51 PSTAPSTVPFGPAPtgMYPSIPLTGPSPGPPAPFPPSGPSCPPPGGPYPAPtvpgPGPIGPYPTPnmpfPELPRPYGAPT 130
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14219 IPSYPTPVAP--TPQSPIYIPSQEQPKPTtrPSVINVPSVPQPAYPTPQAP--VYDVNYPTSPSVIPHQPGVVNIPSVPL 14294
Cdd:pfam15822 131 DPAAAAPSGPwgSMSSGPWAPGMGGQYPA--PNMPYPSPGPYPAVPPPQSPgaAPPVPWGTVPPGPWGPPAPYPDPTGSY 208
|
170 180
....*....|....*....|....*...
gi 442625924 14295 PAP---PVKQRPVFVPSPVHPTPaPQPG 14319
Cdd:pfam15822 209 PMPglyPTPNNPFQVPSGPSGAP-PMPG 235
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14295-14410 |
4.15e-03 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 42.85 E-value: 4.15e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14295 PAPPVKQR--PVFVPSPVHPTP---APQPGVVNIPSVAQPVHPTYQPPVVERPAIydvyyPPPPSRPGVINIPSPPRPVY 14369
Cdd:smart00818 38 QIIPVSQQhpPTHTLQPHHHIPvlpAQQPVVPQQPLMPVPGQHSMTPTQHHQPNL-----PQPAQQPFQPQPLQPPQPQQ 112
|
90 100 110 120
....*....|....*....|....*....|....*....|.
gi 442625924 14370 PVPQQPiyvpaPVLHIPAPRPvihniPSVPQPTYPHRNPPI 14410
Cdd:smart00818 113 PMQPQP-----PVHPIPPLPP-----QPPLPPMFPMQPLPP 143
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
1022-1058 |
4.33e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 39.16 E-value: 4.33e-03
10 20 30
....*....|....*....|....*....|....*..
gi 442625924 1022 DVDECEERGaqLCAFGAQCVNKPGSYSCHCPEGYQGD 1058
Cdd:cd00054 1 DIDECASGN--PCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| PRK10819 |
PRK10819 |
transport protein TonB; Provisional |
14149-14269 |
4.62e-03 |
|
transport protein TonB; Provisional
Pssm-ID: 236768 [Multi-domain] Cd Length: 246 Bit Score: 43.90 E-value: 4.62e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14149 PSVTHPEYPTSQVPVydvnysTTPSPIPQ--KPGVVNIP---SAPQPVhPAPNP-PVHEfnyptPPAVPQQPgvlnipsy 14222
Cdd:PRK10819 61 PQAVQPPPEPVVEPE------PEPEPIPEppKEAPVVIPkpePKPKPK-PKPKPkPVKK-----VEEQPKRE-------- 120
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|
gi 442625924 14223 PTPVAPTPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQA---PVY 14269
Cdd:PRK10819 121 VKPVEPRPASPFENTAPARPTSSTATAAASKPVTSVSSGPRALSrnqPQY 170
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
14184-14298 |
4.69e-03 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 42.85 E-value: 4.69e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14184 IPSAPQ--PVHPAPNPPVHEFNYPTPPAVPQQPgVLNIPSYPtPVAPTP--QSPIYIPSQEQPKPTtrpsvinVPSVPQP 14259
Cdd:smart00818 40 IPVSQQhpPTHTLQPHHHIPVLPAQQPVVPQQP-LMPVPGQH-SMTPTQhhQPNLPQPAQQPFQPQ-------PLQPPQP 110
|
90 100 110 120
....*....|....*....|....*....|....*....|....
gi 442625924 14260 AYPTPQAPVYD-----VNYPTSPSVIPHQPGVVNIPSVPLPAPP 14298
Cdd:smart00818 111 QQPMQPQPPVHpipplPPQPPLPPMFPMQPLPPLLPDLPLEAWP 154
|
|
| PRK14959 |
PRK14959 |
DNA polymerase III subunits gamma and tau; Provisional |
14007-14126 |
4.85e-03 |
|
DNA polymerase III subunits gamma and tau; Provisional
Pssm-ID: 184923 [Multi-domain] Cd Length: 624 Bit Score: 45.06 E-value: 4.85e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14007 VVNIPSAPRLVPPTSQRPVFI-TSPGNLSPTPQPGVINIPSVSQPGYPTPQSPIYDANY-PTTQSPIPQQPGVVNIPSVP 14084
Cdd:PRK14959 356 LLNLAMLPRLMPVESLRPSGGgASAPSGSAAEGPASGGAATIPTPGTQGPQGTAPAAGMtPSSAAPATPAPSAAPSPRVP 435
|
90 100 110 120
....*....|....*....|....*....|....*....|....*..
gi 442625924 14085 SPSYPAPNPPVNYPTQPSPQIpvqPGVINIPSAPLPT-----TPPQH 14126
Cdd:PRK14959 436 WDDAPPAPPRSGIPPRPAPRM---PEASPVPGAPDSVasasdAPPTL 479
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
461-490 |
4.93e-03 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 39.12 E-value: 4.93e-03
10 20 30
....*....|....*....|....*....|..
gi 442625924 461 CQDNP--CGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:pfam12947 1 CSDNNggCHPNATCTNTGGSFTCTCNDGYTGD 32
|
|
| Amelogenin |
smart00818 |
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem ... |
13907-13991 |
4.96e-03 |
|
Amelogenins, cell adhesion proteins, play a role in the biomineralisation of teeth; They seem to regulate formation of crystallites during the secretory stage of tooth enamel development and are thought to play a major role in the structural organisation and mineralisation of developing enamel. The extracellular matrix of the developing enamel comprises two major classes of protein: the hydrophobic amelogenins and the acidic enamelins. Circular dichroism studies of porcine amelogenin have shown that the protein consists of 3 discrete folding units: the N-terminal region appears to contain beta-strand structures, while the C-terminal region displays characteristics of a random coil conformation. Subsequent studies on the bovine protein have indicated the amelogenin structure to contain a repetitive beta-turn segment and a "beta-spiral" between Gln112 and Leu138, which sequester a (Pro, Leu, Gln) rich region. The beta-spiral offers a probable site for interactions with Ca2+ ions. Muatations in the human amelogenin gene (AMGX) cause X-linked hypoplastic amelogenesis imperfecta, a disease characterised by defective enamel. A 9bp deletion in exon 2 of AMGX results in the loss of codons for Ile5, Leu6, Phe7 and Ala8, and replacement by a new threonine codon, disrupting the 16-residue (Met1-Ala16) amelogenin signal peptide.
Pssm-ID: 197891 [Multi-domain] Cd Length: 165 Bit Score: 42.85 E-value: 4.96e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13907 PVRPQIYDTPSPPYPVAIPdlvyVQQQQPGIVNIPSAP-QPIYPTPQSPQYNVNYPSPQ-PANPQKPgvvniPSVPQPVY 13984
Cdd:smart00818 66 PVVPQQPLMPVPGQHSMTP----TQHHQPNLPQPAQQPfQPQPLQPPQPQQPMQPQPPVhPIPPLPP-----QPPLPPMF 136
|
....*...
gi 442625924 13985 P-SPQPPV 13991
Cdd:smart00818 137 PmQPLPPL 144
|
|
| EGF |
cd00053 |
Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large ... |
258-291 |
5.31e-03 |
|
Epidermal growth factor domain, found in epidermal growth factor (EGF) presents in a large number of proteins, mostly animal; the list of proteins currently known to contain one or more copies of an EGF-like pattern is large and varied; the functional significance of EGF-like domains in what appear to be unrelated proteins is not yet clear; a common feature is that these repeats are found in the extracellular domain of membrane-bound proteins or in proteins known to be secreted (exception: prostaglandin G/H synthase); the domain includes six cysteine residues which have been shown to be involved in disulfide bonds; the main structure is a two-stranded beta-sheet followed by a loop to a C-terminal short two-stranded sheet; Subdomains between the conserved cysteines vary in length; the region between the 5th and 6th cysteine contains two conserved glycines of which at least one is present in most EGF-like domains; a subset of these bind calcium.
Pssm-ID: 238010 Cd Length: 36 Bit Score: 39.00 E-value: 5.31e-03
10 20 30
....*....|....*....|....*....|....
gi 442625924 258 ECSYPNVCGPGAICTNLEGSYRCDCPPGYDGDGR 291
Cdd:cd00053 1 ECAASNPCSNGGTCVNTPGSYRCVCPPGYTGDRS 34
|
|
| EGF_3 |
pfam12947 |
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes ... |
676-702 |
5.77e-03 |
|
EGF domain; This family includes a variety of EGF-like domain homologs. This family includes the C-terminal domain of the malaria parasite MSP1 protein.
Pssm-ID: 463759 [Multi-domain] Cd Length: 36 Bit Score: 39.12 E-value: 5.77e-03
10 20
....*....|....*....|....*..
gi 442625924 676 GSCGQNATCTNSAGGFTCACPPGFSGD 702
Cdd:pfam12947 6 GGCHPNATCTNTGGSFTCTCNDGYTGD 32
|
|
| PRK07994 |
PRK07994 |
DNA polymerase III subunits gamma and tau; Validated |
13944-14103 |
6.02e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236138 [Multi-domain] Cd Length: 647 Bit Score: 44.86 E-value: 6.02e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13944 PQPIYPTPQSPQYNVNyPSPQPANPQKPGVVNIPSVPQPVYPSPQP-----PVYDVNYPTTPVSQHPgvVNIPSAPRLVP 14018
Cdd:PRK07994 361 PAAPLPEPEVPPQSAA-PAASAQATAAPTAAVAPPQAPAVPPPPASapqqaPAVPLPETTSQLLAAR--QQLQRAQGATK 437
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14019 PTSQRPVfitSPGNLSPTPqPGVINIPSVSQPGYPTPQSPIYDANYPTTqspiPQQPGVVNIPSVPSPSypAPNPPVNYP 14098
Cdd:PRK07994 438 AKKSEPA---AASRARPVN-SALERLASVRPAPSALEKAPAKKEAYRWK----ATNPVEVKKEPVATPK--ALKKALEHE 507
|
....*
gi 442625924 14099 TQPSP 14103
Cdd:PRK07994 508 KTPEL 512
|
|
| dnaA |
PRK14086 |
chromosomal replication initiator protein DnaA; |
13977-14247 |
6.26e-03 |
|
chromosomal replication initiator protein DnaA;
Pssm-ID: 237605 [Multi-domain] Cd Length: 617 Bit Score: 44.82 E-value: 6.26e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 13977 PSVPQPVYPSPQPPVYDvnyptTPVSQHPGVVNIPSAPRlvPPTSQRPVfITSPGNLSPTPQPGvinipsvsqpgYPTPQ 14056
Cdd:PRK14086 90 PSAGEPAPPPPHARRTS-----EPELPRPGRRPYEGYGG--PRADDRPP-GLPRQDQLPTARPA-----------YPAYQ 150
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14057 SPIYDANYPTTQSPIPQQpgvvnipsVPSPSYPAPNPPVNyptqpspqipvqpgviniPSAPLPTTPPQHPPvfipspes 14136
Cdd:PRK14086 151 QRPEPGAWPRAADDYGWQ--------QQRLGFPPRAPYAS------------------PASYAPEQERDREP-------- 196
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14137 pspapkpgviniPSVTHPEYPTSQVPvYDvnystTPSPIPQKPGVVNIPSaPQPvHPAPNPPVHEfnYPTPPAVPQQPGV 14216
Cdd:PRK14086 197 ------------YDAGRPEYDQRRRD-YD-----HPRPDWDRPRRDRTDR-PEP-PPGAGHVHRG--GPGPPERDDAPVV 254
|
250 260 270
....*....|....*....|....*....|.
gi 442625924 14217 LNIPSYPTPVAPTPQspiyiPSQEQPKPTTR 14247
Cdd:PRK14086 255 PIRPSAPGPLAAQPA-----PAPGPGEPTAR 280
|
|
| DUF5585 |
pfam17823 |
Family of unknown function (DUF5585); This is a family of unknown function found in chordata. |
14149-14530 |
6.54e-03 |
|
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
Pssm-ID: 465521 [Multi-domain] Cd Length: 506 Bit Score: 44.57 E-value: 6.54e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14149 PSVTHPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNiPSAPQPVHPAPNP--PVHEFNYPTPPAVPQQPGVLNIPSYPTPV 14226
Cdd:pfam17823 123 PSSAAQSLPAAIAALPSEAFSAPRAAACRANASAA-PRAAIAAASAPHAasPAPRTAASSTTAASSTTAASSAPTTAASS 201
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14227 AP---TPQSPIYIPSQEQPKPTTRPSVINVPSVpQPAYPTPQAPVYDVNYPTSPSVIPHQPGVVNIPSVPLPAPPVKQRP 14303
Cdd:pfam17823 202 APatlTPARGISTAATATGHPAAGTALAAVGNS-SPAAGTVTAAVGTVTPAALATLAAAAGTVASAAGTINMGDPHARRL 280
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14304 ---------VFVPSPVHPTPAPQPGVVNIPSVAQPVHPTYQPPVverpaiydvyypppPSRPGVINIPSPPRPVYPV--- 14371
Cdd:pfam17823 281 spakhmpsdTMARNPAAPMGAQAQGPIIQVSTDQPVHNTAGEPT--------------PSPSNTTLEPNTPKSVASTnla 346
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14372 --------PQQPIYVPAPVLHIPAprpvihnIPSVpqptyphrnppiqDVTYPAPQpsppvpgivniPSlPQPVSTPTSG 14443
Cdd:pfam17823 347 vvtttkaqAKEPSASPVPVLHTSM-------IPEV-------------EATSPTTQ-----------PS-PLLPTQGAAG 394
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14444 vinipsqasppisvptpgivniPSIPQPTPQrpsPGIINVPSVPQPIPTAPSPGIINIPSVPQPLPSPTPGVINIPQQPT 14523
Cdd:pfam17823 395 ----------------------PGILLAPEQ---VATEATAGTASAGPTPRSSGDPKTLAMASCQLSTQGQYLVVTTDPL 449
|
....*..
gi 442625924 14524 PPPLVQQ 14530
Cdd:pfam17823 450 TPALVDK 456
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
457-490 |
6.55e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.77 E-value: 6.55e-03
10 20 30
....*....|....*....|....*....|....*
gi 442625924 457 NINECQD-NPCGENAICTDTVGSFVCTCKPDYTGD 490
Cdd:cd00054 1 DIDECASgNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| rne |
PRK10811 |
ribonuclease E; Reviewed |
14221-14406 |
7.49e-03 |
|
ribonuclease E; Reviewed
Pssm-ID: 236766 [Multi-domain] Cd Length: 1068 Bit Score: 44.65 E-value: 7.49e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14221 SYPtpVAPtPQSPIYIPSQEQPKPTTRPSVINVPSVPQPAYPTPQAPVYDVnyptspSVIPHQPGVVNIPSVPLPAPPVK 14300
Cdd:PRK10811 844 RYP--VVR-PQDVQVEEQREAEEVQVQPVVAEVPVAAAVEPVVSAPVVEAV------AEVVEEPVVVAEPQPEEVVVVET 914
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14301 QRPVFVPSPVhpTPAPQPGVVNIPSVAQPVhPTYQPPVVERPAIYDVYYPPPPSRPgvINIPSPPRPVYPVPQQPIYVPA 14380
Cdd:PRK10811 915 THPEVIAAPV--TEQPQVITESDVAVAQEV-AEHAEPVVEPQDETADIEEAAETAE--VVVAEPEVVAQPAAPVVAEVAA 989
|
170 180
....*....|....*....|....*.
gi 442625924 14381 PVLHIPAPRPVIHNIPSVPQPTYPHR 14406
Cdd:PRK10811 990 EVETVTAVEPEVAPAQVPEATVEHNH 1015
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
298-331 |
8.46e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 8.46e-03
10 20 30
....*....|....*....|....*....|....*
gi 442625924 298 DQDECA-RTPCGRNADCLNTDGSFRCLCPDGYSGD 331
Cdd:cd00054 1 DIDECAsGNPCQNGGTCVNTVGSYRCSCPPGYTGR 35
|
|
| Caprin-1_C |
pfam12287 |
Cytoplasmic activation/proliferation-associated protein-1 C term; This family of proteins is ... |
14178-14369 |
9.42e-03 |
|
Cytoplasmic activation/proliferation-associated protein-1 C term; This family of proteins is found in eukaryotes. Proteins in this family are typically between 343 and 708 amino acids in length. This family is the C terminal region of caprin-1. Caprin-1 is a protein involved in regulating cellular proliferation. In mutated phenotypes, the G1 phase of the cell cycle is greatly lengthened, impairing normal proliferation. The C terminal region of caprin-1 contains RGG motifs which are characteriztic of RNA binding domains. It is possible that caprin-1 functions through an RNA binding mechanism.
Pssm-ID: 463522 [Multi-domain] Cd Length: 320 Bit Score: 43.63 E-value: 9.42e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14178 KPGVVNIPSAPQPVHPAPNPPVHEFNYPTPPAVPQQPGVLNIPSyPTPVAP-TPQSPIYIPSQEQPKPTTRPSVINVPSV 14256
Cdd:pfam12287 24 KPSDSAIVSAQPPSQSPDLSQMVCPPASPEQRLSQQSDVLQQPE-QTQVSPvSPSSNACASSGSEYQFHTSEPPQPEAID 102
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14257 PQPAYPTPQAPVYDVNYPTSPS----VIPHQP---GVVNIPSVPL-----------PAPPVKQRPVFVPSPVHPT----- 14313
Cdd:pfam12287 103 PIQSSMSLPSELAPPSPPLSPAsqpqVFQSKPassSGINVNAAPFqsmqtvfnvnaPVPPRNEQELKESSQYSSGynqsf 182
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 442625924 14314 ------PAPQPgvvNIPS--VAQPVHPTYQP-PVVERPAIYDVYYPPPPSrpgviNIPSPPRPVY 14369
Cdd:pfam12287 183 ssqstqTVPQC---QLPSeqLEQTVVGAYHPdGTIQVSNGHLAFYPAQTN-----GFPRPPQPFY 239
|
|
| EGF_CA |
cd00054 |
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular ... |
413-456 |
9.61e-03 |
|
Calcium-binding EGF-like domain, present in a large number of membrane-bound and extracellular (mostly animal) proteins. Many of these proteins require calcium for their biological function and calcium-binding sites have been found to be located at the N-terminus of particular EGF-like domains; calcium-binding may be crucial for numerous protein-protein interactions. Six conserved core cysteines form three disulfide bridges as in non calcium-binding EGF domains, whose structures are very similar. EGF_CA can be found in tandem repeat arrangements.
Pssm-ID: 238011 Cd Length: 38 Bit Score: 38.39 E-value: 9.61e-03
10 20 30 40
....*....|....*....|....*....|....*....|....
gi 442625924 413 DIDECNQPDGvakCGTNAKCINFPGSYRCLCPSGFQGQgylHCE 456
Cdd:cd00054 1 DIDECASGNP---CQNGGTCVNTVGSYRCSCPPGYTGR---NCE 38
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
14073-14309 |
9.74e-03 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 44.10 E-value: 9.74e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14073 QQPGVVNIPSVPSPSYPAPNPPVNYPTQPSPQIPVQPGVINIPSAPLPTTPPQhppvfipspespspapkpgvinipsvt 14152
Cdd:PRK12323 367 QSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAV--------------------------- 419
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14153 hPEYPTSQVPVYDVNYSTTPSPIPQKPGVVNIPSAPQPVhPAPNPPvhefnyptPPAVPQQPGVLNIPSYPTPVAPTPQS 14232
Cdd:PRK12323 420 -AAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAA-PAAAAR--------PAAAGPRPVAAAAAAAPARAAPAAAP 489
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 442625924 14233 PiyiPSQEQPKP-TTRPSVINVPSVPQPAYPTPQAPVYDVNYP-TSPSVIPHQPGVVNIPSVPLPAPPVKQRPVFVPSP 14309
Cdd:PRK12323 490 A---PADDDPPPwEELPPEFASPAPAQPDAAPAGWVAESIPDPaTADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRP 565
|
|
| PRK07994 |
PRK07994 |
DNA polymerase III subunits gamma and tau; Validated |
14033-14211 |
9.92e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236138 [Multi-domain] Cd Length: 647 Bit Score: 44.09 E-value: 9.92e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14033 LSPTPQPGV-------------INIPSVSQPGYPTPQSPIYDANYPTTQSPIPQQPGVVNIPSVPSPSyPAPNPPVNYPT 14099
Cdd:PRK07994 341 LAPDRRMGVemtllrmlafhpaAPLPEPEVPPQSAAPAASAQATAAPTAAVAPPQAPAVPPPPASAPQ-QAPAVPLPETT 419
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 442625924 14100 QPSPQIPVQpgvinIPSAPLPTTPPQHPPVfipsPESPSPAPKPGVINIPSVTHPEYPTSQVPVYDVNYSTTPSPipqkP 14179
Cdd:PRK07994 420 SQLLAARQQ-----LQRAQGATKAKKSEPA----AASRARPVNSALERLASVRPAPSALEKAPAKKEAYRWKATN----P 486
|
170 180 190
....*....|....*....|....*....|..
gi 442625924 14180 GVVNIPSAPQPVhPAPNPPVHEfnyPTPPAVP 14211
Cdd:PRK07994 487 VEVKKEPVATPK-ALKKALEHE---KTPELAA 514
|
|
|