|
Name |
Accession |
Description |
Interval |
E-value |
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
121-402 |
1.45e-66 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 226.83 E-value: 1.45e-66
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200 11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200 91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200 169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 569004618 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200 247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-402 |
1.46e-57 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 204.76 E-value: 1.46e-57
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319 122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319 280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 569004618 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
608-857 |
3.50e-14 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 76.48 E-value: 3.50e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 608 QQMPLVPQMGPPGPQGQfraPGPQGQMGPQGPPMHQGGGGPQGFMGPQGpqgppqglprPQDMHGPQGMQrhpgphgplg 687
Cdd:NF038329 111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDRGETGPAGPAGPPGPQG----------ERGEKGPAGPQ---------- 167
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 688 pqgppgpqgssgpqghmgpqGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQ 767
Cdd:NF038329 168 --------------------GEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 768 GPlMGLNPRGMQGPPGPRENQGPA-PQGLmighppqemRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQE 845
Cdd:NF038329 228 GP-AGDGQQGPDGDPGPTGEDGPQgPDGP---------AGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDG 296
|
250
....*....|..
gi 569004618 846 LRGPSGSQGQQG 857
Cdd:NF038329 297 LPGKDGKDGQNG 308
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
590-926 |
3.87e-14 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 77.36 E-value: 3.87e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 590 GQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQFRAPGPQG------------QMGPQGPPMHQGGGGPQGFMGPQGP 657
Cdd:pfam09606 148 RMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGgqqgpmggqmppQMGVPGMPGPADAGAQMGQQAQANG 227
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 658 QgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPASQGHMGPQ 731
Cdd:pfam09606 228 G------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIG 301
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 732 GPPGTQGMQGPPgPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPps 811
Cdd:pfam09606 302 DQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGMMSSPSP-- 378
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 812 gLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLgpppqggmqgppgpqgqQNPARGPHPSQ 891
Cdd:pfam09606 379 -VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMI-----------------PSPALIPSPSP 431
|
330 340 350
....*....|....*....|....*....|....*
gi 569004618 892 GPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606 432 QMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
690-859 |
1.83e-13 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 74.17 E-value: 1.83e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 690 GPPgpqgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGP 769
Cdd:NF038329 132 GEQ------------------------GPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGP 187
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 770 LMGLNPRGMQGPPGPRENQGPA-PQGLMIGHPPQEMRGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGP-----P 843
Cdd:NF038329 188 AGEKGPQGPRGETGPAGEQGPAgPAGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPAGKDGPrgdrgE 267
|
170
....*....|....*.
gi 569004618 844 QELRGPSGSQGQQGPP 859
Cdd:NF038329 268 AGPDGPDGKDGERGPV 283
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
191-230 |
7.02e-09 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 52.70 E-value: 7.02e-09
10 20 30 40
....*....|....*....|....*....|....*....|
gi 569004618 191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
205-319 |
1.63e-08 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 58.75 E-value: 1.63e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421 78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
|
90 100 110 120
....*....|....*....|....*....|....*....|....*..
gi 569004618 273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421 153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
|
|
| SPT5 |
COG5164 |
Transcription elongation factor SPT5 [Transcription]; |
665-981 |
2.48e-08 |
|
Transcription elongation factor SPT5 [Transcription];
Pssm-ID: 444063 [Multi-domain] Cd Length: 495 Bit Score: 58.12 E-value: 2.48e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQG------PPGTQG 738
Cdd:COG5164 12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNT---------------------GGTRPAQNQGSTTPAGntggtrPAGNQG 70
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 739 MQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGlnpRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPPSGLLGHGP 818
Cdd:COG5164 71 ATGPAQNQGGTTPAQNQGGTRPAGNTGGTTPAGD---GGATGPPDDGGATGPPDDGGSTTPPSGGSTTPPGDGGSTPPGP 147
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 819 qemrGPQEMRGMQGPPPQGSMLGPPQE--LRGPSGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPF 896
Cdd:COG5164 148 ----GSTGPGGSTTPPGDGGSTTPPGPggSTTPPDDGGSTTPPNKG---------------------ETGTDIPTGGTPR 202
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 897 QQQKAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPNKGT-KGRRERHASGLPSPPGLVATTTT 975
Cdd:COG5164 203 QGPDGPVKKDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPAELTaLEAENRAANPEPATKTIPETTTV 282
|
....*.
gi 569004618 976 SPFVVV 981
Cdd:COG5164 283 KDLATV 288
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
195-230 |
7.39e-08 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 49.65 E-value: 7.39e-08
10 20 30
....*....|....*....|....*....|....*.
gi 569004618 195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400 4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
734-946 |
6.00e-06 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 50.84 E-value: 6.00e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 734 PGTQGMQGPPGPRGMQGPPHphgiQGGPASQGIQGPLMGLNPRGMQ----GPPGPRENQGPaPQGLMIGHPPQEMRGPHP 809
Cdd:PHA03378 598 PVPHPSQTPEPPTTQSHIPE----TSAPRQWPMPLRPIPMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHI 672
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 810 PSGLLGHGPQEMRGPQEMRGMQGPPPQGsmlgpPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP 889
Cdd:PHA03378 673 PYQPSPTGANTMLPIQWAPGTMQPPPRA-----PTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPP 747
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....*....
gi 569004618 890 SQGPIPFQQqkaPLLGDGP-RAPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQGP 946
Cdd:PHA03378 748 AAAPGRARP---PAAAPGRaRPPAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQAG 803
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
556-675 |
5.99e-05 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 47.49 E-value: 5.99e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPISQIPQ-----GFQQPHPSQQMPlvpqMGPPGPQ 622
Cdd:TIGR01628 360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....
gi 569004618 623 GqFRAPGPQGQMGP-QGPPMHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628 436 G-LAPMNAVRAPSRnAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
|
|
| COG3416 |
COG3416 |
Uncharacterized conserved protein, DUF2076 domain [Function unknown]; |
537-642 |
6.60e-03 |
|
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
Pssm-ID: 442642 [Multi-domain] Cd Length: 237 Bit Score: 39.62 E-value: 6.60e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 537 TQAEIEQEMAtlqytnpqlLEQL--KIERL-AQKQADQIQPPPSSGTPLLGpqpFSGQGPISQIPQGFQQPHPSQQmplv 613
Cdd:COG3416 47 AQTILVQEAA---------LKQAqqRIQELeAQLAQLQQQQPQSSGGFLSG---LFGGGQRPPPAPQPSQPGPQQQ---- 110
|
90 100
....*....|....*....|....*....
gi 569004618 614 PQMGPPGPQGQFRAPGPQGQMGPQGPPMH 642
Cdd:COG3416 111 PAPPSGPWGQAAPQQPGYGQPQYGQPAAG 139
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
121-402 |
1.45e-66 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 226.83 E-value: 1.45e-66
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200 11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200 91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200 169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 569004618 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200 247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-402 |
1.46e-57 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 204.76 E-value: 1.46e-57
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319 122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319 280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 569004618 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-402 |
2.32e-52 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 189.74 E-value: 2.32e-52
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319 38 AVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGHTAAVLSVAFSPDGRLLASASADGTVRLWDlATGLLLRTLT 117
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 118 GHTGAVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGT--VRLWDLATGK 195
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLF 358
Cdd:COG2319 196 LLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLWDLAT-GKLLRTLTGHSGSVRSVAFSP--DGrLLASGSADGTVRL 272
|
250 260 270 280
....*....|....*....|....*....|....*....|....
gi 569004618 359 WHVGvEKEVGGMEMAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 273 WDLA-TGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLW 315
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-361 |
1.89e-47 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 175.48 E-value: 1.89e-47
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWqsNMNN---VKM 197
Cdd:COG2319 164 AVTSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLW--DLATgklLRT 241
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:COG2319 242 LTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGT--VRLWDLAT 319
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:COG2319 320 GKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGTVRLWDLAT-GELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVR 397
|
....
gi 569004618 358 FWHV 361
Cdd:COG2319 398 LWDL 401
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
194-402 |
4.01e-43 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 159.42 E-value: 4.01e-43
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 194 NVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFW 273
Cdd:cd00200 1 LRRTLKGHTGGVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKT--IRLW 78
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 274 DPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEGLFASGGS- 352
Cdd:cd00200 79 DLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVET-GKCLTTLRGHTDWVNSVAFSP--DGTFVASSSq 155
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|
gi 569004618 353 DGSLLFWHVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200 156 DGTIKLWDLRTGKCVATLT-GHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
126-402 |
3.09e-42 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 160.08 E-value: 3.09e-42
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 126 RWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHND-MWMLTADHGGYVKYWQSNMNNVKMFQAHKEA 204
Cdd:COG2319 1 ALSADGAALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGaRLAAGAGDLTLLLLDAAAGALLATLLGHTAA 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 205 IREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQSLATL 284
Cdd:COG2319 81 VLSVAFSPDGRLLASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADGT--VRLWDLATGKLLRTL 158
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 285 HAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNLKeELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLFWHVGV 363
Cdd:COG2319 159 TGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGK-LLRTLTGHTGAVRSVAFSP--DGkLLASGSADGTVRLWDLAT 235
|
250 260 270
....*....|....*....|....*....|....*....
gi 569004618 364 EKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 236 GKLLRTLT-GHSGSVRSVAFSPDGRLLASGSADGTVRLW 273
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
119-359 |
4.12e-39 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 147.48 E-value: 4.12e-39
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAmtwshndmwmltadhggyvkywqsnmnnvkmf 198
Cdd:cd00200 93 TSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS-------------------------------- 140
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 199 qahkeaireASFSPtDNKF-ATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:cd00200 141 ---------VAFSP-DGTFvASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGT--IKLWDLST 208
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:cd00200 209 GKCLGTLRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRT-GECVQTLSGHTNSVTSLAWSP-DGKRLASGSADGTIR 286
|
..
gi 569004618 358 FW 359
Cdd:cd00200 287 IW 288
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-319 |
7.19e-39 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 150.45 E-value: 7.19e-39
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319 206 AVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDlATGELLRTLT 285
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 286 GHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGT--VRLWDLATGE 363
|
170 180 190 200
....*....|....*....|....*....|....*....|
gi 569004618 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:COG2319 364 LLRTLTGHTGAVTSVAFSPDGRTLASGSADGTVRLWDLAT 403
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
119-274 |
1.49e-28 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 117.05 E-value: 1.49e-28
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNM-NNVKM 197
Cdd:cd00200 135 TDWVNSVAFSPDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTgKCLGT 214
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 569004618 198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:cd00200 215 LRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGT--IRIWD 289
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
608-857 |
3.50e-14 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 76.48 E-value: 3.50e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 608 QQMPLVPQMGPPGPQGQfraPGPQGQMGPQGPPMHQGGGGPQGFMGPQGpqgppqglprPQDMHGPQGMQrhpgphgplg 687
Cdd:NF038329 111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDRGETGPAGPAGPPGPQG----------ERGEKGPAGPQ---------- 167
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 688 pqgppgpqgssgpqghmgpqGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQ 767
Cdd:NF038329 168 --------------------GEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 768 GPlMGLNPRGMQGPPGPRENQGPA-PQGLmighppqemRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQE 845
Cdd:NF038329 228 GP-AGDGQQGPDGDPGPTGEDGPQgPDGP---------AGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDG 296
|
250
....*....|..
gi 569004618 846 LRGPSGSQGQQG 857
Cdd:NF038329 297 LPGKDGKDGQNG 308
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
590-926 |
3.87e-14 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 77.36 E-value: 3.87e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 590 GQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQFRAPGPQG------------QMGPQGPPMHQGGGGPQGFMGPQGP 657
Cdd:pfam09606 148 RMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGgqqgpmggqmppQMGVPGMPGPADAGAQMGQQAQANG 227
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 658 QgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPASQGHMGPQ 731
Cdd:pfam09606 228 G------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIG 301
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 732 GPPGTQGMQGPPgPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPps 811
Cdd:pfam09606 302 DQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGMMSSPSP-- 378
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 812 gLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLgpppqggmqgppgpqgqQNPARGPHPSQ 891
Cdd:pfam09606 379 -VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMI-----------------PSPALIPSPSP 431
|
330 340 350
....*....|....*....|....*....|....*
gi 569004618 892 GPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606 432 QMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
690-859 |
1.83e-13 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 74.17 E-value: 1.83e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 690 GPPgpqgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGP 769
Cdd:NF038329 132 GEQ------------------------GPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGP 187
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 770 LMGLNPRGMQGPPGPRENQGPA-PQGLMIGHPPQEMRGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGP-----P 843
Cdd:NF038329 188 AGEKGPQGPRGETGPAGEQGPAgPAGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPAGKDGPrgdrgE 267
|
170
....*....|....*.
gi 569004618 844 QELRGPSGSQGQQGPP 859
Cdd:NF038329 268 AGPDGPDGKDGERGPV 283
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
584-970 |
4.03e-09 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 61.18 E-value: 4.03e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 584 GPQPFSGQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQFRAPGPQGQMGPQgppMHQGGGGPQGFMGPQGPQGPPQG 663
Cdd:pfam09606 60 QQQPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMGPMGPGPGGPMGQQ---MGGPGTASNLLASLGRPQMPMGG 136
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 664 LPRPQDMHGPQGMQRHPGPHGPLGPqgppgpqgssgpqghMGPQGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPP 743
Cdd:pfam09606 137 AGFPSQMSRVGRMQPGGQAGGMMQP---------------SSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMP 201
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 744 GPRGMQGPPHPhgIQGGPASQGIQGPLMGLNPRGMQGPPG--PRENQGPAPQGL-----MIGHPPQEMRGPHPpsGLLGH 816
Cdd:pfam09606 202 PQMGVPGMPGP--ADAGAQMGQQAQANGGMNPQQMGGAPNqvAMQQQQPQQQGQqsqlgMGINQMQQMPQGVG--GGAGQ 277
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 817 GPQEMRGPQEMRGMQGPPPQGSMLGPP----QELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP-SQ 891
Cdd:pfam09606 278 GGPGQPMGPPGQQPGAMPNVMSIGDQNnyqqQQTRQQQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPgNF 357
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 892 GP--IPFQQQKAPLLGDGPRAPFNQEGQSTGPPPlipGLGQQGAQGRIPPLNPGQGPGPNkgtkgrrerHASGLPSPPGL 969
Cdd:pfam09606 358 GGlgANPMQRGQPGMMSSPSPVPGQQVRQVTPNQ---FMRQSPQPSVPSPQGPGSQPPQS---------HPGGMIPSPAL 425
|
.
gi 569004618 970 V 970
Cdd:pfam09606 426 I 426
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
191-230 |
7.02e-09 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 52.70 E-value: 7.02e-09
10 20 30 40
....*....|....*....|....*....|....*....|
gi 569004618 191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
205-319 |
1.63e-08 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 58.75 E-value: 1.63e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421 78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
|
90 100 110 120
....*....|....*....|....*....|....*....|....*..
gi 569004618 273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421 153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
|
|
| SPT5 |
COG5164 |
Transcription elongation factor SPT5 [Transcription]; |
665-981 |
2.48e-08 |
|
Transcription elongation factor SPT5 [Transcription];
Pssm-ID: 444063 [Multi-domain] Cd Length: 495 Bit Score: 58.12 E-value: 2.48e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQG------PPGTQG 738
Cdd:COG5164 12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNT---------------------GGTRPAQNQGSTTPAGntggtrPAGNQG 70
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 739 MQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGlnpRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPPSGLLGHGP 818
Cdd:COG5164 71 ATGPAQNQGGTTPAQNQGGTRPAGNTGGTTPAGD---GGATGPPDDGGATGPPDDGGSTTPPSGGSTTPPGDGGSTPPGP 147
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 819 qemrGPQEMRGMQGPPPQGSMLGPPQE--LRGPSGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPF 896
Cdd:COG5164 148 ----GSTGPGGSTTPPGDGGSTTPPGPggSTTPPDDGGSTTPPNKG---------------------ETGTDIPTGGTPR 202
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 897 QQQKAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPNKGT-KGRRERHASGLPSPPGLVATTTT 975
Cdd:COG5164 203 QGPDGPVKKDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPAELTaLEAENRAANPEPATKTIPETTTV 282
|
....*.
gi 569004618 976 SPFVVV 981
Cdd:COG5164 283 KDLATV 288
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
195-230 |
7.39e-08 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 49.65 E-value: 7.39e-08
10 20 30
....*....|....*....|....*....|....*.
gi 569004618 195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400 4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
|
|
| PLN00181 |
PLN00181 |
protein SPA1-RELATED; Provisional |
215-359 |
1.82e-07 |
|
protein SPA1-RELATED; Provisional
Pssm-ID: 177776 [Multi-domain] Cd Length: 793 Bit Score: 55.86 E-value: 1.82e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 215 NKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWH---PTkgLVVSGSKDSQqpIKFWDPKTGQSLATLHAHKNTV 291
Cdd:PLN00181 546 SQVASSNFEGVVQVWDVARSQLVTEMKEHEKRVWSIDYSsadPT--LLASGSDDGS--VKLWSINQGVSIGTIKTKANIC 621
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 569004618 292 MEVKLNLNGNWLLTASRDHLCKLFDIRNLKEELQVFRGHKKEATAVAWhpVHEGLFASGGSDGSLLFW 359
Cdd:PLN00181 622 CVQFPSESGRSLAFGSADHKVYYYDLRNPKLPLCTMIGHSKTVSYVRF--VDSSTLVSSSTDNTLKLW 687
|
|
| Collagen |
pfam01391 |
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ... |
717-785 |
3.42e-07 |
|
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.
Pssm-ID: 460189 [Multi-domain] Cd Length: 57 Bit Score: 48.26 E-value: 3.42e-07
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 569004618 717 GPQGPPasqghmGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGplmglnPRGMQGPPGPR 785
Cdd:pfam01391 1 GPPGPP------GPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPG------PPGAPGAPGPP 57
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
238-274 |
5.71e-07 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 47.31 E-value: 5.71e-07
10 20 30
....*....|....*....|....*....|....*..
gi 569004618 238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:smart00320 6 KTLKGHTGPVTSVAFSPDGKYLASGSDDGT--IKLWD 40
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
238-274 |
3.28e-06 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 45.03 E-value: 3.28e-06
10 20 30
....*....|....*....|....*....|....*..
gi 569004618 238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:pfam00400 5 KTLEGHTGSVTSLAFSPDGKLLASGSDDGT--VKVWD 39
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
321-360 |
4.18e-06 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 44.61 E-value: 4.18e-06
10 20 30 40
....*....|....*....|....*....|....*....|
gi 569004618 321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFWH 360
Cdd:smart00320 2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
734-946 |
6.00e-06 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 50.84 E-value: 6.00e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 734 PGTQGMQGPPGPRGMQGPPHphgiQGGPASQGIQGPLMGLNPRGMQ----GPPGPRENQGPaPQGLMIGHPPQEMRGPHP 809
Cdd:PHA03378 598 PVPHPSQTPEPPTTQSHIPE----TSAPRQWPMPLRPIPMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHI 672
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 810 PSGLLGHGPQEMRGPQEMRGMQGPPPQGsmlgpPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP 889
Cdd:PHA03378 673 PYQPSPTGANTMLPIQWAPGTMQPPPRA-----PTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPP 747
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....*....
gi 569004618 890 SQGPIPFQQqkaPLLGDGP-RAPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQGP 946
Cdd:PHA03378 748 AAAPGRARP---PAAAPGRaRPPAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQAG 803
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
373-402 |
7.03e-06 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 44.23 E-value: 7.03e-06
10 20 30
....*....|....*....|....*....|
gi 569004618 373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:smart00320 10 GHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
277-316 |
1.13e-05 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 43.46 E-value: 1.13e-05
10 20 30 40
....*....|....*....|....*....|....*....|
gi 569004618 277 TGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
533-774 |
1.33e-05 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 49.65 E-value: 1.33e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 533 EKKKTQAEIEQEMATLQYTNPQL------LEQLKIERLAQKQADQIQPPPSSGtpllgPQPFSGQGPISQIPQGFQQPHP 606
Cdd:pfam09770 167 PKKAAAPAPAPQPAAQPASLPAPsrkmmsLEEVEAAMRAQAKKPAQQPAPAPA-----QPPAAPPAQQAQQQQQFPPQIQ 241
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 607 SQQMPlvPQMGPPGPQGQFRAPGPQGQMGPQGPPMhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRhpgphgpl 686
Cdd:pfam09770 242 QQQQP--QQQPQQPQQHPGQGHPVTILQRPQSPQP------DPAQPSIQPQAQQFHQQPPPVPVQPTQILQN-------- 305
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 687 gpqgppgpqgssgpqghmgpqgppgpqghigPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPAsqgi 766
Cdd:pfam09770 306 -------------------------------PNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQ---- 350
|
....*...
gi 569004618 767 qgPLMGLN 774
Cdd:pfam09770 351 --QLAQLS 356
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
278-316 |
1.38e-05 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 43.10 E-value: 1.38e-05
10 20 30
....*....|....*....|....*....|....*....
gi 569004618 278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:pfam00400 1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
288-398 |
4.03e-05 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 47.97 E-value: 4.03e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 288 KNTVMEVKLN-LNGNWLLTASRDHLCKLFDI------RNLKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGSLLFWH 360
Cdd:PTZ00421 75 EGPIIDVAFNpFDPQKLFTASEDGTIMGWGIpeegltQNISDPIVHLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWD 154
|
90 100 110 120
....*....|....*....|....*....|....*....|
gi 569004618 361 V--GVEKEVggmEMAHEGMIWSLAWHPLGHILCSGSNDHT 398
Cdd:PTZ00421 155 VerGKAVEV---IKCHSDQITSLEWNLDGSLLCTTSKDKK 191
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
718-976 |
5.02e-05 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 48.01 E-value: 5.02e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 718 PQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPP-----HPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAP 792
Cdd:PHA03247 2745 PAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPprrltRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASP 2824
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 793 QGLM------IGHPPQEMRGPHPPSGLLGhGPQEMRGPQEMRGMQGPP---PQGSMLGPPQELRGPSGSQGQQGPPQgsl 863
Cdd:PHA03247 2825 AGPLppptsaQPTAPPPPPGPPPPSLPLG-GSVAPGGDVRRRPPSRSPaakPAAPARPPVRRLARPAVSRSTESFAL--- 2900
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 864 gppPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIPglgqqgaQGRIPPLNPG 943
Cdd:PHA03247 2901 ---PPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVP-------QPWLGALVPG 2970
|
250 260 270
....*....|....*....|....*....|....*.
gi 569004618 944 QGPGPNKGT---KGRRERHASGLPSPPGLVATTTTS 976
Cdd:PHA03247 2971 RVAVPRFRVpqpAPSREAPASSTPPLTGHSLSRVSS 3006
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
373-402 |
5.39e-05 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 41.56 E-value: 5.39e-05
10 20 30
....*....|....*....|....*....|
gi 569004618 373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:pfam00400 9 GHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
556-675 |
5.99e-05 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 47.49 E-value: 5.99e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPISQIPQ-----GFQQPHPSQQMPlvpqMGPPGPQ 622
Cdd:TIGR01628 360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....
gi 569004618 623 GqFRAPGPQGQMGP-QGPPMHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628 436 G-LAPMNAVRAPSRnAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
|
|
| Glutenin_hmw |
pfam03157 |
High molecular weight glutenin subunit; Members of this family include high molecular weight ... |
568-954 |
8.23e-05 |
|
High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.
Pssm-ID: 367362 [Multi-domain] Cd Length: 786 Bit Score: 47.25 E-value: 8.23e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 568 QADQIQPPPSSGTPLLGPQpfSGQGPISQIPQGFQQPHPSQQ--MPLVPQMGPPGPQGQFRAPGPQGQMGPQGPPMHQGG 645
Cdd:pfam03157 256 QGQQGYYPISPQQPRQWQQ--SGQGQQGYYPTSLQQPGQGQSgyYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPG 333
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 646 GgpqgfmgpqgpqgppqglpRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPASQ 725
Cdd:pfam03157 334 Q-------------------GQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQ 394
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 726 GHMGPQGPPGTQGMQGPPGPRGMQgPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAP-QGLMIGHPPQEM 804
Cdd:pfam03157 395 GQQGQQPGQGQQPGQGQPGYYPTS-PQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQQPgQGQQGQQPGQPE 473
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 805 RGPHPPSGLLGHGPQEMR--GPQEMRGMQGPPPQGSMLGPPQELRGPsgSQGQQGPPQGSLgpppqggmqgppgpqgqQN 882
Cdd:pfam03157 474 QGQQPGQGQPGYYPTSPQqsGQGQQLGQWQQQGQGQPGYYPTSPLQP--GQGQPGYYPTSP-----------------QQ 534
|
330 340 350 360 370 380 390
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 569004618 883 PARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPppliPGLGQQGAQgripplnPGQGPGPNKGTKG 954
Cdd:pfam03157 535 PGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQ----PGQGQQGQQ-------PGQGQQPGQGQPG 595
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
565-860 |
9.33e-05 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 47.07 E-value: 9.33e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 565 AQKQADQIQPP---------PSSGTPLLGPQPFSGQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQF---------R 626
Cdd:pfam03154 162 AQQQILQTQPPvlqaqsgaaSPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIqqtptlhpqR 241
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 627 APGPQGQMGPQGPPmhqGGGGPQGFMGPQGPQGPPQGLPRPQDMH-GPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMG 705
Cdd:pfam03154 242 LPSPHPPLQPMTQP---PPPSQVSPQPLPQPSLHGQMPPMPHSLQtGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAP 318
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 706 PQGPPGPQGhigpqgpPASQGHMGPQGPPGTQGMqgPPGPRGMQ--GPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPpg 783
Cdd:pfam03154 319 GQSQQRIHT-------PPSQSQLQSQQPPREQPL--PPAPLSMPhiKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMN-- 387
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 784 preNQGPAPQGLmigHPPQEMRGPHPPSgllGHGPQEMRGPQEMRgMQGPPPQGSMLG-----PPQELRGPSGSQGQQGP 858
Cdd:pfam03154 388 ---SNLPPPPAL---KPLSSLSTHHPPS---AHPPPLQLMPQSQQ-LPPPPAQPPVLTqsqslPPPAASHPPTSGLHQVP 457
|
..
gi 569004618 859 PQ 860
Cdd:pfam03154 458 SQ 459
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
321-359 |
1.57e-04 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 40.41 E-value: 1.57e-04
10 20 30
....*....|....*....|....*....|....*....
gi 569004618 321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:pfam00400 1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVW 38
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
150-188 |
2.21e-04 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 39.99 E-value: 2.21e-04
10 20 30
....*....|....*....|....*....|....*....
gi 569004618 150 TFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
|
|
| Glutenin_hmw |
pfam03157 |
High molecular weight glutenin subunit; Members of this family include high molecular weight ... |
565-936 |
3.02e-04 |
|
High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.
Pssm-ID: 367362 [Multi-domain] Cd Length: 786 Bit Score: 45.32 E-value: 3.02e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 565 AQKQADQIQP---PPSSGTPLLGPQPFS-------GQGPISQIPQGFQQPHPSQQ--MPLVPQMGPPGPQ---------- 622
Cdd:pfam03157 358 SPQQPGQGQPgyyPTSQQQPQQGQQPEQgqqgqqqGQGQQGQQPGQGQQPGQGQPgyYPTSPQQSGQGQPgyyptspqqs 437
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 623 GQFRAPGpQGQMGPQGPPMHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 702
Cdd:pfam03157 438 GQGQQPG-QGQQPGQEQPGQGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPTSPQQSGQGQQLGQWQQQGQGQPGYYPTS 516
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 703 HMGPQGPPGPQGHIGPQGPpaSQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPP 782
Cdd:pfam03157 517 PLQPGQGQPGYYPTSPQQP--GQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQP 594
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 783 G----PRENQGPAPQGLMIGHPPQEMRGPHPPSGL-LGHGPQEMRGPQEMRGMQGPPPQgsmlgppqelRGPSGSQGQQG 857
Cdd:pfam03157 595 GyyptSPQQSGQGQQPGQWQQPGQGQPGYYPTSSLqLGQGQQGYYPTSPQQPGQGQQPG----------QWQQSGQGQQG 664
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 858 ----PPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKaplLGDGPRAPFNQEGQStgppPLIPGLGQQGA 933
Cdd:pfam03157 665 yyptSPQQSGQAQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPGQGQQ---LGQGQQSGQGQQGYY----PTSPGQGQQSG 737
|
...
gi 569004618 934 QGR 936
Cdd:pfam03157 738 QGQ 740
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
740-904 |
5.70e-04 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 44.26 E-value: 5.70e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 740 QGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGlnprgmQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPPSGllgHGPQ 819
Cdd:pfam09770 208 KKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQI------QQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQP---DPAQ 278
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 820 EMRGPQEMRGMQGPPPQgsMLGPPQELRGP-----SGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPArgPHPSQGPI 894
Cdd:pfam09770 279 PSIQPQAQQFHQQPPPV--PVQPTQILQNPnrlsaARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPII--THPQQLAQ 354
|
170
....*....|
gi 569004618 895 PFQQQKAPLL 904
Cdd:pfam09770 355 LSEEEKAAYL 364
|
|
| PTZ00420 |
PTZ00420 |
coronin; Provisional |
284-397 |
6.28e-04 |
|
coronin; Provisional
Pssm-ID: 240412 [Multi-domain] Cd Length: 568 Bit Score: 44.17 E-value: 6.28e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 284 LHAHKNTVMEVKLN-LNGNWLLTASRDHLCKLFDIRN-------LKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGS 355
Cdd:PTZ00420 70 LKGHTSSILDLQFNpCFSEILASGSEDLTIRVWEIPHndesvkeIKDPQCILKGHKKKISIIDWNPMNYYIMCSSGFDSF 149
|
90 100 110 120
....*....|....*....|....*....|....*....|....*
gi 569004618 356 LLFWHVGVEKEVGGMEMAHEgmIWSLAWHPLGHIL---CSGSNDH 397
Cdd:PTZ00420 150 VNIWDIENEKRAFQINMPKK--LSSLKWNIKGNLLsgtCVGKHMH 192
|
|
| PTZ00420 |
PTZ00420 |
coronin; Provisional |
198-274 |
7.14e-04 |
|
coronin; Provisional
Pssm-ID: 240412 [Multi-domain] Cd Length: 568 Bit Score: 43.79 E-value: 7.14e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 198 FQAHKEAIREASFSPTDNK-FATCSDDGTVRIWDfLRCHEER---------ILRGHGADVKCVDWHPTKGLVVSGSK-DS 266
Cdd:PTZ00420 70 LKGHTSSILDLQFNPCFSEiLASGSEDLTIRVWE-IPHNDESvkeikdpqcILKGHKKKISIIDWNPMNYYIMCSSGfDS 148
|
....*...
gi 569004618 267 QqpIKFWD 274
Cdd:PTZ00420 149 F--VNIWD 154
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
715-977 |
7.38e-04 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 44.01 E-value: 7.38e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 715 HIGPQGPPASQGHMGpQGPPGTQGMQGPPGPRGMQGPPhPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQG 794
Cdd:PHA03307 36 LSGSQGQLVSDSAEL-AAVTVVAGAAACDRFEPPTGPP-PGPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSS 113
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 795 lmighPPQEMRGPHPPSGLLGHGPqEMRGPQEMRGMQGPPPQGSMLGPPQ---ELRGPSGSQGQQGPPQGSLGPPPQGGM 871
Cdd:PHA03307 114 -----PDPPPPTPPPASPPPSPAP-DLSEMLRPVGSPGPPPAASPPAAGAspaAVASDAASSRQAALPLSSPEETARAPS 187
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 872 QGPPGPQGQQNPARGPHPSQGPIPFQQQKA----PLLGDGPRAPFNQE-GQSTGPPPLIPGLGQQGAQGR---------- 936
Cdd:PHA03307 188 SPPAEPPPSTPPAAASPRPPRRSSPISASAsspaPAPGRSAADDAGASsSDSSSSESSGCGWGPENECPLprpapitlpt 267
|
250 260 270 280
....*....|....*....|....*....|....*....|....*.
gi 569004618 937 -----IPPLNPGQGPGPNKGTKGRRERHASGLPSPPGLVATTTTSP 977
Cdd:PHA03307 268 riweaSGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPR 313
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
721-966 |
1.51e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 43.39 E-value: 1.51e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 721 PPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPH---PHGIQGGPASQGIQGPLMGLNPRGMQG----------PPGPREN 787
Cdd:PHA03247 2628 PPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrPRRARRLGRAAQASSPPQRPRRRAARPtvgsltsladPPPPPPT 2707
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 788 QGPAPQGLMIGHP----PQEMRGPHPPSGL------------LGHGPQEMRGPQEMRGMQGP-PPQGSMLGPPQELRGPS 850
Cdd:PHA03247 2708 PEPAPHALVSATPlppgPAAARQASPALPAapappavpagpaTPGGPARPARPPTTAGPPAPaPPAAPAAGPPRRLTRPA 2787
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 851 GSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRA-------------PFNQEGQ 917
Cdd:PHA03247 2788 VASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPpslplggsvapggDVRRRPP 2867
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|..
gi 569004618 918 STGPPPLI-------------PGLGQQGAQGRIPPLNPGQGPGPNKGTKGRRERHASGLPSP 966
Cdd:PHA03247 2868 SRSPAAKPaaparppvrrlarPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQP 2929
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
719-940 |
1.66e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 42.72 E-value: 1.66e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 719 QGPPASQGHMGPQGPPGtqgmQGPPGPRGMQGPPHPHGiqggPASQGIQ-----GPLMGLNPR----GMQGPPGPRENQG 789
Cdd:pfam09770 105 QQPAARAAQSSAQPPAS----SLPQYQYASQQSQQPSK----PVRTGYEkykepEPIPDLQVDaslwGVAPKKAAAPAPA 176
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 790 PAPQGLMIGHPPQE------------MRG-----PHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGS 852
Cdd:pfam09770 177 PQPAAQPASLPAPSrkmmsleeveaaMRAqakkpAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQH 256
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 853 QGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQ 931
Cdd:pfam09770 257 PGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQ 336
|
....*....
gi 569004618 932 GAQGRIPPL 940
Cdd:pfam09770 337 GSFGRQAPI 345
|
|
| Pro-rich |
pfam15240 |
Proline-rich protein; This family includes several eukaryotic proline-rich proteins. |
729-844 |
1.81e-03 |
|
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
Pssm-ID: 464580 [Multi-domain] Cd Length: 167 Bit Score: 40.79 E-value: 1.81e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 729 GPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPlmglnprgmQGPPGPRENQGPAPQGlmighppqemrGPH 808
Cdd:pfam15240 72 GPQQPPPQGGKQKPQGPPPQGGPRPPPGKPQGPPPQGGNQQ---------QGPPPPGKPQGPPPQG-----------GGP 131
|
90 100 110
....*....|....*....|....*....|....*.
gi 569004618 809 PPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQ 844
Cdd:pfam15240 132 PPQGGNQQGPPPPPPGNPQGPPQRPPQPGNPQGPPQ 167
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
151-188 |
2.47e-03 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 36.94 E-value: 2.47e-03
10 20 30
....*....|....*....|....*....|....*...
gi 569004618 151 FNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:pfam00400 1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
573-862 |
3.29e-03 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 42.06 E-value: 3.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 573 QPPPSSGTPLLGPQPFSGQGPISQIPQGFQ-------QPHPSQQMPLVPQMG----PPGPQGQfrAPGPQGQMgPQGPPM 641
Cdd:pfam03154 254 QPPPPSQVSPQPLPQPSLHGQMPPMPHSLQtgpshmqHPVPPQPFPLTPQSSqsqvPPGPSPA--APGQSQQR-IHTPPS 330
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 642 HQGGGGPQGfmgpqgpqgppqglPRPQDMhGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPqgP 721
Cdd:pfam03154 331 QSQLQSQQP--------------PREQPL-PPAPLSMPHIKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP--P 393
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 722 PASQghmgpqgPPGTQGMQGPPGprgmqgpPHPHGIQGGPASQGIQGPLMglnprgmqGPPGPRENQGPAPQGlmighpp 801
Cdd:pfam03154 394 PALK-------PLSSLSTHHPPS-------AHPPPLQLMPQSQQLPPPPA--------QPPVLTQSQSLPPPA------- 444
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 569004618 802 qemrGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELrgPSGSQGQQGPPQGS 862
Cdd:pfam03154 445 ----ASHPPTSGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTST--SSAMPGIQPPSSAS 499
|
|
| Collagen |
pfam01391 |
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ... |
710-761 |
4.07e-03 |
|
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.
Pssm-ID: 460189 [Multi-domain] Cd Length: 57 Bit Score: 36.70 E-value: 4.07e-03
10 20 30 40 50
....*....|....*....|....*....|....*....|....*....|..
gi 569004618 710 PgpqghiGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGP 761
Cdd:pfam01391 9 P------GPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPGPPGAPGAP 54
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
718-993 |
5.32e-03 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 41.46 E-value: 5.32e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 718 PQGPPASQGHMGPQGPPGTQGMQGPPGPrgmqGPPHPHGIQGGPASqgiqgplmglnprgmqgpPGPRENQGPAPQGLMI 797
Cdd:PHA03247 2589 PDAPPQSARPRAPVDDRGDPRGPAPPSP----LPPDTHAPDPPPPS------------------PSPAANEPDPHPPPTV 2646
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 798 GHPPQEMRGPHPPsgllghgpqEMRGPQEMRGmQGPPPQGSmlGPPQELRGPSGSqgqqgPPQGSLGP-----PPQGGMQ 872
Cdd:PHA03247 2647 PPPERPRDDPAPG---------RVSRPRRARR-LGRAAQAS--SPPQRPRRRAAR-----PTVGSLTSladppPPPPTPE 2709
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 873 GPPGPQGQQNP---------ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIPglgqqgaqgRIPPLNPG 943
Cdd:PHA03247 2710 PAPHALVSATPlppgpaaarQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPP---------AAPAAGPP 2780
|
250 260 270 280 290
....*....|....*....|....*....|....*....|....*....|
gi 569004618 944 QGPGPNKGTKGRRERHASGLPSPPGLVATTTTSPFVVVTLEALPTITWAP 993
Cdd:PHA03247 2781 RRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPP 2830
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
717-863 |
6.41e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 40.74 E-value: 6.41e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 717 GPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGLM 796
Cdd:PRK07764 623 APAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPA 702
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 569004618 797 IGHPPQEMRGPHP-PSGLLGHGPQEMRGPQEMRGMQGPPPqgsmlGPPQELRGPSGSQGQQGPPQGSL 863
Cdd:PRK07764 703 PAPAATPPAGQADdPAAQPPQAAQGASAPSPAADDPVPLP-----PEPDDPPDPAGAPAQPPPPPAPA 765
|
|
| COG3416 |
COG3416 |
Uncharacterized conserved protein, DUF2076 domain [Function unknown]; |
537-642 |
6.60e-03 |
|
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
Pssm-ID: 442642 [Multi-domain] Cd Length: 237 Bit Score: 39.62 E-value: 6.60e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 537 TQAEIEQEMAtlqytnpqlLEQL--KIERL-AQKQADQIQPPPSSGTPLLGpqpFSGQGPISQIPQGFQQPHPSQQmplv 613
Cdd:COG3416 47 AQTILVQEAA---------LKQAqqRIQELeAQLAQLQQQQPQSSGGFLSG---LFGGGQRPPPAPQPSQPGPQQQ---- 110
|
90 100
....*....|....*....|....*....
gi 569004618 614 PQMGPPGPQGQFRAPGPQGQMGPQGPPMH 642
Cdd:COG3416 111 PAPPSGPWGQAAPQQPGYGQPQYGQPAAG 139
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
708-911 |
7.16e-03 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 40.74 E-value: 7.16e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 708 GPPGPQGhIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGmQGPPGPREN 787
Cdd:PRK07764 596 GGEGPPA-PASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASD-GGDGWPAKA 673
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618 788 QGPAPQGLMIGHPPQEMRGPHPPSGllghGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGSQGQQGPPQGSLGPPP 867
Cdd:PRK07764 674 GGAAPAAPPPAPAPAAPAAPAGAAP----AQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPP 749
|
170 180 190 200
....*....|....*....|....*....|....*....|....
gi 569004618 868 QGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAP 911
Cdd:PRK07764 750 DPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAPSM 793
|
|
|