|
Name |
Accession |
Description |
Interval |
E-value |
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
121-402 |
1.49e-66 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 226.83 E-value: 1.49e-66
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200 11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200 91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200 169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 302699201 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200 247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-402 |
1.50e-57 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 204.76 E-value: 1.50e-57
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319 122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319 280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 302699201 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
576-926 |
1.09e-13 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 75.82 E-value: 1.09e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 576 PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLV---AQMGPPGPQGQFRAP---GPQGQMGPQGPPLHQGGGGPQ 649
Cdd:pfam09606 140 PSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNggpGQGQAGGMNGGQQGPmggQMPPQMGVPGMPGPADAGAQM 219
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 650 GFMGPQGPQgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPA 723
Cdd:pfam09606 220 GQQAQANGG------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGA 293
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 724 PQGHMGPQGPPGTQGMQGPPgPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQE 803
Cdd:pfam09606 294 MPNVMSIGDQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGM 372
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 804 MRGPHPpsgLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLGPppqggmqgppgpqgqqnP 883
Cdd:pfam09606 373 MSSPSP---VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMIPS-----------------P 423
|
330 340 350 360
....*....|....*....|....*....|....*....|...
gi 302699201 884 ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606 424 ALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
690-858 |
1.88e-12 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 71.09 E-value: 1.88e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 690 GPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGP 769
Cdd:NF038329 132 GEQGPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGP 211
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 770 lmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGPPqelrG 848
Cdd:NF038329 212 ---AGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPaGKDGPRGDRGEAGPDGPDGKDGERGPV----G 284
|
170
....*....|
gi 302699201 849 PSGSQGQQGP 858
Cdd:NF038329 285 PAGKDGQNGK 294
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
608-857 |
6.84e-12 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 69.16 E-value: 6.84e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 608 QQMPLVAQMGPPGPQGQfraPGPQGQMGPQGPPlhqgGGGPQGFMGPQGPQGPPQGLPRPQdmhGPQGmqrhpgphgplg 687
Cdd:NF038329 111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDR----GETGPAGPAGPPGPQGERGEKGPA---GPQG------------ 168
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 688 pqgppgpqgssgpqghmgpqgPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQ 767
Cdd:NF038329 169 ---------------------EAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 768 GPlMGLNPRGMQGPPGPRENQGPApqgimighPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQEL 846
Cdd:NF038329 228 GP-AGDGQQGPDGDPGPTGEDGPQ--------GPDGPAGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDGL 297
|
250
....*....|.
gi 302699201 847 RGPSGSQGQQG 857
Cdd:NF038329 298 PGKDGKDGQNG 308
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
191-230 |
7.13e-09 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 52.70 E-value: 7.13e-09
10 20 30 40
....*....|....*....|....*....|....*....|
gi 302699201 191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
205-319 |
2.81e-08 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 57.98 E-value: 2.81e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421 78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
|
90 100 110 120
....*....|....*....|....*....|....*....|....*..
gi 302699201 273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421 153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
195-230 |
7.51e-08 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 49.65 E-value: 7.51e-08
10 20 30
....*....|....*....|....*....|....*.
gi 302699201 195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400 4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
561-970 |
5.79e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 51.09 E-value: 5.79e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 561 IERLAQKQADQIQPPPSSG---TPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQFRAPGPQGQMGPQ 637
Cdd:PHA03247 2578 SEPAVTSRARRPDAPPQSArprAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPA 2657
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 638 GPPLHQGGGGPQGFMGPQGPQGPPQglPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIG 717
Cdd:PHA03247 2658 PGRVSRPRRARRLGRAAQASSPPQR--PRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPAL 2735
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 718 PQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPrGMQGPPGPREnqgPAPQGIMI 797
Cdd:PHA03247 2736 PAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSE-SRESLPSPWD---PADPPAAV 2811
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 798 GHPPQEMRGPHPPSGLLghgpqemrgPQEMRGMQGPPPQGSmlGPPQELRGPSGSQGQQG-----PPQGSLGPPPQGGMQ 872
Cdd:PHA03247 2812 LAPAAALPPAASPAGPL---------PPPTSAQPTAPPPPP--GPPPPSLPLGGSVAPGGdvrrrPPSRSPAAKPAAPAR 2880
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 873 GPPGPQGQQNPARGPHP-SQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQqgaqgriPPLNPGQGPGPNK 950
Cdd:PHA03247 2881 PPVRRLARPAVSRSTESfALPPDQPERPPQPQAPPPPQPQPQPPPPpQPQPPPPPPPRPQ-------PPLAPTTDPAGAG 2953
|
410 420
....*....|....*....|
gi 302699201 951 GDSRGPPNHHLGPMSERRHE 970
Cdd:PHA03247 2954 EPSGAVPQPWLGALVPGRVA 2973
|
|
| SPT5 |
COG5164 |
Transcription elongation factor SPT5 [Transcription]; |
665-949 |
9.89e-06 |
|
Transcription elongation factor SPT5 [Transcription];
Pssm-ID: 444063 [Multi-domain] Cd Length: 495 Bit Score: 49.64 E-value: 9.89e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPgpqghiGPQGPPAPQGHMGPQGPPGTQGMQGPPG 744
Cdd:COG5164 12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNTGGTRPAQNQGSTTPA------GNTGGTRPAGNQGATGPAQNQGGTTPAQ 85
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 745 PRGMQGPPHPHGIQGGPTSQGIQGPlmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPqemrGPHPPSGLLGHGPQEMRGP 824
Cdd:COG5164 86 NQGGTRPAGNTGGTTPAGDGGATGP---PDDGGATGPPDDGGSTTPPSGGSTTPPGD----GGSTPPGPGSTGPGGSTTP 158
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 825 QEMRGMQGPPPQGSMLGPPqelrgpsGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPFQQQKAPLL 904
Cdd:COG5164 159 PGDGGSTTPPGPGGSTTPP-------DDGGSTTPPNKG---------------------ETGTDIPTGGTPRQGPDGPVK 210
|
250 260 270 280
....*....|....*....|....*....|....*....|....*
gi 302699201 905 GDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPN 949
Cdd:COG5164 211 KDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPA 255
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
556-675 |
1.56e-04 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 45.95 E-value: 1.56e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPMSQIPQ-----GFQQPHPSQQMPlvaqMGPPGPQ 622
Cdd:TIGR01628 360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....
gi 302699201 623 GqFRAPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628 436 G-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
121-402 |
1.49e-66 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 226.83 E-value: 1.49e-66
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200 11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200 91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200 169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 302699201 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200 247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-402 |
1.50e-57 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 204.76 E-value: 1.50e-57
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319 122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319 280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
|
250 260 270 280
....*....|....*....|....*....|....*....|...
gi 302699201 360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-402 |
2.37e-52 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 189.74 E-value: 2.37e-52
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319 38 AVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGHTAAVLSVAFSPDGRLLASASADGTVRLWDlATGLLLRTLT 117
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 118 GHTGAVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGT--VRLWDLATGK 195
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLF 358
Cdd:COG2319 196 LLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLWDLAT-GKLLRTLTGHSGSVRSVAFSP--DGrLLASGSADGTVRL 272
|
250 260 270 280
....*....|....*....|....*....|....*....|....
gi 302699201 359 WHVGvEKEVGGMEMAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 273 WDLA-TGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLW 315
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-361 |
1.93e-47 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 175.48 E-value: 1.93e-47
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWqsNMNN---VKM 197
Cdd:COG2319 164 AVTSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLW--DLATgklLRT 241
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:COG2319 242 LTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGT--VRLWDLAT 319
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:COG2319 320 GKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGTVRLWDLAT-GELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVR 397
|
....
gi 302699201 358 FWHV 361
Cdd:COG2319 398 LWDL 401
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
194-402 |
4.09e-43 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 159.42 E-value: 4.09e-43
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 194 NVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFW 273
Cdd:cd00200 1 LRRTLKGHTGGVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKT--IRLW 78
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 274 DPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEGLFASGGS- 352
Cdd:cd00200 79 DLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVET-GKCLTTLRGHTDWVNSVAFSP--DGTFVASSSq 155
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|
gi 302699201 353 DGSLLFWHVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200 156 DGTIKLWDLRTGKCVATLT-GHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
126-402 |
3.15e-42 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 160.08 E-value: 3.15e-42
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 126 RWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHND-MWMLTADHGGYVKYWQSNMNNVKMFQAHKEA 204
Cdd:COG2319 1 ALSADGAALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGaRLAAGAGDLTLLLLDAAAGALLATLLGHTAA 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 205 IREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQSLATL 284
Cdd:COG2319 81 VLSVAFSPDGRLLASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADGT--VRLWDLATGKLLRTL 158
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 285 HAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNLKeELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLFWHVGV 363
Cdd:COG2319 159 TGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGK-LLRTLTGHTGAVRSVAFSP--DGkLLASGSADGTVRLWDLAT 235
|
250 260 270
....*....|....*....|....*....|....*....
gi 302699201 364 EKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319 236 GKLLRTLT-GHSGSVRSVAFSPDGRLLASGSADGTVRLW 273
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
119-359 |
4.20e-39 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 147.48 E-value: 4.20e-39
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAmtwshndmwmltadhggyvkywqsnmnnvkmf 198
Cdd:cd00200 93 TSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS-------------------------------- 140
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 199 qahkeaireASFSPtDNKF-ATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:cd00200 141 ---------VAFSP-DGTFvASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGT--IKLWDLST 208
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:cd00200 209 GKCLGTLRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRT-GECVQTLSGHTNSVTSLAWSP-DGKRLASGSADGTIR 286
|
..
gi 302699201 358 FW 359
Cdd:cd00200 287 IW 288
|
|
| WD40 |
COG2319 |
WD40 repeat [General function prediction only]; |
121-319 |
7.34e-39 |
|
WD40 repeat [General function prediction only];
Pssm-ID: 441893 [Multi-domain] Cd Length: 403 Bit Score: 150.45 E-value: 7.34e-39
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319 206 AVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDlATGELLRTLT 285
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319 286 GHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGT--VRLWDLATGE 363
|
170 180 190 200
....*....|....*....|....*....|....*....|
gi 302699201 280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:COG2319 364 LLRTLTGHTGAVTSVAFSPDGRTLASGSADGTVRLWDLAT 403
|
|
| WD40 |
cd00200 |
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ... |
119-274 |
1.51e-28 |
|
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.
Pssm-ID: 238121 [Multi-domain] Cd Length: 289 Bit Score: 117.05 E-value: 1.51e-28
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNM-NNVKM 197
Cdd:cd00200 135 TDWVNSVAFSPDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTgKCLGT 214
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 302699201 198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:cd00200 215 LRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGT--IRIWD 289
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
576-926 |
1.09e-13 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 75.82 E-value: 1.09e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 576 PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLV---AQMGPPGPQGQFRAP---GPQGQMGPQGPPLHQGGGGPQ 649
Cdd:pfam09606 140 PSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNggpGQGQAGGMNGGQQGPmggQMPPQMGVPGMPGPADAGAQM 219
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 650 GFMGPQGPQgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPA 723
Cdd:pfam09606 220 GQQAQANGG------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGA 293
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 724 PQGHMGPQGPPGTQGMQGPPgPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQE 803
Cdd:pfam09606 294 MPNVMSIGDQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGM 372
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 804 MRGPHPpsgLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLGPppqggmqgppgpqgqqnP 883
Cdd:pfam09606 373 MSSPSP---VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMIPS-----------------P 423
|
330 340 350 360
....*....|....*....|....*....|....*....|...
gi 302699201 884 ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606 424 ALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
690-858 |
1.88e-12 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 71.09 E-value: 1.88e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 690 GPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGP 769
Cdd:NF038329 132 GEQGPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGP 211
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 770 lmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGPPqelrG 848
Cdd:NF038329 212 ---AGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPaGKDGPRGDRGEAGPDGPDGKDGERGPV----G 284
|
170
....*....|
gi 302699201 849 PSGSQGQQGP 858
Cdd:NF038329 285 PAGKDGQNGK 294
|
|
| gly_rich_SclB |
NF038329 |
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ... |
608-857 |
6.84e-12 |
|
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.
Pssm-ID: 468478 [Multi-domain] Cd Length: 440 Bit Score: 69.16 E-value: 6.84e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 608 QQMPLVAQMGPPGPQGQfraPGPQGQMGPQGPPlhqgGGGPQGFMGPQGPQGPPQGLPRPQdmhGPQGmqrhpgphgplg 687
Cdd:NF038329 111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDR----GETGPAGPAGPPGPQGERGEKGPA---GPQG------------ 168
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 688 pqgppgpqgssgpqghmgpqgPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQ 767
Cdd:NF038329 169 ---------------------EAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 768 GPlMGLNPRGMQGPPGPRENQGPApqgimighPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQEL 846
Cdd:NF038329 228 GP-AGDGQQGPDGDPGPTGEDGPQ--------GPDGPAGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDGL 297
|
250
....*....|.
gi 302699201 847 RGPSGSQGQQG 857
Cdd:NF038329 298 PGKDGKDGQNG 308
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
191-230 |
7.13e-09 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 52.70 E-value: 7.13e-09
10 20 30 40
....*....|....*....|....*....|....*....|
gi 302699201 191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
205-319 |
2.81e-08 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 57.98 E-value: 2.81e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421 78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
|
90 100 110 120
....*....|....*....|....*....|....*....|....*..
gi 302699201 273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421 153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
195-230 |
7.51e-08 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 49.65 E-value: 7.51e-08
10 20 30
....*....|....*....|....*....|....*.
gi 302699201 195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400 4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
|
|
| PLN00181 |
PLN00181 |
protein SPA1-RELATED; Provisional |
215-359 |
1.86e-07 |
|
protein SPA1-RELATED; Provisional
Pssm-ID: 177776 [Multi-domain] Cd Length: 793 Bit Score: 55.86 E-value: 1.86e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 215 NKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWH---PTkgLVVSGSKDSQqpIKFWDPKTGQSLATLHAHKNTV 291
Cdd:PLN00181 546 SQVASSNFEGVVQVWDVARSQLVTEMKEHEKRVWSIDYSsadPT--LLASGSDDGS--VKLWSINQGVSIGTIKTKANIC 621
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 302699201 292 MEVKLNLNGNWLLTASRDHLCKLFDIRNLKEELQVFRGHKKEATAVAWhpVHEGLFASGGSDGSLLFW 359
Cdd:PLN00181 622 CVQFPSESGRSLAFGSADHKVYYYDLRNPKLPLCTMIGHSKTVSYVRF--VDSSTLVSSSTDNTLKLW 687
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
238-274 |
5.80e-07 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 47.31 E-value: 5.80e-07
10 20 30
....*....|....*....|....*....|....*..
gi 302699201 238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:smart00320 6 KTLKGHTGPVTSVAFSPDGKYLASGSDDGT--IKLWD 40
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
238-274 |
3.34e-06 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 45.03 E-value: 3.34e-06
10 20 30
....*....|....*....|....*....|....*..
gi 302699201 238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:pfam00400 5 KTLEGHTGSVTSLAFSPDGKLLASGSDDGT--VKVWD 39
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
321-360 |
4.25e-06 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 44.61 E-value: 4.25e-06
10 20 30 40
....*....|....*....|....*....|....*....|
gi 302699201 321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFWH 360
Cdd:smart00320 2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
561-970 |
5.79e-06 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 51.09 E-value: 5.79e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 561 IERLAQKQADQIQPPPSSG---TPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQFRAPGPQGQMGPQ 637
Cdd:PHA03247 2578 SEPAVTSRARRPDAPPQSArprAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPA 2657
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 638 GPPLHQGGGGPQGFMGPQGPQGPPQglPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIG 717
Cdd:PHA03247 2658 PGRVSRPRRARRLGRAAQASSPPQR--PRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPAL 2735
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 718 PQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPrGMQGPPGPREnqgPAPQGIMI 797
Cdd:PHA03247 2736 PAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSE-SRESLPSPWD---PADPPAAV 2811
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 798 GHPPQEMRGPHPPSGLLghgpqemrgPQEMRGMQGPPPQGSmlGPPQELRGPSGSQGQQG-----PPQGSLGPPPQGGMQ 872
Cdd:PHA03247 2812 LAPAAALPPAASPAGPL---------PPPTSAQPTAPPPPP--GPPPPSLPLGGSVAPGGdvrrrPPSRSPAAKPAAPAR 2880
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 873 GPPGPQGQQNPARGPHP-SQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQqgaqgriPPLNPGQGPGPNK 950
Cdd:PHA03247 2881 PPVRRLARPAVSRSTESfALPPDQPERPPQPQAPPPPQPQPQPPPPpQPQPPPPPPPRPQ-------PPLAPTTDPAGAG 2953
|
410 420
....*....|....*....|
gi 302699201 951 GDSRGPPNHHLGPMSERRHE 970
Cdd:PHA03247 2954 EPSGAVPQPWLGALVPGRVA 2973
|
|
| Collagen |
pfam01391 |
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ... |
717-769 |
6.79e-06 |
|
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.
Pssm-ID: 460189 [Multi-domain] Cd Length: 57 Bit Score: 44.79 E-value: 6.79e-06
10 20 30 40 50
....*....|....*....|....*....|....*....|....*....|...
gi 302699201 717 GPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGP 769
Cdd:pfam01391 1 GPPGPPGPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPGPPGAPGA 53
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
373-402 |
7.14e-06 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 44.23 E-value: 7.14e-06
10 20 30
....*....|....*....|....*....|
gi 302699201 373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:smart00320 10 GHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
|
|
| SPT5 |
COG5164 |
Transcription elongation factor SPT5 [Transcription]; |
665-949 |
9.89e-06 |
|
Transcription elongation factor SPT5 [Transcription];
Pssm-ID: 444063 [Multi-domain] Cd Length: 495 Bit Score: 49.64 E-value: 9.89e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPgpqghiGPQGPPAPQGHMGPQGPPGTQGMQGPPG 744
Cdd:COG5164 12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNTGGTRPAQNQGSTTPA------GNTGGTRPAGNQGATGPAQNQGGTTPAQ 85
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 745 PRGMQGPPHPHGIQGGPTSQGIQGPlmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPqemrGPHPPSGLLGHGPQEMRGP 824
Cdd:COG5164 86 NQGGTRPAGNTGGTTPAGDGGATGP---PDDGGATGPPDDGGSTTPPSGGSTTPPGD----GGSTPPGPGSTGPGGSTTP 158
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 825 QEMRGMQGPPPQGSMLGPPqelrgpsGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPFQQQKAPLL 904
Cdd:COG5164 159 PGDGGSTTPPGPGGSTTPP-------DDGGSTTPPNKG---------------------ETGTDIPTGGTPRQGPDGPVK 210
|
250 260 270 280
....*....|....*....|....*....|....*....|....*
gi 302699201 905 GDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPN 949
Cdd:COG5164 211 KDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPA 255
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
614-983 |
1.11e-05 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 50.01 E-value: 1.11e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 614 AQMGPPGPQGQFRAPGPQGQMGPqgPPLHQGGGGPQGFMGPQGPQgppqglPRPQDMHGPQGMQRHPGPHGPLGPQGPPG 693
Cdd:pfam09606 58 AQQQQPQGGQGNGGMGGGQQGMP--DPINALQNLAGQGTRPQMMG------PMGPGPGGPMGQQMGGPGTASNLLASLGR 129
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 694 PQGSsgpqghmgpqgppgpqghIGPQGPPAPQGHMGPQGPPG-TQGMQGPPGPRGMQGPPHPHGIQGGP-------TSQG 765
Cdd:pfam09606 130 PQMP------------------MGGAGFPSQMSRVGRMQPGGqAGGMMQPSSGQPGSGTPNQMGPNGGPgqgqaggMNGG 191
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 766 IQGPLMGLNPRGM--QGPPGPRE--NQGPAPQGIMIGHPPQEMRGPhPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLG 841
Cdd:pfam09606 192 QQGPMGGQMPPQMgvPGMPGPADagAQMGQQAQANGGMNPQQMGGA-PNQVAMQQQQPQQQGQQSQLGMGINQMQQMPQG 270
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 842 PPQElrGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQ-GPIPFQQQKAPLLGDGPRAPFNQEGQSTG 920
Cdd:pfam09606 271 VGGG--AGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQQQQQQgGNHPAAHQQQMNQSVGQGGQVVALGGLNH 348
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 302699201 921 PPPLIP----GLGQQGAQGRIPPLNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGP-EHGPDRGP 983
Cdd:pfam09606 349 LETWNPgnfgGLGANPMQRGQPGMMSSPSPVPGQQVRQVTPNQFMRQSPQPSVPSPQGPgSQPPQSHP 416
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
277-316 |
1.14e-05 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 43.46 E-value: 1.14e-05
10 20 30 40
....*....|....*....|....*....|....*....|
gi 302699201 277 TGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
734-978 |
1.18e-05 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 50.07 E-value: 1.18e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 734 PGTQGMQGPPGPRGMQGPPHphgiQGGPTSQGIQGPLMGLNPRGMQ----GPPGPRENQGPaPQGIMIGHPPQEMRGPHP 809
Cdd:PHA03378 598 PVPHPSQTPEPPTTQSHIPE----TSAPRQWPMPLRPIPMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHI 672
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 810 PSGLLGHGPQEMRGPQEMRGMQGPPPQGsmlgpPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP 889
Cdd:PHA03378 673 PYQPSPTGANTMLPIQWAPGTMQPPPRA-----PTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPP 747
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 890 SQGPIPFQQQKAPLLGDGPraPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQ-GPGPNKGDSRGPPNHHLGPMSER 967
Cdd:PHA03378 748 AAAPGRARPPAAAPGRARP--PAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQaGPTSMQLMPRAAPGQQGPTKQIL 825
|
250
....*....|.
gi 302699201 968 RHEQSGGPEHG 978
Cdd:PHA03378 826 RQLLTGGVKRG 836
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
278-316 |
1.40e-05 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 43.10 E-value: 1.40e-05
10 20 30
....*....|....*....|....*....|....*....
gi 302699201 278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:pfam00400 1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
533-761 |
4.88e-05 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 47.72 E-value: 4.88e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 533 EKKKTQAEIEQEMATLQYTNPQL------LEQLKIERLAQKQADQIQPPPSSGtpllgPQPFSGQGPMSQIPQGFQQPHP 606
Cdd:pfam09770 167 PKKAAAPAPAPQPAAQPASLPAPsrkmmsLEEVEAAMRAQAKKPAQQPAPAPA-----QPPAAPPAQQAQQQQQFPPQIQ 241
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 607 SQQMPlvAQMGPPGPQGQFRAPGPQGQMGPQGPPLhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRhpgphgpl 686
Cdd:pfam09770 242 QQQQP--QQQPQQPQQHPGQGHPVTILQRPQSPQP------DPAQPSIQPQAQQFHQQPPPVPVQPTQILQN-------- 305
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 302699201 687 gpqgppgpqgssgpqghmgpqgppgpqghigPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGP 761
Cdd:pfam09770 306 -------------------------------PNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHP 349
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
373-402 |
5.47e-05 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 41.56 E-value: 5.47e-05
10 20 30
....*....|....*....|....*....|
gi 302699201 373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:pfam00400 9 GHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
|
|
| Collagen |
pfam01391 |
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ... |
717-765 |
5.66e-05 |
|
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.
Pssm-ID: 460189 [Multi-domain] Cd Length: 57 Bit Score: 42.10 E-value: 5.66e-05
10 20 30 40
....*....|....*....|....*....|....*....|....*....
gi 302699201 717 GPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQG 765
Cdd:pfam01391 7 GPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPGPPGAPGAPG 55
|
|
| PTZ00421 |
PTZ00421 |
coronin; Provisional |
288-398 |
6.47e-05 |
|
coronin; Provisional
Pssm-ID: 173611 [Multi-domain] Cd Length: 493 Bit Score: 47.20 E-value: 6.47e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 288 KNTVMEVKLN-LNGNWLLTASRDHLCKLFDI------RNLKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGSLLFWH 360
Cdd:PTZ00421 75 EGPIIDVAFNpFDPQKLFTASEDGTIMGWGIpeegltQNISDPIVHLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWD 154
|
90 100 110 120
....*....|....*....|....*....|....*....|
gi 302699201 361 V--GVEKEVggmEMAHEGMIWSLAWHPLGHILCSGSNDHT 398
Cdd:PTZ00421 155 VerGKAVEV---IKCHSDQITSLEWNLDGSLLCTTSKDKK 191
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
556-675 |
1.56e-04 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 45.95 E-value: 1.56e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPMSQIPQ-----GFQQPHPSQQMPlvaqMGPPGPQ 622
Cdd:TIGR01628 360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....
gi 302699201 623 GqFRAPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628 436 G-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
321-359 |
1.60e-04 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 40.41 E-value: 1.60e-04
10 20 30
....*....|....*....|....*....|....*....
gi 302699201 321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:pfam00400 1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVW 38
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
721-1021 |
1.61e-04 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 46.47 E-value: 1.61e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 721 PPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPH---PHGIQGGPTSQGIQGPLMGLNPRGMQG----------PPGPREN 787
Cdd:PHA03247 2628 PPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrPRRARRLGRAAQASSPPQRPRRRAARPtvgsltsladPPPPPPT 2707
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 788 QGPAPQGIMIGHP----PQEMRGPHPPSGL------------LGHGPQEMRGPQEMRGMQGP-PPQGSMLGPPQELRGPS 850
Cdd:PHA03247 2708 PEPAPHALVSATPlppgPAAARQASPALPAapappavpagpaTPGGPARPARPPTTAGPPAPaPPAAPAAGPPRRLTRPA 2787
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 851 GSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGP-------PP 923
Cdd:PHA03247 2788 VASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPggdvrrrPP 2867
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 924 LIPGLGQQGAQGRI-------PPLNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPDRGPFRGGQDCRGPPDR 996
Cdd:PHA03247 2868 SRSPAAKPAAPARPpvrrlarPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTT 2947
|
330 340
....*....|....*....|....*
gi 302699201 997 RgshpdfPDDFSRPDDFHPDKRFGH 1021
Cdd:PHA03247 2948 D------PAGAGEPSGAVPQPWLGA 2966
|
|
| WD40 |
smart00320 |
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ... |
150-188 |
2.25e-04 |
|
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.
Pssm-ID: 197651 [Multi-domain] Cd Length: 40 Bit Score: 39.99 E-value: 2.25e-04
10 20 30
....*....|....*....|....*....|....*....
gi 302699201 150 TFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:smart00320 1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
|
|
| PTZ00420 |
PTZ00420 |
coronin; Provisional |
284-397 |
1.40e-03 |
|
coronin; Provisional
Pssm-ID: 240412 [Multi-domain] Cd Length: 568 Bit Score: 43.01 E-value: 1.40e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 284 LHAHKNTVMEVKLN-LNGNWLLTASRDHLCKLFDIRN-------LKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGS 355
Cdd:PTZ00420 70 LKGHTSSILDLQFNpCFSEILASGSEDLTIRVWEIPHndesvkeIKDPQCILKGHKKKISIIDWNPMNYYIMCSSGFDSF 149
|
90 100 110 120
....*....|....*....|....*....|....*....|....*
gi 302699201 356 LLFWHVGVEKEVGGMEMAHEgmIWSLAWHPLGHIL---CSGSNDH 397
Cdd:PTZ00420 150 VNIWDIENEKRAFQINMPKK--LSSLKWNIKGNLLsgtCVGKHMH 192
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
718-904 |
1.55e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 43.10 E-value: 1.55e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 718 PQGPPAPQGHMGPQGPPGTQGM-------------QGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGlnprgmQGPPGP 784
Cdd:pfam09770 173 PAPAPQPAAQPASLPAPSRKMMsleeveaamraqaKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQI------QQQQQP 246
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 785 RENQGPAPQGIMIGHPPQEMRGPHPPSGllgHGPQEMRGPQEMRGMQGPPPQgsMLGPPQELRGP-----SGSQGQQGPP 859
Cdd:pfam09770 247 QQQPQQPQQHPGQGHPVTILQRPQSPQP---DPAQPSIQPQAQQFHQQPPPV--PVQPTQILQNPnrlsaARVGYPQNPQ 321
|
170 180 190 200
....*....|....*....|....*....|....*....|....*
gi 302699201 860 QGSLGPPPQGGMQGPPGPQGQQNPArgPHPSQGPIPFQQQKAPLL 904
Cdd:pfam09770 322 PGVQPAPAHQAHRQQGSFGRQAPII--THPQQLAQLSEEEKAAYL 364
|
|
| PTZ00420 |
PTZ00420 |
coronin; Provisional |
198-274 |
1.75e-03 |
|
coronin; Provisional
Pssm-ID: 240412 [Multi-domain] Cd Length: 568 Bit Score: 42.63 E-value: 1.75e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 198 FQAHKEAIREASFSPTDNK-FATCSDDGTVRIWDfLRCHEER---------ILRGHGADVKCVDWHPTKGLVVSGSK-DS 266
Cdd:PTZ00420 70 LKGHTSSILDLQFNPCFSEiLASGSEDLTIRVWE-IPHNDESvkeikdpqcILKGHKKKISIIDWNPMNYYIMCSSGfDS 148
|
....*...
gi 302699201 267 QqpIKFWD 274
Cdd:PTZ00420 149 F--VNIWD 154
|
|
| Glutenin_hmw |
pfam03157 |
High molecular weight glutenin subunit; Members of this family include high molecular weight ... |
565-980 |
1.90e-03 |
|
High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.
Pssm-ID: 367362 [Multi-domain] Cd Length: 786 Bit Score: 42.63 E-value: 1.90e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 565 AQKQADQIQPPPSSGTPLLGPQPF-------SGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQFRAPGPQGQMGPQ 637
Cdd:pfam03157 301 SQQQAGQLQQEQQLGQEQQDQQPGqgrqgqqPGQGQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQG 380
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 638 GPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIG 717
Cdd:pfam03157 381 QQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQQP 460
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 718 PQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPG----PRENQGPAPQ 793
Cdd:pfam03157 461 GQGQQGQQPGQPEQGQQPGQGQPGYYPTSPQQSGQGQQLGQWQQQGQGQPGYYPTSPLQPGQGQPGyyptSPQQPGQGQQ 540
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 794 GIMIGHPPQEMRGPHPPSGLLGHGP-QEMRGPQEMRGMQGPPP-QGSMLGPPQELRGPSGSQ--------GQQGPPQGSL 863
Cdd:pfam03157 541 LGQLQQPTQGQQGQQSGQGQQGQQPgQGQQGQQPGQGQQGQQPgQGQQPGQGQPGYYPTSPQqsgqgqqpGQWQQPGQGQ 620
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 864 GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKApllGDGPRAPFNQEGQSTGPPPLiPGLGQ---------QGAQ 934
Cdd:pfam03157 621 PGYYPTSSLQLGQGQQGYYPTSPQQPGQGQQPGQWQQS---GQGQQGYYPTSPQQSGQAQQ-PGQGQqpgqwlqpgQGQQ 696
|
410 420 430 440
....*....|....*....|....*....|....*....|....*...
gi 302699201 935 GRIP--PLNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPD 980
Cdd:pfam03157 697 GYYPtsPQQPGQGQQLGQGQQSGQGQQGYYPTSPGQGQQSGQGQQGYD 744
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
618-1001 |
2.05e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 42.85 E-value: 2.05e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 618 PPGPQGQFRAPGPQGQMGPqgpPLHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRHPGPHGPLGPQGppgpqgs 697
Cdd:PHA03307 27 TPGDAADDLLSGSQGQLVS---DSAELAAVTVVAGAAACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLA------- 96
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 698 sgpqghmgpqgppgpqghigpqgpPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHgiqgGPTSQGIQGPLMGLNPRG 777
Cdd:PHA03307 97 ------------------------PASPAREGSPTPPGPSSPDPPPPTPPPASPPPSP----APDLSEMLRPVGSPGPPP 148
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 778 MQGPPGPRENQGPAPQG--------IMIGHPPQEMRGPHPPSGLLGHGPQEMRGPqemrgmQGPPPQGSMLGPPQELRGP 849
Cdd:PHA03307 149 AASPPAAGASPAAVASDaassrqaaLPLSSPEETARAPSSPPAEPPPSTPPAAAS------PRPPRRSSPISASASSPAP 222
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 850 SGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNParGPHPSQGPIPFQqqkapllgdgPRAPFNQEGQSTGPPPLIPGLG 929
Cdd:PHA03307 223 APGRSAADDAGASSSDSSSSESSGCGWGPENECP--LPRPAPITLPTR----------IWEASGWNGPSSRPGPASSSSS 290
|
330 340 350 360 370 380 390
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 302699201 930 QQGAQGRIPPLNPGQG---PGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPDRGPFRGGQDCRGPPDRRGSHP 1001
Cdd:PHA03307 291 PRERSPSPSPSSPGSGpapSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSP 365
|
|
| WD40 |
pfam00400 |
WD domain, G-beta repeat; |
151-188 |
2.51e-03 |
|
WD domain, G-beta repeat;
Pssm-ID: 459801 [Multi-domain] Cd Length: 39 Bit Score: 36.94 E-value: 2.51e-03
10 20 30
....*....|....*....|....*....|....*...
gi 302699201 151 FNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:pfam00400 1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
565-860 |
3.24e-03 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 42.06 E-value: 3.24e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 565 AQKQADQIQPP---------PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQF---------R 626
Cdd:pfam03154 162 AQQQILQTQPPvlqaqsgaaSPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIqqtptlhpqR 241
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 627 APGPQGQMGPQGPPlhqGGGGPQGFMGPQGPQGPPQGLPRPQDMH-GPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMG 705
Cdd:pfam03154 242 LPSPHPPLQPMTQP---PPPSQVSPQPLPQPSLHGQMPPMPHSLQtGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAP 318
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 706 PQGPPGpqghigpQGPPAPQGHMGPQGPPGTQGMqgPPGPRGMqgpPHphgIQGGPTSQGIQGPlmglNPRGMQGPPgpr 785
Cdd:pfam03154 319 GQSQQR-------IHTPPSQSQLQSQQPPREQPL--PPAPLSM---PH---IKPPPTTPIPQLP----NPQSHKHPP--- 376
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 786 ENQGPAPQGIMIGHPPQEMRGP-------HPPSgllGHGPQEMRGPQEMRgMQGPPPQGSMLG-----PPQELRGPSGSQ 853
Cdd:pfam03154 377 HLSGPSPFQMNSNLPPPPALKPlsslsthHPPS---AHPPPLQLMPQSQQ-LPPPPAQPPVLTqsqslPPPAASHPPTSG 452
|
....*..
gi 302699201 854 GQQGPPQ 860
Cdd:pfam03154 453 LHQVPSQ 459
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
807-940 |
3.91e-03 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 41.56 E-value: 3.91e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 807 PHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARG 886
Cdd:pfam09770 211 AQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQ 290
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*
gi 302699201 887 PHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQGAQGRIPPL 940
Cdd:pfam09770 291 QPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQGSFGRQAPI 345
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
718-1128 |
4.65e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 41.70 E-value: 4.65e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 718 PQGPPAPQGHMGPQG-----------PPGTQGMQGPPGPRGMQGPPhPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRE 786
Cdd:PHA03307 27 TPGDAADDLLSGSQGqlvsdsaelaaVTVVAGAAACDRFEPPTGPP-PGPGTEAPANESRSTPTWSLSTLAPASPAREGS 105
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 787 NQGPAPQGimighPPQEMRGPHPPSGLLGHGPqEMRGPQEMRGMQGPPPQGSMLGPPqeLRGPSGSQGQQGPPQGSLGPP 866
Cdd:PHA03307 106 PTPPGPSS-----PDPPPPTPPPASPPPSPAP-DLSEMLRPVGSPGPPPAASPPAAG--ASPAAVASDAASSRQAALPLS 177
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 867 PQGGmqgppgpqgqqnPARGPHPSQGPIPFQQQkAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQ-----QGAQGRIPPLN 941
Cdd:PHA03307 178 SPEE------------TARAPSSPPAEPPPSTP-PAAASPRPPRRSSPISASASSPAPAPGRSAaddagASSSDSSSSES 244
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 942 PGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPDRGPFRGGQDCRGPPDRRgSHPDFPDDFSRPddfhPDKRFGH 1021
Cdd:PHA03307 245 SGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSP-SSPGSGPAPSSP----RASSSSS 319
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 1022 RLREfEGRGGPLPQEEKwRRGGPGPPFPPDHREFNEGDGRGAARGPPGAWEGRRPGDERFPR--DPDDPRFRGRR---EE 1096
Cdd:PHA03307 320 SSRE-SSSSSTSSSSES-SRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAasAGRPTRRRARAavaGR 397
|
410 420 430
....*....|....*....|....*....|...
gi 302699201 1097 SFRRGAPPRHE-GRAPPRGRDNFPGPDDFGPEE 1128
Cdd:PHA03307 398 ARRRDATGRFPaGRPRPSPLDAGAASGAFYARY 430
|
|
| dnaA |
PRK14086 |
chromosomal replication initiator protein DnaA; |
718-913 |
5.52e-03 |
|
chromosomal replication initiator protein DnaA;
Pssm-ID: 237605 [Multi-domain] Cd Length: 617 Bit Score: 40.96 E-value: 5.52e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 718 PQGPPAPQGHMGPQGP-PGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGiqgplmGLNPRGMQGPPG--PRENQGPAPQG 794
Cdd:PRK14086 96 APPPPHARRTSEPELPrPGRRPYEGYGGPRADDRPPGLPRQDQLPTARP------AYPAYQQRPEPGawPRAADDYGWQQ 169
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 795 IMIGHPPqemRGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGSQGQQ-GPPQGSLGPPPQGGMQG 873
Cdd:PRK14086 170 QRLGFPP---RAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRPRRDRTDRpEPPPGAGHVHRGGPGPP 246
|
170 180 190 200
....*....|....*....|....*....|....*....|.
gi 302699201 874 PPGPQGQQNPARGphpsqGPIPFQQQKAPLLGDG-PRAPFN 913
Cdd:PRK14086 247 ERDDAPVVPIRPS-----APGPLAAQPAPAPGPGePTARLN 282
|
|
| Glutenin_hmw |
pfam03157 |
High molecular weight glutenin subunit; Members of this family include high molecular weight ... |
568-978 |
5.55e-03 |
|
High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.
Pssm-ID: 367362 [Multi-domain] Cd Length: 786 Bit Score: 41.09 E-value: 5.55e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 568 QADQIQPPPSSGTPLLGPQpfSGQGPMSQIPQGFQQPHPSQQ--MPLVAQMGPPGPQGQFRAPGPQGQMGPQGPPLHQGG 645
Cdd:pfam03157 256 QGQQGYYPISPQQPRQWQQ--SGQGQQGYYPTSLQQPGQGQSgyYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPG 333
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 646 GGPQGFMGPQGPqgppqglprpQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPAPQ 725
Cdd:pfam03157 334 QGQQGQQPAQGQ----------QPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQ 403
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 726 GHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQEMR 805
Cdd:pfam03157 404 GQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQQPGQGQQGQQPGQPEQGQQPGQGQP 483
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 806 GPHPPS-GLLGHGPQEMRGPQEMRGMQG-------PPPQGSMLGPPQELRGPSGSQ--------GQQGPPQGSLGPPPQG 869
Cdd:pfam03157 484 GYYPTSpQQSGQGQQLGQWQQQGQGQPGyyptsplQPGQGQPGYYPTSPQQPGQGQqlgqlqqpTQGQQGQQSGQGQQGQ 563
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 870 GMQGPPGPQGQQNPARGPHPSQGPIPFQQQ--------KAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIP--P 939
Cdd:pfam03157 564 QPGQGQQGQQPGQGQQGQQPGQGQQPGQGQpgyyptspQQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGQGQQGYYPtsP 643
|
410 420 430
....*....|....*....|....*....|....*....
gi 302699201 940 LNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHG 978
Cdd:pfam03157 644 QQPGQGQQPGQWQQSGQGQQGYYPTSPQQSGQAQQPGQG 682
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
541-642 |
9.58e-03 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 40.45 E-value: 9.58e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 541 IEQEMATLQYTNPQLLEQLKIERLA-QKQADQIQPPPSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPlvaqMGPP 619
Cdd:PRK10263 749 VEPVQQPQQPVAPQQQYQQPQQPVApQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQP----QQPV 824
|
90 100
....*....|....*....|...
gi 302699201 620 GPQGQFRAPGPQGQMGPQGPPLH 642
Cdd:PRK10263 825 APQPQYQQPQQPVAPQPQDTLLH 847
|
|
|