NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|302699201|ref|NP_001100868|]
View 

pre-mRNA 3' end processing protein WDR33 [Rattus norvegicus]

Protein Classification

WD40 and Med15 domain-containing protein( domain architecture ID 11526309)

WD40 and Med15 domain-containing protein

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 1.49e-66

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


:

Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 1.49e-66
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 302699201  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
Med15 super family cl26621
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
576-926 1.09e-13

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


The actual alignment was detected with superfamily member pfam09606:

Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 75.82  E-value: 1.09e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   576 PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLV---AQMGPPGPQGQFRAP---GPQGQMGPQGPPLHQGGGGPQ 649
Cdd:pfam09606  140 PSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNggpGQGQAGGMNGGQQGPmggQMPPQMGVPGMPGPADAGAQM 219
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   650 GFMGPQGPQgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPA 723
Cdd:pfam09606  220 GQQAQANGG------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGA 293
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   724 PQGHMGPQGPPGTQGMQGPPgPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQE 803
Cdd:pfam09606  294 MPNVMSIGDQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGM 372
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   804 MRGPHPpsgLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLGPppqggmqgppgpqgqqnP 883
Cdd:pfam09606  373 MSSPSP---VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMIPS-----------------P 423
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|...
gi 302699201   884 ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606  424 ALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 1.49e-66

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 1.49e-66
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 302699201  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.50e-57

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 204.76  E-value: 1.50e-57
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319   122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319   280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 302699201  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
576-926 1.09e-13

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 75.82  E-value: 1.09e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   576 PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLV---AQMGPPGPQGQFRAP---GPQGQMGPQGPPLHQGGGGPQ 649
Cdd:pfam09606  140 PSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNggpGQGQAGGMNGGQQGPmggQMPPQMGVPGMPGPADAGAQM 219
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   650 GFMGPQGPQgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPA 723
Cdd:pfam09606  220 GQQAQANGG------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGA 293
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   724 PQGHMGPQGPPGTQGMQGPPgPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQE 803
Cdd:pfam09606  294 MPNVMSIGDQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGM 372
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   804 MRGPHPpsgLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLGPppqggmqgppgpqgqqnP 883
Cdd:pfam09606  373 MSSPSP---VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMIPS-----------------P 423
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|...
gi 302699201   884 ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606  424 ALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
690-858 1.88e-12

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 71.09  E-value: 1.88e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  690 GPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGP 769
Cdd:NF038329  132 GEQGPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGP 211
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  770 lmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGPPqelrG 848
Cdd:NF038329  212 ---AGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPaGKDGPRGDRGEAGPDGPDGKDGERGPV----G 284
                         170
                  ....*....|
gi 302699201  849 PSGSQGQQGP 858
Cdd:NF038329  285 PAGKDGQNGK 294
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
608-857 6.84e-12

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 69.16  E-value: 6.84e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  608 QQMPLVAQMGPPGPQGQfraPGPQGQMGPQGPPlhqgGGGPQGFMGPQGPQGPPQGLPRPQdmhGPQGmqrhpgphgplg 687
Cdd:NF038329  111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDR----GETGPAGPAGPPGPQGERGEKGPA---GPQG------------ 168
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  688 pqgppgpqgssgpqghmgpqgPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQ 767
Cdd:NF038329  169 ---------------------EAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  768 GPlMGLNPRGMQGPPGPRENQGPApqgimighPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQEL 846
Cdd:NF038329  228 GP-AGDGQQGPDGDPGPTGEDGPQ--------GPDGPAGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDGL 297
                         250
                  ....*....|.
gi 302699201  847 RGPSGSQGQQG 857
Cdd:NF038329  298 PGKDGKDGQNG 308
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
191-230 7.13e-09

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 52.70  E-value: 7.13e-09
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 302699201    191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PTZ00421 PTZ00421
coronin; Provisional
205-319 2.81e-08

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 57.98  E-value: 2.81e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421   78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 302699201  273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421  153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
WD40 pfam00400
WD domain, G-beta repeat;
195-230 7.51e-08

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 49.65  E-value: 7.51e-08
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 302699201   195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400    4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PHA03247 PHA03247
large tegument protein UL36; Provisional
561-970 5.79e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.09  E-value: 5.79e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  561 IERLAQKQADQIQPPPSSG---TPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQFRAPGPQGQMGPQ 637
Cdd:PHA03247 2578 SEPAVTSRARRPDAPPQSArprAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPA 2657
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  638 GPPLHQGGGGPQGFMGPQGPQGPPQglPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIG 717
Cdd:PHA03247 2658 PGRVSRPRRARRLGRAAQASSPPQR--PRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPAL 2735
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  718 PQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPrGMQGPPGPREnqgPAPQGIMI 797
Cdd:PHA03247 2736 PAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSE-SRESLPSPWD---PADPPAAV 2811
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  798 GHPPQEMRGPHPPSGLLghgpqemrgPQEMRGMQGPPPQGSmlGPPQELRGPSGSQGQQG-----PPQGSLGPPPQGGMQ 872
Cdd:PHA03247 2812 LAPAAALPPAASPAGPL---------PPPTSAQPTAPPPPP--GPPPPSLPLGGSVAPGGdvrrrPPSRSPAAKPAAPAR 2880
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  873 GPPGPQGQQNPARGPHP-SQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQqgaqgriPPLNPGQGPGPNK 950
Cdd:PHA03247 2881 PPVRRLARPAVSRSTESfALPPDQPERPPQPQAPPPPQPQPQPPPPpQPQPPPPPPPRPQ-------PPLAPTTDPAGAG 2953
                         410       420
                  ....*....|....*....|
gi 302699201  951 GDSRGPPNHHLGPMSERRHE 970
Cdd:PHA03247 2954 EPSGAVPQPWLGALVPGRVA 2973
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
665-949 9.89e-06

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 49.64  E-value: 9.89e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPgpqghiGPQGPPAPQGHMGPQGPPGTQGMQGPPG 744
Cdd:COG5164    12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNTGGTRPAQNQGSTTPA------GNTGGTRPAGNQGATGPAQNQGGTTPAQ 85
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  745 PRGMQGPPHPHGIQGGPTSQGIQGPlmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPqemrGPHPPSGLLGHGPQEMRGP 824
Cdd:COG5164    86 NQGGTRPAGNTGGTTPAGDGGATGP---PDDGGATGPPDDGGSTTPPSGGSTTPPGD----GGSTPPGPGSTGPGGSTTP 158
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  825 QEMRGMQGPPPQGSMLGPPqelrgpsGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPFQQQKAPLL 904
Cdd:COG5164   159 PGDGGSTTPPGPGGSTTPP-------DDGGSTTPPNKG---------------------ETGTDIPTGGTPRQGPDGPVK 210
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*
gi 302699201  905 GDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPN 949
Cdd:COG5164   211 KDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPA 255
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-675 1.56e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 45.95  E-value: 1.56e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPMSQIPQ-----GFQQPHPSQQMPlvaqMGPPGPQ 622
Cdd:TIGR01628  360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....
gi 302699201   623 GqFRAPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628  436 G-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 1.49e-66

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 1.49e-66
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 302699201  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.50e-57

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 204.76  E-value: 1.50e-57
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319   122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319   280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 302699201  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 2.37e-52

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 189.74  E-value: 2.37e-52
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319    38 AVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGHTAAVLSVAFSPDGRLLASASADGTVRLWDlATGLLLRTLT 117
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   118 GHTGAVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGT--VRLWDLATGK 195
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLF 358
Cdd:COG2319   196 LLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLWDLAT-GKLLRTLTGHSGSVRSVAFSP--DGrLLASGSADGTVRL 272
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....
gi 302699201  359 WHVGvEKEVGGMEMAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   273 WDLA-TGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLW 315
WD40 COG2319
WD40 repeat [General function prediction only];
121-361 1.93e-47

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 175.48  E-value: 1.93e-47
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWqsNMNN---VKM 197
Cdd:COG2319   164 AVTSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLW--DLATgklLRT 241
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:COG2319   242 LTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGT--VRLWDLAT 319
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:COG2319   320 GKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGTVRLWDLAT-GELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVR 397

                  ....
gi 302699201  358 FWHV 361
Cdd:COG2319   398 LWDL 401
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
194-402 4.09e-43

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 159.42  E-value: 4.09e-43
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  194 NVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFW 273
Cdd:cd00200     1 LRRTLKGHTGGVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKT--IRLW 78
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  274 DPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEGLFASGGS- 352
Cdd:cd00200    79 DLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVET-GKCLTTLRGHTDWVNSVAFSP--DGTFVASSSq 155
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|
gi 302699201  353 DGSLLFWHVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   156 DGTIKLWDLRTGKCVATLT-GHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204
WD40 COG2319
WD40 repeat [General function prediction only];
126-402 3.15e-42

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 160.08  E-value: 3.15e-42
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  126 RWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHND-MWMLTADHGGYVKYWQSNMNNVKMFQAHKEA 204
Cdd:COG2319     1 ALSADGAALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGaRLAAGAGDLTLLLLDAAAGALLATLLGHTAA 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  205 IREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQSLATL 284
Cdd:COG2319    81 VLSVAFSPDGRLLASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADGT--VRLWDLATGKLLRTL 158
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  285 HAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNLKeELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLFWHVGV 363
Cdd:COG2319   159 TGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGK-LLRTLTGHTGAVRSVAFSP--DGkLLASGSADGTVRLWDLAT 235
                         250       260       270
                  ....*....|....*....|....*....|....*....
gi 302699201  364 EKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   236 GKLLRTLT-GHSGSVRSVAFSPDGRLLASGSADGTVRLW 273
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
119-359 4.20e-39

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 147.48  E-value: 4.20e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAmtwshndmwmltadhggyvkywqsnmnnvkmf 198
Cdd:cd00200    93 TSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS-------------------------------- 140
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  199 qahkeaireASFSPtDNKF-ATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:cd00200   141 ---------VAFSP-DGTFvASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGT--IKLWDLST 208
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:cd00200   209 GKCLGTLRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRT-GECVQTLSGHTNSVTSLAWSP-DGKRLASGSADGTIR 286

                  ..
gi 302699201  358 FW 359
Cdd:cd00200   287 IW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-319 7.34e-39

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 150.45  E-value: 7.34e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319   206 AVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDlATGELLRTLT 285
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   286 GHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGT--VRLWDLATGE 363
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|
gi 302699201  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:COG2319   364 LLRTLTGHTGAVTSVAFSPDGRTLASGSADGTVRLWDLAT 403
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
119-274 1.51e-28

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 117.05  E-value: 1.51e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNM-NNVKM 197
Cdd:cd00200   135 TDWVNSVAFSPDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTgKCLGT 214
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 302699201  198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:cd00200   215 LRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGT--IRIWD 289
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
576-926 1.09e-13

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 75.82  E-value: 1.09e-13
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   576 PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLV---AQMGPPGPQGQFRAP---GPQGQMGPQGPPLHQGGGGPQ 649
Cdd:pfam09606  140 PSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNggpGQGQAGGMNGGQQGPmggQMPPQMGVPGMPGPADAGAQM 219
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   650 GFMGPQGPQgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPA 723
Cdd:pfam09606  220 GQQAQANGG------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGA 293
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   724 PQGHMGPQGPPGTQGMQGPPgPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQE 803
Cdd:pfam09606  294 MPNVMSIGDQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGM 372
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   804 MRGPHPpsgLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLGPppqggmqgppgpqgqqnP 883
Cdd:pfam09606  373 MSSPSP---VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMIPS-----------------P 423
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|...
gi 302699201   884 ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606  424 ALIPSPSPQMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
690-858 1.88e-12

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 71.09  E-value: 1.88e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  690 GPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGP 769
Cdd:NF038329  132 GEQGPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGP 211
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  770 lmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGPPqelrG 848
Cdd:NF038329  212 ---AGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPaGKDGPRGDRGEAGPDGPDGKDGERGPV----G 284
                         170
                  ....*....|
gi 302699201  849 PSGSQGQQGP 858
Cdd:NF038329  285 PAGKDGQNGK 294
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
608-857 6.84e-12

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 69.16  E-value: 6.84e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  608 QQMPLVAQMGPPGPQGQfraPGPQGQMGPQGPPlhqgGGGPQGFMGPQGPQGPPQGLPRPQdmhGPQGmqrhpgphgplg 687
Cdd:NF038329  111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDR----GETGPAGPAGPPGPQGERGEKGPA---GPQG------------ 168
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  688 pqgppgpqgssgpqghmgpqgPPGPQGHIGPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQ 767
Cdd:NF038329  169 ---------------------EAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  768 GPlMGLNPRGMQGPPGPRENQGPApqgimighPPQEMRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQEL 846
Cdd:NF038329  228 GP-AGDGQQGPDGDPGPTGEDGPQ--------GPDGPAGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDGL 297
                         250
                  ....*....|.
gi 302699201  847 RGPSGSQGQQG 857
Cdd:NF038329  298 PGKDGKDGQNG 308
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
191-230 7.13e-09

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 52.70  E-value: 7.13e-09
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 302699201    191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PTZ00421 PTZ00421
coronin; Provisional
205-319 2.81e-08

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 57.98  E-value: 2.81e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421   78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 302699201  273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421  153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
WD40 pfam00400
WD domain, G-beta repeat;
195-230 7.51e-08

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 49.65  E-value: 7.51e-08
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 302699201   195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400    4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
215-359 1.86e-07

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 55.86  E-value: 1.86e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  215 NKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWH---PTkgLVVSGSKDSQqpIKFWDPKTGQSLATLHAHKNTV 291
Cdd:PLN00181  546 SQVASSNFEGVVQVWDVARSQLVTEMKEHEKRVWSIDYSsadPT--LLASGSDDGS--VKLWSINQGVSIGTIKTKANIC 621
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 302699201  292 MEVKLNLNGNWLLTASRDHLCKLFDIRNLKEELQVFRGHKKEATAVAWhpVHEGLFASGGSDGSLLFW 359
Cdd:PLN00181  622 CVQFPSESGRSLAFGSADHKVYYYDLRNPKLPLCTMIGHSKTVSYVRF--VDSSTLVSSSTDNTLKLW 687
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
238-274 5.80e-07

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 47.31  E-value: 5.80e-07
                            10        20        30
                    ....*....|....*....|....*....|....*..
gi 302699201    238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:smart00320    6 KTLKGHTGPVTSVAFSPDGKYLASGSDDGT--IKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
238-274 3.34e-06

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 45.03  E-value: 3.34e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 302699201   238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:pfam00400    5 KTLEGHTGSVTSLAFSPDGKLLASGSDDGT--VKVWD 39
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
321-360 4.25e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 44.61  E-value: 4.25e-06
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 302699201    321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFWH 360
Cdd:smart00320    2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
PHA03247 PHA03247
large tegument protein UL36; Provisional
561-970 5.79e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.09  E-value: 5.79e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  561 IERLAQKQADQIQPPPSSG---TPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQFRAPGPQGQMGPQ 637
Cdd:PHA03247 2578 SEPAVTSRARRPDAPPQSArprAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPA 2657
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  638 GPPLHQGGGGPQGFMGPQGPQGPPQglPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIG 717
Cdd:PHA03247 2658 PGRVSRPRRARRLGRAAQASSPPQR--PRRRAARPTVGSLTSLADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPAL 2735
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  718 PQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPrGMQGPPGPREnqgPAPQGIMI 797
Cdd:PHA03247 2736 PAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLSE-SRESLPSPWD---PADPPAAV 2811
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  798 GHPPQEMRGPHPPSGLLghgpqemrgPQEMRGMQGPPPQGSmlGPPQELRGPSGSQGQQG-----PPQGSLGPPPQGGMQ 872
Cdd:PHA03247 2812 LAPAAALPPAASPAGPL---------PPPTSAQPTAPPPPP--GPPPPSLPLGGSVAPGGdvrrrPPSRSPAAKPAAPAR 2880
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  873 GPPGPQGQQNPARGPHP-SQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQqgaqgriPPLNPGQGPGPNK 950
Cdd:PHA03247 2881 PPVRRLARPAVSRSTESfALPPDQPERPPQPQAPPPPQPQPQPPPPpQPQPPPPPPPRPQ-------PPLAPTTDPAGAG 2953
                         410       420
                  ....*....|....*....|
gi 302699201  951 GDSRGPPNHHLGPMSERRHE 970
Cdd:PHA03247 2954 EPSGAVPQPWLGALVPGRVA 2973
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
717-769 6.79e-06

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 44.79  E-value: 6.79e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 302699201   717 GPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGP 769
Cdd:pfam01391    1 GPPGPPGPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPGPPGAPGA 53
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
373-402 7.14e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 44.23  E-value: 7.14e-06
                            10        20        30
                    ....*....|....*....|....*....|
gi 302699201    373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:smart00320   10 GHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
665-949 9.89e-06

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 49.64  E-value: 9.89e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPgpqghiGPQGPPAPQGHMGPQGPPGTQGMQGPPG 744
Cdd:COG5164    12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNTGGTRPAQNQGSTTPA------GNTGGTRPAGNQGATGPAQNQGGTTPAQ 85
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  745 PRGMQGPPHPHGIQGGPTSQGIQGPlmgLNPRGMQGPPGPRENQGPAPQGIMIGHPPqemrGPHPPSGLLGHGPQEMRGP 824
Cdd:COG5164    86 NQGGTRPAGNTGGTTPAGDGGATGP---PDDGGATGPPDDGGSTTPPSGGSTTPPGD----GGSTPPGPGSTGPGGSTTP 158
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  825 QEMRGMQGPPPQGSMLGPPqelrgpsGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPFQQQKAPLL 904
Cdd:COG5164   159 PGDGGSTTPPGPGGSTTPP-------DDGGSTTPPNKG---------------------ETGTDIPTGGTPRQGPDGPVK 210
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*
gi 302699201  905 GDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPN 949
Cdd:COG5164   211 KDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPA 255
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
614-983 1.11e-05

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 50.01  E-value: 1.11e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   614 AQMGPPGPQGQFRAPGPQGQMGPqgPPLHQGGGGPQGFMGPQGPQgppqglPRPQDMHGPQGMQRHPGPHGPLGPQGPPG 693
Cdd:pfam09606   58 AQQQQPQGGQGNGGMGGGQQGMP--DPINALQNLAGQGTRPQMMG------PMGPGPGGPMGQQMGGPGTASNLLASLGR 129
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   694 PQGSsgpqghmgpqgppgpqghIGPQGPPAPQGHMGPQGPPG-TQGMQGPPGPRGMQGPPHPHGIQGGP-------TSQG 765
Cdd:pfam09606  130 PQMP------------------MGGAGFPSQMSRVGRMQPGGqAGGMMQPSSGQPGSGTPNQMGPNGGPgqgqaggMNGG 191
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   766 IQGPLMGLNPRGM--QGPPGPRE--NQGPAPQGIMIGHPPQEMRGPhPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLG 841
Cdd:pfam09606  192 QQGPMGGQMPPQMgvPGMPGPADagAQMGQQAQANGGMNPQQMGGA-PNQVAMQQQQPQQQGQQSQLGMGINQMQQMPQG 270
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   842 PPQElrGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQ-GPIPFQQQKAPLLGDGPRAPFNQEGQSTG 920
Cdd:pfam09606  271 VGGG--AGQGGPGQPMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQQQQQQgGNHPAAHQQQMNQSVGQGGQVVALGGLNH 348
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 302699201   921 PPPLIP----GLGQQGAQGRIPPLNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGP-EHGPDRGP 983
Cdd:pfam09606  349 LETWNPgnfgGLGANPMQRGQPGMMSSPSPVPGQQVRQVTPNQFMRQSPQPSVPSPQGPgSQPPQSHP 416
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
277-316 1.14e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 43.46  E-value: 1.14e-05
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 302699201    277 TGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PHA03378 PHA03378
EBNA-3B; Provisional
734-978 1.18e-05

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 50.07  E-value: 1.18e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  734 PGTQGMQGPPGPRGMQGPPHphgiQGGPTSQGIQGPLMGLNPRGMQ----GPPGPRENQGPaPQGIMIGHPPQEMRGPHP 809
Cdd:PHA03378  598 PVPHPSQTPEPPTTQSHIPE----TSAPRQWPMPLRPIPMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHI 672
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  810 PSGLLGHGPQEMRGPQEMRGMQGPPPQGsmlgpPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP 889
Cdd:PHA03378  673 PYQPSPTGANTMLPIQWAPGTMQPPPRA-----PTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPP 747
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  890 SQGPIPFQQQKAPLLGDGPraPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQ-GPGPNKGDSRGPPNHHLGPMSER 967
Cdd:PHA03378  748 AAAPGRARPPAAAPGRARP--PAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQaGPTSMQLMPRAAPGQQGPTKQIL 825
                         250
                  ....*....|.
gi 302699201  968 RHEQSGGPEHG 978
Cdd:PHA03378  826 RQLLTGGVKRG 836
WD40 pfam00400
WD domain, G-beta repeat;
278-316 1.40e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 43.10  E-value: 1.40e-05
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 302699201   278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
533-761 4.88e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 47.72  E-value: 4.88e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   533 EKKKTQAEIEQEMATLQYTNPQL------LEQLKIERLAQKQADQIQPPPSSGtpllgPQPFSGQGPMSQIPQGFQQPHP 606
Cdd:pfam09770  167 PKKAAAPAPAPQPAAQPASLPAPsrkmmsLEEVEAAMRAQAKKPAQQPAPAPA-----QPPAAPPAQQAQQQQQFPPQIQ 241
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   607 SQQMPlvAQMGPPGPQGQFRAPGPQGQMGPQGPPLhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRhpgphgpl 686
Cdd:pfam09770  242 QQQQP--QQQPQQPQQHPGQGHPVTILQRPQSPQP------DPAQPSIQPQAQQFHQQPPPVPVQPTQILQN-------- 305
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 302699201   687 gpqgppgpqgssgpqghmgpqgppgpqghigPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGP 761
Cdd:pfam09770  306 -------------------------------PNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHP 349
WD40 pfam00400
WD domain, G-beta repeat;
373-402 5.47e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 41.56  E-value: 5.47e-05
                           10        20        30
                   ....*....|....*....|....*....|
gi 302699201   373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:pfam00400    9 GHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
717-765 5.66e-05

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 42.10  E-value: 5.66e-05
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 302699201   717 GPQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQG 765
Cdd:pfam01391    7 GPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPGPPGAPGAPG 55
PTZ00421 PTZ00421
coronin; Provisional
288-398 6.47e-05

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 47.20  E-value: 6.47e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  288 KNTVMEVKLN-LNGNWLLTASRDHLCKLFDI------RNLKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGSLLFWH 360
Cdd:PTZ00421   75 EGPIIDVAFNpFDPQKLFTASEDGTIMGWGIpeegltQNISDPIVHLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWD 154
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|
gi 302699201  361 V--GVEKEVggmEMAHEGMIWSLAWHPLGHILCSGSNDHT 398
Cdd:PTZ00421  155 VerGKAVEV---IKCHSDQITSLEWNLDGSLLCTTSKDKK 191
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-675 1.56e-04

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 45.95  E-value: 1.56e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPMSQIPQ-----GFQQPHPSQQMPlvaqMGPPGPQ 622
Cdd:TIGR01628  360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....
gi 302699201   623 GqFRAPGPQGQMGPQGPPL-HQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628  436 G-LAPMNAVRAPSRNAQNAaQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
WD40 pfam00400
WD domain, G-beta repeat;
321-359 1.60e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.41  E-value: 1.60e-04
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 302699201   321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVW 38
PHA03247 PHA03247
large tegument protein UL36; Provisional
721-1021 1.61e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 46.47  E-value: 1.61e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  721 PPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPH---PHGIQGGPTSQGIQGPLMGLNPRGMQG----------PPGPREN 787
Cdd:PHA03247 2628 PPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrPRRARRLGRAAQASSPPQRPRRRAARPtvgsltsladPPPPPPT 2707
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  788 QGPAPQGIMIGHP----PQEMRGPHPPSGL------------LGHGPQEMRGPQEMRGMQGP-PPQGSMLGPPQELRGPS 850
Cdd:PHA03247 2708 PEPAPHALVSATPlppgPAAARQASPALPAapappavpagpaTPGGPARPARPPTTAGPPAPaPPAAPAAGPPRRLTRPA 2787
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  851 GSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGP-------PP 923
Cdd:PHA03247 2788 VASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSVAPggdvrrrPP 2867
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  924 LIPGLGQQGAQGRI-------PPLNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPDRGPFRGGQDCRGPPDR 996
Cdd:PHA03247 2868 SRSPAAKPAAPARPpvrrlarPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTT 2947
                         330       340
                  ....*....|....*....|....*
gi 302699201  997 RgshpdfPDDFSRPDDFHPDKRFGH 1021
Cdd:PHA03247 2948 D------PAGAGEPSGAVPQPWLGA 2966
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
150-188 2.25e-04

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 39.99  E-value: 2.25e-04
                            10        20        30
                    ....*....|....*....|....*....|....*....
gi 302699201    150 TFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
PTZ00420 PTZ00420
coronin; Provisional
284-397 1.40e-03

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 43.01  E-value: 1.40e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  284 LHAHKNTVMEVKLN-LNGNWLLTASRDHLCKLFDIRN-------LKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGS 355
Cdd:PTZ00420   70 LKGHTSSILDLQFNpCFSEILASGSEDLTIRVWEIPHndesvkeIKDPQCILKGHKKKISIIDWNPMNYYIMCSSGFDSF 149
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*
gi 302699201  356 LLFWHVGVEKEVGGMEMAHEgmIWSLAWHPLGHIL---CSGSNDH 397
Cdd:PTZ00420  150 VNIWDIENEKRAFQINMPKK--LSSLKWNIKGNLLsgtCVGKHMH 192
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
718-904 1.55e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 43.10  E-value: 1.55e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   718 PQGPPAPQGHMGPQGPPGTQGM-------------QGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGlnprgmQGPPGP 784
Cdd:pfam09770  173 PAPAPQPAAQPASLPAPSRKMMsleeveaamraqaKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQI------QQQQQP 246
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   785 RENQGPAPQGIMIGHPPQEMRGPHPPSGllgHGPQEMRGPQEMRGMQGPPPQgsMLGPPQELRGP-----SGSQGQQGPP 859
Cdd:pfam09770  247 QQQPQQPQQHPGQGHPVTILQRPQSPQP---DPAQPSIQPQAQQFHQQPPPV--PVQPTQILQNPnrlsaARVGYPQNPQ 321
                          170       180       190       200
                   ....*....|....*....|....*....|....*....|....*
gi 302699201   860 QGSLGPPPQGGMQGPPGPQGQQNPArgPHPSQGPIPFQQQKAPLL 904
Cdd:pfam09770  322 PGVQPAPAHQAHRQQGSFGRQAPII--THPQQLAQLSEEEKAAYL 364
PTZ00420 PTZ00420
coronin; Provisional
198-274 1.75e-03

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 42.63  E-value: 1.75e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  198 FQAHKEAIREASFSPTDNK-FATCSDDGTVRIWDfLRCHEER---------ILRGHGADVKCVDWHPTKGLVVSGSK-DS 266
Cdd:PTZ00420   70 LKGHTSSILDLQFNPCFSEiLASGSEDLTIRVWE-IPHNDESvkeikdpqcILKGHKKKISIIDWNPMNYYIMCSSGfDS 148

                  ....*...
gi 302699201  267 QqpIKFWD 274
Cdd:PTZ00420  149 F--VNIWD 154
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
565-980 1.90e-03

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 42.63  E-value: 1.90e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   565 AQKQADQIQPPPSSGTPLLGPQPF-------SGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQFRAPGPQGQMGPQ 637
Cdd:pfam03157  301 SQQQAGQLQQEQQLGQEQQDQQPGqgrqgqqPGQGQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQG 380
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   638 GPPLHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIG 717
Cdd:pfam03157  381 QQPEQGQQGQQQGQGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQQP 460
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   718 PQGPPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPG----PRENQGPAPQ 793
Cdd:pfam03157  461 GQGQQGQQPGQPEQGQQPGQGQPGYYPTSPQQSGQGQQLGQWQQQGQGQPGYYPTSPLQPGQGQPGyyptSPQQPGQGQQ 540
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   794 GIMIGHPPQEMRGPHPPSGLLGHGP-QEMRGPQEMRGMQGPPP-QGSMLGPPQELRGPSGSQ--------GQQGPPQGSL 863
Cdd:pfam03157  541 LGQLQQPTQGQQGQQSGQGQQGQQPgQGQQGQQPGQGQQGQQPgQGQQPGQGQPGYYPTSPQqsgqgqqpGQWQQPGQGQ 620
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   864 GPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKApllGDGPRAPFNQEGQSTGPPPLiPGLGQ---------QGAQ 934
Cdd:pfam03157  621 PGYYPTSSLQLGQGQQGYYPTSPQQPGQGQQPGQWQQS---GQGQQGYYPTSPQQSGQAQQ-PGQGQqpgqwlqpgQGQQ 696
                          410       420       430       440
                   ....*....|....*....|....*....|....*....|....*...
gi 302699201   935 GRIP--PLNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPD 980
Cdd:pfam03157  697 GYYPtsPQQPGQGQQLGQGQQSGQGQQGYYPTSPGQGQQSGQGQQGYD 744
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
618-1001 2.05e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 42.85  E-value: 2.05e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  618 PPGPQGQFRAPGPQGQMGPqgpPLHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRHPGPHGPLGPQGppgpqgs 697
Cdd:PHA03307   27 TPGDAADDLLSGSQGQLVS---DSAELAAVTVVAGAAACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLA------- 96
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  698 sgpqghmgpqgppgpqghigpqgpPAPQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHgiqgGPTSQGIQGPLMGLNPRG 777
Cdd:PHA03307   97 ------------------------PASPAREGSPTPPGPSSPDPPPPTPPPASPPPSP----APDLSEMLRPVGSPGPPP 148
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  778 MQGPPGPRENQGPAPQG--------IMIGHPPQEMRGPHPPSGLLGHGPQEMRGPqemrgmQGPPPQGSMLGPPQELRGP 849
Cdd:PHA03307  149 AASPPAAGASPAAVASDaassrqaaLPLSSPEETARAPSSPPAEPPPSTPPAAAS------PRPPRRSSPISASASSPAP 222
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  850 SGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNParGPHPSQGPIPFQqqkapllgdgPRAPFNQEGQSTGPPPLIPGLG 929
Cdd:PHA03307  223 APGRSAADDAGASSSDSSSSESSGCGWGPENECP--LPRPAPITLPTR----------IWEASGWNGPSSRPGPASSSSS 290
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 302699201  930 QQGAQGRIPPLNPGQG---PGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPDRGPFRGGQDCRGPPDRRGSHP 1001
Cdd:PHA03307  291 PRERSPSPSPSSPGSGpapSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPADPSSP 365
WD40 pfam00400
WD domain, G-beta repeat;
151-188 2.51e-03

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 36.94  E-value: 2.51e-03
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 302699201   151 FNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
565-860 3.24e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 42.06  E-value: 3.24e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   565 AQKQADQIQPP---------PSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPLVAQMGPPGPQGQF---------R 626
Cdd:pfam03154  162 AQQQILQTQPPvlqaqsgaaSPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIqqtptlhpqR 241
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   627 APGPQGQMGPQGPPlhqGGGGPQGFMGPQGPQGPPQGLPRPQDMH-GPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMG 705
Cdd:pfam03154  242 LPSPHPPLQPMTQP---PPPSQVSPQPLPQPSLHGQMPPMPHSLQtGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAP 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   706 PQGPPGpqghigpQGPPAPQGHMGPQGPPGTQGMqgPPGPRGMqgpPHphgIQGGPTSQGIQGPlmglNPRGMQGPPgpr 785
Cdd:pfam03154  319 GQSQQR-------IHTPPSQSQLQSQQPPREQPL--PPAPLSM---PH---IKPPPTTPIPQLP----NPQSHKHPP--- 376
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   786 ENQGPAPQGIMIGHPPQEMRGP-------HPPSgllGHGPQEMRGPQEMRgMQGPPPQGSMLG-----PPQELRGPSGSQ 853
Cdd:pfam03154  377 HLSGPSPFQMNSNLPPPPALKPlsslsthHPPS---AHPPPLQLMPQSQQ-LPPPPAQPPVLTqsqslPPPAASHPPTSG 452

                   ....*..
gi 302699201   854 GQQGPPQ 860
Cdd:pfam03154  453 LHQVPSQ 459
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
807-940 3.91e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 41.56  E-value: 3.91e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   807 PHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARG 886
Cdd:pfam09770  211 AQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQ 290
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*
gi 302699201   887 PHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQGAQGRIPPL 940
Cdd:pfam09770  291 QPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQGSFGRQAPI 345
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
718-1128 4.65e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 41.70  E-value: 4.65e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  718 PQGPPAPQGHMGPQG-----------PPGTQGMQGPPGPRGMQGPPhPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRE 786
Cdd:PHA03307   27 TPGDAADDLLSGSQGqlvsdsaelaaVTVVAGAAACDRFEPPTGPP-PGPGTEAPANESRSTPTWSLSTLAPASPAREGS 105
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  787 NQGPAPQGimighPPQEMRGPHPPSGLLGHGPqEMRGPQEMRGMQGPPPQGSMLGPPqeLRGPSGSQGQQGPPQGSLGPP 866
Cdd:PHA03307  106 PTPPGPSS-----PDPPPPTPPPASPPPSPAP-DLSEMLRPVGSPGPPPAASPPAAG--ASPAAVASDAASSRQAALPLS 177
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  867 PQGGmqgppgpqgqqnPARGPHPSQGPIPFQQQkAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQ-----QGAQGRIPPLN 941
Cdd:PHA03307  178 SPEE------------TARAPSSPPAEPPPSTP-PAAASPRPPRRSSPISASASSPAPAPGRSAaddagASSSDSSSSES 244
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  942 PGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHGPDRGPFRGGQDCRGPPDRRgSHPDFPDDFSRPddfhPDKRFGH 1021
Cdd:PHA03307  245 SGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPGPASSSSSPRERSPSPSP-SSPGSGPAPSSP----RASSSSS 319
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201 1022 RLREfEGRGGPLPQEEKwRRGGPGPPFPPDHREFNEGDGRGAARGPPGAWEGRRPGDERFPR--DPDDPRFRGRR---EE 1096
Cdd:PHA03307  320 SSRE-SSSSSTSSSSES-SRGAAVSPGPSPSRSPSPSRPPPPADPSSPRKRPRPSRAPSSPAasAGRPTRRRARAavaGR 397
                         410       420       430
                  ....*....|....*....|....*....|...
gi 302699201 1097 SFRRGAPPRHE-GRAPPRGRDNFPGPDDFGPEE 1128
Cdd:PHA03307  398 ARRRDATGRFPaGRPRPSPLDAGAASGAFYARY 430
dnaA PRK14086
chromosomal replication initiator protein DnaA;
718-913 5.52e-03

chromosomal replication initiator protein DnaA;


Pssm-ID: 237605 [Multi-domain]  Cd Length: 617  Bit Score: 40.96  E-value: 5.52e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  718 PQGPPAPQGHMGPQGP-PGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGiqgplmGLNPRGMQGPPG--PRENQGPAPQG 794
Cdd:PRK14086   96 APPPPHARRTSEPELPrPGRRPYEGYGGPRADDRPPGLPRQDQLPTARP------AYPAYQQRPEPGawPRAADDYGWQQ 169
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  795 IMIGHPPqemRGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGSQGQQ-GPPQGSLGPPPQGGMQG 873
Cdd:PRK14086  170 QRLGFPP---RAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRPRRDRTDRpEPPPGAGHVHRGGPGPP 246
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|.
gi 302699201  874 PPGPQGQQNPARGphpsqGPIPFQQQKAPLLGDG-PRAPFN 913
Cdd:PRK14086  247 ERDDAPVVPIRPS-----APGPLAAQPAPAPGPGePTARLN 282
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
568-978 5.55e-03

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 41.09  E-value: 5.55e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   568 QADQIQPPPSSGTPLLGPQpfSGQGPMSQIPQGFQQPHPSQQ--MPLVAQMGPPGPQGQFRAPGPQGQMGPQGPPLHQGG 645
Cdd:pfam03157  256 QGQQGYYPISPQQPRQWQQ--SGQGQQGYYPTSLQQPGQGQSgyYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPG 333
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   646 GGPQGFMGPQGPqgppqglprpQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPAPQ 725
Cdd:pfam03157  334 QGQQGQQPAQGQ----------QPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQGQQGQQPGQ 403
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   726 GHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPTSQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGIMIGHPPQEMR 805
Cdd:pfam03157  404 GQQPGQGQPGYYPTSPQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQQPGQGQQGQQPGQPEQGQQPGQGQP 483
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   806 GPHPPS-GLLGHGPQEMRGPQEMRGMQG-------PPPQGSMLGPPQELRGPSGSQ--------GQQGPPQGSLGPPPQG 869
Cdd:pfam03157  484 GYYPTSpQQSGQGQQLGQWQQQGQGQPGyyptsplQPGQGQPGYYPTSPQQPGQGQqlgqlqqpTQGQQGQQSGQGQQGQ 563
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201   870 GMQGPPGPQGQQNPARGPHPSQGPIPFQQQ--------KAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIP--P 939
Cdd:pfam03157  564 QPGQGQQGQQPGQGQQGQQPGQGQQPGQGQpgyyptspQQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGQGQQGYYPtsP 643
                          410       420       430
                   ....*....|....*....|....*....|....*....
gi 302699201   940 LNPGQGPGPNKGDSRGPPNHHLGPMSERRHEQSGGPEHG 978
Cdd:pfam03157  644 QQPGQGQQPGQWQQSGQGQQGYYPTSPQQSGQAQQPGQG 682
PRK10263 PRK10263
DNA translocase FtsK; Provisional
541-642 9.58e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 40.45  E-value: 9.58e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 302699201  541 IEQEMATLQYTNPQLLEQLKIERLA-QKQADQIQPPPSSGTPLLGPQPFSGQGPMSQIPQGFQQPHPSQQMPlvaqMGPP 619
Cdd:PRK10263  749 VEPVQQPQQPVAPQQQYQQPQQPVApQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQP----QQPV 824
                          90       100
                  ....*....|....*....|...
gi 302699201  620 GPQGQFRAPGPQGQMGPQGPPLH 642
Cdd:PRK10263  825 APQPQYQQPQQPVAPQPQDTLLH 847
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH