NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|569004618|ref|XP_006526363|]
View 

pre-mRNA 3' end processing protein WDR33 isoform X2 [Mus musculus]

Protein Classification

WD40 and Med15 domain-containing protein( domain architecture ID 11526309)

WD40 and Med15 domain-containing protein

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 1.45e-66

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


:

Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 1.45e-66
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 569004618  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
gly_rich_SclB super family cl45768
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
608-857 3.50e-14

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


The actual alignment was detected with superfamily member NF038329:

Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 76.48  E-value: 3.50e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  608 QQMPLVPQMGPPGPQGQfraPGPQGQMGPQGPPMHQGGGGPQGFMGPQGpqgppqglprPQDMHGPQGMQrhpgphgplg 687
Cdd:NF038329  111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDRGETGPAGPAGPPGPQG----------ERGEKGPAGPQ---------- 167
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  688 pqgppgpqgssgpqghmgpqGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQ 767
Cdd:NF038329  168 --------------------GEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  768 GPlMGLNPRGMQGPPGPRENQGPA-PQGLmighppqemRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQE 845
Cdd:NF038329  228 GP-AGDGQQGPDGDPGPTGEDGPQgPDGP---------AGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDG 296
                         250
                  ....*....|..
gi 569004618  846 LRGPSGSQGQQG 857
Cdd:NF038329  297 LPGKDGKDGQNG 308
COG3416 super family cl47320
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
537-642 6.60e-03

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


The actual alignment was detected with superfamily member COG3416:

Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 39.62  E-value: 6.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  537 TQAEIEQEMAtlqytnpqlLEQL--KIERL-AQKQADQIQPPPSSGTPLLGpqpFSGQGPISQIPQGFQQPHPSQQmplv 613
Cdd:COG3416    47 AQTILVQEAA---------LKQAqqRIQELeAQLAQLQQQQPQSSGGFLSG---LFGGGQRPPPAPQPSQPGPQQQ---- 110
                          90       100
                  ....*....|....*....|....*....
gi 569004618  614 PQMGPPGPQGQFRAPGPQGQMGPQGPPMH 642
Cdd:COG3416   111 PAPPSGPWGQAAPQQPGYGQPQYGQPAAG 139
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 1.45e-66

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 1.45e-66
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 569004618  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.46e-57

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 204.76  E-value: 1.46e-57
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319   122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319   280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 569004618  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
608-857 3.50e-14

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 76.48  E-value: 3.50e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  608 QQMPLVPQMGPPGPQGQfraPGPQGQMGPQGPPMHQGGGGPQGFMGPQGpqgppqglprPQDMHGPQGMQrhpgphgplg 687
Cdd:NF038329  111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDRGETGPAGPAGPPGPQG----------ERGEKGPAGPQ---------- 167
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  688 pqgppgpqgssgpqghmgpqGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQ 767
Cdd:NF038329  168 --------------------GEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  768 GPlMGLNPRGMQGPPGPRENQGPA-PQGLmighppqemRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQE 845
Cdd:NF038329  228 GP-AGDGQQGPDGDPGPTGEDGPQgPDGP---------AGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDG 296
                         250
                  ....*....|..
gi 569004618  846 LRGPSGSQGQQG 857
Cdd:NF038329  297 LPGKDGKDGQNG 308
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
590-926 3.87e-14

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 77.36  E-value: 3.87e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   590 GQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQFRAPGPQG------------QMGPQGPPMHQGGGGPQGFMGPQGP 657
Cdd:pfam09606  148 RMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGgqqgpmggqmppQMGVPGMPGPADAGAQMGQQAQANG 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   658 QgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPASQGHMGPQ 731
Cdd:pfam09606  228 G------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIG 301
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   732 GPPGTQGMQGPPgPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPps 811
Cdd:pfam09606  302 DQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGMMSSPSP-- 378
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   812 gLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLgpppqggmqgppgpqgqQNPARGPHPSQ 891
Cdd:pfam09606  379 -VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMI-----------------PSPALIPSPSP 431
                          330       340       350
                   ....*....|....*....|....*....|....*
gi 569004618   892 GPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606  432 QMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
690-859 1.83e-13

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 74.17  E-value: 1.83e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  690 GPPgpqgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGP 769
Cdd:NF038329  132 GEQ------------------------GPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGP 187
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  770 LMGLNPRGMQGPPGPRENQGPA-PQGLMIGHPPQEMRGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGP-----P 843
Cdd:NF038329  188 AGEKGPQGPRGETGPAGEQGPAgPAGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPAGKDGPrgdrgE 267
                         170
                  ....*....|....*.
gi 569004618  844 QELRGPSGSQGQQGPP 859
Cdd:NF038329  268 AGPDGPDGKDGERGPV 283
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
191-230 7.02e-09

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 52.70  E-value: 7.02e-09
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 569004618    191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PTZ00421 PTZ00421
coronin; Provisional
205-319 1.63e-08

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 58.75  E-value: 1.63e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421   78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 569004618  273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421  153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
665-981 2.48e-08

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 58.12  E-value: 2.48e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQG------PPGTQG 738
Cdd:COG5164    12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNT---------------------GGTRPAQNQGSTTPAGntggtrPAGNQG 70
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  739 MQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGlnpRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPPSGLLGHGP 818
Cdd:COG5164    71 ATGPAQNQGGTTPAQNQGGTRPAGNTGGTTPAGD---GGATGPPDDGGATGPPDDGGSTTPPSGGSTTPPGDGGSTPPGP 147
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  819 qemrGPQEMRGMQGPPPQGSMLGPPQE--LRGPSGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPF 896
Cdd:COG5164   148 ----GSTGPGGSTTPPGDGGSTTPPGPggSTTPPDDGGSTTPPNKG---------------------ETGTDIPTGGTPR 202
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  897 QQQKAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPNKGT-KGRRERHASGLPSPPGLVATTTT 975
Cdd:COG5164   203 QGPDGPVKKDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPAELTaLEAENRAANPEPATKTIPETTTV 282

                  ....*.
gi 569004618  976 SPFVVV 981
Cdd:COG5164   283 KDLATV 288
WD40 pfam00400
WD domain, G-beta repeat;
195-230 7.39e-08

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 49.65  E-value: 7.39e-08
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 569004618   195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400    4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PHA03378 PHA03378
EBNA-3B; Provisional
734-946 6.00e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 50.84  E-value: 6.00e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  734 PGTQGMQGPPGPRGMQGPPHphgiQGGPASQGIQGPLMGLNPRGMQ----GPPGPRENQGPaPQGLMIGHPPQEMRGPHP 809
Cdd:PHA03378  598 PVPHPSQTPEPPTTQSHIPE----TSAPRQWPMPLRPIPMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHI 672
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  810 PSGLLGHGPQEMRGPQEMRGMQGPPPQGsmlgpPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP 889
Cdd:PHA03378  673 PYQPSPTGANTMLPIQWAPGTMQPPPRA-----PTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPP 747
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 569004618  890 SQGPIPFQQqkaPLLGDGP-RAPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQGP 946
Cdd:PHA03378  748 AAAPGRARP---PAAAPGRaRPPAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQAG 803
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-675 5.99e-05

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 47.49  E-value: 5.99e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPISQIPQ-----GFQQPHPSQQMPlvpqMGPPGPQ 622
Cdd:TIGR01628  360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....
gi 569004618   623 GqFRAPGPQGQMGP-QGPPMHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628  436 G-LAPMNAVRAPSRnAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
COG3416 COG3416
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
537-642 6.60e-03

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 39.62  E-value: 6.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  537 TQAEIEQEMAtlqytnpqlLEQL--KIERL-AQKQADQIQPPPSSGTPLLGpqpFSGQGPISQIPQGFQQPHPSQQmplv 613
Cdd:COG3416    47 AQTILVQEAA---------LKQAqqRIQELeAQLAQLQQQQPQSSGGFLSG---LFGGGQRPPPAPQPSQPGPQQQ---- 110
                          90       100
                  ....*....|....*....|....*....
gi 569004618  614 PQMGPPGPQGQFRAPGPQGQMGPQGPPMH 642
Cdd:COG3416   111 PAPPSGPWGQAAPQQPGYGQPQYGQPAAG 139
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
121-402 1.45e-66

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 226.83  E-value: 1.45e-66
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:cd00200    11 GVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKTIRLWDLETGEcVRTLT 90
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:cd00200    91 GHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNSVAFSPDGTFVASSSQDGT--IKLWDLRTGK 168
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:cd00200   169 CVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLST-GKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVW 246
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 569004618  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   247 DLRTGECVQTLS-GHTNSVTSLAWSPDGKRLASGSADGTIRIW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 1.46e-57

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 204.76  E-value: 1.46e-57
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNMNN-VKMFQ 199
Cdd:COG2319   122 AVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGKlLRTLT 201
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   202 GHTGAVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGT--VRLWDLATGE 279
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:COG2319   280 LLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLAT-GKLLRTLTGHTGAVRSVAFSP-DGKTLASGSDDGTVRLW 357
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|...
gi 569004618  360 HVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   358 DLATGELLRTLT-GHTGAVTSVAFSPDGRTLASGSADGTVRLW 399
WD40 COG2319
WD40 repeat [General function prediction only];
121-402 2.32e-52

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 189.74  E-value: 2.32e-52
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319    38 AVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGHTAAVLSVAFSPDGRLLASASADGTVRLWDlATGLLLRTLT 117
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   118 GHTGAVRSVAFSPDGKTLASGSADGTVRLWDLATGKLLRTLTGHSGAVTSVAFSPDGKLLASGSDDGT--VRLWDLATGK 195
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLF 358
Cdd:COG2319   196 LLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLWDLAT-GKLLRTLTGHSGSVRSVAFSP--DGrLLASGSADGTVRL 272
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....
gi 569004618  359 WHVGvEKEVGGMEMAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   273 WDLA-TGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGTVRLW 315
WD40 COG2319
WD40 repeat [General function prediction only];
121-361 1.89e-47

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 175.48  E-value: 1.89e-47
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWqsNMNN---VKM 197
Cdd:COG2319   164 AVTSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKLLASGSADGTVRLW--DLATgklLRT 241
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:COG2319   242 LTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSPDGKLLASGSDDGT--VRLWDLAT 319
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:COG2319   320 GKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGTVRLWDLAT-GELLRTLTGHTGAVTSVAFSP-DGRTLASGSADGTVR 397

                  ....
gi 569004618  358 FWHV 361
Cdd:COG2319   398 LWDL 401
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
194-402 4.01e-43

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 159.42  E-value: 4.01e-43
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  194 NVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFW 273
Cdd:cd00200     1 LRRTLKGHTGGVTCVAFSPDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLASGSSDKT--IRLW 78
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  274 DPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvhEGLFASGGS- 352
Cdd:cd00200    79 DLETGECVRTLTGHTSYVSSVAFSPDGRILSSSSRDKTIKVWDVET-GKCLTTLRGHTDWVNSVAFSP--DGTFVASSSq 155
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|
gi 569004618  353 DGSLLFWHVGVEKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:cd00200   156 DGTIKLWDLRTGKCVATLT-GHTGEVNSVAFSPDGEKLLSSSSDGTIKLW 204
WD40 COG2319
WD40 repeat [General function prediction only];
126-402 3.09e-42

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 160.08  E-value: 3.09e-42
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  126 RWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHND-MWMLTADHGGYVKYWQSNMNNVKMFQAHKEA 204
Cdd:COG2319     1 ALSADGAALAAASADLALALLAAALGALLLLLLGLAAAVASLAASPDGaRLAAGAGDLTLLLLDAAAGALLATLLGHTAA 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  205 IREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQSLATL 284
Cdd:COG2319    81 VLSVAFSPDGRLLASASADGTVRLWDLATGLLLRTLTGHTGAVRSVAFSPDGKTLASGSADGT--VRLWDLATGKLLRTL 158
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  285 HAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNLKeELQVFRGHKKEATAVAWHPvhEG-LFASGGSDGSLLFWHVGV 363
Cdd:COG2319   159 TGHSGAVTSVAFSPDGKLLASGSDDGTVRLWDLATGK-LLRTLTGHTGAVRSVAFSP--DGkLLASGSADGTVRLWDLAT 235
                         250       260       270
                  ....*....|....*....|....*....|....*....
gi 569004618  364 EKEVGGMEmAHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:COG2319   236 GKLLRTLT-GHSGSVRSVAFSPDGRLLASGSADGTVRLW 273
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
119-359 4.12e-39

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 147.48  E-value: 4.12e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAmtwshndmwmltadhggyvkywqsnmnnvkmf 198
Cdd:cd00200    93 TSYVSSVAFSPDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS-------------------------------- 140
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  199 qahkeaireASFSPtDNKF-ATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKT 277
Cdd:cd00200   141 ---------VAFSP-DGTFvASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGT--IKLWDLST 208
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRNlKEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLL 357
Cdd:cd00200   209 GKCLGTLRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRT-GECVQTLSGHTNSVTSLAWSP-DGKRLASGSADGTIR 286

                  ..
gi 569004618  358 FW 359
Cdd:cd00200   287 IW 288
WD40 COG2319
WD40 repeat [General function prediction only];
121-319 7.19e-39

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 150.45  E-value: 7.19e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  121 PVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQ-SNMNNVKMFQ 199
Cdd:COG2319   206 AVRSVAFSPDGKLLASGSADGTVRLWDLATGKLLRTLTGHSGSVRSVAFSPDGRLLASGSADGTVRLWDlATGELLRTLT 285
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  200 AHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWDPKTGQ 279
Cdd:COG2319   286 GHSGGVNSVAFSPDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLASGSDDGT--VRLWDLATGE 363
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|
gi 569004618  280 SLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:COG2319   364 LLRTLTGHTGAVTSVAFSPDGRTLASGSADGTVRLWDLAT 403
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
119-274 1.49e-28

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 117.05  E-value: 1.49e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  119 KCPVFVVRWTPEGRRLVTGASSGEFTLWNGLTFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYWQSNM-NNVKM 197
Cdd:cd00200   135 TDWVNSVAFSPDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNSVAFSPDGEKLLSSSSDGTIKLWDLSTgKCLGT 214
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 569004618  198 FQAHKEAIREASFSPTDNKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:cd00200   215 LRGHENGVNSVAFSPDGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLASGSADGT--IRIWD 289
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
608-857 3.50e-14

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 76.48  E-value: 3.50e-14
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  608 QQMPLVPQMGPPGPQGQfraPGPQGQMGPQGPPMHQGGGGPQGFMGPQGpqgppqglprPQDMHGPQGMQrhpgphgplg 687
Cdd:NF038329  111 QQLKGDGEKGEPGPAGP---AGPAGEQGPRGDRGETGPAGPAGPPGPQG----------ERGEKGPAGPQ---------- 167
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  688 pqgppgpqgssgpqghmgpqGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQ 767
Cdd:NF038329  168 --------------------GEAGPQGPAGKDGEAGAKGPAGEKGPQGPRGETGPAGEQGPAGPAGPDGEAGPAGEDGPA 227
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  768 GPlMGLNPRGMQGPPGPRENQGPA-PQGLmighppqemRGPHPPSGLLGH-GPQEMRGPQEMRGMQGPPPQGSMLGpPQE 845
Cdd:NF038329  228 GP-AGDGQQGPDGDPGPTGEDGPQgPDGP---------AGKDGPRGDRGEaGPDGPDGKDGERGPVGPAGKDGQNG-KDG 296
                         250
                  ....*....|..
gi 569004618  846 LRGPSGSQGQQG 857
Cdd:NF038329  297 LPGKDGKDGQNG 308
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
590-926 3.87e-14

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 77.36  E-value: 3.87e-14
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   590 GQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQFRAPGPQG------------QMGPQGPPMHQGGGGPQGFMGPQGP 657
Cdd:pfam09606  148 RMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGgqqgpmggqmppQMGVPGMPGPADAGAQMGQQAQANG 227
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   658 QgppqglPRPQDMHGP------QGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPASQGHMGPQ 731
Cdd:pfam09606  228 G------MNPQQMGGApnqvamQQQQPQQQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPGQQPGAMPNVMSIG 301
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   732 GPPGTQGMQGPPgPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPps 811
Cdd:pfam09606  302 DQNNYQQQQTRQ-QQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANPMQRGQPGMMSSPSP-- 378
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   812 gLLGHGPQEMRGPQEMRgmqgPPPQGSMLGPpqelrGPSGSQGQQGPPQGSLgpppqggmqgppgpqgqQNPARGPHPSQ 891
Cdd:pfam09606  379 -VPGQQVRQVTPNQFMR----QSPQPSVPSP-----QGPGSQPPQSHPGGMI-----------------PSPALIPSPSP 431
                          330       340       350
                   ....*....|....*....|....*....|....*
gi 569004618   892 GPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIP 926
Cdd:pfam09606  432 QMSQQPAQQRTIGQDSPGGSLNTPGQSAVNSPLNP 466
gly_rich_SclB NF038329
LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like ...
690-859 1.83e-13

LPXTG-anchored collagen-like adhesin Scl2/SclB; SclB (or Scl2 - streptococcal collagen-like protein 2) is an LPXTG-anchored surface-anchored adhesin with a variable-length region of triple helix-forming collagen-like Gly-Xaa-Xaa repeats.


Pssm-ID: 468478 [Multi-domain]  Cd Length: 440  Bit Score: 74.17  E-value: 1.83e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  690 GPPgpqgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGP 769
Cdd:NF038329  132 GEQ------------------------GPRGDRGETGPAGPAGPPGPQGERGEKGPAGPQGEAGPQGPAGKDGEAGAKGP 187
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  770 LMGLNPRGMQGPPGPRENQGPA-PQGLMIGHPPQEMRGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGP-----P 843
Cdd:NF038329  188 AGEKGPQGPRGETGPAGEQGPAgPAGPDGEAGPAGEDGPAGPAGDGQQGPDGDPGPTGEDGPQGPDGPAGKDGPrgdrgE 267
                         170
                  ....*....|....*.
gi 569004618  844 QELRGPSGSQGQQGPP 859
Cdd:NF038329  268 AGPDGPDGKDGERGPV 283
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
584-970 4.03e-09

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 61.18  E-value: 4.03e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   584 GPQPFSGQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQFRAPGPQGQMGPQgppMHQGGGGPQGFMGPQGPQGPPQG 663
Cdd:pfam09606   60 QQQPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMGPMGPGPGGPMGQQ---MGGPGTASNLLASLGRPQMPMGG 136
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   664 LPRPQDMHGPQGMQRHPGPHGPLGPqgppgpqgssgpqghMGPQGPPGPQGHIGPQGPPASQGHMGPQGPPGTQGMQGPP 743
Cdd:pfam09606  137 AGFPSQMSRVGRMQPGGQAGGMMQP---------------SSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMP 201
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   744 GPRGMQGPPHPhgIQGGPASQGIQGPLMGLNPRGMQGPPG--PRENQGPAPQGL-----MIGHPPQEMRGPHPpsGLLGH 816
Cdd:pfam09606  202 PQMGVPGMPGP--ADAGAQMGQQAQANGGMNPQQMGGAPNqvAMQQQQPQQQGQqsqlgMGINQMQQMPQGVG--GGAGQ 277
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   817 GPQEMRGPQEMRGMQGPPPQGSMLGPP----QELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP-SQ 891
Cdd:pfam09606  278 GGPGQPMGPPGQQPGAMPNVMSIGDQNnyqqQQTRQQQQQQGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPgNF 357
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   892 GP--IPFQQQKAPLLGDGPRAPFNQEGQSTGPPPlipGLGQQGAQGRIPPLNPGQGPGPNkgtkgrrerHASGLPSPPGL 969
Cdd:pfam09606  358 GGlgANPMQRGQPGMMSSPSPVPGQQVRQVTPNQ---FMRQSPQPSVPSPQGPGSQPPQS---------HPGGMIPSPAL 425

                   .
gi 569004618   970 V 970
Cdd:pfam09606  426 I 426
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
191-230 7.02e-09

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 52.70  E-value: 7.02e-09
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 569004618    191 NMNNVKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PTZ00421 PTZ00421
coronin; Provisional
205-319 1.63e-08

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 58.75  E-value: 1.63e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  205 IREASFSPTDN-KFATCSDDGTVRIWDFlrcHEERI----------LRGHGADVKCVDWHPT-KGLVVSGSKDSQqpIKF 272
Cdd:PTZ00421   78 IIDVAFNPFDPqKLFTASEDGTIMGWGI---PEEGLtqnisdpivhLQGHTKKVGIVSFHPSaMNVLASAGADMV--VNV 152
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*..
gi 569004618  273 WDPKTGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFDIRN 319
Cdd:PTZ00421  153 WDVERGKAVEVIKCHSDQITSLEWNLDGSLLCTTSKDKKLNIIDPRD 199
SPT5 COG5164
Transcription elongation factor SPT5 [Transcription];
665-981 2.48e-08

Transcription elongation factor SPT5 [Transcription];


Pssm-ID: 444063 [Multi-domain]  Cd Length: 495  Bit Score: 58.12  E-value: 2.48e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  665 PRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQgssgpqghmgpqgppgpqghiGPQGPPASQGHMGPQG------PPGTQG 738
Cdd:COG5164    12 SDPGGVTTPAGSQGSTKPAQNQGSTRPAGNT---------------------GGTRPAQNQGSTTPAGntggtrPAGNQG 70
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  739 MQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGlnpRGMQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPPSGLLGHGP 818
Cdd:COG5164    71 ATGPAQNQGGTTPAQNQGGTRPAGNTGGTTPAGD---GGATGPPDDGGATGPPDDGGSTTPPSGGSTTPPGDGGSTPPGP 147
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  819 qemrGPQEMRGMQGPPPQGSMLGPPQE--LRGPSGSQGQQGPPQGSlgpppqggmqgppgpqgqqnpARGPHPSQGPIPF 896
Cdd:COG5164   148 ----GSTGPGGSTTPPGDGGSTTPPGPggSTTPPDDGGSTTPPNKG---------------------ETGTDIPTGGTPR 202
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  897 QQQKAPLLGDGPRAPFNQEGQSTGPPPLIPGLGQQGAQGRIPPLNPGQGPGPNKGT-KGRRERHASGLPSPPGLVATTTT 975
Cdd:COG5164   203 QGPDGPVKKDDKNGKGNPPDDRGGKTGPKDQRPKTNPIERRGPERPEAAALPAELTaLEAENRAANPEPATKTIPETTTV 282

                  ....*.
gi 569004618  976 SPFVVV 981
Cdd:COG5164   283 KDLATV 288
WD40 pfam00400
WD domain, G-beta repeat;
195-230 7.39e-08

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 49.65  E-value: 7.39e-08
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 569004618   195 VKMFQAHKEAIREASFSPTDNKFATCSDDGTVRIWD 230
Cdd:pfam00400    4 LKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PLN00181 PLN00181
protein SPA1-RELATED; Provisional
215-359 1.82e-07

protein SPA1-RELATED; Provisional


Pssm-ID: 177776 [Multi-domain]  Cd Length: 793  Bit Score: 55.86  E-value: 1.82e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  215 NKFATCSDDGTVRIWDFLRCHEERILRGHGADVKCVDWH---PTkgLVVSGSKDSQqpIKFWDPKTGQSLATLHAHKNTV 291
Cdd:PLN00181  546 SQVASSNFEGVVQVWDVARSQLVTEMKEHEKRVWSIDYSsadPT--LLASGSDDGS--VKLWSINQGVSIGTIKTKANIC 621
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 569004618  292 MEVKLNLNGNWLLTASRDHLCKLFDIRNLKEELQVFRGHKKEATAVAWhpVHEGLFASGGSDGSLLFW 359
Cdd:PLN00181  622 CVQFPSESGRSLAFGSADHKVYYYDLRNPKLPLCTMIGHSKTVSYVRF--VDSSTLVSSSTDNTLKLW 687
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
717-785 3.42e-07

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 48.26  E-value: 3.42e-07
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 569004618   717 GPQGPPasqghmGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGplmglnPRGMQGPPGPR 785
Cdd:pfam01391    1 GPPGPP------GPPGPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPG------PPGAPGAPGPP 57
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
238-274 5.71e-07

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 47.31  E-value: 5.71e-07
                            10        20        30
                    ....*....|....*....|....*....|....*..
gi 569004618    238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:smart00320    6 KTLKGHTGPVTSVAFSPDGKYLASGSDDGT--IKLWD 40
WD40 pfam00400
WD domain, G-beta repeat;
238-274 3.28e-06

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 45.03  E-value: 3.28e-06
                           10        20        30
                   ....*....|....*....|....*....|....*..
gi 569004618   238 RILRGHGADVKCVDWHPTKGLVVSGSKDSQqpIKFWD 274
Cdd:pfam00400    5 KTLEGHTGSVTSLAFSPDGKLLASGSDDGT--VKVWD 39
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
321-360 4.18e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 44.61  E-value: 4.18e-06
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 569004618    321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFWH 360
Cdd:smart00320    2 GELLKTLKGHTGPVTSVAFSP-DGKYLASGSDDGTIKLWD 40
PHA03378 PHA03378
EBNA-3B; Provisional
734-946 6.00e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 50.84  E-value: 6.00e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  734 PGTQGMQGPPGPRGMQGPPHphgiQGGPASQGIQGPLMGLNPRGMQ----GPPGPRENQGPaPQGLMIGHPPQEMRGPHP 809
Cdd:PHA03378  598 PVPHPSQTPEPPTTQSHIPE----TSAPRQWPMPLRPIPMRPLRMQpitfNVLVFPTPHQP-PQVEITPYKPTWTQIGHI 672
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  810 PSGLLGHGPQEMRGPQEMRGMQGPPPQGsmlgpPQELRGPSGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHP 889
Cdd:PHA03378  673 PYQPSPTGANTMLPIQWAPGTMQPPPRA-----PTPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPP 747
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 569004618  890 SQGPIPFQQqkaPLLGDGP-RAPFNQEGQST-GPPPLIPGLGQQGAQGRIPPLNPGQGP 946
Cdd:PHA03378  748 AAAPGRARP---PAAAPGRaRPPAAAPGAPTpQPPPQAPPAPQQRPRGAPTPQPPPQAG 803
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
373-402 7.03e-06

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 44.23  E-value: 7.03e-06
                            10        20        30
                    ....*....|....*....|....*....|
gi 569004618    373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:smart00320   10 GHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
277-316 1.13e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 43.46  E-value: 1.13e-05
                            10        20        30        40
                    ....*....|....*....|....*....|....*....|
gi 569004618    277 TGQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLWD 40
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
533-774 1.33e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 49.65  E-value: 1.33e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   533 EKKKTQAEIEQEMATLQYTNPQL------LEQLKIERLAQKQADQIQPPPSSGtpllgPQPFSGQGPISQIPQGFQQPHP 606
Cdd:pfam09770  167 PKKAAAPAPAPQPAAQPASLPAPsrkmmsLEEVEAAMRAQAKKPAQQPAPAPA-----QPPAAPPAQQAQQQQQFPPQIQ 241
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   607 SQQMPlvPQMGPPGPQGQFRAPGPQGQMGPQGPPMhqggggPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRhpgphgpl 686
Cdd:pfam09770  242 QQQQP--QQQPQQPQQHPGQGHPVTILQRPQSPQP------DPAQPSIQPQAQQFHQQPPPVPVQPTQILQN-------- 305
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   687 gpqgppgpqgssgpqghmgpqgppgpqghigPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPAsqgi 766
Cdd:pfam09770  306 -------------------------------PNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQ---- 350

                   ....*...
gi 569004618   767 qgPLMGLN 774
Cdd:pfam09770  351 --QLAQLS 356
WD40 pfam00400
WD domain, G-beta repeat;
278-316 1.38e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 43.10  E-value: 1.38e-05
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 569004618   278 GQSLATLHAHKNTVMEVKLNLNGNWLLTASRDHLCKLFD 316
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVWD 39
PTZ00421 PTZ00421
coronin; Provisional
288-398 4.03e-05

coronin; Provisional


Pssm-ID: 173611 [Multi-domain]  Cd Length: 493  Bit Score: 47.97  E-value: 4.03e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  288 KNTVMEVKLN-LNGNWLLTASRDHLCKLFDI------RNLKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGSLLFWH 360
Cdd:PTZ00421   75 EGPIIDVAFNpFDPQKLFTASEDGTIMGWGIpeegltQNISDPIVHLQGHTKKVGIVSFHPSAMNVLASAGADMVVNVWD 154
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|
gi 569004618  361 V--GVEKEVggmEMAHEGMIWSLAWHPLGHILCSGSNDHT 398
Cdd:PTZ00421  155 VerGKAVEV---IKCHSDQITSLEWNLDGSLLCTTSKDKK 191
PHA03247 PHA03247
large tegument protein UL36; Provisional
718-976 5.02e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 48.01  E-value: 5.02e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  718 PQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPP-----HPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAP 792
Cdd:PHA03247 2745 PAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPprrltRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASP 2824
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  793 QGLM------IGHPPQEMRGPHPPSGLLGhGPQEMRGPQEMRGMQGPP---PQGSMLGPPQELRGPSGSQGQQGPPQgsl 863
Cdd:PHA03247 2825 AGPLppptsaQPTAPPPPPGPPPPSLPLG-GSVAPGGDVRRRPPSRSPaakPAAPARPPVRRLARPAVSRSTESFAL--- 2900
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  864 gppPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIPglgqqgaQGRIPPLNPG 943
Cdd:PHA03247 2901 ---PPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVP-------QPWLGALVPG 2970
                         250       260       270
                  ....*....|....*....|....*....|....*.
gi 569004618  944 QGPGPNKGT---KGRRERHASGLPSPPGLVATTTTS 976
Cdd:PHA03247 2971 RVAVPRFRVpqpAPSREAPASSTPPLTGHSLSRVSS 3006
WD40 pfam00400
WD domain, G-beta repeat;
373-402 5.39e-05

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 41.56  E-value: 5.39e-05
                           10        20        30
                   ....*....|....*....|....*....|
gi 569004618   373 AHEGMIWSLAWHPLGHILCSGSNDHTSKFW 402
Cdd:pfam00400    9 GHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
556-675 5.99e-05

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 47.49  E-value: 5.99e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   556 LEQLKIERLA--QKQADQIQPP---PSSGTPLLG---PQPFSGQGPISQIPQ-----GFQQPHPSQQMPlvpqMGPPGPQ 622
Cdd:TIGR01628  360 LAQRKEQRRAhlQDQFMQLQPRmrqLPMGSPMGGamgQPPYYGQGPQQQFNGqplgwPRMSMMPTPMGP----GGPLRPN 435
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....
gi 569004618   623 GqFRAPGPQGQMGP-QGPPMHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQG 675
Cdd:TIGR01628  436 G-LAPMNAVRAPSRnAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQSTASQGG 488
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
568-954 8.23e-05

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 47.25  E-value: 8.23e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   568 QADQIQPPPSSGTPLLGPQpfSGQGPISQIPQGFQQPHPSQQ--MPLVPQMGPPGPQGQFRAPGPQGQMGPQGPPMHQGG 645
Cdd:pfam03157  256 QGQQGYYPISPQQPRQWQQ--SGQGQQGYYPTSLQQPGQGQSgyYPTSQQQAGQLQQEQQLGQEQQDQQPGQGRQGQQPG 333
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   646 GgpqgfmgpqgpqgppqglpRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPQGPPASQ 725
Cdd:pfam03157  334 Q-------------------GQQGQQPAQGQQPGQGQPGYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQQQGQ 394
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   726 GHMGPQGPPGTQGMQGPPGPRGMQgPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAP-QGLMIGHPPQEM 804
Cdd:pfam03157  395 GQQGQQPGQGQQPGQGQPGYYPTS-PQQSGQGQPGYYPTSPQQSGQGQQPGQGQQPGQEQPGQGQQPgQGQQGQQPGQPE 473
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   805 RGPHPPSGLLGHGPQEMR--GPQEMRGMQGPPPQGSMLGPPQELRGPsgSQGQQGPPQGSLgpppqggmqgppgpqgqQN 882
Cdd:pfam03157  474 QGQQPGQGQPGYYPTSPQqsGQGQQLGQWQQQGQGQPGYYPTSPLQP--GQGQPGYYPTSP-----------------QQ 534
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 569004618   883 PARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPppliPGLGQQGAQgripplnPGQGPGPNKGTKG 954
Cdd:pfam03157  535 PGQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQ----PGQGQQGQQ-------PGQGQQPGQGQPG 595
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
565-860 9.33e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 47.07  E-value: 9.33e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   565 AQKQADQIQPP---------PSSGTPLLGPQPFSGQGPISQIPQGFQQPHPSQQMPLVPQMGPPGPQGQF---------R 626
Cdd:pfam03154  162 AQQQILQTQPPvlqaqsgaaSPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIqqtptlhpqR 241
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   627 APGPQGQMGPQGPPmhqGGGGPQGFMGPQGPQGPPQGLPRPQDMH-GPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMG 705
Cdd:pfam03154  242 LPSPHPPLQPMTQP---PPPSQVSPQPLPQPSLHGQMPPMPHSLQtGPSHMQHPVPPQPFPLTPQSSQSQVPPGPSPAAP 318
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   706 PQGPPGPQGhigpqgpPASQGHMGPQGPPGTQGMqgPPGPRGMQ--GPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPpg 783
Cdd:pfam03154  319 GQSQQRIHT-------PPSQSQLQSQQPPREQPL--PPAPLSMPhiKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMN-- 387
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   784 preNQGPAPQGLmigHPPQEMRGPHPPSgllGHGPQEMRGPQEMRgMQGPPPQGSMLG-----PPQELRGPSGSQGQQGP 858
Cdd:pfam03154  388 ---SNLPPPPAL---KPLSSLSTHHPPS---AHPPPLQLMPQSQQ-LPPPPAQPPVLTqsqslPPPAASHPPTSGLHQVP 457

                   ..
gi 569004618   859 PQ 860
Cdd:pfam03154  458 SQ 459
WD40 pfam00400
WD domain, G-beta repeat;
321-359 1.57e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 40.41  E-value: 1.57e-04
                           10        20        30
                   ....*....|....*....|....*....|....*....
gi 569004618   321 KEELQVFRGHKKEATAVAWHPvHEGLFASGGSDGSLLFW 359
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSP-DGKLLASGSDDGTVKVW 38
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
150-188 2.21e-04

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 39.99  E-value: 2.21e-04
                            10        20        30
                    ....*....|....*....|....*....|....*....
gi 569004618    150 TFNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:smart00320    1 SGELLKTLKGHTGPVTSVAFSPDGKYLASGSDDGTIKLW 39
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
565-936 3.02e-04

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 45.32  E-value: 3.02e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   565 AQKQADQIQP---PPSSGTPLLGPQPFS-------GQGPISQIPQGFQQPHPSQQ--MPLVPQMGPPGPQ---------- 622
Cdd:pfam03157  358 SPQQPGQGQPgyyPTSQQQPQQGQQPEQgqqgqqqGQGQQGQQPGQGQQPGQGQPgyYPTSPQQSGQGQPgyyptspqqs 437
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   623 GQFRAPGpQGQMGPQGPPMHQGGGGPQGFMGPQGPQGPPQGLPRPQDMHGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQG 702
Cdd:pfam03157  438 GQGQQPG-QGQQPGQEQPGQGQQPGQGQQGQQPGQPEQGQQPGQGQPGYYPTSPQQSGQGQQLGQWQQQGQGQPGYYPTS 516
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   703 HMGPQGPPGPQGHIGPQGPpaSQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPP 782
Cdd:pfam03157  517 PLQPGQGQPGYYPTSPQQP--GQGQQLGQLQQPTQGQQGQQSGQGQQGQQPGQGQQGQQPGQGQQGQQPGQGQQPGQGQP 594
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   783 G----PRENQGPAPQGLMIGHPPQEMRGPHPPSGL-LGHGPQEMRGPQEMRGMQGPPPQgsmlgppqelRGPSGSQGQQG 857
Cdd:pfam03157  595 GyyptSPQQSGQGQQPGQWQQPGQGQPGYYPTSSLqLGQGQQGYYPTSPQQPGQGQQPG----------QWQQSGQGQQG 664
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   858 ----PPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKaplLGDGPRAPFNQEGQStgppPLIPGLGQQGA 933
Cdd:pfam03157  665 yyptSPQQSGQAQQPGQGQQPGQWLQPGQGQQGYYPTSPQQPGQGQQ---LGQGQQSGQGQQGYY----PTSPGQGQQSG 737

                   ...
gi 569004618   934 QGR 936
Cdd:pfam03157  738 QGQ 740
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
740-904 5.70e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 44.26  E-value: 5.70e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   740 QGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGlnprgmQGPPGPRENQGPAPQGLMIGHPPQEMRGPHPPSGllgHGPQ 819
Cdd:pfam09770  208 KKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQI------QQQQQPQQQPQQPQQHPGQGHPVTILQRPQSPQP---DPAQ 278
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   820 EMRGPQEMRGMQGPPPQgsMLGPPQELRGP-----SGSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPArgPHPSQGPI 894
Cdd:pfam09770  279 PSIQPQAQQFHQQPPPV--PVQPTQILQNPnrlsaARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPII--THPQQLAQ 354
                          170
                   ....*....|
gi 569004618   895 PFQQQKAPLL 904
Cdd:pfam09770  355 LSEEEKAAYL 364
PTZ00420 PTZ00420
coronin; Provisional
284-397 6.28e-04

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 44.17  E-value: 6.28e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  284 LHAHKNTVMEVKLN-LNGNWLLTASRDHLCKLFDIRN-------LKEELQVFRGHKKEATAVAWHPVHEGLFASGGSDGS 355
Cdd:PTZ00420   70 LKGHTSSILDLQFNpCFSEILASGSEDLTIRVWEIPHndesvkeIKDPQCILKGHKKKISIIDWNPMNYYIMCSSGFDSF 149
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|....*
gi 569004618  356 LLFWHVGVEKEVGGMEMAHEgmIWSLAWHPLGHIL---CSGSNDH 397
Cdd:PTZ00420  150 VNIWDIENEKRAFQINMPKK--LSSLKWNIKGNLLsgtCVGKHMH 192
PTZ00420 PTZ00420
coronin; Provisional
198-274 7.14e-04

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 43.79  E-value: 7.14e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  198 FQAHKEAIREASFSPTDNK-FATCSDDGTVRIWDfLRCHEER---------ILRGHGADVKCVDWHPTKGLVVSGSK-DS 266
Cdd:PTZ00420   70 LKGHTSSILDLQFNPCFSEiLASGSEDLTIRVWE-IPHNDESvkeikdpqcILKGHKKKISIIDWNPMNYYIMCSSGfDS 148

                  ....*...
gi 569004618  267 QqpIKFWD 274
Cdd:PTZ00420  149 F--VNIWD 154
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
715-977 7.38e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 44.01  E-value: 7.38e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  715 HIGPQGPPASQGHMGpQGPPGTQGMQGPPGPRGMQGPPhPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQG 794
Cdd:PHA03307   36 LSGSQGQLVSDSAEL-AAVTVVAGAAACDRFEPPTGPP-PGPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSS 113
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  795 lmighPPQEMRGPHPPSGLLGHGPqEMRGPQEMRGMQGPPPQGSMLGPPQ---ELRGPSGSQGQQGPPQGSLGPPPQGGM 871
Cdd:PHA03307  114 -----PDPPPPTPPPASPPPSPAP-DLSEMLRPVGSPGPPPAASPPAAGAspaAVASDAASSRQAALPLSSPEETARAPS 187
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  872 QGPPGPQGQQNPARGPHPSQGPIPFQQQKA----PLLGDGPRAPFNQE-GQSTGPPPLIPGLGQQGAQGR---------- 936
Cdd:PHA03307  188 SPPAEPPPSTPPAAASPRPPRRSSPISASAsspaPAPGRSAADDAGASsSDSSSSESSGCGWGPENECPLprpapitlpt 267
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*.
gi 569004618  937 -----IPPLNPGQGPGPNKGTKGRRERHASGLPSPPGLVATTTTSP 977
Cdd:PHA03307  268 riweaSGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPR 313
PHA03247 PHA03247
large tegument protein UL36; Provisional
721-966 1.51e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.39  E-value: 1.51e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  721 PPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPH---PHGIQGGPASQGIQGPLMGLNPRGMQG----------PPGPREN 787
Cdd:PHA03247 2628 PPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRvsrPRRARRLGRAAQASSPPQRPRRRAARPtvgsltsladPPPPPPT 2707
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  788 QGPAPQGLMIGHP----PQEMRGPHPPSGL------------LGHGPQEMRGPQEMRGMQGP-PPQGSMLGPPQELRGPS 850
Cdd:PHA03247 2708 PEPAPHALVSATPlppgPAAARQASPALPAapappavpagpaTPGGPARPARPPTTAGPPAPaPPAAPAAGPPRRLTRPA 2787
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  851 GSQGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRA-------------PFNQEGQ 917
Cdd:PHA03247 2788 VASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPPpslplggsvapggDVRRRPP 2867
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 569004618  918 STGPPPLI-------------PGLGQQGAQGRIPPLNPGQGPGPNKGTKGRRERHASGLPSP 966
Cdd:PHA03247 2868 SRSPAAKPaaparppvrrlarPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQP 2929
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
719-940 1.66e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 42.72  E-value: 1.66e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   719 QGPPASQGHMGPQGPPGtqgmQGPPGPRGMQGPPHPHGiqggPASQGIQ-----GPLMGLNPR----GMQGPPGPRENQG 789
Cdd:pfam09770  105 QQPAARAAQSSAQPPAS----SLPQYQYASQQSQQPSK----PVRTGYEkykepEPIPDLQVDaslwGVAPKKAAAPAPA 176
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   790 PAPQGLMIGHPPQE------------MRG-----PHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGS 852
Cdd:pfam09770  177 PQPAAQPASLPAPSrkmmsleeveaaMRAqakkpAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQH 256
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   853 QGQQGPPQGSLGPPPQGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQ-STGPPPLIPGLGQQ 931
Cdd:pfam09770  257 PGQGHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVQPTQILQNPNRLSAARVGYPQNPQpGVQPAPAHQAHRQQ 336

                   ....*....
gi 569004618   932 GAQGRIPPL 940
Cdd:pfam09770  337 GSFGRQAPI 345
Pro-rich pfam15240
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
729-844 1.81e-03

Proline-rich protein; This family includes several eukaryotic proline-rich proteins.


Pssm-ID: 464580 [Multi-domain]  Cd Length: 167  Bit Score: 40.79  E-value: 1.81e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   729 GPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPlmglnprgmQGPPGPRENQGPAPQGlmighppqemrGPH 808
Cdd:pfam15240   72 GPQQPPPQGGKQKPQGPPPQGGPRPPPGKPQGPPPQGGNQQ---------QGPPPPGKPQGPPPQG-----------GGP 131
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 569004618   809 PPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQ 844
Cdd:pfam15240  132 PPQGGNQQGPPPPPPGNPQGPPQRPPQPGNPQGPPQ 167
WD40 pfam00400
WD domain, G-beta repeat;
151-188 2.47e-03

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 36.94  E-value: 2.47e-03
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 569004618   151 FNFETILQAHDSPVRAMTWSHNDMWMLTADHGGYVKYW 188
Cdd:pfam00400    1 GKLLKTLEGHTGSVTSLAFSPDGKLLASGSDDGTVKVW 38
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
573-862 3.29e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 42.06  E-value: 3.29e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   573 QPPPSSGTPLLGPQPFSGQGPISQIPQGFQ-------QPHPSQQMPLVPQMG----PPGPQGQfrAPGPQGQMgPQGPPM 641
Cdd:pfam03154  254 QPPPPSQVSPQPLPQPSLHGQMPPMPHSLQtgpshmqHPVPPQPFPLTPQSSqsqvPPGPSPA--APGQSQQR-IHTPPS 330
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   642 HQGGGGPQGfmgpqgpqgppqglPRPQDMhGPQGMQRHPGPHGPLGPQGPPGPQGSSGPQGHMGPQGPPGPQGHIGPqgP 721
Cdd:pfam03154  331 QSQLQSQQP--------------PREQPL-PPAPLSMPHIKPPPTTPIPQLPNPQSHKHPPHLSGPSPFQMNSNLPP--P 393
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618   722 PASQghmgpqgPPGTQGMQGPPGprgmqgpPHPHGIQGGPASQGIQGPLMglnprgmqGPPGPRENQGPAPQGlmighpp 801
Cdd:pfam03154  394 PALK-------PLSSLSTHHPPS-------AHPPPLQLMPQSQQLPPPPA--------QPPVLTQSQSLPPPA------- 444
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 569004618   802 qemrGPHPPSGLLGHGPQEMRGPQEMRGMQGPPPQGSMLGPPQELrgPSGSQGQQGPPQGS 862
Cdd:pfam03154  445 ----ASHPPTSGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTST--SSAMPGIQPPSSAS 499
Collagen pfam01391
Collagen triple helix repeat (20 copies); Members of this family belong to the collagen ...
710-761 4.07e-03

Collagen triple helix repeat (20 copies); Members of this family belong to the collagen superfamily. Collagens are generally extracellular structural proteins involved in formation of connective tissue structure. The alignment contains 20 copies of the G-X-Y repeat that forms a triple helix. The first position of the repeat is glycine, the second and third positions can be any residue but are frequently proline and hydroxy-proline. Collagens are post translationally modified by proline hydroxylase to form the hydroxy-proline residues. Defective hydroxylation is the cause of scurvy. Some members of the collagen superfamily are not involved in connective tissue structure but share the same triple helical structure. The family includes bacterial collagen-like triple-helix repeat proteins.


Pssm-ID: 460189 [Multi-domain]  Cd Length: 57  Bit Score: 36.70  E-value: 4.07e-03
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|..
gi 569004618   710 PgpqghiGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGP 761
Cdd:pfam01391    9 P------GPPGPPGPPGPPGPPGPPGPPGEPGPPGPPGPPGPPGPPGAPGAP 54
PHA03247 PHA03247
large tegument protein UL36; Provisional
718-993 5.32e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.46  E-value: 5.32e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  718 PQGPPASQGHMGPQGPPGTQGMQGPPGPrgmqGPPHPHGIQGGPASqgiqgplmglnprgmqgpPGPRENQGPAPQGLMI 797
Cdd:PHA03247 2589 PDAPPQSARPRAPVDDRGDPRGPAPPSP----LPPDTHAPDPPPPS------------------PSPAANEPDPHPPPTV 2646
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  798 GHPPQEMRGPHPPsgllghgpqEMRGPQEMRGmQGPPPQGSmlGPPQELRGPSGSqgqqgPPQGSLGP-----PPQGGMQ 872
Cdd:PHA03247 2647 PPPERPRDDPAPG---------RVSRPRRARR-LGRAAQAS--SPPQRPRRRAAR-----PTVGSLTSladppPPPPTPE 2709
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  873 GPPGPQGQQNP---------ARGPHPSQGPIPFQQQKAPLLGDGPRAPFNQEGQSTGPPPLIPglgqqgaqgRIPPLNPG 943
Cdd:PHA03247 2710 PAPHALVSATPlppgpaaarQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPP---------AAPAAGPP 2780
                         250       260       270       280       290
                  ....*....|....*....|....*....|....*....|....*....|
gi 569004618  944 QGPGPNKGTKGRRERHASGLPSPPGLVATTTTSPFVVVTLEALPTITWAP 993
Cdd:PHA03247 2781 RRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPP 2830
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
717-863 6.41e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 40.74  E-value: 6.41e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  717 GPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGMQGPPGPRENQGPAPQGLM 796
Cdd:PRK07764  623 APAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPA 702
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 569004618  797 IGHPPQEMRGPHP-PSGLLGHGPQEMRGPQEMRGMQGPPPqgsmlGPPQELRGPSGSQGQQGPPQGSL 863
Cdd:PRK07764  703 PAPAATPPAGQADdPAAQPPQAAQGASAPSPAADDPVPLP-----PEPDDPPDPAGAPAQPPPPPAPA 765
COG3416 COG3416
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
537-642 6.60e-03

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 39.62  E-value: 6.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  537 TQAEIEQEMAtlqytnpqlLEQL--KIERL-AQKQADQIQPPPSSGTPLLGpqpFSGQGPISQIPQGFQQPHPSQQmplv 613
Cdd:COG3416    47 AQTILVQEAA---------LKQAqqRIQELeAQLAQLQQQQPQSSGGFLSG---LFGGGQRPPPAPQPSQPGPQQQ---- 110
                          90       100
                  ....*....|....*....|....*....
gi 569004618  614 PQMGPPGPQGQFRAPGPQGQMGPQGPPMH 642
Cdd:COG3416   111 PAPPSGPWGQAAPQQPGYGQPQYGQPAAG 139
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
708-911 7.16e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 40.74  E-value: 7.16e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  708 GPPGPQGhIGPQGPPASQGHMGPQGPPGTQGMQGPPGPRGMQGPPHPHGIQGGPASQGIQGPLMGLNPRGmQGPPGPREN 787
Cdd:PRK07764  596 GGEGPPA-PASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASD-GGDGWPAKA 673
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 569004618  788 QGPAPQGLMIGHPPQEMRGPHPPSGllghGPQEMRGPQEMRGMQGPPPQGSMLGPPQELRGPSGSQGQQGPPQGSLGPPP 867
Cdd:PRK07764  674 GGAAPAAPPPAPAPAAPAAPAGAAP----AQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPP 749
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....
gi 569004618  868 QGGMQGPPGPQGQQNPARGPHPSQGPIPFQQQKAPLLGDGPRAP 911
Cdd:PRK07764  750 DPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMAEDDAPSM 793
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH