NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1622966953|ref|XP_015003419|]
View 

protein transport protein Sec31B isoform X4 [Macaca mulatta]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-335 2.64e-30

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 122.06  E-value: 2.64e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HKGVLSassrfhKLIWGSFGSgllessgVIAGG 91
Cdd:cd00200     16 AFSPDGKL---LATG----------SGDGTIKVWDLETGELLRTLKgHTGPVR------DVAASADGT-------YLASG 69
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   92 GDNGMLILYNVthilsSGKEPVIAQKQkHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLGSKSQqpp 171
Cdd:cd00200     70 SSDKTIRLWDL-----ETGECVRTLTG-HTSYVSSVAFSP----DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTD--- 136
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  172 eDVKALSWNrQAQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDrlpVIQLWDLRf 251
Cdd:cd00200    137 -WVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKLWDLS- 207
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  252 ASSPLKVLESHSRGILSVSWSQaDAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPRDPSVFSaASFDGWI 331
Cdd:cd00200    208 TGKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GSADGTI 285

                   ....
gi 1622966953  332 SLYS 335
Cdd:cd00200    286 RIWD 289
PHA03247 super family cl33720
large tegument protein UL36; Provisional
772-1079 1.04e-08

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 1.04e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  772 PPVQQLRDRLFHAQGSAVLGQQSPPFPYPRIVVGAIPHSKETSyRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPS 851
Cdd:PHA03247  2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRAR-RLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPP 2704
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  852 PYQgprmqnisdyrasgsqaiqPLPLGPGVRPASSQPQLLGGQRVQAPNPVGFPGTWPLPGSLLLMACPdiTQPGSTSLS 931
Cdd:PHA03247  2705 PPT-------------------PEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGP--ARPARPPTT 2763
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  932 ETPRlfpllplrppgpshmvshAPAPPVSFLVPYPPGGPVAPCSSVLPTTGILTPHPGPQDSWKEAPAPGGNLQRNKLPE 1011
Cdd:PHA03247  2764 AGPP------------------APAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPA 2825
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622966953 1012 TFMAPAPITAPVMSLTPE--LQGILPLQ---PPVSGVSHAPPGAPGELSLQQLQHLPPEKMERKELPPEHQSL 1079
Cdd:PHA03247  2826 GPLPPPTSAQPTAPPPPPgpPPPSLPLGgsvAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESF 2898
ACE1-Sec16-like super family cl14807
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-692 1.36e-04

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


The actual alignment was detected with superfamily member cd09233:

Pssm-ID: 449359 [Multi-domain]  Cd Length: 314  Bit Score: 45.33  E-value: 1.36e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  572 LLLGELGPAVELYLKEERFADAIILAQAGGADLLKQTQERYlAKKKTKISSLLACVVQ---KNWKDVVCTCS-------- 640
Cdd:cd09233     73 LLTGNRKEALELALDNGLWAHALLLASSLGKETWAEVVSRF-ARSESKLNDPLQTLYQlfsGNSPEAITELAdnpaeaew 151
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1622966953  641 -LKNWREALALLLTYSSTEKfpelcDM-----LGTRMEQEGgraLTSEARLCYVCSGS 692
Cdd:cd09233    152 aLGNWREHLAIILSNRTSNL-----DLealveLGDLLAQRG---LVEAAHICYLLAGV 201
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-335 2.64e-30

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 122.06  E-value: 2.64e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HKGVLSassrfhKLIWGSFGSgllessgVIAGG 91
Cdd:cd00200     16 AFSPDGKL---LATG----------SGDGTIKVWDLETGELLRTLKgHTGPVR------DVAASADGT-------YLASG 69
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   92 GDNGMLILYNVthilsSGKEPVIAQKQkHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLGSKSQqpp 171
Cdd:cd00200     70 SSDKTIRLWDL-----ETGECVRTLTG-HTSYVSSVAFSP----DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTD--- 136
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  172 eDVKALSWNrQAQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDrlpVIQLWDLRf 251
Cdd:cd00200    137 -WVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKLWDLS- 207
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  252 ASSPLKVLESHSRGILSVSWSQaDAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPRDPSVFSaASFDGWI 331
Cdd:cd00200    208 TGKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GSADGTI 285

                   ....
gi 1622966953  332 SLYS 335
Cdd:cd00200    286 RIWD 289
WD40 COG2319
WD40 repeat [General function prediction only];
6-336 5.42e-27

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 115.01  E-value: 5.42e-27
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953    6 LERPAVQAWSPASQYPLYLATGTSAQQLDSSFSTNGTLEIFEVDFRDPSLDLKHKGVLSASSRFHklIWGSFGSGLLESS 85
Cdd:COG2319     13 SADLALALLAAALGALLLLLLGLAAAVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGH--TAAVLSVAFSPDG 90
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   86 GVIAGGGDNGMLILYNVthilSSGKEPviAQKQKHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLgs 165
Cdd:COG2319     91 RLLASASADGTVRLWDL----ATGLLL--RTLTGHTGAVRSVAFSP----DGKTLASGSADGTVRLWDLATGKLLRTL-- 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  166 ksQQPPEDVKALSWNRQAQhILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDRlpvIQ 245
Cdd:COG2319    159 --TGHSGAVTSVAFSPDGK-LLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRS--VAFSPD-GKLLASGSADGT---VR 229
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  246 LWDLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPrDPSVFSAA 325
Cdd:COG2319    230 LWDLA-TGKLLRTLTGHSGSVRSVAFS-PDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSP-DGKLLASG 306
                          330
                   ....*....|.
gi 1622966953  326 SFDGWISLYSV 336
Cdd:COG2319    307 SDDGTVRLWDL 317
PHA03247 PHA03247
large tegument protein UL36; Provisional
772-1079 1.04e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 1.04e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  772 PPVQQLRDRLFHAQGSAVLGQQSPPFPYPRIVVGAIPHSKETSyRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPS 851
Cdd:PHA03247  2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRAR-RLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPP 2704
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  852 PYQgprmqnisdyrasgsqaiqPLPLGPGVRPASSQPQLLGGQRVQAPNPVGFPGTWPLPGSLLLMACPdiTQPGSTSLS 931
Cdd:PHA03247  2705 PPT-------------------PEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGP--ARPARPPTT 2763
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  932 ETPRlfpllplrppgpshmvshAPAPPVSFLVPYPPGGPVAPCSSVLPTTGILTPHPGPQDSWKEAPAPGGNLQRNKLPE 1011
Cdd:PHA03247  2764 AGPP------------------APAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPA 2825
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622966953 1012 TFMAPAPITAPVMSLTPE--LQGILPLQ---PPVSGVSHAPPGAPGELSLQQLQHLPPEKMERKELPPEHQSL 1079
Cdd:PHA03247  2826 GPLPPPTSAQPTAPPPPPgpPPPSLPLGgsvAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESF 2898
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
826-1064 4.58e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.47  E-value: 4.58e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  826 PTPSPRPRVFTPQSSPAMPLAPSHPSPYQGPR--MQNISDYRA----SGSQAIQPLPLGPgvRPASSQPQLLGGQRVQAP 899
Cdd:pfam03154  197 AGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHtlIQQTPTLHPqrlpSPHPPLQPMTQPP--PPSQVSPQPLPQPSLHGQ 274
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  900 NPvgfPGTWPLPGSLLLMACPDITQP-GSTSLSETPRLFPLLPLRPPGPSHMVSHAPAPPVSFLVPYPPGgpvapcSSVL 978
Cdd:pfam03154  275 MP---PMPHSLQTGPSHMQHPVPPQPfPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPR------EQPL 345
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  979 PTTGILTPH--PGPQDSWKEAPAPggnlQRNKLPETFMAPAPITAPVmSLTPElqgilPLQPPVSGVS-HAPPGA-PGEL 1054
Cdd:pfam03154  346 PPAPLSMPHikPPPTTPIPQLPNP----QSHKHPPHLSGPSPFQMNS-NLPPP-----PALKPLSSLStHHPPSAhPPPL 415
                          250
                   ....*....|.
gi 1622966953 1055 SLQ-QLQHLPP 1064
Cdd:pfam03154  416 QLMpQSQQLPP 426
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
120-153 7.72e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 41.14  E-value: 7.72e-05
                            10        20        30
                    ....*....|....*....|....*....|....
gi 1622966953   120 HTGAVRALDFNPfqvlQGNLLASGASDSEVFIWD 153
Cdd:smart00320   11 HTGPVTSVAFSP----DGKYLASGSDDGTIKLWD 40
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-692 1.36e-04

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 45.33  E-value: 1.36e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  572 LLLGELGPAVELYLKEERFADAIILAQAGGADLLKQTQERYlAKKKTKISSLLACVVQ---KNWKDVVCTCS-------- 640
Cdd:cd09233     73 LLTGNRKEALELALDNGLWAHALLLASSLGKETWAEVVSRF-ARSESKLNDPLQTLYQlfsGNSPEAITELAdnpaeaew 151
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1622966953  641 -LKNWREALALLLTYSSTEKfpelcDM-----LGTRMEQEGgraLTSEARLCYVCSGS 692
Cdd:cd09233    152 aLGNWREHLAIILSNRTSNL-----DLealveLGDLLAQRG---LVEAAHICYLLAGV 201
PTZ00420 PTZ00420
coronin; Provisional
80-205 2.01e-04

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 45.33  E-value: 2.01e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   80 GLLESSGVIA------GGGDNGMLILYNVTHilssgKEPVIAQKqKHTGAVRALDFNPfqvLQGNLLASGASDSEVFIWD 153
Cdd:PTZ00420    33 GIACSSGFVAvpweveGGGLIGAIRLENQMR-----KPPVIKLK-GHTSSILDLQFNP---CFSEILASGSEDLTIRVWE 103
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1622966953  154 L--NNLNV-----PMTL--GSKSQqppedVKALSWNRQAQHILSSAHPSGKAVVWDLrKNE 205
Cdd:PTZ00420   104 IphNDESVkeikdPQCIlkGHKKK-----ISIIDWNPMNYYIMCSSGFDSFVNIWDI-ENE 158
WD40 pfam00400
WD domain, G-beta repeat;
120-153 5.43e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 38.48  E-value: 5.43e-04
                           10        20        30
                   ....*....|....*....|....*....|....
gi 1622966953  120 HTGAVRALDFNPfqvlQGNLLASGASDSEVFIWD 153
Cdd:pfam00400   10 HTGSVTSLAFSP----DGKLLASGSDDGTVKVWD 39
 
Name Accession Description Interval E-value
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
13-335 2.64e-30

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 122.06  E-value: 2.64e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLK-HKGVLSassrfhKLIWGSFGSgllessgVIAGG 91
Cdd:cd00200     16 AFSPDGKL---LATG----------SGDGTIKVWDLETGELLRTLKgHTGPVR------DVAASADGT-------YLASG 69
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   92 GDNGMLILYNVthilsSGKEPVIAQKQkHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLGSKSQqpp 171
Cdd:cd00200     70 SSDKTIRLWDL-----ETGECVRTLTG-HTSYVSSVAFSP----DGRILSSSSRDKTIKVWDVETGKCLTTLRGHTD--- 136
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  172 eDVKALSWNrQAQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDrlpVIQLWDLRf 251
Cdd:cd00200    137 -WVNSVAFS-PDGTFVASSSQDGTIKLWDLRTGKCVATLTGHTGEVNS--VAFSPD-GEKLLSSSSDG---TIKLWDLS- 207
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  252 ASSPLKVLESHSRGILSVSWSQaDAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPRDPSVFSaASFDGWI 331
Cdd:cd00200    208 TGKCLGTLRGHENGVNSVAFSP-DGYLLASGSEDGTIRVWDLRTGECVQTLSGHTNSVTSLAWSPDGKRLAS-GSADGTI 285

                   ....
gi 1622966953  332 SLYS 335
Cdd:cd00200    286 RIWD 289
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
84-337 2.96e-30

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 121.67  E-value: 2.96e-30
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   84 SSGVIAGGGDNGMLILYNVThilssGKEPVIAQKQkHTGAVRALDFNPFqvlqGNLLASGASDSEVFIWDLNNLNVPMTL 163
Cdd:cd00200     20 DGKLLATGSGDGTIKVWDLE-----TGELLRTLKG-HTGPVRDVAASAD----GTYLASGSSDKTIRLWDLETGECVRTL 89
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  164 GSKSQqppeDVKALSWNrQAQHILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiaTQLVLCSEDDRLpv 243
Cdd:cd00200     90 TGHTS----YVSSVAFS-PDGRILSSSSRDKTIKVWDVETGKCLTTLRGHTDWVNS--VAFSPD--GTFVASSSQDGT-- 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  244 IQLWDLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPrDPSVFS 323
Cdd:cd00200    159 IKLWDLR-TGKCVATLTGHTGEVNSVAFS-PDGEKLLSSSSDGTIKLWDLSTGKCLGTLRGHENGVNSVAFSP-DGYLLA 235
                          250
                   ....*....|....
gi 1622966953  324 AASFDGWISLYSVM 337
Cdd:cd00200    236 SGSEDGTIRVWDLR 249
WD40 COG2319
WD40 repeat [General function prediction only];
6-336 5.42e-27

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 115.01  E-value: 5.42e-27
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953    6 LERPAVQAWSPASQYPLYLATGTSAQQLDSSFSTNGTLEIFEVDFRDPSLDLKHKGVLSASSRFHklIWGSFGSGLLESS 85
Cdd:COG2319     13 SADLALALLAAALGALLLLLLGLAAAVASLAASPDGARLAAGAGDLTLLLLDAAAGALLATLLGH--TAAVLSVAFSPDG 90
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   86 GVIAGGGDNGMLILYNVthilSSGKEPviAQKQKHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLgs 165
Cdd:COG2319     91 RLLASASADGTVRLWDL----ATGLLL--RTLTGHTGAVRSVAFSP----DGKTLASGSADGTVRLWDLATGKLLRTL-- 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  166 ksQQPPEDVKALSWNRQAQhILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDRlpvIQ 245
Cdd:COG2319    159 --TGHSGAVTSVAFSPDGK-LLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRS--VAFSPD-GKLLASGSADGT---VR 229
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  246 LWDLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPrDPSVFSAA 325
Cdd:COG2319    230 LWDLA-TGKLLRTLTGHSGSVRSVAFS-PDGRLLASGSADGTVRLWDLATGELLRTLTGHSGGVNSVAFSP-DGKLLASG 306
                          330
                   ....*....|.
gi 1622966953  326 SFDGWISLYSV 336
Cdd:COG2319    307 SDDGTVRLWDL 317
WD40 COG2319
WD40 repeat [General function prediction only];
88-336 6.23e-27

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 115.01  E-value: 6.23e-27
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   88 IAGGGDNGMLILYNVThilsSGKEpvIAQKQKHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLGSKS 167
Cdd:COG2319    177 LASGSDDGTVRLWDLA----TGKL--LRTLTGHTGAVRSVAFSP----DGKLLASGSADGTVRLWDLATGKLLRTLTGHS 246
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  168 QQppedVKALSWNRQAQHILsSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHcsGLAWHPDiATQLVLCSEDDRlpvIQLW 247
Cdd:COG2319    247 GS----VRSVAFSPDGRLLA-SGSADGTVRLWDLATGELLRTLTGHSGGVN--SVAFSPD-GKLLASGSDDGT---VRLW 315
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  248 DLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPrDPSVFSAASF 327
Cdd:COG2319    316 DLA-TGKLLRTLTGHTGAVRSVAFS-PDGKTLASGSDDGTVRLWDLATGELLRTLTGHTGAVTSVAFSP-DGRTLASGSA 392

                   ....*....
gi 1622966953  328 DGWISLYSV 336
Cdd:COG2319    393 DGTVRLWDL 401
WD40 COG2319
WD40 repeat [General function prediction only];
88-336 2.12e-25

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 110.39  E-value: 2.12e-25
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   88 IAGGGDNGMLILYNvthiLSSGKEpvIAQKQKHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNNLNVPMTLgsks 167
Cdd:COG2319    135 LASGSADGTVRLWD----LATGKL--LRTLTGHSGAVTSVAFSP----DGKLLASGSDDGTVRLWDLATGKLLRTL---- 200
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  168 QQPPEDVKALSWNRQAQhILSSAHPSGKAVVWDLRKNEPIIKVSDHSNRMHCsgLAWHPDiATQLVLCSEDDRlpvIQLW 247
Cdd:COG2319    201 TGHTGAVRSVAFSPDGK-LLASGSADGTVRLWDLATGKLLRTLTGHSGSVRS--VAFSPD-GRLLASGSADGT---VRLW 273
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  248 DLRfASSPLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPRDPSVFSaASF 327
Cdd:COG2319    274 DLA-TGELLRTLTGHSGGVNSVAFS-PDGKLLASGSDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFSPDGKTLAS-GSD 350

                   ....*....
gi 1622966953  328 DGWISLYSV 336
Cdd:COG2319    351 DGTVRLWDL 359
PHA03247 PHA03247
large tegument protein UL36; Provisional
772-1079 1.04e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.95  E-value: 1.04e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  772 PPVQQLRDRLFHAQGSAVLGQQSPPFPYPRIVVGAIPHSKETSyRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPS 851
Cdd:PHA03247  2626 PPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRAR-RLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPP 2704
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  852 PYQgprmqnisdyrasgsqaiqPLPLGPGVRPASSQPQLLGGQRVQAPNPVGFPGTWPLPGSLLLMACPdiTQPGSTSLS 931
Cdd:PHA03247  2705 PPT-------------------PEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGP--ARPARPPTT 2763
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  932 ETPRlfpllplrppgpshmvshAPAPPVSFLVPYPPGGPVAPCSSVLPTTGILTPHPGPQDSWKEAPAPGGNLQRNKLPE 1011
Cdd:PHA03247  2764 AGPP------------------APAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPA 2825
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622966953 1012 TFMAPAPITAPVMSLTPE--LQGILPLQ---PPVSGVSHAPPGAPGELSLQQLQHLPPEKMERKELPPEHQSL 1079
Cdd:PHA03247  2826 GPLPPPTSAQPTAPPPPPgpPPPSLPLGgsvAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESF 2898
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
826-1064 4.58e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.47  E-value: 4.58e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  826 PTPSPRPRVFTPQSSPAMPLAPSHPSPYQGPR--MQNISDYRA----SGSQAIQPLPLGPgvRPASSQPQLLGGQRVQAP 899
Cdd:pfam03154  197 AGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHtlIQQTPTLHPqrlpSPHPPLQPMTQPP--PPSQVSPQPLPQPSLHGQ 274
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  900 NPvgfPGTWPLPGSLLLMACPDITQP-GSTSLSETPRLFPLLPLRPPGPSHMVSHAPAPPVSFLVPYPPGgpvapcSSVL 978
Cdd:pfam03154  275 MP---PMPHSLQTGPSHMQHPVPPQPfPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQSQQPPR------EQPL 345
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  979 PTTGILTPH--PGPQDSWKEAPAPggnlQRNKLPETFMAPAPITAPVmSLTPElqgilPLQPPVSGVS-HAPPGA-PGEL 1054
Cdd:pfam03154  346 PPAPLSMPHikPPPTTPIPQLPNP----QSHKHPPHLSGPSPFQMNS-NLPPP-----PALKPLSSLStHHPPSAhPPPL 415
                          250
                   ....*....|.
gi 1622966953 1055 SLQ-QLQHLPP 1064
Cdd:pfam03154  416 QLMpQSQQLPP 426
PHA03247 PHA03247
large tegument protein UL36; Provisional
792-1074 5.95e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 57.64  E-value: 5.95e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  792 QQSPPFPYPRIVVGAiphsketsyrlgsqpshqvpTPSPRPRVFTPQSSPAMPLAPSHPSPYQGPrmqnisdyrasgsqA 871
Cdd:PHA03247  2704 PPPTPEPAPHALVSA--------------------TPLPPGPAAARQASPALPAAPAPPAVPAGP--------------A 2749
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  872 IQPLPLGPGVRPASSQPqllggqrvQAPNPVGFPGTWPLPGslllmacpdITQPGSTSLSETPRLFPLLPLRPPGPSHMV 951
Cdd:PHA03247  2750 TPGGPARPARPPTTAGP--------PAPAPPAAPAAGPPRR---------LTRPAVASLSESRESLPSPWDPADPPAAVL 2812
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  952 SHAPAPPVSflvpYPPGGPVAPCSSVLPTTGILTPHP-GPQDSWKEAPAPGGNLQRnKLPETFMAPAPIT---------- 1020
Cdd:PHA03247  2813 APAAALPPA----ASPAGPLPPPTSAQPTAPPPPPGPpPPSLPLGGSVAPGGDVRR-RPPSRSPAAKPAAparppvrrla 2887
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1622966953 1021 APVMSLTPELQGILPLQP--PVSGVSHAPPGAPGELSLQQLQHLPPEKMERKELPP 1074
Cdd:PHA03247  2888 RPAVSRSTESFALPPDQPerPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPL 2943
PHA03247 PHA03247
large tegument protein UL36; Provisional
821-1051 1.18e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 56.49  E-value: 1.18e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  821 PSHQVPTPSPRPRVFTP--QSSPAMPLAPSHPSPYQGPRmqnisDYRASGSQAIQPLPLGPGVRPASSQPQllggQRVQA 898
Cdd:PHA03247  2564 PDRSVPPPRPAPRPSEPavTSRARRPDAPPQSARPRAPV-----DDRGDPRGPAPPSPLPPDTHAPDPPPP----SPSPA 2634
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  899 PNPVGFPGTWPLPGSLLLMACPditQPGSTSLSETPRLFPLLPLRPPGPSHMVSHAPAPPVSFLVPY--PPGGPVAPCSS 976
Cdd:PHA03247  2635 ANEPDPHPPPTVPPPERPRDDP---APGRVSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLadPPPPPPTPEPA 2711
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622966953  977 VLPTTGILTPHPGPQDSWKEAPAPggnlqrnklPETFMAPAPITAPVMSLTPELQGI--LPLQPPVSGVSHAPPGAP 1051
Cdd:PHA03247  2712 PHALVSATPLPPGPAAARQASPAL---------PAAPAPPAVPAGPATPGGPARPARppTTAGPPAPAPPAAPAAGP 2779
PHA03247 PHA03247
large tegument protein UL36; Provisional
795-1052 2.24e-07

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 55.71  E-value: 2.24e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  795 PPFPYPRIVVGAIpHSKETSYRLGSQPSH-QVPT--PSPRPRVFTPQSSPAMPLAPSHPSPYQGPRmqniSDYRASGSQA 871
Cdd:PHA03247  2570 PPRPAPRPSEPAV-TSRARRPDAPPQSARpRAPVddRGDPRGPAPPSPLPPDTHAPDPPPPSPSPA----ANEPDPHPPP 2644
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  872 IQPLPLGPGVRPASSQPQLLGGQRVQ--APNPVGFPGTW------PLPGSLLLMACPdiTQPGSTSLSETPRLFPLLPLR 943
Cdd:PHA03247  2645 TVPPPERPRDDPAPGRVSRPRRARRLgrAAQASSPPQRPrrraarPTVGSLTSLADP--PPPPPTPEPAPHALVSATPLP 2722
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  944 PPGPSHMVShAPAPPVSFLVPYPPGGPVAPCSSVLP-----TTGILTPHP------GPQDSWKEAPAPGGNLQRNKLP-- 1010
Cdd:PHA03247  2723 PGPAAARQA-SPALPAAPAPPAVPAGPATPGGPARParpptTAGPPAPAPpaapaaGPPRRLTRPAVASLSESRESLPsp 2801
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|...
gi 1622966953 1011 -ETFMAPAPITAPVMSLTPELQGILPLQPPVSGVSHAPPGAPG 1052
Cdd:PHA03247  2802 wDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPG 2844
PHA03247 PHA03247
large tegument protein UL36; Provisional
732-1096 3.15e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 51.86  E-value: 3.15e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  732 PHGVSPGPATTYRVTQYANLLAAQGSLATAMSfLPHDCAQPPVqqlrdrlfhAQGSAVLGQQSPPfpyPRIVVGAIPHSK 811
Cdd:PHA03247  2703 PPPPTPEPAPHALVSATPLPPGPAAARQASPA-LPAAPAPPAV---------PAGPATPGGPARP---ARPPTTAGPPAP 2769
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  812 ETSYRLGSQPSHQVPTPSPRPRVFTPQSSPAMPLAPSHPSPYQGPRMQNISDYRASGSQA--IQPLPLGPGVRPASSQPQ 889
Cdd:PHA03247  2770 APPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPppTSAQPTAPPPPPGPPPPS 2849
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  890 L-LGG---------QRVQAPNPVGFPGTWPLPgSLLLMACPDITQPGSTSLSETPRLFPLLPLRPPGPSHMVSHAPAPPV 959
Cdd:PHA03247  2850 LpLGGsvapggdvrRRPPSRSPAAKPAAPARP-PVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQ 2928
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  960 SFLVPYPPGGPVAPCSSVLPTTGILTPHPGPQDSWKEAPAPGG-NLQRNKLPEtfmaPAPITAPVMSLTPELQGIlplqp 1038
Cdd:PHA03247  2929 PQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRvAVPRFRVPQ----PAPSREAPASSTPPLTGH----- 2999
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1622966953 1039 PVSGVS--------HAPPgAPGELSLQQLQHLPPEKmerkelppEHQSLKSSFEALLQRCSLSATD 1096
Cdd:PHA03247  3000 SLSRVSswasslalHEET-DPPPVSLKQTLWPPDDT--------EDSDADSLFDSDSERSDLEALD 3056
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
729-1077 9.74e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 50.15  E-value: 9.74e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  729 LRGPHGVSPGPATTYRVTQYANLLAAQGSLATAMSFLPHDCAQPPVQQLRDRLFHA--QGSAVLGQQSPPFPYPRIV-VG 805
Cdd:pfam03154  174 LQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTliQQTPTLHPQRLPSPHPPLQpMT 253
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  806 AIPHSKETSYRLGSQPSHQVPTP-----------------SPRPRVFTPQSSPA-MPLAPSHPSPYQGPRMQNISDYRAS 867
Cdd:pfam03154  254 QPPPPSQVSPQPLPQPSLHGQMPpmphslqtgpshmqhpvPPQPFPLTPQSSQSqVPPGPSPAAPGQSQQRIHTPPSQSQ 333
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  868 GSQAI----QPLPLGPGVRP------ASSQPQLLGGQ------RVQAPNPVGFPGTWPLPGSLLLMAC-----PDITQPG 926
Cdd:pfam03154  334 LQSQQppreQPLPPAPLSMPhikpppTTPIPQLPNPQshkhppHLSGPSPFQMNSNLPPPPALKPLSSlsthhPPSAHPP 413
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  927 STSLSETPRLFPLLPLRPPGPSHMVSHAP----APPVSFLVPYPPGGPVAPCSSVLPTTGILTPHPGPQDSwkeAPAPGG 1002
Cdd:pfam03154  414 PLQLMPQSQQLPPPPAQPPVLTQSQSLPPpaasHPPTSGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTS---TSSAMP 490
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1622966953 1003 NLQrnklpetfmapapitaPVMSLTPELQGILPLQPPVSgvshAPPgapgelslQQLQHLPPEKMERKELPPEHQ 1077
Cdd:pfam03154  491 GIQ----------------PPSSASVSSSGPVPAAVSCP----LPP--------VQIKEEALDEAEEPESPPPPP 537
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
846-1079 2.35e-05

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 48.61  E-value: 2.35e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  846 APSHPSPyqgprMQNISDYRASGSQAI---QPLPLgPGVRPASSQPQLLGGQRVQAPNPVGFPGTWPLPGSlllmACPDI 922
Cdd:pfam03154  145 SPSIPSP-----QDNESDSDSSAQQQIlqtQPPVL-QAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQ----GSPAT 214
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  923 TQPGSTSLSetprlfpllplrppgpshmvshaPAPPVSFLVPYPPGGPVAPCSSVLPTTGILTPHPGPQDSWKEAPAPGG 1002
Cdd:pfam03154  215 SQPPNQTQS-----------------------TAAPHTLIQQTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSL 271
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953 1003 NLQrnklpetfMAPAPitAPVMSLTPELQGILPLQP----PVSGVSHAPPGAPGELS--LQQLQHLPPEKME-RKELPPE 1075
Cdd:pfam03154  272 HGQ--------MPPMP--HSLQTGPSHMQHPVPPQPfpltPQSSQSQVPPGPSPAAPgqSQQRIHTPPSQSQlQSQQPPR 341

                   ....
gi 1622966953 1076 HQSL 1079
Cdd:pfam03154  342 EQPL 345
WD40 COG2319
WD40 repeat [General function prediction only];
13-156 3.02e-05

WD40 repeat [General function prediction only];


Pssm-ID: 441893 [Multi-domain]  Cd Length: 403  Bit Score: 47.98  E-value: 3.02e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   13 AWSPASQYplyLATGtsaqqldssfSTNGTLEIFEVDFRDPSLDLKHKGVLSASSRFHkliwgsfgsgllESSGVIAGGG 92
Cdd:COG2319    295 AFSPDGKL---LASG----------SDDGTVRLWDLATGKLLRTLTGHTGAVRSVAFS------------PDGKTLASGS 349
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1622966953   93 DNGMLILYNvthiLSSGKEpvIAQKQKHTGAVRALDFNPfqvlQGNLLASGASDSEVFIWDLNN 156
Cdd:COG2319    350 DDGTVRLWD----LATGEL--LRTLTGHTGAVTSVAFSP----DGRTLASGSADGTVRLWDLAT 403
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
255-338 4.21e-05

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 46.94  E-value: 4.21e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  255 PLKVLESHSRGILSVSWSqADAELLLTSAKDSQILCLNLESSEVVYKLPTQSSWCFEVQWCPRDPSVFSaASFDGWISLY 334
Cdd:cd00200      1 LRRTLKGHTGGVTCVAFS-PDGKLLATGSGDGTIKVWDLETGELLRTLKGHTGPVRDVAASADGTYLAS-GSSDKTIRLW 78

                   ....
gi 1622966953  335 SVMG 338
Cdd:cd00200     79 DLET 82
WD40 smart00320
WD40 repeats; Note that these repeats are permuted with respect to the structural repeats ...
120-153 7.72e-05

WD40 repeats; Note that these repeats are permuted with respect to the structural repeats (blades) of the beta propeller domain.


Pssm-ID: 197651 [Multi-domain]  Cd Length: 40  Bit Score: 41.14  E-value: 7.72e-05
                            10        20        30
                    ....*....|....*....|....*....|....
gi 1622966953   120 HTGAVRALDFNPfqvlQGNLLASGASDSEVFIWD 153
Cdd:smart00320   11 HTGPVTSVAFSP----DGKYLASGSDDGTIKLWD 40
ACE1-Sec16-like cd09233
Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat ...
572-692 1.36e-04

Ancestral coatomer element 1 (ACE1) of COPII coat complex assembly protein Sec16; COPII coat complex plays an important role in vesicular traffic of newly synthezised proteins from the endoplasmatic reticulum (ER) to the Golgi apparatus by mediating the formation of transport vesicles. COPII consists of an outer coat, made up of the scaffold proteins Sec31 and Sec13, and the cargo adaptor complex, Sec23 and Sec24, which are recruited by the small GTPase Sar1. Sec16 is involved in the early steps of the assembly process. Sec16 forms elongated heterotetramers with Sec13, Sec13-(Sec16)2-Sec13. It interacts with Sec13 by insertion of a single beta-blade to close the six-bladded beta propeller of Sec13. In the same way Sec13 interacts with Sec31 and Nup145C, a nuclear pore protein, all of these contain a structurally related ancestral coatomer element 1 (ACE1). Sec16 is believed to be a key component in maintaining the integrity of the ER exit site.


Pssm-ID: 187750 [Multi-domain]  Cd Length: 314  Bit Score: 45.33  E-value: 1.36e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  572 LLLGELGPAVELYLKEERFADAIILAQAGGADLLKQTQERYlAKKKTKISSLLACVVQ---KNWKDVVCTCS-------- 640
Cdd:cd09233     73 LLTGNRKEALELALDNGLWAHALLLASSLGKETWAEVVSRF-ARSESKLNDPLQTLYQlfsGNSPEAITELAdnpaeaew 151
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 1622966953  641 -LKNWREALALLLTYSSTEKfpelcDM-----LGTRMEQEGgraLTSEARLCYVCSGS 692
Cdd:cd09233    152 aLGNWREHLAIILSNRTSNL-----DLealveLGDLLAQRG---LVEAAHICYLLAGV 201
PTZ00420 PTZ00420
coronin; Provisional
80-205 2.01e-04

coronin; Provisional


Pssm-ID: 240412 [Multi-domain]  Cd Length: 568  Bit Score: 45.33  E-value: 2.01e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953   80 GLLESSGVIA------GGGDNGMLILYNVTHilssgKEPVIAQKqKHTGAVRALDFNPfqvLQGNLLASGASDSEVFIWD 153
Cdd:PTZ00420    33 GIACSSGFVAvpweveGGGLIGAIRLENQMR-----KPPVIKLK-GHTSSILDLQFNP---CFSEILASGSEDLTIRVWE 103
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1622966953  154 L--NNLNV-----PMTL--GSKSQqppedVKALSWNRQAQHILSSAHPSGKAVVWDLrKNE 205
Cdd:PTZ00420   104 IphNDESVkeikdPQCIlkGHKKK-----ISIIDWNPMNYYIMCSSGFDSFVNIWDI-ENE 158
WD40 pfam00400
WD domain, G-beta repeat;
120-153 5.43e-04

WD domain, G-beta repeat;


Pssm-ID: 459801 [Multi-domain]  Cd Length: 39  Bit Score: 38.48  E-value: 5.43e-04
                           10        20        30
                   ....*....|....*....|....*....|....
gi 1622966953  120 HTGAVRALDFNPfqvlQGNLLASGASDSEVFIWD 153
Cdd:pfam00400   10 HTGSVTSLAFSP----DGKLLASGSDDGTVKVWD 39
PHA03378 PHA03378
EBNA-3B; Provisional
810-1111 2.32e-03

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 42.36  E-value: 2.32e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  810 SKETSYRLGSQPSH-QVPTPSPRPRVfTPQSSPAMPLAPSHPSPYQGPRMQNISDYRasgSQAIQPLPLGPGVRPASSQP 888
Cdd:PHA03378   579 SPTTSQLASSAPSYaQTPWPVPHPSQ-TPEPPTTQSHIPETSAPRQWPMPLRPIPMR---PLRMQPITFNVLVFPTPHQP 654
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  889 QllggqrvqAPNPVGFPGTWPLPGSLllMACPDITQPGSTSLSETPRLFPLLPLRPPGPSHMVSHAPA---PPVSFLVPY 965
Cdd:PHA03378   655 P--------QVEITPYKPTWTQIGHI--PYQPSPTGANTMLPIQWAPGTMQPPPRAPTPMRPPAAPPGraqRPAAATGRA 724
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  966 PPggPVAPCSSVLPTTGILTPHPGPQDSWKEAPAPGGNLQRNKLPE-TFMAPAPITAPVMsltpelqGILPLQPPVSGVS 1044
Cdd:PHA03378   725 RP--PAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAaAPGAPTPQPPPQA-------PPAPQQRPRGAPT 795
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953 1045 HAPP----GAPGELSLQQL--QHLPPEKMERKELPPEHQSLKSS--FEALLQRcsLSATDLVLALGAGVGIH------FF 1110
Cdd:PHA03378   796 PQPPpqagPTSMQLMPRAApgQQGPTKQILRQLLTGGVKRGRPSlkKPAALER--QAAAGPTPSPGSGTSDKivqapvFY 873

                   .
gi 1622966953 1111 P 1111
Cdd:PHA03378   874 P 874
PRK10263 PRK10263
DNA translocase FtsK; Provisional
793-1052 5.47e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 41.22  E-value: 5.47e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  793 QSPPFPyprivvGAIPHSKETSYRLGSQPSHQVPTPS--PRPRVFTPQSSPAMPLAPsHPSPYQGPRMQNISDYRASGSQ 870
Cdd:PRK10263   342 QTPPVA------SVDVPPAQPTVAWQPVPGPQTGEPViaPAPEGYPQQSQYAQPAVQ-YNEPLQQPVQPQQPYYAPAAEQ 414
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  871 AIQPLPLGPGVRPASSQPQLlggqrVQAPNPVGFPGTWPLPGSLLLMACPDITQPGSTslsetprlfpllplrppgpshM 950
Cdd:PRK10263   415 PAQQPYYAPAPEQPAQQPYY-----APAPEQPVAGNAWQAEEQQSTFAPQSTYQTEQT---------------------Y 468
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622966953  951 VSHAPAPPvsflvPYPPGGPVAPCSSVLPTTGILTPHPG-PQDSWKEAPAPGGNLQRNKL-------PETFMAPAPITAP 1022
Cdd:PRK10263   469 QQPAAQEP-----LYQQPQPVEQQPVVEPEPVVEETKPArPPLYYFEEVEEKRAREREQLaawyqpiPEPVKEPEPIKSS 543
                          250       260       270
                   ....*....|....*....|....*....|
gi 1622966953 1023 VMSLTPelqgilPLQPPVSGVSHAPPGAPG 1052
Cdd:PRK10263   544 LKAPSV------AAVPPVEAAAAVSPLASG 567
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH