NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|149031233|gb|EDL86240|]
View 

rCG41932 [Rattus norvegicus]

Protein Classification

SEC24 family transport protein( domain architecture ID 1001573)

SEC24 family transport protein is a component of the coat protein complex II (COPII) which promotes the formation of transport vesicles from the endoplasmic reticulum (ER)

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
COG5028 super family cl34873
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
181-1091 6.74e-171

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


The actual alignment was detected with superfamily member COG5028:

Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 524.36  E-value: 6.74e-171
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  181 TYPQSQAPPLSQAQGHPGVQPPLRSAPPLAS--SFTSPASGGPrmpsmPGPLPPGQgfgslpvSQANRvsSPPAHALPpg 258
Cdd:COG5028     2 SQHKKGVYPQAQSQVHTGAASSKKSARPHRAyaNFSAGQMGMP-----PYTTPPLQ-------QQSRR--QIDQAATA-- 65
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  259 tqmtgppappppMHspqQPGYQLQQNGSFGPARGPQPNYESPYPGAPTFGTqpgppqplppkrLDPDAIPS-PIQVIEDd 337
Cdd:COG5028    66 ------------MH---NTGANNPAPSVMSPAFQSQQKFSSPYGGSMADGT------------APKPTNPLvPVDLFED- 117
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  338 rnnrgSEPFVTGVRG----QVPPLvTTNFLVKDQGNASPRYIRCTSYNIPCTSDMAKQAQVPLAAVIKPLARLPPEEASP 413
Cdd:COG5028   118 -----QPPPISDLFLppppIVPPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPV 191
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  414 YVVDHGEsgPLRCNRCKAYMCPLMTFIEGGRRFQCSFCSCINDVPPQYFQHLDHTGKRVDAYDRPELSLGSYEFLATVDY 493
Cdd:COG5028   192 PLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY 269
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  494 ckNNKFPSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREGGAeesaIRVGFVTYNKVLHFYNVKSSLaQPQMMV 573
Cdd:COG5028   270 --SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLI 342
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  574 VSDVADMFVPLLDG-FLVNVSESRAVITSLLDQIPEMFADTRETETVFAPviqagmeALKAA-----ECAGKLFLFHTSL 647
Cdd:COG5028   343 VSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNALGP-------ALKAAksligGTGGKIIVFLSTL 415
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  648 PIAeAPGKLKNRDDrklintdKEKTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACF 727
Cdd:COG5028   416 PNM-GIGKLQLRED-------KESSLLSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNF 487
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  728 QVE--NDQERFLSDLRRDVQKVVGFDAVMRVRTSTGIRAVDFFGAFYMSNTTDVELAGLDGDKTVTVEFKHDDRLNEeSG 805
Cdd:COG5028   488 SATrpNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLMT-SD 566
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  806 ALLQCALLYTSCAGQRRLRIHNLALNCCTQLADLYRNCETDTLINYMAKFAYRAVVSSPVKTVRDTLITQCAQILACYRK 885
Cdd:COG5028   567 VYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKK 646
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  886 NCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGAeVTTDDRAYVRQLVSSMDVAETNVFFYPRLLPLTKSPLDS--- 962
Cdd:COG5028   647 ELVKSNTSTQLPLPANLKLLPLLMLALLKSSAFRSGS-TPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMPIEAglp 725
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  963 ----TTEPPAVRASEERLSSGDIYLLENGLNLFVWVGASVQQGVVQSLFNVSSFSQITSGLSVLPVLDNPLSKKVRGLID 1038
Cdd:COG5028   726 deglLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIG 805
                         890       900       910       920       930
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233 1039 TLRAQ-RTRYMKLIVVKQ--EDKLEMLFKHFLVEDKSLsGGASYVDFLCHMHKEIR 1091
Cdd:COG5028   806 ELRSVnDDSTLPLVLVRGggDPSLRLWFFSTLVEDKTL-NIPSYLDYLQILHEKIK 860
PHA03247 super family cl33720
large tegument protein UL36; Provisional
7-305 1.73e-13

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 75.75  E-value: 1.73e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    7 APPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAP--PQGVPRAPPCSGAPPASAAQVPCGqttyg 84
Cdd:PHA03247 2703 PPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPggPARPARPPTTAGPPAPAPPAAPAA----- 2777
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 qfgqgdiqnGPSSTAQMPRVpGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPT 164
Cdd:PHA03247 2778 ---------GPPRRLTRPAV-ASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP 2847
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  165 SLASASGNFPNSGPYSTYPQSQAPPLS-QAQGHPGVQPPLRSAPPLAS-SFTSPASGGPRMPSMPGPLPPGQGFGSLPVS 242
Cdd:PHA03247 2848 PSLPLGGSVAPGGDVRRRPPSRSPAAKpAAPARPPVRRLARPAVSRSTeSFALPPDQPERPPQPQAPPPPQPQPQPPPPP 2927
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149031233  243 QANRVSSPPAHALPPGTQMTGPPAPpppmhSPQQPGYQLQQNGSFGPARGPQPNYESPYPGAP 305
Cdd:PHA03247 2928 QPQPPPPPPPRPQPPLAPTTDPAGA-----GEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPS 2985
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
181-1091 6.74e-171

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 524.36  E-value: 6.74e-171
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  181 TYPQSQAPPLSQAQGHPGVQPPLRSAPPLAS--SFTSPASGGPrmpsmPGPLPPGQgfgslpvSQANRvsSPPAHALPpg 258
Cdd:COG5028     2 SQHKKGVYPQAQSQVHTGAASSKKSARPHRAyaNFSAGQMGMP-----PYTTPPLQ-------QQSRR--QIDQAATA-- 65
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  259 tqmtgppappppMHspqQPGYQLQQNGSFGPARGPQPNYESPYPGAPTFGTqpgppqplppkrLDPDAIPS-PIQVIEDd 337
Cdd:COG5028    66 ------------MH---NTGANNPAPSVMSPAFQSQQKFSSPYGGSMADGT------------APKPTNPLvPVDLFED- 117
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  338 rnnrgSEPFVTGVRG----QVPPLvTTNFLVKDQGNASPRYIRCTSYNIPCTSDMAKQAQVPLAAVIKPLARLPPEEASP 413
Cdd:COG5028   118 -----QPPPISDLFLppppIVPPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPV 191
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  414 YVVDHGEsgPLRCNRCKAYMCPLMTFIEGGRRFQCSFCSCINDVPPQYFQHLDHTGKRVDAYDRPELSLGSYEFLATVDY 493
Cdd:COG5028   192 PLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY 269
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  494 ckNNKFPSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREGGAeesaIRVGFVTYNKVLHFYNVKSSLaQPQMMV 573
Cdd:COG5028   270 --SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLI 342
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  574 VSDVADMFVPLLDG-FLVNVSESRAVITSLLDQIPEMFADTRETETVFAPviqagmeALKAA-----ECAGKLFLFHTSL 647
Cdd:COG5028   343 VSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNALGP-------ALKAAksligGTGGKIIVFLSTL 415
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  648 PIAeAPGKLKNRDDrklintdKEKTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACF 727
Cdd:COG5028   416 PNM-GIGKLQLRED-------KESSLLSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNF 487
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  728 QVE--NDQERFLSDLRRDVQKVVGFDAVMRVRTSTGIRAVDFFGAFYMSNTTDVELAGLDGDKTVTVEFKHDDRLNEeSG 805
Cdd:COG5028   488 SATrpNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLMT-SD 566
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  806 ALLQCALLYTSCAGQRRLRIHNLALNCCTQLADLYRNCETDTLINYMAKFAYRAVVSSPVKTVRDTLITQCAQILACYRK 885
Cdd:COG5028   567 VYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKK 646
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  886 NCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGAeVTTDDRAYVRQLVSSMDVAETNVFFYPRLLPLTKSPLDS--- 962
Cdd:COG5028   647 ELVKSNTSTQLPLPANLKLLPLLMLALLKSSAFRSGS-TPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMPIEAglp 725
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  963 ----TTEPPAVRASEERLSSGDIYLLENGLNLFVWVGASVQQGVVQSLFNVSSFSQITSGLSVLPVLDNPLSKKVRGLID 1038
Cdd:COG5028   726 deglLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIG 805
                         890       900       910       920       930
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233 1039 TLRAQ-RTRYMKLIVVKQ--EDKLEMLFKHFLVEDKSLsGGASYVDFLCHMHKEIR 1091
Cdd:COG5028   806 ELRSVnDDSTLPLVLVRGggDPSLRLWFFSTLVEDKTL-NIPSYLDYLQILHEKIK 860
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
500-759 8.40e-124

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 378.54  E-value: 8.40e-124
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  500 PSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREggaeESAIRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 579
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGD----DPRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  580 MFVPLLDGFLVNVSESRAVITSLLDQIPEMFADTRETETVFAPVIQAGMEALKaaECAGKLFLFHTSLPIAEApGKLKNR 659
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  660 DDRKLINTDKEKTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYAcfqvendqeRFLSD 739
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYP---------SFNFS 224
                         250       260
                  ....*....|....*....|
gi 149031233  740 LRRDVQKVVGFDAVMRVRTS 759
Cdd:cd01479   225 APNDVEKLVNELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
500-744 1.11e-115

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 357.33  E-value: 1.11e-115
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   500 PSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREggaeeSAIRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 579
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   580 MFVPLLDGFLVNVSESRAVITSLLDQIPEMFADTRETETVFAPVIQAGMEALKAAECAGKLFLFHTSLPIAEAPGKLKNR 659
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   660 DDRKLINTDKEKTLFQPQT-GTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACFQVENDQERFLS 738
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKAdKFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 149031233   739 DLRRDV 744
Cdd:pfam04811  236 DLQRYF 241
PTZ00395 PTZ00395
Sec24-related protein; Provisional
19-1092 5.66e-49

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 190.67  E-value: 5.66e-49
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   19 IYPGYHqsnyGGQPGPAAPATPYGAYNGPVPG--YQQAPP--QGVPRAPPCSGAPPASAAQVPCGQTTYGQfgqgdiqng 94
Cdd:PTZ00395  338 IYGGFH----DGSPNAASAGAPFNGLGNQADGghINQVHPdaRGAWAGGPHSNASYNCAAYSNAAQSNAAQ--------- 404
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   95 psSTAQMPRVPGSQQfGPPLAPVVSQPAVLQPYGPPPTSTQVTAQlaamqisgavaqaPPPSGlgygPPTSlasasgNFP 174
Cdd:PTZ00395  405 --SNAGFSNAGYSNP-GNSNPGYNNAPNSNTPYNNPPNSNTPYSN-------------PPNSN----PPYS------NLP 458
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  175 NSG-PYSTYPQSQAPPlSQAQGHPGVqpplrsappLASSFTSPASGGPRMPSMPGPLPPGQGF-GSLPVSQANRVSSPPA 252
Cdd:PTZ00395  459 YSNtPYSNAPLSNAPP-SSAKDHHSA---------YHAAYQHRAANQPAANLPTANQPAANNFhGAAGNSVGNPFASRPF 528
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  253 HALPPGTQMTGPPAPPPPMHSPQQPG-----YQLQQNGSFGPARGPQPNYESPYPGAPTFGTQPGPPQPLPPKRLDPDAI 327
Cdd:PTZ00395  529 GSAPYGGNAATTADPNGIAKREDHPEggtnrQKYEQSDEESVESSSSENSSENENEVTDKGEEIYSLLKKTINRIDMNKI 608
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  328 PSPIQVIEDDRNNRGSEPFVTgVRGQVPPLVTTNFLVKDQGNASPRYIRCTSYNIPCTSDMAKQAQVPLAAVIKPLARLP 407
Cdd:PTZ00395  609 PRPIINTQEKKKKKNLKVFET-CKYISPPSYYQPYISIDTGKADPRFLKSTLYQIPLFSETLKLSQIPFGIIVNPFACLN 687
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  408 PEEASPYV-----VDHGESGP--LRCNRCKAYMcpLMTFIEG-GRRFQCSFCSC---IND-------------------- 456
Cdd:PTZ00395  688 EGEGIDKIdmkdiINDKEENIeiLRCPKCLGYL--HATILEDiSSSVQCVFCDTdflINEnvlfdifqynekighkesdh 765
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  457 ------------------VPPQYFQHLD-------HTGKRV--------------------------------------- 472
Cdd:PTZ00395  766 nehgnslspllkgsvdiiIPPIYYHNVNkfkltytYLNKNInqtafmitnkimsftkhisnslvandskggnkatsasaf 845
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  473 -DAYDRPELSLGSY--------------------------------------------EFLATVD------------YCK 495
Cdd:PTZ00395  846 gDSGDANFLAGGGYtnyggaggyntydnqsgynnhdvvnnrggsgagnhlygkdhdvqNFDNVMDnanftihdmknlICE 925
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  496 NN---------------KFPS-----PPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYL--PReggaeesaIRVGFVTY 553
Cdd:PTZ00395  926 KNgepdsakirrnsflaKYPQvknmlPPYFVFVVECSYNAIYNNITYTILEGIRYAVQNVkcPQ--------TKIAIITF 997
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  554 NKVLHFYNVKSSLAQP-------------QMMVVSDVADMFVPL-LDGFLVNVSESRAVITSLLDQIPEMFADTRETETV 619
Cdd:PTZ00395  998 NSSIYFYHCKGGKGVSgeegdggggsgnhQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSC 1077
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  620 FAPVIQAGMEALKAAECAGKLFLFHTSLPIAeAPGKLKnrddrKLINTDKEKTLFQPQTGTYQTLAKECVAQGCCVDLFL 699
Cdd:PTZ00395 1078 GNSALKIAMDMLKERNGLGSICMFYTTTPNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFI 1151
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  700 FP--NQYVDVATLSVVPQLTGGSVYKYACFQVEND-QERFLSDLRRDVQKVVGFDAVMRVRTSTGIRAVDFFGAFYMSNT 776
Cdd:PTZ00395 1152 ISsnNVRVCVPSLQYVAQNTGGKILFVENFLWQKDyKEIYMNIMDTLTSEDIAYCCELKLRYSHHMSVKKLFCCNNNFNS 1231
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  777 T----DVELAGLDGDKTVTVEFKHDDRLNEESGALLQCALLYTSCAGQRRLRIHNLALNCCTQLADLYRNCETDTLINYM 852
Cdd:PTZ00395 1232 IisvdTIKIPKIRHDQTFAFLLNYSDISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNIL 1311
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  853 AKFAYRAVVSSpvKTVRDTLITQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQpgAEVTTDDRAYV 932
Cdd:PTZ00395 1312 IKQLCTNILHN--DNYSKIIIDNLAAILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNVTK--KEILHDLKVYS 1387
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  933 RQLVSSMDVAETNVFFYPRLLPL----TKSPLDSTTE------PPAVRASEERLSSGDIYLLENGLNLFVWVGASVQQGV 1002
Cdd:PTZ00395 1388 LIKLLSMPIISSLLYVYPVMYVIhikgKTNEIDSMDVdddlfiPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANF 1467
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233 1003 VQSLFNVSSFSQITSGLSvlpVLDNPLSKKVRGLIDTLRA--QRTRYMKLIVVKQEDKLEMLFKHFLVEDKSlSGGASYV 1080
Cdd:PTZ00395 1468 AKEIVGDIPTEKNAHELN---LTDTPNAQKVQRIIKNLSRihHFNKYVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYV 1543
                        1290
                  ....*....|..
gi 149031233 1081 DFLCHMHKEIRQ 1092
Cdd:PTZ00395 1544 NFLCFIHKLVHK 1555
PHA03247 PHA03247
large tegument protein UL36; Provisional
7-305 1.73e-13

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 75.75  E-value: 1.73e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    7 APPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAP--PQGVPRAPPCSGAPPASAAQVPCGqttyg 84
Cdd:PHA03247 2703 PPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPggPARPARPPTTAGPPAPAPPAAPAA----- 2777
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 qfgqgdiqnGPSSTAQMPRVpGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPT 164
Cdd:PHA03247 2778 ---------GPPRRLTRPAV-ASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP 2847
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  165 SLASASGNFPNSGPYSTYPQSQAPPLS-QAQGHPGVQPPLRSAPPLAS-SFTSPASGGPRMPSMPGPLPPGQGFGSLPVS 242
Cdd:PHA03247 2848 PSLPLGGSVAPGGDVRRRPPSRSPAAKpAAPARPPVRRLARPAVSRSTeSFALPPDQPERPPQPQAPPPPQPQPQPPPPP 2927
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149031233  243 QANRVSSPPAHALPPGTQMTGPPAPpppmhSPQQPGYQLQQNGSFGPARGPQPNYESPYPGAP 305
Cdd:PHA03247 2928 QPQPPPPPPPRPQPPLAPTTDPAGA-----GEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPS 2985
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
5-305 1.32e-12

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 72.49  E-value: 1.32e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233     5 QSAPPVPPfgqnQPIYPGYHQSNYGGqPGPAAPATPygayNGPVPGYQQAPPQGVPRAPPCS---GAPPASAAQVPCGQT 81
Cdd:pfam03154  177 QSGAASPP----SPPPPGTTQAATAG-PTPSAPSVP----PQGSPATSQPPNQTQSTAAPHTliqQTPTLHPQRLPSPHP 247
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    82 TYGQFGQG--DIQNGPSST------AQMPRVPGSQQFGPPLAPvvsQPAVLQPYGPPPTSTQvtaqlaamqisgavAQAP 153
Cdd:pfam03154  248 PLQPMTQPppPSQVSPQPLpqpslhGQMPPMPHSLQTGPSHMQ---HPVPPQPFPLTPQSSQ--------------SQVP 310
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   154 PPsglgygPPTSLASASGNFPNSGPYSTYPQSQAPPLSQA-----QGHPGVQPPLRSA-PPLASSFT---SPASGGPRMP 224
Cdd:pfam03154  311 PG------PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPlppapLSMPHIKPPPTTPiPQLPNPQShkhPPHLSGPSPF 384
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   225 SMPGPLPPG---QGFGSLPVSQANRVSSPPAHALPPGTQMTGPpappppmhsPQQPGYqLQQNGSFGPARGPQPNYESPY 301
Cdd:pfam03154  385 QMNSNLPPPpalKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPP---------PAQPPV-LTQSQSLPPPAASHPPTSGLH 454

                   ....
gi 149031233   302 PGAP 305
Cdd:pfam03154  455 QVPS 458
COG3416 COG3416
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
13-62 8.36e-04

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 42.32  E-value: 8.36e-04
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....
gi 149031233   13 FGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQ----APPQGVPRA 62
Cdd:COG3416    90 FGGGQRPPPAPQPSQPGPQQQPAPPSGPWGQAAPQQPGYGQpqygQPAAGPSGG 143
GEL smart00262
Gelsolin homology domain; Gelsolin/severin/villin homology domain. Calcium-binding and ...
964-1000 6.92e-03

Gelsolin homology domain; Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains.


Pssm-ID: 214590 [Multi-domain]  Cd Length: 90  Bit Score: 36.89  E-value: 6.92e-03
                            10        20        30
                    ....*....|....*....|....*....|....*....
gi 149031233    964 TEPPAVRASEERLSSGDIYLLENGLNLFVWVG--ASVQQ 1000
Cdd:smart00262   11 VRVPEVPFSQGSLNSGDCYILDTGSEIYVWVGkkSSQDE 49
 
Name Accession Description Interval E-value
COG5028 COG5028
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ...
181-1091 6.74e-171

Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];


Pssm-ID: 227361 [Multi-domain]  Cd Length: 861  Bit Score: 524.36  E-value: 6.74e-171
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  181 TYPQSQAPPLSQAQGHPGVQPPLRSAPPLAS--SFTSPASGGPrmpsmPGPLPPGQgfgslpvSQANRvsSPPAHALPpg 258
Cdd:COG5028     2 SQHKKGVYPQAQSQVHTGAASSKKSARPHRAyaNFSAGQMGMP-----PYTTPPLQ-------QQSRR--QIDQAATA-- 65
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  259 tqmtgppappppMHspqQPGYQLQQNGSFGPARGPQPNYESPYPGAPTFGTqpgppqplppkrLDPDAIPS-PIQVIEDd 337
Cdd:COG5028    66 ------------MH---NTGANNPAPSVMSPAFQSQQKFSSPYGGSMADGT------------APKPTNPLvPVDLFED- 117
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  338 rnnrgSEPFVTGVRG----QVPPLvTTNFLVKDQGNASPRYIRCTSYNIPCTSDMAKQAQVPLAAVIKPLARLPPEEASP 413
Cdd:COG5028   118 -----QPPPISDLFLppppIVPPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPV 191
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  414 YVVDHGEsgPLRCNRCKAYMCPLMTFIEGGRRFQCSFCSCINDVPPQYFQHLDHTGKRVDAYDRPELSLGSYEFLATVDY 493
Cdd:COG5028   192 PLVEDGS--IVRCRRCRSYINPFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY 269
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  494 ckNNKFPSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREGGAeesaIRVGFVTYNKVLHFYNVKSSLaQPQMMV 573
Cdd:COG5028   270 --SLRQPPPPVYVFLIDVSFEAIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLI 342
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  574 VSDVADMFVPLLDG-FLVNVSESRAVITSLLDQIPEMFADTRETETVFAPviqagmeALKAA-----ECAGKLFLFHTSL 647
Cdd:COG5028   343 VSDLDEPFLPFPSGlFVLPLKSCKQIIETLLDRVPRIFQDNKSPKNALGP-------ALKAAksligGTGGKIIVFLSTL 415
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  648 PIAeAPGKLKNRDDrklintdKEKTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACF 727
Cdd:COG5028   416 PNM-GIGKLQLRED-------KESSLLSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNF 487
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  728 QVE--NDQERFLSDLRRDVQKVVGFDAVMRVRTSTGIRAVDFFGAFYMSNTTDVELAGLDGDKTVTVEFKHDDRLNEeSG 805
Cdd:COG5028   488 SATrpNDATKLANDLVSHLSMEIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLMT-SD 566
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  806 ALLQCALLYTSCAGQRRLRIHNLALNCCTQLADLYRNCETDTLINYMAKFAYRAVVSSPVKTVRDTLITQCAQILACYRK 885
Cdd:COG5028   567 VYFQVALLYTLNDGERRIRVVNLSLPTSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKK 646
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  886 NCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGAeVTTDDRAYVRQLVSSMDVAETNVFFYPRLLPLTKSPLDS--- 962
Cdd:COG5028   647 ELVKSNTSTQLPLPANLKLLPLLMLALLKSSAFRSGS-TPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMPIEAglp 725
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  963 ----TTEPPAVRASEERLSSGDIYLLENGLNLFVWVGASVQQGVVQSLFNVSSFSQITSGLSVLPVLDNPLSKKVRGLID 1038
Cdd:COG5028   726 deglLVLPSPINATSSLLESGGLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIG 805
                         890       900       910       920       930
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233 1039 TLRAQ-RTRYMKLIVVKQ--EDKLEMLFKHFLVEDKSLsGGASYVDFLCHMHKEIR 1091
Cdd:COG5028   806 ELRSVnDDSTLPLVLVRGggDPSLRLWFFSTLVEDKTL-NIPSYLDYLQILHEKIK 860
Sec24-like cd01479
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ...
500-759 8.40e-124

Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.


Pssm-ID: 238756 [Multi-domain]  Cd Length: 244  Bit Score: 378.54  E-value: 8.40e-124
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  500 PSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREggaeESAIRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 579
Cdd:cd01479     1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGD----DPRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  580 MFVPLLDGFLVNVSESRAVITSLLDQIPEMFADTRETETVFAPVIQAGMEALKaaECAGKLFLFHTSLPIAEApGKLKNR 659
Cdd:cd01479    77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  660 DDRKLINTDKEKTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYAcfqvendqeRFLSD 739
Cdd:cd01479   154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYP---------SFNFS 224
                         250       260
                  ....*....|....*....|
gi 149031233  740 LRRDVQKVVGFDAVMRVRTS 759
Cdd:cd01479   225 APNDVEKLVNELARYLTRKI 244
Sec23_trunk pfam04811
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
500-744 1.11e-115

Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.


Pssm-ID: 398467 [Multi-domain]  Cd Length: 241  Bit Score: 357.33  E-value: 1.11e-115
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   500 PSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREggaeeSAIRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 579
Cdd:pfam04811    1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   580 MFVPLLDGFLVNVSESRAVITSLLDQIPEMFADTRETETVFAPVIQAGMEALKAAECAGKLFLFHTSLPIAEAPGKLKNR 659
Cdd:pfam04811   76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   660 DDRKLINTDKEKTLFQPQT-GTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACFQVENDQERFLS 738
Cdd:pfam04811  156 LDESHHGTDKEKAKLVKKAdKFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235

                   ....*.
gi 149031233   739 DLRRDV 744
Cdd:pfam04811  236 DLQRYF 241
trunk_domain cd01468
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi ...
500-742 2.44e-103

trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.


Pssm-ID: 238745 [Multi-domain]  Cd Length: 239  Bit Score: 324.20  E-value: 2.44e-103
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  500 PSPPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYLPREGGAeesaiRVGFVTYNKVLHFYNVKSSLAQPQMMVVSDVAD 579
Cdd:cd01468     1 PQPPVFVFVIDVSYEAIKEGLLQALKESLLASLDLLPGDPRA-----RVGLITYDSTVHFYNLSSDLAQPKMYVVSDLKD 75
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  580 MFVPLLDGFLVNVSESRAVITSLLDQIPEMFAD--TRETETVFAPVIQAGMEALKAAECAGKLFLFHTSLPIAEaPGKLK 657
Cdd:cd01468    76 VFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPvpTHRPERCLGPALQAAFLLLKGTFAGGRIIVFQGGLPTVG-PGKLK 154
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  658 NRDDRKLINTDKEKTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACFQVENDQERFL 737
Cdd:cd01468   155 SREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLYDSFQAPNDGSKFK 234

                  ....*
gi 149031233  738 SDLRR 742
Cdd:cd01468   235 QDLQR 239
PTZ00395 PTZ00395
Sec24-related protein; Provisional
19-1092 5.66e-49

Sec24-related protein; Provisional


Pssm-ID: 185594 [Multi-domain]  Cd Length: 1560  Bit Score: 190.67  E-value: 5.66e-49
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   19 IYPGYHqsnyGGQPGPAAPATPYGAYNGPVPG--YQQAPP--QGVPRAPPCSGAPPASAAQVPCGQTTYGQfgqgdiqng 94
Cdd:PTZ00395  338 IYGGFH----DGSPNAASAGAPFNGLGNQADGghINQVHPdaRGAWAGGPHSNASYNCAAYSNAAQSNAAQ--------- 404
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   95 psSTAQMPRVPGSQQfGPPLAPVVSQPAVLQPYGPPPTSTQVTAQlaamqisgavaqaPPPSGlgygPPTSlasasgNFP 174
Cdd:PTZ00395  405 --SNAGFSNAGYSNP-GNSNPGYNNAPNSNTPYNNPPNSNTPYSN-------------PPNSN----PPYS------NLP 458
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  175 NSG-PYSTYPQSQAPPlSQAQGHPGVqpplrsappLASSFTSPASGGPRMPSMPGPLPPGQGF-GSLPVSQANRVSSPPA 252
Cdd:PTZ00395  459 YSNtPYSNAPLSNAPP-SSAKDHHSA---------YHAAYQHRAANQPAANLPTANQPAANNFhGAAGNSVGNPFASRPF 528
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  253 HALPPGTQMTGPPAPPPPMHSPQQPG-----YQLQQNGSFGPARGPQPNYESPYPGAPTFGTQPGPPQPLPPKRLDPDAI 327
Cdd:PTZ00395  529 GSAPYGGNAATTADPNGIAKREDHPEggtnrQKYEQSDEESVESSSSENSSENENEVTDKGEEIYSLLKKTINRIDMNKI 608
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  328 PSPIQVIEDDRNNRGSEPFVTgVRGQVPPLVTTNFLVKDQGNASPRYIRCTSYNIPCTSDMAKQAQVPLAAVIKPLARLP 407
Cdd:PTZ00395  609 PRPIINTQEKKKKKNLKVFET-CKYISPPSYYQPYISIDTGKADPRFLKSTLYQIPLFSETLKLSQIPFGIIVNPFACLN 687
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  408 PEEASPYV-----VDHGESGP--LRCNRCKAYMcpLMTFIEG-GRRFQCSFCSC---IND-------------------- 456
Cdd:PTZ00395  688 EGEGIDKIdmkdiINDKEENIeiLRCPKCLGYL--HATILEDiSSSVQCVFCDTdflINEnvlfdifqynekighkesdh 765
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  457 ------------------VPPQYFQHLD-------HTGKRV--------------------------------------- 472
Cdd:PTZ00395  766 nehgnslspllkgsvdiiIPPIYYHNVNkfkltytYLNKNInqtafmitnkimsftkhisnslvandskggnkatsasaf 845
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  473 -DAYDRPELSLGSY--------------------------------------------EFLATVD------------YCK 495
Cdd:PTZ00395  846 gDSGDANFLAGGGYtnyggaggyntydnqsgynnhdvvnnrggsgagnhlygkdhdvqNFDNVMDnanftihdmknlICE 925
                         650       660       670       680       690       700       710       720
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  496 NN---------------KFPS-----PPAFIFMIDVSYNAIRTGLVRLLCEELKSLLDYL--PReggaeesaIRVGFVTY 553
Cdd:PTZ00395  926 KNgepdsakirrnsflaKYPQvknmlPPYFVFVVECSYNAIYNNITYTILEGIRYAVQNVkcPQ--------TKIAIITF 997
                         730       740       750       760       770       780       790       800
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  554 NKVLHFYNVKSSLAQP-------------QMMVVSDVADMFVPL-LDGFLVNVSESRAVITSLLDQIPEMFADTRETETV 619
Cdd:PTZ00395  998 NSSIYFYHCKGGKGVSgeegdggggsgnhQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSC 1077
                         810       820       830       840       850       860       870       880
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  620 FAPVIQAGMEALKAAECAGKLFLFHTSLPIAeAPGKLKnrddrKLINTDKEKTLFQPQTGTYQTLAKECVAQGCCVDLFL 699
Cdd:PTZ00395 1078 GNSALKIAMDMLKERNGLGSICMFYTTTPNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFI 1151
                         890       900       910       920       930       940       950       960
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  700 FP--NQYVDVATLSVVPQLTGGSVYKYACFQVEND-QERFLSDLRRDVQKVVGFDAVMRVRTSTGIRAVDFFGAFYMSNT 776
Cdd:PTZ00395 1152 ISsnNVRVCVPSLQYVAQNTGGKILFVENFLWQKDyKEIYMNIMDTLTSEDIAYCCELKLRYSHHMSVKKLFCCNNNFNS 1231
                         970       980       990      1000      1010      1020      1030      1040
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  777 T----DVELAGLDGDKTVTVEFKHDDRLNEESGALLQCALLYTSCAGQRRLRIHNLALNCCTQLADLYRNCETDTLINYM 852
Cdd:PTZ00395 1232 IisvdTIKIPKIRHDQTFAFLLNYSDISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNIL 1311
                        1050      1060      1070      1080      1090      1100      1110      1120
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  853 AKFAYRAVVSSpvKTVRDTLITQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQpgAEVTTDDRAYV 932
Cdd:PTZ00395 1312 IKQLCTNILHN--DNYSKIIIDNLAAILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNVTK--KEILHDLKVYS 1387
                        1130      1140      1150      1160      1170      1180      1190      1200
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  933 RQLVSSMDVAETNVFFYPRLLPL----TKSPLDSTTE------PPAVRASEERLSSGDIYLLENGLNLFVWVGASVQQGV 1002
Cdd:PTZ00395 1388 LIKLLSMPIISSLLYVYPVMYVIhikgKTNEIDSMDVdddlfiPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANF 1467
                        1210      1220      1230      1240      1250      1260      1270      1280
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233 1003 VQSLFNVSSFSQITSGLSvlpVLDNPLSKKVRGLIDTLRA--QRTRYMKLIVVKQEDKLEMLFKHFLVEDKSlSGGASYV 1080
Cdd:PTZ00395 1468 AKEIVGDIPTEKNAHELN---LTDTPNAQKVQRIIKNLSRihHFNKYVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYV 1543
                        1290
                  ....*....|..
gi 149031233 1081 DFLCHMHKEIRQ 1092
Cdd:PTZ00395 1544 NFLCFIHKLVHK 1555
Sec23_helical pfam04815
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic ...
846-944 6.73e-35

Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.


Pssm-ID: 461441 [Multi-domain]  Cd Length: 103  Bit Score: 128.39  E-value: 6.73e-35
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   846 DTLINYMAKFAYRAVVSSPVKTVRDTLITQCAQILACYRKNCASPSSAGQLILPECMKLLPVYLNCVLKSDVLQPGAEVT 925
Cdd:pfam04815    3 EAIAVLLAKKAVEKALSSSLSDAREALDNKLVDILAAYRKYCASSSSPGQLILPESLKLLPLYMLALLKSPALRGGNSSP 82
                           90
                   ....*....|....*....
gi 149031233   926 TDDRAYVRQLVSSMDVAET 944
Cdd:pfam04815   83 SDERAYARHLLLSLPVEEL 101
Sec23_BS pfam08033
Sec23/Sec24 beta-sandwich domain;
749-832 3.62e-28

Sec23/Sec24 beta-sandwich domain;


Pssm-ID: 429794 [Multi-domain]  Cd Length: 86  Bit Score: 108.78  E-value: 3.62e-28
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   749 GFDAVMRVRTSTGIRAVDFFGAFYMSNTTD-VELAGLDGDKTVTVEFKHDDRLNEESGALLQCALLYTSCAGQRRLRIHN 827
Cdd:pfam08033    1 GFNAVLRVRTSKGLKVSGFIGNFVSRSSGDtWKLPSLDPDTSYAFEFDIDEPLPNGSNAYIQFALLYTHSSGERRIRVTT 80

                   ....*
gi 149031233   828 LALNC 832
Cdd:pfam08033   81 VALPV 85
PLN00162 PLN00162
transport protein sec23; Provisional
379-721 3.54e-17

transport protein sec23; Provisional


Pssm-ID: 215083 [Multi-domain]  Cd Length: 761  Bit Score: 86.92  E-value: 3.54e-17
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  379 SYNI-PCTSDMAKQAQVPLAAVIKPLARLPPEEASPYvvdhgesGPLRCNRCKAYMCPLMTFIEGGRRFQCSFCSCINDV 457
Cdd:PLN00162   15 SWNVwPSSKIEASKCVIPLAALYTPLKPLPELPVLPY-------DPLRCRTCRAVLNPYCRVDFQAKIWICPFCFQRNHF 87
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  458 PPQYF----QHLDhtgkrvdaydrPELslgsYEFLATVDY---CKNNKFPSPPAFIFMIDVSynAIRTGLvRLLCEELKS 530
Cdd:PLN00162   88 PPHYSsiseTNLP-----------AEL----FPQYTTVEYtlpPGSGGAPSPPVFVFVVDTC--MIEEEL-GALKSALLQ 149
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  531 LLDYLPreggaeESAiRVGFVTY----------------------------NKVLHFYNVKSSLAQPQMMVVSDVADMFV 582
Cdd:PLN00162  150 AIALLP------ENA-LVGLITFgthvhvhelgfsecsksyvfrgnkevskDQILEQLGLGGKKRRPAGGGIAGARDGLS 222
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  583 PL-LDGFLVNVSESRAVITSLLDQI-PEMF---ADTRETE-TVFAPVIQAGMEALKAAECAGKLFLFhTSLPIAEAPGKL 656
Cdd:PLN00162  223 SSgVNRFLLPASECEFTLNSALEELqKDPWpvpPGHRPARcTGAALSVAAGLLGACVPGTGARIMAF-VGGPCTEGPGAI 301
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  657 KNRDDRKLINTDKE-----KTLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSV 721
Cdd:PLN00162  302 VSKDLSEPIRSHKDldkdaAPYYKKAVKFYEGLAKQLVAQGHVLDVFACSLDQVGVAEMKVAVERTGGLV 371
zf-Sec23_Sec24 pfam04810
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum ...
423-460 5.11e-15

Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.


Pssm-ID: 461437 [Multi-domain]  Cd Length: 38  Bit Score: 69.78  E-value: 5.11e-15
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 149031233   423 PLRCNRCKAYMCPLMTFIEGGRRFQCSFCSCINDVPPQ 460
Cdd:pfam04810    1 PVRCRRCRAYLNPFCQFDFGGKKWTCNFCGTRNPVPPE 38
SEC23 COG5047
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];
375-1000 6.34e-15

Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];


Pssm-ID: 227380 [Multi-domain]  Cd Length: 755  Bit Score: 79.54  E-value: 6.34e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  375 IRCTSYNIPCTSDMAKQAQVPLAAVIKPLARLPPEEASPYvvdhgesGPLRCNR-CKAYMCPLMTFIEGGRRFQCSFCSC 453
Cdd:COG5047    12 IRLTWNVFPATRGDATRTVIPIACLYTPLHEDDALTVNYY-------EPVKCTApCKAVLNPYCHIDERNQSWICPFCNQ 84
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  454 INDVPPQYfqhldhtgkrvDAYDRPELSLGSYEFLATVDYCKNNKFPSPPAFIFMIDVSYNAIRtglVRLLCEELKSLLD 533
Cdd:COG5047    85 RNTLPPQY-----------RDISNANLPLELLPQSSTIEYTLSKPVILPPVFFFVVDACCDEEE---LTALKDSLIVSLS 150
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  534 YLPREggaeesAIrVGFVTYNKVLHFYNVkSSLAQPQMMVVSDVADMFVPLLD--------------------------- 586
Cdd:COG5047   151 LLPPE------AL-VGLITYGTSIQVHEL-NAENHRRSYVFSGNKEYTKENLQellalskptksggfeskisgigqfass 222
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  587 GFLVNVSESRAVITSLLDQI-PEMF---ADTRETE-TVFAPVIQAGMEALKAAECAGKLFLFhTSLPIAEAPGKLKNRDD 661
Cdd:COG5047   223 RFLLPTQQCEFKLLNILEQLqPDPWpvpAGKRPLRcTGSALNIASSLLEQCFPNAGCHIVLF-AGGPCTVGPGTVVSTEL 301
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  662 RK------LINTDKEKtLFQPQTGTYQTLAKECVAQGCCVDLFLFPNQYVDVATLSVVPQLTGGSVYKYACFQVENDQER 735
Cdd:COG5047   302 KEpmrshhDIESDSAQ-HSKKATKFYKGLAERVANQGHALDIFAGCLDQIGIMEMEPLTTSTGGALVLSDSFTTSIFKQS 380
                         410       420       430       440       450       460       470       480
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  736 FLSDLRRDVQK--VVGFDAVMRVRTSTGIRAVDFFG---------------AFYMSNTTDVELAGLDGDKTVTVEFKHDD 798
Cdd:COG5047   381 FQRIFNRDSEGylKMGFNANMEVKTSKNLKIKGLIGhavsvkkkannisdsEIGIGATNSWKMASLSPKSNYALYFEIAL 460
                         490       500       510       520       530       540       550       560
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  799 RLNEESG-----ALLQCALLYTSCAGQRRLRIHNLALNCCTQLADL-YRNCETDTLINYMAKFA-YRAVVSSPVKTVR-- 869
Cdd:COG5047   461 GAASGSAqrpaeAYIQFITTYQHSSGTYRIRVTTVARMFTDGGLPKiNRSFDQEAAAVFMARIAaFKAETEDIIDVFRwi 540
                         570       580       590       600       610       620       630       640
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  870 -DTLITQCaQILACYRKNcaSPSSAGqliLPECMKLLPVYLNCVLKSDVLQPGAEvTTDDRAYVRQLVSSMDVAETNVFF 948
Cdd:COG5047   541 dRNLIRLC-QKFADYRKD--DPSSFR---LDPNFTLYPQFMYHLRRSPFLSVFNN-SPDETAFYRHMLNNADVNDSLIMI 613
                         650       660       670       680       690
                  ....*....|....*....|....*....|....*....|....*....|....*...
gi 149031233  949 YPRLLPLT--KSP----LDSTTEPPAVraseerlssgdIYLLENGLNLFVWVGASVQQ 1000
Cdd:COG5047   614 QPTLQSYSfeKGGvpvlLDSVSVKPDV-----------ILLLDTFFHILIFHGSYIAQ 660
PHA03247 PHA03247
large tegument protein UL36; Provisional
7-305 1.73e-13

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 75.75  E-value: 1.73e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    7 APPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAP--PQGVPRAPPCSGAPPASAAQVPCGqttyg 84
Cdd:PHA03247 2703 PPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPggPARPARPPTTAGPPAPAPPAAPAA----- 2777
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 qfgqgdiqnGPSSTAQMPRVpGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPT 164
Cdd:PHA03247 2778 ---------GPPRRLTRPAV-ASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGPPP 2847
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  165 SLASASGNFPNSGPYSTYPQSQAPPLS-QAQGHPGVQPPLRSAPPLAS-SFTSPASGGPRMPSMPGPLPPGQGFGSLPVS 242
Cdd:PHA03247 2848 PSLPLGGSVAPGGDVRRRPPSRSPAAKpAAPARPPVRRLARPAVSRSTeSFALPPDQPERPPQPQAPPPPQPQPQPPPPP 2927
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149031233  243 QANRVSSPPAHALPPGTQMTGPPAPpppmhSPQQPGYQLQQNGSFGPARGPQPNYESPYPGAP 305
Cdd:PHA03247 2928 QPQPPPPPPPRPQPPLAPTTDPAGA-----GEPSGAVPQPWLGALVPGRVAVPRFRVPQPAPS 2985
PHA03247 PHA03247
large tegument protein UL36; Provisional
7-308 3.96e-13

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 74.59  E-value: 3.96e-13
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    7 APPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTygqf 86
Cdd:PHA03247 2709 EPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLT---- 2784
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   87 gqgdIQNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYG------PPPTSTQVTAqlaamqisgavaqAPPPSGLgy 160
Cdd:PHA03247 2785 ----RPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAAspagplPPPTSAQPTA-------------PPPPPGP-- 2845
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  161 gPPTSLASASGNFPnSGPYSTYPQSQAPPlsqaqghpgVQPPLRSAPPlASSFTSPAsggPRMPSMPGPLPPgqgfgslP 240
Cdd:PHA03247 2846 -PPPSLPLGGSVAP-GGDVRRRPPSRSPA---------AKPAAPARPP-VRRLARPA---VSRSTESFALPP-------D 2903
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149031233  241 VSQANRVSSPPAHALPPgTQMTGPPAPPPPMHSPQQPGYQLQQNGSFGPARGPQPNYESPYPGAPTFG 308
Cdd:PHA03247 2904 QPERPPQPQAPPPPQPQ-PQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPG 2970
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
5-305 1.32e-12

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 72.49  E-value: 1.32e-12
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233     5 QSAPPVPPfgqnQPIYPGYHQSNYGGqPGPAAPATPygayNGPVPGYQQAPPQGVPRAPPCS---GAPPASAAQVPCGQT 81
Cdd:pfam03154  177 QSGAASPP----SPPPPGTTQAATAG-PTPSAPSVP----PQGSPATSQPPNQTQSTAAPHTliqQTPTLHPQRLPSPHP 247
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    82 TYGQFGQG--DIQNGPSST------AQMPRVPGSQQFGPPLAPvvsQPAVLQPYGPPPTSTQvtaqlaamqisgavAQAP 153
Cdd:pfam03154  248 PLQPMTQPppPSQVSPQPLpqpslhGQMPPMPHSLQTGPSHMQ---HPVPPQPFPLTPQSSQ--------------SQVP 310
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   154 PPsglgygPPTSLASASGNFPNSGPYSTYPQSQAPPLSQA-----QGHPGVQPPLRSA-PPLASSFT---SPASGGPRMP 224
Cdd:pfam03154  311 PG------PSPAAPGQSQQRIHTPPSQSQLQSQQPPREQPlppapLSMPHIKPPPTTPiPQLPNPQShkhPPHLSGPSPF 384
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   225 SMPGPLPPG---QGFGSLPVSQANRVSSPPAHALPPGTQMTGPpappppmhsPQQPGYqLQQNGSFGPARGPQPNYESPY 301
Cdd:pfam03154  385 QMNSNLPPPpalKPLSSLSTHHPPSAHPPPLQLMPQSQQLPPP---------PAQPPV-LTQSQSLPPPAASHPPTSGLH 454

                   ....
gi 149031233   302 PGAP 305
Cdd:pfam03154  455 QVPS 458
Gelsolin pfam00626
Gelsolin repeat;
962-1037 5.82e-12

Gelsolin repeat;


Pssm-ID: 395501 [Multi-domain]  Cd Length: 76  Bit Score: 62.33  E-value: 5.82e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149031233   962 STTEPPAVRASEERLSSGDIYLLENGLNLFVWVGASVQQgvVQSLFNVSSFSQI-TSGLSVLPVLDN-PLSKKVRGLI 1037
Cdd:pfam00626    1 KFVLPPPVPLSQESLNSGDCYLLDNGFTIFLWVGKGSSL--LEKLFAALLAAQLdDDERFPLPEVIRvPQGKEPARFL 76
PHA03247 PHA03247
large tegument protein UL36; Provisional
7-356 3.56e-09

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 61.49  E-value: 3.56e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    7 APPVPPFGqnqpiypGYHQSNYGGQPGPAAPATPYGAYNGPVPG---------------YQQAPPQG-VPRAPPCSGAPP 70
Cdd:PHA03247 2623 APDPPPPS-------PSPAANEPDPHPPPTVPPPERPRDDPAPGrvsrprrarrlgraaQASSPPQRpRRRAARPTVGSL 2695
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   71 ASAAQVPCGQTTygqfgqgdiqNGPSSTAQMPRVPGSQqfGPPLAPVVSQPAVLQPYGPPPTSTQVT-----AQLAAMQI 145
Cdd:PHA03247 2696 TSLADPPPPPPT----------PEPAPHALVSATPLPP--GPAAARQASPALPAAPAPPAVPAGPATpggpaRPARPPTT 2763
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  146 SGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPLRS-----APPLASSFTSPA-SG 219
Cdd:PHA03247 2764 AGPPAPAPPAAPAAGPPRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASpagplPPPTSAQPTAPPpPP 2843
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  220 GPRMPSMP--GPLPPGQGFGSLPVSQANrVSSPPAHALPPGTQMTGPPAPPPPMHSPQQPGYQLQQNGSFGPARgPQPNY 297
Cdd:PHA03247 2844 GPPPPSLPlgGSVAPGGDVRRRPPSRSP-AAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPP-PQPQP 2921
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233  298 ESPYPGAPTfgtqpgPPQPLPPK-------RLDPDAIPSPIQVIEDDRNNRGSEPFVTGVRGQVPP 356
Cdd:PHA03247 2922 QPPPPPQPQ------PPPPPPPRpqpplapTTDPAGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQ 2981
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
119-305 4.88e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 60.55  E-value: 4.88e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   119 SQPAVLQPYGPPPtstqVTAQLAAMQISGAVAQAPPPSGLGYGPPTSlasasgnfPNSGPYSTYPQSQAPPLSQAQG--- 195
Cdd:pfam03154  169 TQPPVLQAQSGAA----SPPSPPPPGTTQAATAGPTPSAPSVPPQGS--------PATSQPPNQTQSTAAPHTLIQQtpt 236
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   196 ---------HPGVQPPLRSAPPL---ASSFTSPASGGPrMPSMPGPLPPG----------QGFGSLPVSQANRVSSPPAH 253
Cdd:pfam03154  237 lhpqrlpspHPPLQPMTQPPPPSqvsPQPLPQPSLHGQ-MPPMPHSLQTGpshmqhpvppQPFPLTPQSSQSQVPPGPSP 315
                          170       180       190       200       210
                   ....*....|....*....|....*....|....*....|....*....|..
gi 149031233   254 ALPPGTQMTgppappppmhsPQQPGYQLQqngsfgpARGPQPNYESPYPGAP 305
Cdd:pfam03154  316 AAPGQSQQR-----------IHTPPSQSQ-------LQSQQPPREQPLPPAP 349
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
29-240 1.14e-08

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 59.12  E-value: 1.14e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATpygAYNGPVPgyQQAPPQGVPR-APPCSGAPPASAAQVPCGQTTYGQFGQGDIQNGPSSTA-QMPRVPG 106
Cdd:PRK12323  366 GQSGGGAGPAT---AAAAPVA--QPAPAAAAPAaAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEAlAAARQAS 440
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  107 SQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPtSLASASGNFPNSGPYSTYPqSQ 186
Cdd:PRK12323  441 ARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPP-PWEELPPEFASPAPAQPDA-AP 518
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233  187 APPLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMP--GPLPPGQGFGSLP 240
Cdd:PRK12323  519 AGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPvvAPRPPRASASGLP 574
PHA03247 PHA03247
large tegument protein UL36; Provisional
8-330 1.26e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 59.57  E-value: 1.26e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    8 PPVPPfGQNQPIYPGyhqSNYGGQPGPAAPATPYGAYNGPVPgyqqAPPQGVPRA--PPCSGAPPASAAQVPCGQTTYGQ 85
Cdd:PHA03247 2589 PDAPP-QSARPRAPV---DDRGDPRGPAPPSPLPPDTHAPDP----PPPSPSPAAnePDPHPPPTVPPPERPRDDPAPGR 2660
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   86 FG------QGDIQNGPSSTAQMPRVPGSQqfgPPLAPVVS----QPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPP 155
Cdd:PHA03247 2661 VSrprrarRLGRAAQASSPPQRPRRRAAR---PTVGSLTSladpPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPA 2737
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  156 SGLGYGPPTSLASASGNFPNSGPystyPQSQAPPLSQAQGHPGVQPPLRSAPPLASSFtspASGGPRMPSMPGPLPPGQG 235
Cdd:PHA03247 2738 APAPPAVPAGPATPGGPARPARP----PTTAGPPAPAPPAAPAAGPPRRLTRPAVASL---SESRESLPSPWDPADPPAA 2810
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  236 fgSLPVSQANRVSSPPAHALPPGTQMTGPPAPP-----PpmhSPQQPGYQLQQNGSFgpARGPQPNYESPYPGAPTFGTQ 310
Cdd:PHA03247 2811 --VLAPAAALPPAASPAGPLPPPTSAQPTAPPPppgppP---PSLPLGGSVAPGGDV--RRRPPSRSPAAKPAAPARPPV 2883
                         330       340
                  ....*....|....*....|
gi 149031233  311 PGPPQPLPPKRLDPDAIPSP 330
Cdd:PHA03247 2884 RRLARPAVSRSTESFALPPD 2903
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
4-296 2.47e-08

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 58.10  E-value: 2.47e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233     4 NQSAPPVPPFGQNQPIYPGYHQSNYGGQPGP---AAPATPYGAYNGPVPGYQQAPPQGVPRAPPCSGAP-------PASA 73
Cdd:pfam09606  112 QQMGGPGTASNLLASLGRPQMPMGGAGFPSQmsrVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPgqgqaggMNGG 191
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    74 AQVPCGQTTYGQFGQGdIQNGPSST-AQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPP-PTSTQVTAQLAA---MQISGA 148
Cdd:pfam09606  192 QQGPMGGQMPPQMGVP-GMPGPADAgAQMGQQAQANGGMNPQQMGGAPNQVAMQQQQPqQQGQQSQLGMGInqmQQMPQG 270
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   149 VAQAPPPSGLG--YGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQG--HPGVQP-----------PLRSAPPLASSF 213
Cdd:pfam09606  271 VGGGAGQGGPGqpMGPPGQQPGAMPNVMSIGDQNNYQQQQTRQQQQQQGgnHPAAHQqqmnqsvgqggQVVALGGLNHLE 350
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   214 TSPAS--GGPRMPSMpGPLPPGQGFGSLPVSQANRVSSPPAHALPPGTQmtgppappPPMHSPQQPGYQLQQNGSFG--- 288
Cdd:pfam09606  351 TWNPGnfGGLGANPM-QRGQPGMMSSPSPVPGQQVRQVTPNQFMRQSPQ--------PSVPSPQGPGSQPPQSHPGGmip 421

                   ....*....
gi 149031233   289 -PARGPQPN 296
Cdd:pfam09606  422 sPALIPSPS 430
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
29-257 2.94e-08

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 57.94  E-value: 2.94e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATPYGAYNGPVP------GYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTYGQFGQGDIQNGPSSTAQMP 102
Cdd:PRK07003  365 GGAPGGGVPARVAGAVPAPGAraaaavGASAVPAVTAVTGAAGAALAPKAAAAAAATRAEAPPAAPAPPATADRGDDAAD 444
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  103 RVPGSQQFGPPLAPVVSQPAvlqPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTY 182
Cdd:PRK07003  445 GDAPVPAKANARASADSRCD---ERDAQPPADSGSASAPASDAPPDAAFEPAPRAAAPSAATPAAVPDARAPAAASREDA 521
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  183 PQSQAPPLSQA-QGHPGV-QPPLRS----------------------------APPLASSFTSPASGGPRmPSMPGPLP- 231
Cdd:PRK07003  522 PAAAAPPAPEArPPTPAAaAPAARAggaaaaldvlrnagmrvssdrgaraaaaAKPAAAPAAAPKPAAPR-VAVQVPTPr 600
                         250       260       270
                  ....*....|....*....|....*....|....
gi 149031233  232 --------PGQGFGSLPVSQANRVSSPPAHALPP 257
Cdd:PRK07003  601 araatgdaPPNGAARAEQAAESRGAPPPWEDIPP 634
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
5-306 4.95e-08

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 57.47  E-value: 4.95e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233     5 QSAPPVPPFGQN--QPIYPGYHQsnyggqpGPAAPAtPYGAYNGPVPGYQQAPPQGVPRAPPCSGA--PPASAAQVPcgq 80
Cdd:pfam03154  250 QPMTQPPPPSQVspQPLPQPSLH-------GQMPPM-PHSLQTGPSHMQHPVPPQPFPLTPQSSQSqvPPGPSPAAP--- 318
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    81 ttyGQFGQGDIQNGPSSTAQMPRVPGSQqfgpPLAPVvsqPAVLQPYGPPPTSTqvTAQLAAMQISGAVAQAPPPSGLGY 160
Cdd:pfam03154  319 ---GQSQQRIHTPPSQSQLQSQQPPREQ----PLPPA---PLSMPHIKPPPTTP--IPQLPNPQSHKHPPHLSGPSPFQM 386
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   161 G----PPTSLASASgNFPNSGPYSTYPqsqaPPLSQAQGHPGVQPPLRSAPPLASSFTSPASGGprmpSMPGPLPPGQGF 236
Cdd:pfam03154  387 NsnlpPPPALKPLS-SLSTHHPPSAHP----PPLQLMPQSQQLPPPPAQPPVLTQSQSLPPPAA----SHPPTSGLHQVP 457
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   237 GSLPVSQANRVSSPPAHALPPGTqmtgppapPPPMHSPQQPGYQlqqngsfgPARGPQPNYESPYPGAPT 306
Cdd:pfam03154  458 SQSPFPQHPFVPGGPPPITPPSG--------PPTSTSSAMPGIQ--------PPSSASVSSSGPVPAAVS 511
PHA03377 PHA03377
EBNA-3C; Provisional
21-224 9.16e-08

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 56.60  E-value: 9.16e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   21 PGYHQSNYGGQPGPAAPATPYGAYNGPVP------GYQQAPPQGVPRAP-PCSGAPPASAAQVPCGQTTYgqfgqgdiqn 93
Cdd:PHA03377  741 PPSHQAPYSGHEEPQAQQAPYPGYWEPRPpqapylGYQEPQAQGVQVSSyPGYAGPWGLRAQHPRYRHSW---------- 810
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   94 gpsstAQMPRVPGsqqFGPPLAPVVSQPAVLQPYGPPptstqvTAQLAAMQISGAVAQAPPPsglgyGPPTSLASASGNF 173
Cdd:PHA03377  811 -----AYWSQYPG---HGHPQGPWAPRPPHLPPQWDG------SAGHGQDQVSQFPHLQSET-----GPPRLQLSQVPQL 871
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|....*....
gi 149031233  174 PNSGPYSTypqSQAPPLSQAQGHPGVQP-PLRSAPP-------LASSFTSPASGGPRMP 224
Cdd:PHA03377  872 PYSQTLVS---SSAPSWSSPQPRAPIRPiPTRFPPPpmplqdsMAVGCDSSGTACPSMP 927
Retinal pfam15449
Retinal protein; This family of proteins is found in the photoreceptor cells of the retina. ...
15-251 1.81e-07

Retinal protein; This family of proteins is found in the photoreceptor cells of the retina. Mutations of the gene encoding this protein have been associated with retinal disorders such as retinitis pigmentosa and late-onset progressive retinal atrophy. The function of this family of proteins is unknown, but it is likely to be important in the development and function of the retina.


Pssm-ID: 464722 [Multi-domain]  Cd Length: 1293  Bit Score: 55.55  E-value: 1.81e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    15 QNQPIYPGYHQSNYGGQPGPAAPAT--PYGAYNGPvpgyqQAPPQGVPRAPPCS-GAPPASAAQVPcgQTTYGQFGQgdi 91
Cdd:pfam15449  967 QPRKAIPWHHSSHTSGQSRTSEPSLarPTRGPHSP-----EAPRQSQERSPPLVrKASPTRAHWAP--RADKRHPSL--- 1036
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    92 qngPSS--TAQmPRVPGSQ-QFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLAS 168
Cdd:pfam15449 1037 ---PSShrPAQ-PSLPTVQrSPSPPLSPRAPSPPRSPRVLSPPTSKKRTSPPPQHKLPSPPPESPPAQHKLSSPPTQRTE 1112
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   169 ASGnfPNSGPystypqSQAPPLSQAQGHPGV------QPPLRSAPPLASSFTSPASG------GPRMPSMPGPLP-PGQG 235
Cdd:pfam15449 1113 ASS--PSSGP------SPSPPTSPSQGHKETrdsedsQAATAKASGNTCSIFCPATSslfeakSPFSTAHPLLPPeAGGP 1184
                          250
                   ....*....|....*.
gi 149031233   236 FGSLPVSQanRVSSPP 251
Cdd:pfam15449 1185 LETPAGCW--RSSSGP 1198
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
71-308 5.55e-07

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 53.86  E-value: 5.55e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    71 ASAAQVPCGQTTYGQFGQGD---------IQNGPSstaQMPRVPGSQQFGP-PLAPVVSQpavlqpYGPPPTSTQVTAQL 140
Cdd:pfam09606   57 AAQQQQPQGGQGNGGMGGGQqgmpdpinaLQNLAG---QGTRPQMMGPMGPgPGGPMGQQ------MGGPGTASNLLASL 127
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   141 --AAMQISGA--------VAQAPPPSGLGYGPPTSLASASGNFPNS-GPYSTYPQSQAP-PLSQAQGHPGVQPPLRSAPP 208
Cdd:pfam09606  128 grPQMPMGGAgfpsqmsrVGRMQPGGQAGGMMQPSSGQPGSGTPNQmGPNGGPGQGQAGgMNGGQQGPMGGQMPPQMGVP 207
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   209 LASSFTSPASGGPRMPSMPGPLPPGQGFGslpvSQANRvsspPAHALPPGTQMTgPPAPPPPMHSPQQ-PGYQLQQNGSF 287
Cdd:pfam09606  208 GMPGPADAGAQMGQQAQANGGMNPQQMGG----APNQV----AMQQQQPQQQGQ-QSQLGMGINQMQQmPQGVGGGAGQG 278
                          250       260
                   ....*....|....*....|.
gi 149031233   288 GPARGPQPNYESPYPGAPTFG 308
Cdd:pfam09606  279 GPGQPMGPPGQQPGAMPNVMS 299
PHA03378 PHA03378
EBNA-3B; Provisional
8-295 6.99e-07

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 53.53  E-value: 6.99e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    8 PPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAPPQgvPRAPPCSGAPPASAAQvpCGQTTYGQFG 87
Cdd:PHA03378  526 PPSPPQPRAGRRAPCVYTEDLDIESDEPASTEPVHDQLLPAPGLGPLQIQ--PLTSPTTSQLASSAPS--YAQTPWPVPH 601
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   88 QGDIQNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLA 167
Cdd:PHA03378  602 PSQTPEPPTTQSHIPETSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPTWTQIGHIPYQPSPTGA 681
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  168 SASgNFPNSGPYSTYPQSQAP-PLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGfgslPVSQANR 246
Cdd:PHA03378  682 NTM-LPIQWAPGTMQPPPRAPtPMRPPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAA----APGRARP 756
                         250       260       270       280
                  ....*....|....*....|....*....|....*....|....*....
gi 149031233  247 VSSPPAHALPPGTQmTGPPAPPPPMHSPQQPgyqlQQNGSFGPARGPQP 295
Cdd:PHA03378  757 PAAAPGRARPPAAA-PGAPTPQPPPQAPPAP----QQRPRGAPTPQPPP 800
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
29-224 7.83e-07

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 53.45  E-value: 7.83e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATPYGAYNGPVPGYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTYGQFGQGDIQNGPSSTAQMPRVPGSQ 108
Cdd:PRK07764  595 AGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAG 674
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  109 QFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSglgyGPPTSLASASGNFPNSGPYSTYPQSQAP 188
Cdd:PRK07764  675 GAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPP----QAAQGASAPSPAADDPVPLPPEPDDPPD 750
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 149031233  189 PLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMP 224
Cdd:PRK07764  751 PAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEEEEMA 786
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
6-306 9.03e-07

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 53.25  E-value: 9.03e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGyhQSNYGGQPGPAAPATPYGAYNGPVPGY---QQAPPQGVPRAPPCSGAPPASAAQVPCGQTT 82
Cdd:PHA03307   91 SLSTLAPASPAREGSPT--PPGPSSPDPPPPTPPPASPPPSPAPDLsemLRPVGSPGPPPAASPPAAGASPAAVASDAAS 168
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   83 YGQfgQGDIQNGPSSTAQMPRVPgsqqfgPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAmQISGAVAQAPPPSGLGYGP 162
Cdd:PHA03307  169 SRQ--AALPLSSPEETARAPSSP------PAEPPPSTPPAAASPRPPRRSSPISASASSP-APAPGRSAADDAGASSSDS 239
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  163 PTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPLRsAPPLASSFTSPASGGPRMPSMPG-PLPPGQGFGSL-- 239
Cdd:PHA03307  240 SSSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSR-PGPASSSSSPRERSPSPSPSSPGsGPAPSSPRASSss 318
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 149031233  240 -PVSQANRVSSPPAHAlPPGTQMTGPPAPPPPMHSPQQPgyqlqqNGSFGPARGPQPNYESPYPGAPT 306
Cdd:PHA03307  319 sSSRESSSSSTSSSSE-SSRGAAVSPGPSPSRSPSPSRP------PPPADPSSPRKRPRPSRAPSSPA 379
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
65-257 9.60e-07

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 53.07  E-value: 9.60e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   65 CSGAPPASAAQvPCGQTTYGQFGQGDIQNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQ 144
Cdd:PRK07764  586 AVVGPAPGAAG-GEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASD 664
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  145 -ISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQghpgvQPPLRSAPPLASSFTSPASGGPRM 223
Cdd:PRK07764  665 gGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQAD-----DPAAQPPQAAQGASAPSPAADDPV 739
                         170       180       190
                  ....*....|....*....|....*....|....
gi 149031233  224 PSMPGPLPPGQGFGSLPVSQANRVSSPPAHALPP 257
Cdd:PRK07764  740 PLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAA 773
PHA03247 PHA03247
large tegument protein UL36; Provisional
29-329 1.22e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.02  E-value: 1.22e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPatpygayngPVPGYQQapPQGVPRAPPCSGAPPASAAQVpcgqTTYGQFGQGDIQNGPSSTAQMPRVPGSQ 108
Cdd:PHA03247 2501 GGPPDPDAP---------PAPSRLA--PAILPDEPVGEPVHPRMLTWI----RGLEELASDDAGDPPPPLPPAAPPAAPD 2565
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  109 QFGPP--LAPVVSQPAV----LQPYGPPPTSTQVTAQLAAMQISGAVAQAP-PPSGLGYGPPTSLASASGNFPNSGPYST 181
Cdd:PHA03247 2566 RSVPPprPAPRPSEPAVtsraRRPDAPPQSARPRAPVDDRGDPRGPAPPSPlPPDTHAPDPPPPSPSPAANEPDPHPPPT 2645
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  182 YPQSQAPPLSQAQGHpgVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSSP----PAHALPP 257
Cdd:PHA03247 2646 VPPPERPRDDPAPGR--VSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSLADPPPPPPTPEPAPhalvSATPLPP 2723
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 149031233  258 GTQMTGPPAPPPpmhsPQQPGYQLQQNGSFGPArGPQPNYESPYPGAPTFGTQPGPPQPLPPKRLDPDAIPS 329
Cdd:PHA03247 2724 GPAAARQASPAL----PAAPAPPAVPAGPATPG-GPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVAS 2790
PHA03247 PHA03247
large tegument protein UL36; Provisional
6-229 1.38e-06

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 53.02  E-value: 1.38e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGYHQSNYG----GQPGPAAPATPYGAYNGPV--------PGYQQAPPQGVPRAPPCSGAPPASA 73
Cdd:PHA03247 2769 PAPPAAPAAGPPRRLTRPAVASLSesreSLPSPWDPADPPAAVLAPAaalppaasPAGPLPPPTSAQPTAPPPPPGPPPP 2848
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   74 AQVPCGQTTYGqfgqGDIQNGPSSTA--------------QMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQ 139
Cdd:PHA03247 2849 SLPLGGSVAPG----GDVRRRPPSRSpaakpaaparppvrRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPP 2924
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  140 LAAMQisgavAQAPPPSGLgygPPTSLASASGNFPNSGPYSTYPQSQAPPLSqaqghPG-VQPPLRSAPPLASSFTSPAS 218
Cdd:PHA03247 2925 PPPQP-----QPPPPPPPR---PQPPLAPTTDPAGAGEPSGAVPQPWLGALV-----PGrVAVPRFRVPQPAPSREAPAS 2991
                         250
                  ....*....|.
gi 149031233  219 GGPRMPSMPGP 229
Cdd:PHA03247 2992 STPPLTGHSLS 3002
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
47-305 4.32e-06

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 50.80  E-value: 4.32e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    47 PVPGYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTYGQFGQ-GDIQngpsSTAQMPRVPGSQQFGPPLAPVVS-QPAVL 124
Cdd:pfam09770  111 AAQSSAQPPASSLPQYQYASQQSQQPSKPVRTGYEKYKEPEPiPDLQ----VDASLWGVAPKKAAAPAPAPQPAaQPASL 186
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   125 QPYGPPPTSTQ-VTAQLAAmQISGAVAQAPPPSglgYGPPTSLASASGNFPNSGPYSTYPQSQApplsQAQGHPGVQPPL 203
Cdd:pfam09770  187 PAPSRKMMSLEeVEAAMRA-QAKKPAQQPAPAP---AQPPAAPPAQQAQQQQQFPPQIQQQQQP----QQQPQQPQQHPG 258
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   204 RSAPPlaSSFTSPASGGPRmPSMPGPLPPGQGFGSLPvsqanrvssPPAhalppgtqmtgppappppmhsPQQPGYQLQQ 283
Cdd:pfam09770  259 QGHPV--TILQRPQSPQPD-PAQPSIQPQAQQFHQQP---------PPV---------------------PVQPTQILQN 305
                          250       260
                   ....*....|....*....|..
gi 149031233   284 NGSFGPARGPQPNYESPYPGAP 305
Cdd:pfam09770  306 PNRLSAARVGYPQNPQPGVQPA 327
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
67-257 6.67e-06

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 50.26  E-value: 6.67e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   67 GAPPASAAQVPCGQTTygqfgqgdiqngPSSTAQMPRV--PGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQ 144
Cdd:PRK12323  371 GAGPATAAAAPVAQPA------------PAAAAPAAAApaPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQ 438
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  145 IS---GAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGV-----QPPLRSAPPLASSFTSP 216
Cdd:PRK12323  439 ASargPGGAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPppweeLPPEFASPAPAQPDAAP 518
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|.
gi 149031233  217 ASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSSPPAHALPP 257
Cdd:PRK12323  519 AGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEP 559
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
184-308 8.61e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 50.15  E-value: 8.61e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   184 QSQAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMPGP---LPPGQGFGSL---------PVSQANRVSSP- 250
Cdd:pfam03154  167 LQTQPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPatsQPPNQTQSTAaphtliqqtPTLHPQRLPSPh 246
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149031233   251 ----PAHALPPGTQMTGPPAPPPPMHSPQQP-GYQLQQngsfGPARGPQPNYESPYPGAPTFG 308
Cdd:pfam03154  247 pplqPMTQPPPPSQVSPQPLPQPSLHGQMPPmPHSLQT----GPSHMQHPVPPQPFPLTPQSS 305
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
6-139 1.21e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 49.60  E-value: 1.21e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPyGAYNGPVPGYQQAPPQGVPRAPPcsgAPPASAAQVPCGQTTYGQ 85
Cdd:PRK07764  650 PEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPA-PAPAAPAAPAGAAPAQPAPAPAA---TPPAGQADDPAAQPPQAA 725
                          90       100       110       120       130
                  ....*....|....*....|....*....|....*....|....*....|....
gi 149031233   86 FGQGDIQNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQ 139
Cdd:PRK07764  726 QGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPP 779
PHA03377 PHA03377
EBNA-3C; Provisional
8-197 1.21e-05

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 49.67  E-value: 1.21e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    8 PPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAPPQGVPRAP--PCSGAPPASAAQVPC-GQTTYG 84
Cdd:PHA03377  770 PQAPYLGYQEPQAQGVQVSSYPGYAGPWGLRAQHPRYRHSWAYWSQYPGHGHPQGPwaPRPPHLPPQWDGSAGhGQDQVS 849
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 QFGQGDIQNGPSS--TAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTstqvtaqlaamqisgavaQAPPPSGLGYGP 162
Cdd:PHA03377  850 QFPHLQSETGPPRlqLSQVPQLPYSQTLVSSSAPSWSSPQPRAPIRPIPT------------------RFPPPPMPLQDS 911
                         170       180       190
                  ....*....|....*....|....*....|....*.
gi 149031233  163 PTSLASASGNFPNSGPY-STYPQSQAPPLSQAQGHP 197
Cdd:PHA03377  912 MAVGCDSSGTACPSMPFaSDYSQGAFTPLDINAQTP 947
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
27-305 1.37e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 49.21  E-value: 1.37e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   27 NYGGQPGPAAPATPYGAYNGPVPgyqQAPPQGVPRAPPCSGAPPASAAQVPcgqttygqfgqgdiqnGPSSTAQMPRVPG 106
Cdd:PRK07764  386 GVAGGAGAPAAAAPSAAAAAPAA---APAPAAAAPAAAAAPAPAAAPQPAP----------------APAPAPAPPSPAG 446
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  107 SQQFGPPLAPVVSQPAVLQPyGPPPTSTQVTAQLAAMQISGAVAQAPPPsglgyGPPTSLASASGNFPNSGPYSTYPQ-- 184
Cdd:PRK07764  447 NAPAGGAPSPPPAAAPSAQP-APAPAAAPEPTAAPAPAPPAAPAPAAAP-----AAPAAPAAPAGADDAATLRERWPEil 520
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  185 ---------------SQA----------------PPLSQAQGHPGVQPPLRSA-----------------PPLASSFTSP 216
Cdd:PRK07764  521 aavpkrsrktwaillPEAtvlgvrgdtlvlgfstGGLARRFASPGNAEVLVTAlaeelggdwqveavvgpAPGAAGGEGP 600
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  217 ASGGPRMPSMPGPLPPGQGFGSLPVSQAnRVSSPPAHALPPGTQMTGPPAPPPPMHSPQQPGYQLQQNGSFGPARGPQPN 296
Cdd:PRK07764  601 PAPASSGPPEEAARPAAPAAPAAPAAPA-PAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPA 679

                  ....*....
gi 149031233  297 YESPYPGAP 305
Cdd:PRK07764  680 APPPAPAPA 688
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
24-294 1.61e-05

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 49.18  E-value: 1.61e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    24 HQSNYGGQPGPAAPatpygaynGPVPGYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTYgQFGQGdiqngpsstaQMPR 103
Cdd:pfam03157  339 QQPAQGQQPGQGQP--------GYYPTSPQQPGQGQPGYYPTSQQQPQQGQQPEQGQQGQ-QQGQG----------QQGQ 399
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   104 VPGSQQfgpplapvvsQPAVLQPyGPPPTSTQVTAQ------LAAMQISGAVAQAPPPSGLGYGPPtslasASGNFPNSG 177
Cdd:pfam03157  400 QPGQGQ----------QPGQGQP-GYYPTSPQQSGQgqpgyyPTSPQQSGQGQQPGQGQQPGQEQP-----GQGQQPGQG 463
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   178 PYSTYPQSQAPPLSQAQGHPGVQPplrsapplassfTSPASGGPrmpsmpgplppGQGFGSLPVSQANRVSSPPAHALPP 257
Cdd:pfam03157  464 QQGQQPGQPEQGQQPGQGQPGYYP------------TSPQQSGQ-----------GQQLGQWQQQGQGQPGYYPTSPLQP 520
                          250       260       270
                   ....*....|....*....|....*....|....*..
gi 149031233   258 GTQMTGPPAPpppmhSPQQPGYQLQQNGSFGPARGPQ 294
Cdd:pfam03157  521 GQGQPGYYPT-----SPQQPGQGQQLGQLQQPTQGQQ 552
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
29-345 2.54e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.44  E-value: 2.54e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATPYGAYNGPVPGYQQAPPQGVPRAPPCSGAP----PASAAQVPCGQTTYGQFGQGDIQNGPSSTAQmPRV 104
Cdd:PRK07764  390 GAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQpapaPAPAPAPPSPAGNAPAGGAPSPPPAAAPSAQ-PAP 468
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  105 PGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAM----------QISGAV--------------AQAPPP----- 155
Cdd:PRK07764  469 APAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAPAGADdaatlrerwpEILAAVpkrsrktwaillpeATVLGVrgdtl 548
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  156 ------SGLGY-----------------------------GPPTSLASASGNfPNSGPYSTYPQSQAPPLSQAQGHPgvQ 200
Cdd:PRK07764  549 vlgfstGGLARrfaspgnaevlvtalaeelggdwqveavvGPAPGAAGGEGP-PAPASSGPPEEAARPAAPAAPAAP--A 625
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  201 PPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSSPPAHALPPGTQMTGPPAPPppmhSPQQPGYQ 280
Cdd:PRK07764  626 APAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAP----AGAAPAQP 701
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 149031233  281 LQQNGSFGPARGPQPNYESPYPGAPTFGTQPGPPQPLPPKRLDPDAIPSPIQVIEDDRNNRGSEP 345
Cdd:PRK07764  702 APAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAP 766
Gag_spuma pfam03276
Spumavirus gag protein;
92-251 2.72e-05

Spumavirus gag protein;


Pssm-ID: 460872 [Multi-domain]  Cd Length: 614  Bit Score: 48.20  E-value: 2.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    92 QNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTqvtaqlaAMQISGAVAQAPPPSGLGYGPPTSLASasg 171
Cdd:pfam03276  175 LAEISPGAQGGIPPGASFSGLPSLPAIGGIHLPAIPGIHARAP-------PGNIARSLGDDIMPSLGDAGMPQPRFA--- 244
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   172 nFPNSGPYSTYPQSqaPPLSQAQGHPGVQP--PLRSApPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSS 249
Cdd:pfam03276  245 -FHPGNPFAEAEGH--PFAEAEGERPRDIPraPRIDA-PSAPAIPAIQPIAPPMIPPIGAPIPIPHGASIPGEHIRNPRE 320

                   ..
gi 149031233   250 PP 251
Cdd:pfam03276  321 EP 322
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
5-305 2.78e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 48.44  E-value: 2.78e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    5 QSAPPVPPFGQNQPIYPGYhqsnyGGQPGPAAPATPygayngpVPGYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTYG 84
Cdd:PRK07764  436 APAPAPPSPAGNAPAGGAP-----SPPPAAAPSAQP-------APAPAAAPEPTAAPAPAPPAAPAPAAAPAAPAAPAAP 503
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 QFGQGDIQ---------------NGPSSTAQMPRVPGSQQFG---------PPLAPVVSQP--------AVLQPYG---- 128
Cdd:PRK07764  504 AGADDAATlrerwpeilaavpkrSRKTWAILLPEATVLGVRGdtlvlgfstGGLARRFASPgnaevlvtALAEELGgdwq 583
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  129 ------PPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLASASGNfPNSGPYSTYPQSQAPPLSQAQGHPGVQPP 202
Cdd:PRK07764  584 veavvgPAPGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAA-AAPAEASAAPAPGVAAPEHHPKHVAVPDA 662
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  203 LRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSSPPAHALPPGTQMTG--------PPAPPPPMHSP 274
Cdd:PRK07764  663 SDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQaaqgasapSPAADDPVPLP 742
                         330       340       350
                  ....*....|....*....|....*....|.
gi 149031233  275 QQPGYQLQQNGSFGPARGPQPNYESPYPGAP 305
Cdd:PRK07764  743 PEPDDPPDPAGAPAQPPPPPAPAPAAAPAAA 773
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
31-194 4.25e-05

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 47.72  E-value: 4.25e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    31 QPGPAAPATPYGAYNGPVPGYQQAPPQgvprappcsgappASAAQVPCGQTTYGQFGQGdiQNGPSSTAQMPRvpgSQQF 110
Cdd:pfam09770  213 QPAPAPAQPPAAPPAQQAQQQQQFPPQ-------------IQQQQQPQQQPQQPQQHPG--QGHPVTILQRPQ---SPQP 274
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   111 GPPlAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQ-ISGAVAQAP--PPSGLGYGPPTSLASASGNFPNSGPYSTYPQsQA 187
Cdd:pfam09770  275 DPA-QPSIQPQAQQFHQQPPPVPVQPTQILQNPNrLSAARVGYPqnPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQ-QL 352

                   ....*..
gi 149031233   188 PPLSQAQ 194
Cdd:pfam09770  353 AQLSEEE 359
SP6_N cd22544
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins ...
89-278 5.05e-05

N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6.


Pssm-ID: 411693 [Multi-domain]  Cd Length: 245  Bit Score: 46.07  E-value: 5.05e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   89 GDIQNGPSSTaqmPRVPGSQQFGPPLapvvsqpavlQPYGPPPTSTQVTAQLAAmqiSGAVAQAPPP------SGLGYGP 162
Cdd:cd22544     7 GSLGNQHSET---PRASPPTLDLQPL----------QPYQIHSSPEAGDYPSPL---QPTELQSLPLgpgvdfSARESYE 70
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  163 PTSLASASGNFPNSGPYSTYPQSQAPPLSQAQ--------GHPGVQPPLRSAPPL-----ASSFT--SPASGGPRMPSMP 227
Cdd:cd22544    71 PHSSRRTCLDLESDLPLGPFPKLLHPPPDMAHpyeswfrpPHPGGSGEEGGVPSWwdlhaGSSWMdlQHGQGGLQSPGPP 150
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|..
gi 149031233  228 GPL-PPGQGFGSlpvsqANRVSSPPAHALPPGTQMTGPPAPPPPMHSPQQPG 278
Cdd:cd22544   151 GGLqPPLGGYGS-----EHQLCGPPHHLLPPAQHLMGQEGPKLLEHPAEDPS 197
PRK10263 PRK10263
DNA translocase FtsK; Provisional
80-305 5.09e-05

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 47.77  E-value: 5.09e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   80 QTTYGQFGQGD-IQNGPSSTAQMPRVP----GSQQFGPPLAPVVSQPAVLQPyGPPPTSTQVTAQLAAMQISGAVAQAPP 154
Cdd:PRK10263  298 RATQPEYDEYDpLLNGAPITEPVAVAAaattATQSWAAPVEPVTQTPPVASV-DVPPAQPTVAWQPVPGPQTGEPVIAPA 376
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  155 PSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPlrSAPPLASSFTSPASGGPRMPSMPGPLPPGQ 234
Cdd:PRK10263  377 PEGYPQQSQYAQPAVQYNEPLQQPVQPQQPYYAPAAEQPAQQPYYAPA--PEQPAQQPYYAPAPEQPVAGNAWQAEEQQS 454
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149031233  235 GFGSLPVSQANRVSSPPAhalppgtqmtgppappppmhsPQQPGYQLQQNGSFGPARGPQPNYESPYPGAP 305
Cdd:PRK10263  455 TFAPQSTYQTEQTYQQPA---------------------AQEPLYQQPQPVEQQPVVEPEPVVEETKPARP 504
PHA03378 PHA03378
EBNA-3B; Provisional
3-234 5.85e-05

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 47.37  E-value: 5.85e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    3 VNQSAPPVPPFGQNQ-PIYPGYHQSNYGGQPGPAAPATPYGA-YNGPVPGYQQAPPQGVPRAPPCSGAPPAsaaQVPCGQ 80
Cdd:PHA03378  600 PHPSQTPEPPTTQSHiPETSAPRQWPMPLRPIPMRPLRMQPItFNVLVFPTPHQPPQVEITPYKPTWTQIG---HIPYQP 676
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   81 TTYGQFGQGDIQNGPSSTAQMPRVPGSQQfgPPLAPVVS--QPAVLQPYGPPPTSTQVTAQlaamqisgavaqaPPPSGL 158
Cdd:PHA03378  677 SPTGANTMLPIQWAPGTMQPPPRAPTPMR--PPAAPPGRaqRPAAATGRARPPAAAPGRAR-------------PPAAAP 741
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233  159 GYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQ 234
Cdd:PHA03378  742 GRARPPAAAPGRARPPAAAPGRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAPTPQPPPQAGPTSMQLMPRAAPGQ 817
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
6-314 7.09e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.90  E-value: 7.09e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPygayngpVPGYQQAPPQGVPRAPPCSGAPPASAAQVPcGQTTYGQ 85
Cdd:PRK07764  409 APAPAAAAPAAAAAPAPAAAPQPAPAPAPAPAPPS-------PAGNAPAGGAPSPPPAAAPSAQPAPAPAAA-PEPTAAP 480
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   86 FGQGDIQNGPSSTAQMPRVPGSQQFGPPLAPV----------------------VSQPAVLQPYGPPPTSTQVTAQLAAM 143
Cdd:PRK07764  481 APAPPAAPAPAAAPAAPAAPAAPAGADDAATLrerwpeilaavpkrsrktwailLPEATVLGVRGDTLVLGFSTGGLARR 560
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  144 ----------------------QI--------SGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQA 193
Cdd:PRK07764  561 faspgnaevlvtalaeelggdwQVeavvgpapGAAGGEGPPAPASSGPPEEAARPAAPAAPAAPAAPAPAGAAAAPAEAS 640
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  194 QGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSSPPAHALPPGTQMTGPPAPPPPMHS 273
Cdd:PRK07764  641 AAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQ 720
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|.
gi 149031233  274 PQQPGYQLQQNGSFGPARGPQPNYESPYPGAPTFGTQPGPP 314
Cdd:PRK07764  721 PPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPP 761
MISS pfam15822
MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic ...
4-258 7.18e-05

MAPK-interacting and spindle-stabilising protein-like; MISS is a family of eukaryotic MAPK-interacting and spindle-stabilising protein-like proteins. MISS is rich in prolines and has four potential MAPK-phosphorylation sites, a MAPK-docking site, a PEST sequence (PEST motif) and a bipartite nuclear localization signal. The endogenous protein accumulates during mouse meiotic maturation and is found as discrete dots on the MII spindle. MISS is the first example of a physiological MAPK-substrate that is stabilized in MII that specifically regulates MII spindle integrity during the CSF arrest.


Pssm-ID: 318115 [Multi-domain]  Cd Length: 238  Bit Score: 45.75  E-value: 7.18e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233     4 NQSAPPVPPFGQNqpiypgyhqsnyggqPGPAAPATPYGAynGPVPGYQQAPPQGVPRAPPcsgaPPASAAQVPCGQtty 83
Cdd:pfam15822   38 NPSAPPAVPSGLP---------------PSTAPSTVPFGP--APTGMYPSIPLTGPSPGPP----APFPPSGPSCPP--- 93
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    84 gqfgqgdiqngPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTStqvtAQLAAMQISGAVAQAPPPSGLGYGPP 163
Cdd:pfam15822   94 -----------PGGPYPAPTVPGPGPIGPYPTPNMPFPELPRPYGAPTDP----AAAAPSGPWGSMSSGPWAPGMGGQYP 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   164 TslasASGNFPNSGPYSTYPQSQAPPlsqaqghpgvqpplrSAPPLASSFTSPASGGPrmpsmPGPLPPGQGFGSLP--- 240
Cdd:pfam15822  159 A----PNMPYPSPGPYPAVPPPQSPG---------------AAPPVPWGTVPPGPWGP-----PAPYPDPTGSYPMPgly 214
                          250       260
                   ....*....|....*....|.
gi 149031233   241 --VSQANRVSSPPAHALP-PG 258
Cdd:pfam15822  215 ptPNNPFQVPSGPSGAPPmPG 235
Glutenin_hmw pfam03157
High molecular weight glutenin subunit; Members of this family include high molecular weight ...
14-305 7.83e-05

High molecular weight glutenin subunit; Members of this family include high molecular weight subunits of glutenin. This group of gluten proteins is thought to be largely responsible for the elastic properties of gluten, and hence, doughs. Indeed, glutenin high molecular weight subunits are classified as elastomeric proteins, because the glutenin network can withstand significant deformations without breaking, and return to the original conformation when the stress is removed. Elastomeric proteins differ considerably in amino acid sequence, but they are all polymers whose subunits consist of elastomeric domains, composed of repeated motifs, and non-elastic domains that mediate cross-linking between the subunits. The elastomeric domain motifs are all rich in glycine residues in addition to other hydrophobic residues. High molecular weight glutenin subunits have an extensive central elastomeric domain, flanked by two terminal non-elastic domains that form disulphide cross-links. The central elastomeric domain is characterized by the following three repeated motifs: PGQGQQ, GYYPTS[P/L]QQ, GQQ. It possesses overlapping beta-turns within and between the repeated motifs, and assumes a regular helical secondary structure with a diameter of approx. 1.9 nm and a pitch of approx. 1.5 nm.


Pssm-ID: 367362 [Multi-domain]  Cd Length: 786  Bit Score: 46.86  E-value: 7.83e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    14 GQNQP-IYPGYHQSNYGGQPGpAAPATPYGAYNGPVPGYQQAPPQGVP--RAPPCSGAPPASAAQVPCGQTTyGQFGQGD 90
Cdd:pfam03157  408 GQGQPgYYPTSPQQSGQGQPG-YYPTSPQQSGQGQQPGQGQQPGQEQPgqGQQPGQGQQGQQPGQPEQGQQP-GQGQPGY 485
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    91 IQNGPSSTAQMPRVPGSQQFG---------PPLAPVVSQP-----AVLQP-YGPPPTSTQVTAQLAAMQISGAVAQAPPP 155
Cdd:pfam03157  486 YPTSPQQSGQGQQLGQWQQQGqgqpgyyptSPLQPGQGQPgyyptSPQQPgQGQQLGQLQQPTQGQQGQQSGQGQQGQQP 565
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   156 SGLGYG-----PPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASGG----PRMPSM 226
Cdd:pfam03157  566 GQGQQGqqpgqGQQGQQPGQGQQPGQGQPGYYPTSPQQSGQGQQPGQWQQPGQGQPGYYPTSSLQLGQGQqgyyPTSPQQ 645
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   227 PGPLP-PGQGFGSLPVSQANRVSSP--------------PAHALPPGTQMTGPPAPpppmhSPQQPGyQLQQNGSFGPAR 291
Cdd:pfam03157  646 PGQGQqPGQWQQSGQGQQGYYPTSPqqsgqaqqpgqgqqPGQWLQPGQGQQGYYPT-----SPQQPG-QGQQLGQGQQSG 719
                          330
                   ....*....|....
gi 149031233   292 GPQPNYESPYPGAP 305
Cdd:pfam03157  720 QGQQGYYPTSPGQG 733
PHA03247 PHA03247
large tegument protein UL36; Provisional
34-330 7.84e-05

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 47.24  E-value: 7.84e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   34 PAAPATPYGAYNGPVPGyQQAPPQgvPRAPPCSGAP-PASAAQVPCGQTTY----------GQFGQGDIQNGPSSTAQMP 102
Cdd:PHA03247 2483 PAEARFPFAAGAAPDPG-GGGPPD--PDAPPAPSRLaPAILPDEPVGEPVHprmltwirglEELASDDAGDPPPPLPPAA 2559
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  103 RVPGSQQFGPP--LAPVVSQPAV----LQPYGPPPTSTQVTAQLAAMQISGAVAQAP-PPSGLGYGPPTSLASASGNFPN 175
Cdd:PHA03247 2560 PPAAPDRSVPPprPAPRPSEPAVtsraRRPDAPPQSARPRAPVDDRGDPRGPAPPSPlPPDTHAPDPPPPSPSPAANEPD 2639
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  176 SGPYSTYPQSQAPPLSQAQGHpgVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGFgslpvsqanrvSSPPAHAL 255
Cdd:PHA03247 2640 PHPPPTVPPPERPRDDPAPGR--VSRPRRARRLGRAAQASSPPQRPRRRAARPTVGSLTSL-----------ADPPPPPP 2706
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 149031233  256 PPgtqmtgppappPPMHSPQQPGYQLQQNGSFGPARGPQPNYeSPYPGAPTFGTQPGPPQPLPPKRLDPDAIPSP 330
Cdd:PHA03247 2707 TP-----------EPAPHALVSATPLPPGPAAARQASPALPA-APAPPAVPAGPATPGGPARPARPPTTAGPPAP 2769
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
99-239 7.85e-05

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 46.90  E-value: 7.85e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   99 AQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLgyGPPTSLASASGNFPNSGP 178
Cdd:PRK07764  376 ARLERLERRLGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAPAAAAAPAPAAAPQPAPAP--APAPAPPSPAGNAPAGGA 453
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149031233  179 YSTYPQSQAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASggPRMPSMPGPLPPGQGFGSL 239
Cdd:PRK07764  454 PSPPPAAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAPAAA--PAAPAAPAAPAGADDAATL 512
DUF4813 pfam16072
Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. ...
23-258 1.60e-04

Domain of unknown function (DUF4813); This family of proteins is functionally uncharacterized. This family of proteins is found in eukaryotes. Proteins in this family are typically between 345 and 672 amino acids in length.


Pssm-ID: 435117 [Multi-domain]  Cd Length: 288  Bit Score: 44.75  E-value: 1.60e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    23 YHQSNYGGQPGPAAPATPYGAYNGPVP-GYQQAPPQGVPRAppCSGAPPASAAQVPCGqTTYGQFGQGdIQNGPSSTAQM 101
Cdd:pfam16072    4 YHPAGATYHPGGYAPAGATYHPAGQVPaGATYYPSGGVPHG--ATYYPQAPVAAVPAG-ATYLPAGAA-IPAGATYYPQA 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   102 PRVPGSQQFGPPLAPVVSQPAVLQpYGPPPTSTQVTAQLAAMQISGAVAQAPPPSG------LGYGPPTSLASASGNFPN 175
Cdd:pfam16072   80 PKSSSGLGLGTGLIAGALGGAILG-HALTPTQTRVVEHAPSSGGGGGGGGYSNGNNedkiiiINNGPPGSVTTTSAGSGT 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   176 SGPYSTYPQSQAPPLSQAQGHPGVQPPLRSAPPLAssftsPASGGPRMPSmpGPLPPGQGFGSLPVSQANR----VSSPP 251
Cdd:pfam16072  159 TVINAGGQQPAAPAAPAYPVAPAAYPAQAPAAAPA-----PAPGAPQTPL--APLNPVAAAPAAAAGAAAApvvaAAAPA 231

                   ....*..
gi 149031233   252 AHALPPG 258
Cdd:pfam16072  232 AAAPPPP 238
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
128-305 1.95e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 45.64  E-value: 1.95e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  128 GPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAppLSQAQGHPGVQPPLRSAP 207
Cdd:PRK12323  373 GPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEA--LAAARQASARGPGGAPAP 450
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  208 PLASSfTSPASGGPrmPSMPGPLPPGQGFGSLPVSQANRVSSPPA-HALPPGTQMTGPPAPPPPMHSPQQPGYQLQQNGS 286
Cdd:PRK12323  451 APAPA-AAPAAAAR--PAAAGPRPVAAAAAAAPARAAPAAAPAPAdDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIP 527
                         170
                  ....*....|....*....
gi 149031233  287 FGPARGPQPNYESPYPGAP 305
Cdd:PRK12323  528 DPATADPDDAFETLAPAPA 546
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
85-259 2.98e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 44.87  E-value: 2.98e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 QFGQGDIQNGPSSTAQMPRvpgSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQlAAMQISGAVAQAPPPSGLgygPPT 164
Cdd:PRK12323  364 RPGQSGGGAGPATAAAAPV---AQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAA-ARAVAAAPARRSPAPEAL---AAA 436
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  165 SLASASGNFPNSGPYSTYPQS----QAPPLSQAQGHP--GVQPPLRSAPPLASS----FTSPASGGPRMPSMPGPLPPGQ 234
Cdd:PRK12323  437 RQASARGPGGAPAPAPAPAAApaaaARPAAAGPRPVAaaAAAAPARAAPAAAPApaddDPPPWEELPPEFASPAPAQPDA 516
                         170       180
                  ....*....|....*....|....*
gi 149031233  235 GFGSLPVSQANRvsspPAHALPPGT 259
Cdd:PRK12323  517 APAGWVAESIPD----PATADPDDA 537
PHA03379 PHA03379
EBNA-3A; Provisional
5-309 3.28e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 45.05  E-value: 3.28e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    5 QSAPPVP-PFGQNQPIYPGYHQSnygGQPGPAAPATPYGAYNGP-------VPGYQQAP--PQGVPRAP----------P 64
Cdd:PHA03379  468 AQLPPGPlQDLEPGDQLPGVVQD---GRPACAPVPAPAGPIVRPweaslsqVPGVAFAPvmPQPMPVEPvpvptvalerP 544
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   65 CSGAPPASAAQvpcgqttygqfgqgdiqnGPSSTAQMPRVPGSQQfGPPLAPVVSQPavlqpygppPTSTQVTAQLAAMQ 144
Cdd:PHA03379  545 VCPAPPLIAMQ------------------GPGETSGIVRVRERWR-PAPWTPNPPRS---------PSQMSVRDRLARLR 596
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  145 ISGAVAQAPPPSglgygPPTSLASASGNFPNSGP-------YSTYPQSQAPPLSQAQGHPGVQPPLRSAPpLASSFTSPA 217
Cdd:PHA03379  597 AEAQPYQASVEV-----QPPQLTQVSPQQPMEYPlepeqqmFPGSPFSQVADVMRAGGVPAMQPQYFDLP-LQQPISQGA 670
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  218 SGGPRMPSMpGPLPP----GQGFGSLPVSQANRVSSPPAHALPPgTQMTGPPAPPPPMHspqqPGYQLQQNGSFGPARGP 293
Cdd:PHA03379  671 PLAPLRASM-GPVPPvpatQPQYFDIPLTEPINQGASAAHFLPQ-QPMEGPLVPERWMF----QGATLSQSVRPGVAQSQ 744
                         330
                  ....*....|....*....
gi 149031233  294 Q---PNYESPYPGAPTFGT 309
Cdd:PHA03379  745 YfdlPLTQPINHGAPAAHF 763
PHA03379 PHA03379
EBNA-3A; Provisional
43-251 3.37e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 44.66  E-value: 3.37e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   43 AYNGPVPGYQQAPPQgVPRAppCSGAPPASAAQVPCGQTTYgQFGQGDIQN----GPSSTAQMPRVP------GSQQFGP 112
Cdd:PHA03379  412 TYGTPRPPVEKPRPE-VPQS--LETATSHGSAQVPEPPPVH-DLEPGPLHDqhsmAPCPVAQLPPGPlqdlepGDQLPGV 487
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  113 PLAPVVSQPAVLQPYGP-----PPTSTQVTAQLAAMQISGAVAQAPPP-----SGLGYGPPTSLASASGNFPNSGPYSTY 182
Cdd:PHA03379  488 VQDGRPACAPVPAPAGPivrpwEASLSQVPGVAFAPVMPQPMPVEPVPvptvaLERPVCPAPPLIAMQGPGETSGIVRVR 567
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  183 PQSQAPPLSQAQGHPGVQPPLRSAP----PLASSFTSP-ASGGPRMPSMP------GPL-PPGQGFGSLPVSQANRVSSP 250
Cdd:PHA03379  568 ERWRPAPWTPNPPRSPSQMSVRDRLarlrAEAQPYQASvEVQPPQLTQVSpqqpmeYPLePEQQMFPGSPFSQVADVMRA 647

                  .
gi 149031233  251 P 251
Cdd:PHA03379  648 G 648
PRK10263 PRK10263
DNA translocase FtsK; Provisional
35-209 5.34e-04

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 44.31  E-value: 5.34e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   35 AAPATPYGAYNGPVPGYQQAPPqgVPRAPPCSGAPPASAAQVPCGQTtygqfGQGDIQNGPSSTAQMPRV--PGSQQFGP 112
Cdd:PRK10263  324 AAATTATQSWAAPVEPVTQTPP--VASVDVPPAQPTVAWQPVPGPQT-----GEPVIAPAPEGYPQQSQYaqPAVQYNEP 396
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  113 PLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQ-------------ISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPY 179
Cdd:PRK10263  397 LQQPVQPQQPYYAPAAEQPAQQPYYAPAPEQPaqqpyyapapeqpVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEP 476
                         170       180       190
                  ....*....|....*....|....*....|
gi 149031233  180 STYPQSQAPPLSQAQGHPGVQPPLRSAPPL 209
Cdd:PRK10263  477 LYQQPQPVEQQPVVEPEPVVEETKPARPPL 506
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
32-173 5.85e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 43.93  E-value: 5.85e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   32 PGPAAPATPYGAYNGPVPGYQQAP---PQGVPRAPPCSGAPPASAAQVPCGQTTYGQfgqgdiqngPSSTAQMPRVPgsq 108
Cdd:PRK14951  366 PAAAAEAAAPAEKKTPARPEAAAPaaaPVAQAAAAPAPAAAPAAAASAPAAPPAAAP---------PAPVAAPAAAA--- 433
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 149031233  109 qfGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLASASGNF 173
Cdd:PRK14951  434 --PAAAPAAAPAAVALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTEEGDV 496
PRK07003 PRK07003
DNA polymerase III subunit gamma/tau;
47-258 5.93e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 235906 [Multi-domain]  Cd Length: 830  Bit Score: 44.07  E-value: 5.93e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   47 PVPGYQQAPPQGVPRAPPcsGAPPASAAQVPcgqttygqfgqgdiqNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQP 126
Cdd:PRK07003  360 PAVTGGGAPGGGVPARVA--GAVPAPGARAA---------------AAVGASAVPAVTAVTGAAGAALAPKAAAAAAATR 422
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  127 YGPPPTSTQVTAqlaamqiSGAVAQAPPPSGlgygPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQghpgvqpPLRSA 206
Cdd:PRK07003  423 AEAPPAAPAPPA-------TADRGDDAADGD----APVPAKANARASADSRCDERDAQPPADSGSASA-------PASDA 484
                         170       180       190       200       210
                  ....*....|....*....|....*....|....*....|....*....|..
gi 149031233  207 PPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQANRVSSPPAHALPPG 258
Cdd:PRK07003  485 PPDAAFEPAPRAAAPSAATPAAVPDARAPAAASREDAPAAAAPPAPEARPPT 536
PHA02682 PHA02682
ORF080 virion core protein; Provisional
56-231 5.93e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 42.93  E-value: 5.93e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   56 PQGVPRAP--PCsgapPASAAQVPCGQTTYGQFGQGDIQNGPSSTAQMPRVPGSQQFGPplAPVVSQPAVLQPYGPPPTs 133
Cdd:PHA02682   30 PQATIPAPaaPC----PPDADVDPLDKYSVKEAGRYYQSRLKANSACMQRPSGQSPLAP--SPACAAPAPACPACAPAA- 102
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  134 tqvtaqLAAMQISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPL--RSAPPLas 211
Cdd:PHA02682  103 ------PAPAVTCPAPAPACPPATAPTCPPPAVCPAPARPAPACPPSTRQCPPAPPLPTPKPAPAAKPIFlhNQLPPP-- 174
                         170       180
                  ....*....|....*....|
gi 149031233  212 sfTSPASGGPRMPSMPGPLP 231
Cdd:PHA02682  175 --DYPAASCPTIETAPAASP 192
PHA03379 PHA03379
EBNA-3A; Provisional
32-243 6.80e-04

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 43.89  E-value: 6.80e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   32 PGPAAPATPYGAYNGPVPGYQQAPPQGVPRA---PPCSGAPPASAAQVPCGQTTygQFGQGDIQNGPSSTAQMPRVPG-- 106
Cdd:PHA03379  577 PNPPRSPSQMSVRDRLARLRAEAQPYQASVEvqpPQLTQVSPQQPMEYPLEPEQ--QMFPGSPFSQVADVMRAGGVPAmq 654
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  107 SQQFGPPLAPVVSQPAVLQPY----GPPPTSTQVTAQLAAMQISGAVAQ-APPPSGLGYGPPTS-LASASGNFPNSgpys 180
Cdd:PHA03379  655 PQYFDLPLQQPISQGAPLAPLrasmGPVPPVPATQPQYFDIPLTEPINQgASAAHFLPQQPMEGpLVPERWMFQGA---- 730
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233  181 TYPQSQAPPLSQAQ--GHPGVQPPLRSAPplassftspASGGPRMPSMPGPLPPGQG-FGSLPVSQ 243
Cdd:PHA03379  731 TLSQSVRPGVAQSQyfDLPLTQPINHGAP---------AAHFLHQPPMEGPWVPEQWmFQGAPPSQ 787
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
6-214 7.41e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 43.71  E-value: 7.41e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGYHQSNYGGQPGPAAPAtpygayngpVPGYQQAPPQGVPRAPpcsgAPPASAAQVPCGQTtygq 85
Cdd:PRK12323  400 AAPPAAPAAAPAAAAAARAVAAAPARRSPAPEA---------LAAARQASARGPGGAP----APAPAPAAAPAAAA---- 462
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   86 fgqgdiqngPSSTAQMPRVPGSQQFGPPLAPVVSQPAVlQPYGPPPTStQVTAQLAAMQISGAVAqAPPPSGLGYGPPTS 165
Cdd:PRK12323  463 ---------RPAAAGPRPVAAAAAAAPARAAPAAAPAP-ADDDPPPWE-ELPPEFASPAPAQPDA-APAGWVAESIPDPA 530
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*....
gi 149031233  166 LASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVqPPLRSAPPLASSFT 214
Cdd:PRK12323  531 TADPDDAFETLAPAPAAAPAPRAAAATEPVVAPR-PPRASASGLPDMFD 578
COG3416 COG3416
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
13-62 8.36e-04

Uncharacterized conserved protein, DUF2076 domain [Function unknown];


Pssm-ID: 442642 [Multi-domain]  Cd Length: 237  Bit Score: 42.32  E-value: 8.36e-04
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....
gi 149031233   13 FGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQ----APPQGVPRA 62
Cdd:COG3416    90 FGGGQRPPPAPQPSQPGPQQQPAPPSGPWGQAAPQQPGYGQpqygQPAAGPSGG 143
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
146-259 9.59e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 43.13  E-value: 9.59e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  146 SGAVAQAPPPSGLGygpPTSLASASGNFPNSGPYSTYPQSQAPPlsqaqghPGVqpPLRSAPPLASSFTSPASGGPRMP- 224
Cdd:PRK14959  391 SGGAATIPTPGTQG---PQGTAPAAGMTPSSAAPATPAPSAAPS-------PRV--PWDDAPPAPPRSGIPPRPAPRMPe 458
                          90       100       110
                  ....*....|....*....|....*....|....*..
gi 149031233  225 --SMPGPlppgqgfgslPVSQANRVSSPPAHALPPGT 259
Cdd:PRK14959  459 asPVPGA----------PDSVASASDAPPTLGDPSDT 485
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
6-306 9.62e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 43.62  E-value: 9.62e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNGPVPGYQQAPPQGVPRAP-PCSGAPPASAA-QVPCGQTTY 83
Cdd:PHA03307  125 SPPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASDAASSRQAALPLSSPEETaRAPSSPPAEPPpSTPPAAASP 204
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   84 GQFGQGDIQNGPSSTAQmPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPP 163
Cdd:PHA03307  205 RPPRRSSPISASASSPA-PAPGRSAADDAGASSSDSSSSESSGCGWGPENECPLPRPAPITLPTRIWEASGWNGPSSRPG 283
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  164 TSlaSASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQGFGSLPVSQ 243
Cdd:PHA03307  284 PA--SSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPSPSRSPSPSRPPPPAD 361
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233  244 ANRVSSPPAHALPPGTQMTGPPAPPPPMHSPQQPGYQLQQNGSF---GPARGPQPNYESPYPGAPT 306
Cdd:PHA03307  362 PSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRARRRDATGrfpAGRPRPSPLDAGAASGAFY 427
PHA03247 PHA03247
large tegument protein UL36; Provisional
3-249 1.39e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.00  E-value: 1.39e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    3 VNQSAPPVPPFGQNQPiypgyhqsnyggQPGPAAPATPygayNGPVPGYQQAPPQGVPRAPPCSGAPPASAAQVPCGQTT 82
Cdd:PHA03247 2886 LARPAVSRSTESFALP------------PDQPERPPQP----QAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDP 2949
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   83 YGQFGQGDIQNGPSSTAQMP-RVPGSQQFGPPLAPVVSQPAvlqPYGPPPTSTQVTAQLA-AMQISGAVAQAPPPSGL-- 158
Cdd:PHA03247 2950 AGAGEPSGAVPQPWLGALVPgRVAVPRFRVPQPAPSREAPA---SSTPPLTGHSLSRVSSwASSLALHEETDPPPVSLkq 3026
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  159 GYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPlRSAPPLASSFTSPASggprmpsmpgplppgqGFGS 238
Cdd:PHA03247 3027 TLWPPDDTEDSDADSLFDSDSERSDLEALDPLPPEPHDPFAHEP-DPATPEAGARESPSS----------------QFGP 3089
                         250
                  ....*....|.
gi 149031233  239 LPVSqANRVSS 249
Cdd:PHA03247 3090 PPLS-ANAALS 3099
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
5-174 1.60e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 42.56  E-value: 1.60e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    5 QSAPPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPygaynGPVPGYQQAPPQGVPRAPPcSGAPPASAAQVPCGQTTYG 84
Cdd:PRK12323  419 VAAAPARRSPAPEALAAARQASARGPGGAPAPAPAP-----AAAPAAAARPAAAGPRPVA-AAAAAAPARAAPAAAPAPA 492
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   85 QFGQGDIQNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPT 164
Cdd:PRK12323  493 DDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPRAAAATEPVVAPRPPRASASG 572
                         170
                  ....*....|
gi 149031233  165 SLASASGNFP 174
Cdd:PRK12323  573 LPDMFDGDWP 582
PRK10263 PRK10263
DNA translocase FtsK; Provisional
127-341 1.66e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 42.76  E-value: 1.66e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  127 YGPPPTSTQVTAQLAAMQISGAVAQApppsglgYGPPTSLASASGNFPNSGPYSTYPQSQAPPLSQAQ-GHPGVQPPLRS 205
Cdd:PRK10263  307 YDPLLNGAPITEPVAVAAAATTATQS-------WAAPVEPVTQTPPVASVDVPPAQPTVAWQPVPGPQtGEPVIAPAPEG 379
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  206 APPlASSFTSPASGGPRMPSMP-GPLPPGQGFGSLPVSQANRVSSPPAHALP---PGTQMTGPPAPPPPMHSPQQPGYQL 281
Cdd:PRK10263  380 YPQ-QSQYAQPAVQYNEPLQQPvQPQQPYYAPAAEQPAQQPYYAPAPEQPAQqpyYAPAPEQPVAGNAWQAEEQQSTFAP 458
                         170       180       190       200       210       220
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  282 QqngsfgPARGPQPNYESPYPGAPTFgtqpgppqpLPPKRLDPDAIPSPIQVIEDDRNNR 341
Cdd:PRK10263  459 Q------STYQTEQTYQQPAAQEPLY---------QQPQPVEQQPVVEPEPVVEETKPAR 503
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
29-278 2.44e-03

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 42.08  E-value: 2.44e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATPYGAYNGPVPGyqqAPPQGVPRAPPCSGAPPASAAqvpcgqttygqfgqgdiqnGPSSTAQMPRVPGSq 108
Cdd:PHA03307   17 GGEFFPRPPATPGDAADDLLSG---SQGQLVSDSAELAAVTVVAGA-------------------AACDRFEPPTGPPP- 73
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  109 qfGPPLAPVVSQPAVLQPYGPPPtstqvtaqLAAMQISGAVAQAPPPSGLGYGPPTSLASASGnfpnsgpystyPQSQAP 188
Cdd:PHA03307   74 --GPGTEAPANESRSTPTWSLST--------LAPASPAREGSPTPPGPSSPDPPPPTPPPASP-----------PPSPAP 132
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  189 PLSQAQ--GHPGVQPPLRSAPPLASSFTSPASG--GPRMPSMPGPLPPgqgfgslpvsQANRVSSPPAHALPPGTQMTGP 264
Cdd:PHA03307  133 DLSEMLrpVGSPGPPPAASPPAAGASPAAVASDaaSSRQAALPLSSPE----------ETARAPSSPPAEPPPSTPPAAA 202
                         250
                  ....*....|....
gi 149031233  265 PAPPPPMHSPQQPG 278
Cdd:PHA03307  203 SPRPPRRSSPISAS 216
PLN03209 PLN03209
translocon at the inner envelope of chloroplast subunit 62; Provisional
29-245 2.67e-03

translocon at the inner envelope of chloroplast subunit 62; Provisional


Pssm-ID: 178748 [Multi-domain]  Cd Length: 576  Bit Score: 41.84  E-value: 2.67e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATPygayNGPVPGYQQAPPQG---VPR-----------APPCSGAP--PASAAQVPCGQTTYGQFGQGDIQ 92
Cdd:PLN03209  338 GPKPVPTKPVTP----EAPSPPIEEEPPQPkavVPRplspytayedlKPPTSPIPtpPSSSPASSKSVDAVAKPAEPDVV 413
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   93 NGPSSTAQMPRV---PGSQQFGPPLAPVVSQPAVLQPYGPPPTstqvtaqlaamqisgavaqapPPSGLGygPPTSLASA 169
Cdd:PLN03209  414 PSPGSASNVPEVepaQVEAKKTRPLSPYARYEDLKPPTSPSPT---------------------APTGVS--PSVSSTSS 470
                         170       180       190       200       210       220       230
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 149031233  170 SGNFPNSGPYSTYPQSQAPPLSQAQGHPGVQPPLRSAPPLASsftSPASGGPRMPSMPGPLPPGQGFGSLPVSQAN 245
Cdd:PLN03209  471 VPAVPDTAPATAATDAAAPPPANMRPLSPYAVYDDLKPPTSP---SPAAPVGKVAPSSTNEVVKVGNSAPPTALAD 543
PRK07764 PRK07764
DNA polymerase III subunits gamma and tau; Validated
6-155 3.19e-03

DNA polymerase III subunits gamma and tau; Validated


Pssm-ID: 236090 [Multi-domain]  Cd Length: 824  Bit Score: 41.51  E-value: 3.19e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    6 SAPPVPPFGQNQPIYPGYHQSNYGGQPGPAAPATPYGAYNG-PVPGYQQAPPQGVPRAPPcSGAPPASAAQVPCGQTTYG 84
Cdd:PRK07764  634 AAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGWPAKAGgAAPAAPPPAPAPAAPAAP-AGAAPAQPAPAPAATPPAG 712
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 149031233   85 QfGQGDIQNGPSSTAQMPRVPGSQQFGPPLAPVVSQPAVLQPYGPPPTSTQVTAQLAAMQISGAVAQAPPP 155
Cdd:PRK07764  713 Q-ADDPAAQPPQAAQGASAPSPAADDPVPLPPEPDDPPDPAGAPAQPPPPPAPAPAAAPAAAPPPSPPSEE 782
SOBP pfam15279
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ...
106-302 3.94e-03

Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.


Pssm-ID: 464609 [Multi-domain]  Cd Length: 325  Bit Score: 40.57  E-value: 3.94e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   106 GSQQFGPPLAPVVSQ-PAVLQPYGPPPTSTQVTAQLAA-MQISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYP 183
Cdd:pfam15279   91 ESVSPGPSSSASPSSsPTSSNSSKPLISVASSSKLLAPkPHEPPSLPPPPLPPKKGRRHRPGLHPPLGRPPGSPPMSMTP 170
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   184 QS---------QAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASggPRMPSMPGPLPPGQGFGSLPVSQANRVSSPPAHA 254
Cdd:pfam15279  171 RGllgkpqqhpPPSPLPAFMEPSSMPPPFLRPPPSIPQPNSPLS--NPMLPGIGPPPKPPRNLGPPSNPMHRPPFSPHHP 248
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 149031233   255 LPPGTqMTGPPAPPPPMH-----------SPQQPGYQLQQ--NGSFGPARG--PQPNYESPYP 302
Cdd:pfam15279  249 PPPPT-PPGPPPGLPPPPprgftppfgppFPPVNMMPNPPemNFGLPSLAPlvPPVTVLVPYP 310
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
156-306 5.07e-03

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 41.06  E-value: 5.07e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   156 SGLGYGPPTSLASASGNFPNSGP----YSTYPQSQ--APPLSQAQ-GHPGVQPPLRSAPPLASSFTSPASGGPRMPS--M 226
Cdd:pfam05109  394 SGLGTAPKTLIITRTATNATTTThkviFSKAPESTttSPTLNTTGfAAPNTTTGLPSSTHVPTNLTAPASTGPTVSTadV 473
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   227 PGPLPPGQGFGSLPV------------SQANRVSSPPAHALPPGTQMTGPPAPPPPmHSPQQPGYQLQQNGSFGPARGPQ 294
Cdd:pfam05109  474 TSPTPAGTTSGASPVtpspsprdngteSKAPDMTSPTSAVTTPTPNATSPTPAVTT-PTPNATSPTLGKTSPTSAVTTPT 552
                          170
                   ....*....|..
gi 149031233   295 PNYESPYPGAPT 306
Cdd:pfam05109  553 PNATSPTPAVTT 564
PHA03247 PHA03247
large tegument protein UL36; Provisional
29-262 5.62e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.08  E-value: 5.62e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   29 GGQPGPAAPATPygayngPVPGYQQAPPQGVPRAPP------CSGAPPA-SAAQVPCGQTTYGQFGQGDIQNGP---SST 98
Cdd:PHA03247  266 DRAPETARGATG------PPPPPEAAAPNGAAAPPDgvwgaaLAGAPLAlPAPPDPPPPAPAGDAEEEDDEDGAmevVSP 339
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   99 AQMPRVPGSQQFGPPLAPVVSQPAVLQ---------PYGPPPTSTQVTAQLAAmqisgavaqAPPPSGLGYGPPTSLASA 169
Cdd:PHA03247  340 LPRPRQHYPLGFPKRRRPTWTPPSSLEdlsagrhhpKRASLPTRKRRSARHAA---------TPFARGPGGDDQTRPAAP 410
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  170 SgnfpnsgPYSTyPQSQAPPLSQAQGHPGVQPPLRSAPPLASSFTSPASGGPRMPSMPgPLPPGQGFGSLPVSQANRVSS 249
Cdd:PHA03247  411 V-------PASV-PTPAPTPVPASAPPPPATPLPSAEPGSDDGPAPPPERQPPAPATE-PAPDDPDDATRKALDALRERR 481
                         250
                  ....*....|...
gi 149031233  250 PPAhalPPGTQMT 262
Cdd:PHA03247  482 PPE---PPGADLA 491
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
112-234 5.86e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 40.47  E-value: 5.86e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233  112 PPLAPVVSQPAVlqpyGPPPTSTQVTAQLAAMQISGAVAQAPPPSGLGYGPPTSLASASGNFPNSGPYSTYPQSQAPPLS 191
Cdd:PRK14951  380 TPARPEAAAPAA----APVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAAAPAAAPAAAPAAVALAPAPPAQA 455
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|...
gi 149031233  192 QAQghpGVQPPLRSAPPLASSFTSPASGGPRMPSMPGPLPPGQ 234
Cdd:PRK14951  456 APE---TVAIPVRVAPEPAVASAAPAPAAAPAAARLTPTEEGD 495
DAZAP2 pfam11029
DAZ associated protein 2 (DAZAP2); DAZ associated protein 2 has a highly conserved sequence ...
51-144 6.63e-03

DAZ associated protein 2 (DAZAP2); DAZ associated protein 2 has a highly conserved sequence throughout evolution including a conserved polyproline region and several SH2/SH3 binding sites. It occurs as a single copy gene with a four-exon organization and is located on chromosome 12. It encodes a ubiquitously expressed protein and binds to DAZ and DAZL1 through DAZ repeats.


Pssm-ID: 402565 [Multi-domain]  Cd Length: 129  Bit Score: 37.79  E-value: 6.63e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    51 YQQAPPQGVPRAPPCSGAPPASAAQVPCGQTTYGQFGQGDIQNGPSSTAQMPRVPGSQQFGPPLAPVV------------ 118
Cdd:pfam11029    1 YPDAPPAYSQLYQPRYAHPPYAASVAPSMSAAYPPAQAYPPMAQMASPPPMAYQPPGPAQPPGQTVVVpggfdagarfga 80
                           90       100
                   ....*....|....*....|....*.
gi 149031233   119 SQPAVLQPygPPPTSTQVTAQLAAMQ 144
Cdd:pfam11029   81 GSQPSIPP--PPPGCAPNAAQLAAMQ 104
GEL smart00262
Gelsolin homology domain; Gelsolin/severin/villin homology domain. Calcium-binding and ...
964-1000 6.92e-03

Gelsolin homology domain; Gelsolin/severin/villin homology domain. Calcium-binding and actin-binding. Both intra- and extracellular domains.


Pssm-ID: 214590 [Multi-domain]  Cd Length: 90  Bit Score: 36.89  E-value: 6.92e-03
                            10        20        30
                    ....*....|....*....|....*....|....*....
gi 149031233    964 TEPPAVRASEERLSSGDIYLLENGLNLFVWVG--ASVQQ 1000
Cdd:smart00262   11 VRVPEVPFSQGSLNSGDCYILDTGSEIYVWVGkkSSQDE 49
Treacle pfam03546
Treacher Collins syndrome protein Treacle;
34-411 8.44e-03

Treacher Collins syndrome protein Treacle;


Pssm-ID: 460967 [Multi-domain]  Cd Length: 531  Bit Score: 40.06  E-value: 8.44e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    34 PAAPATPYGAYNGPVPGYQQA-------PPQGVPRAPPCSGAPPASAAQV------------------------------ 76
Cdd:pfam03546   39 PAAKTPLQAKPSGKTPQVRAAsapakesPRKGAPPVPPGKTGPAAAQAQAgkpeedsessseesdsdgetpaaatlttsp 118
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233    77 ----PCGQTTYGQFGQGdIQNGPSSTAQMPRVPGSqqfgppLAPVVSQPAVLQPYGPPPTSTQ-------VTAQLAAMQI 145
Cdd:pfam03546  119 aqvkPLGKNSQVRPAST-VGKGPSGKGANPAPPGK------AGSAAPLVQVGKKEEDSESSSEesdsegeAPPAATQAKP 191
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   146 SGAVAQAPPPSGLGYG----PPTSL--------ASASGNFPNSGPYSTYPQSQAPP-LSQAQGHPGVQPPLRSAPPLASS 212
Cdd:pfam03546  192 SGKILQVRPASGPAKGaapaPPQKAgpvatqvkAERSKEDSESSEESSDSEEEAPAaATPAQAKPALKTPQTKASPRKGT 271
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   213 FTSPAS--GGPRMPSMPGPLPPGQgfgslpVSQANRVSSPpahALPPGTQMTGPPAPPPPMHSPQQ---PGYQLQQNGSF 287
Cdd:pfam03546  272 PITPTSakVPPVRVGTPAPWKAGT------VTSPACASSP---AVARGAQRPEEDSSSSEESESEEetaPAAAVGQAKSV 342
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   288 GpaRGPQPNYESpypgAPTFGTQPGPPQPLPPKRLDPDAIPSPIQVIEDDRNNR---GSEPfVTGVRGQVPPLVTTNflv 364
Cdd:pfam03546  343 G--KGLQGKAAS----APTKGPSGQGTAPVPPGKTGPAVAQVKAEAQEDSESSEeesDSEE-AAATPAQVKASGKTP--- 412
                          410       420       430       440       450
                   ....*....|....*....|....*....|....*....|....*....|..
gi 149031233   365 kdQGNASPRYIRCTSYNIPCTS-----DMAKQAQVPLAAVIKPLARLPPEEA 411
Cdd:pfam03546  413 --QAKANPAPTKASSAKGAASApgkvvAAAAQAKQGSPAKVKPPARTPQNSA 462
DUF4645 pfam15488
Domain of unknown function (DUF4645); This family of proteins is found in eukaryotes. Proteins ...
116-258 9.63e-03

Domain of unknown function (DUF4645); This family of proteins is found in eukaryotes. Proteins in this family are typically between 200 and 298 amino acids in length.


Pssm-ID: 406050 [Multi-domain]  Cd Length: 294  Bit Score: 39.46  E-value: 9.63e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   116 PVVSQPAVLQPYGPPPTSTQ---VTAQ------LAAMQISGA----VAQ---APPPSGLGYGPPTSLASASGNFPNSGPY 179
Cdd:pfam15488   82 PVDSSRALRHPYGPPPAVAEeslATAEvnssegLAGWRQKGQdsinVSQefsGSPPALMVGGTRVSNGGTERGGNNAKLY 161
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 149031233   180 STYPQSQA---PPLSQAQGHPGVqPPLRSA-----PPLASSFTSPAS--------GGPRMP--SMPGPLPpgqgfgsLPV 241
Cdd:pfam15488  162 SALPRGQGffpPRGPQVRGPPHI-PTLRSGimmevPPGNTRMAGKERlahvsfplGGPRHPmdNWPRPIP-------LSS 233
                          170
                   ....*....|....*...
gi 149031233   242 SQANRVSSPPAHA-LPPG 258
Cdd:pfam15488  234 STPGLPSCSTAHCfIPPR 251
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH