|
Name |
Accession |
Description |
Interval |
E-value |
| COG5028 |
COG5028 |
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking ... |
141-1040 |
5.33e-168 |
|
Vesicle coat complex COPII, subunit SEC24/subunit SFB2/subunit SFB3 [Intracellular trafficking and secretion];
Pssm-ID: 227361 [Multi-domain] Cd Length: 861 Bit Score: 515.50 E-value: 5.33e-168
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 141 MAHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSG-PPPPNSQYQ 219
Cdd:COG5028 1 MSQHKKGVYPQAQSQVHTGAASSKKSARPHRAYANFSAGQMGMPPYTTPPLQQQSRRQIDQAATAMHNTGaNNPAPSVMS 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 220 PPPLPGQTLGAGYPPQQAANSGPQMAGaqlsypggfPGGPAQLAgPPQPQKKLDPDSIPSPiqviendrasrggqvyatn 299
Cdd:COG5028 81 PAFQSQQKFSSPYGGSMADGTAPKPTN---------PLVPVDLF-EDQPPPISDLFLPPPP------------------- 131
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 300 trgQIPPLvTTDCIIQDQGNASPRFIRCTTYCFPCTSDMAKQAQIPLAAVIKPFATIPSNESPLYLVNHGEsgPVRCNRC 379
Cdd:COG5028 132 ---IVPPL-TTNFVGSEQSNCSPKYVRSTMYAIPETNDLLKKSKIPFGLVIRPFLELYPEEDPVPLVEDGS--IVRCRRC 205
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 380 KAYMCPFMQFIEGGRRYQCGFCNCVNDVPPFYFQHLDHIGRRLDHYEKPELSLGSYEYVATLDYcrKNKPPNPPAFIFMI 459
Cdd:COG5028 206 RSYINPFVQFIEQGRKWRCNICRSKNDVPEGFDNPSGPNDPRSDRYSRPELKSGVVDFLAPKEY--SLRQPPPPVYVFLI 283
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 460 DVSYSNIKNGLVKLICEELKTMLEKLPKEEQEemsaIRVGFITYNKVLHFFNVKSNLaQPQMMVVTDVGEVFVPLLDG-F 538
Cdd:COG5028 284 DVSFEAIKNGLVKAAIRAILENLDQIPNFDPR----TKIAIICFDSSLHFFKLSPDL-DEQMLIVSDLDEPFLPFPSGlF 358
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 539 LVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAadCPGKLFIFHSSLPTAeAPGKLKNRDDKklvntd 618
Cdd:COG5028 359 VLPLKSCKQIIETLLDRVPRIFQDNKSPKNALGPALKAAKSLIGG--TGGKIIVFLSTLPNM-GIGKLQLREDK------ 429
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 619 kEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQM--HLDRQQFLNDLRNDIEK 696
Cdd:COG5028 430 -ESSLLSCKDSFYKEFAIECSKVGISVDLFLTSEDYIDVATLSHLCRYTGGQTYFYPNFSAtrPNDATKLANDLVSHLSM 508
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 697 KIGFDAIMRVRTSTGFRATDFFGGILMNNTTDVEMAAIDCDKAVTVEFKHDDKLSeDTGALIQCAVLYTTISGQRRLRIH 776
Cdd:COG5028 509 EIGYEAVMRVRCSTGLRVSSFYGNFFNRSSDLCAFSTMPRDTSLLVEFSIDEKLM-TSDVYFQVALLYTLNDGERRIRVV 587
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 777 NLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLP 856
Cdd:COG5028 588 NLSLPTSSSIREVYASADQLAIACILAKKASTKALNSSLKEARVLINKSMVDILKAYKKELVKSNTSTQLPLPANLKLLP 667
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 857 VYMNCLLKNcVLLSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIHTL-------DVKSTMLPAAVRCSESRLSEE 929
Cdd:COG5028 668 LLMLALLKS-SAFRSGSTPSDIRISALNRLTSLPLKQLMRNIYPTLYALHDMpieaglpDEGLLVLPSPINATSSLLESG 746
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 930 GIFLLANGLHMFLWLGVSSPPELIQGIFNVPSFAHINTDMTLLPEVGNPYSQQLRMIMGIIQQKRPYS-MKLTIVKQREQ 1008
Cdd:COG5028 747 GLYLIDTGQKIFLWFGKDAVPSLLQDLFGVDSLSDIPSGKFTLPPTGNEFNERVRNIIGELRSVNDDStLPLVLVRGGGD 826
|
890 900 910
....*....|....*....|....*....|....
gi 1622941236 1009 P--EMVFRQFLVEDKgLYGGSSYVDFLCCVHKEI 1040
Cdd:COG5028 827 PslRLWFFSTLVEDK-TLNIPSYLDYLQILHEKI 859
|
|
| Sec24-like |
cd01479 |
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the ... |
450-709 |
1.11e-118 |
|
Sec24-like: Protein and membrane traffic in eukaryotes is mediated by at least in part by the budding and fusion of intracellular transport vesicles that selectively carry cargo proteins and lipids from donor to acceptor organelles. The two main classes of vesicular carriers within the endocytic and the biosynthetic pathways are COP- and clathrin-coated vesicles. Formation of COPII vesicles requires the ordered assembly of the coat built from several cytosolic components GTPase Sar1, complexes of Sec23-Sec24 and Sec13-Sec31. The process is initiated by the conversion of GDP to GTP by the GTPase Sar1 which then recruits the heterodimeric complex of Sec23 and Sec24. This heterodimeric complex generates the pre-budding complex. The final step leading to membrane deformation and budding of COPII-coated vesicles is carried by the heterodimeric complex Sec13-Sec31. The members of this CD belong to the Sec23-like family. Sec 24 is very similar to Sec23. The Sec23 and Sec24 polypeptides fold into five distinct domains: a beta-barrel, a zinc finger, a vWA or trunk, an all helical region and a carboxy Gelsolin domain. The members of this subgroup carry a partial MIDAS motif and have the overall Para-Rossmann type fold that is characteristic of this superfamily.
Pssm-ID: 238756 [Multi-domain] Cd Length: 244 Bit Score: 364.29 E-value: 1.11e-118
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 450 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKLPKEEqeemSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 529
Cdd:cd01479 1 PQPAVYVFLIDVSYNAIKSGLLATACEALLSNLDNLPGDD----PRTRVGFITFDSTLHFFNLKSSLEQPQMMVVSDLDD 76
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 530 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKaaDCPGKLFIFHSSLPTAEApGKLKNR 609
Cdd:cd01479 77 PFLPLPDGLLVNLKESRQVIEDLLDQIPEMFQDTKETESALGPALQAAFLLLK--ETGGKIIVFQSSLPTLGA-GKLKSR 153
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 610 DDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQmhldrqqflND 689
Cdd:cd01479 154 EDPKLLSTDKEKQLLQPQTDFYKKLALECVKSQISVDLFLFSNQYVDVATLGCLSRLTGGQVYYYPSFN---------FS 224
|
250 260
....*....|....*....|
gi 1622941236 690 LRNDIEKKIGFDAIMRVRTS 709
Cdd:cd01479 225 APNDVEKLVNELARYLTRKI 244
|
|
| Sec23_trunk |
pfam04811 |
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum ... |
450-694 |
6.28e-111 |
|
Sec23/Sec24 trunk domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface.
Pssm-ID: 398467 [Multi-domain] Cd Length: 241 Bit Score: 343.46 E-value: 6.28e-111
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 450 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKLPKEeqeemSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 529
Cdd:pfam04811 1 PQPPVFLFVIDVSYNAIKSGLLAALKESLLQSLDLLPGD-----PRARVGFITFDSTVHFFNLGSSLRQPQMLVVSDLQD 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 530 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEAPGKLKNR 609
Cdd:pfam04811 76 MFLPLPDRFLVPLSECRFVLEDLLEQLPPMFPVTKRPERCLGPALQAAFLLLKAAFTGGKIMVFQGGLPTVGPGGKLKSR 155
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 610 DDKKLVNTDKEKILFQPQTN-VYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFLN 688
Cdd:pfam04811 156 LDESHHGTDKEKAKLVKKADkFYKSLAKECVKQGHSVDLFAFSLDYVDVATLGQLSRLTGGQVYLYPSFQADVDGSKFKQ 235
|
....*.
gi 1622941236 689 DLRNDI 694
Cdd:pfam04811 236 DLQRYF 241
|
|
| trunk_domain |
cd01468 |
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi ... |
450-690 |
5.22e-98 |
|
trunk domain. COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is known as the trunk domain and has an alpha/beta vWA fold and forms the dimer interface. Some members of this family possess a partial MIDAS motif that is a characteristic feature of most vWA domain proteins.
Pssm-ID: 238745 [Multi-domain] Cd Length: 239 Bit Score: 309.18 E-value: 5.22e-98
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 450 PNPPAFIFMIDVSYSNIKNGLVKLICEELKTMLEKLPKEeqeemSAIRVGFITYNKVLHFFNVKSNLAQPQMMVVTDVGE 529
Cdd:cd01468 1 PQPPVFVFVIDVSYEAIKEGLLQALKESLLASLDLLPGD-----PRARVGLITYDSTVHFYNLSSDLAQPKMYVVSDLKD 75
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 530 VFVPLLDGFLVNYQESQSVIHNLLDQIPDMFAD--SNENETVFAPVIQAGMEALKAADCPGKLFIFHSSLPTAEaPGKLK 607
Cdd:cd01468 76 VFLPLPDRFLVPLSECKKVIHDLLEQLPPMFWPvpTHRPERCLGPALQAAFLLLKGTFAGGRIIVFQGGLPTVG-PGKLK 154
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 608 NRDDKKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGGTLYKYNNFQMHLDRQQFL 687
Cdd:cd01468 155 SREDKEPIRSHDEAQLLKPATKFYKSLAKECVKSGICVDLFAFSLDYVDVATLKQLAKSTGGQVYLYDSFQAPNDGSKFK 234
|
...
gi 1622941236 688 NDL 690
Cdd:cd01468 235 QDL 237
|
|
| PTZ00395 |
PTZ00395 |
Sec24-related protein; Provisional |
452-1038 |
2.59e-50 |
|
Sec24-related protein; Provisional
Pssm-ID: 185594 [Multi-domain] Cd Length: 1560 Bit Score: 194.52 E-value: 2.59e-50
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 452 PPAFIFMIDVSYSNIKNGLVKLICEELKTMLE--KLPKeeqeemsaIRVGFITYNKVLHFFNVKSNLAQP---------- 519
Cdd:PTZ00395 952 PPYFVFVVECSYNAIYNNITYTILEGIRYAVQnvKCPQ--------TKIAIITFNSSIYFYHCKGGKGVSgeegdggggs 1023
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 520 ---QMMVVTDVGEVFVPL-LDGFLVNYQESQSVIHNLLDQIPDMFADSNENETVFAPVIQAGMEALKAADCPGKLFIFHS 595
Cdd:PTZ00395 1024 gnhQVIVMSDVDDPFLPLpLEDLFFGCVEEIDKINTLIDTIKSVSTTMQSYGSCGNSALKIAMDMLKERNGLGSICMFYT 1103
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 596 SLPTAeAPGKLKnrddkKLVNTDKEKILFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVA--SLGLVPQLTGGTLYK 673
Cdd:PTZ00395 1104 TTPNC-GIGAIK-----ELKKDLQENFLEVKQKIFYDSLLLDLYAFNISVDIFIISSNNVRVCvpSLQYVAQNTGGKILF 1177
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 674 YNNFQMHLDRQQ-FLNDLRNDIEKKIGFDAIMRVRTSTG------FRATDFFGGILMNNTtdVEMAAIDCDKAVTVEFKH 746
Cdd:PTZ00395 1178 VENFLWQKDYKEiYMNIMDTLTSEDIAYCCELKLRYSHHmsvkklFCCNNNFNSIISVDT--IKIPKIRHDQTFAFLLNY 1255
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 747 DDKLSEDTGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADLYKSCETDALINFFAKSAFKAVLHQplKVIREILVNQT 826
Cdd:PTZ00395 1256 SDISESKKQIYFQCACIYTNLWGDRFVRLHTTHMNLTSSLSTVFRYTDAEALMNILIKQLCTNILHN--DNYSKIIIDNL 1333
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 827 AHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVllSRPEISTDERAYQRQLVMTMGVADSQLFFYPQLLPIH 906
Cdd:PTZ00395 1334 AAILFSYRINCASSAHSGQLILPDTLKLLPLFTSSLLKHNV--TKKEILHDLKVYSLIKLLSMPIISSLLYVYPVMYVIH 1411
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 907 ---------TLDVKSTM-LPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSPPELIQGIF-NVPSFAHINTdmtlLPEV 975
Cdd:PTZ00395 1412 ikgktneidSMDVDDDLfIPKTIPSSAEKIYSNGIYLLDACTHFYLYFGFHSDANFAKEIVgDIPTEKNAHE----LNLT 1487
|
570 580 590 600 610 620
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1622941236 976 GNPYSQQLRMIMGIIQQKRPYS--MKLTIVKQREQPEMVFRQFLVEDKGlYGGSSYVDFLCCVHK 1038
Cdd:PTZ00395 1488 DTPNAQKVQRIIKNLSRIHHFNkyVPLVMVAPKSNEEEHLISLCVEDKA-DKEYSYVNFLCFIHK 1551
|
|
| Sec23_helical |
pfam04815 |
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic ... |
796-896 |
1.56e-32 |
|
Sec23/Sec24 helical domain; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is composed of five alpha helices.
Pssm-ID: 461441 [Multi-domain] Cd Length: 103 Bit Score: 121.84 E-value: 1.56e-32
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 796 DALINFFAKSAFKAVLHQPLKVIREILVNQTAHMLACYRKNCASPSAASQLILPDSMKVLPVYMNCLLKNCVLLSRPEIS 875
Cdd:pfam04815 3 EAIAVLLAKKAVEKALSSSLSDAREALDNKLVDILAAYRKYCASSSSPGQLILPESLKLLPLYMLALLKSPALRGGNSSP 82
|
90 100
....*....|....*....|.
gi 1622941236 876 TDERAYQRQLVMTMGVADSQL 896
Cdd:pfam04815 83 SDERAYARHLLLSLPVEELLL 103
|
|
| Sec23_BS |
pfam08033 |
Sec23/Sec24 beta-sandwich domain; |
699-783 |
6.18e-29 |
|
Sec23/Sec24 beta-sandwich domain;
Pssm-ID: 429794 [Multi-domain] Cd Length: 86 Bit Score: 110.70 E-value: 6.18e-29
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 699 GFDAIMRVRTSTGFRATDFFGGILMNNTTD-VEMAAIDCDKAVTVEFKHDDKLSEDTGALIQCAVLYTTISGQRRLRIHN 777
Cdd:pfam08033 1 GFNAVLRVRTSKGLKVSGFIGNFVSRSSGDtWKLPSLDPDTSYAFEFDIDEPLPNGSNAYIQFALLYTHSSGERRIRVTT 80
|
....*.
gi 1622941236 778 LGLNCS 783
Cdd:pfam08033 81 VALPVT 86
|
|
| SEC23 |
COG5047 |
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion]; |
325-913 |
1.48e-21 |
|
Vesicle coat complex COPII, subunit SEC23 [Intracellular trafficking and secretion];
Pssm-ID: 227380 [Multi-domain] Cd Length: 755 Bit Score: 101.11 E-value: 1.48e-21
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 325 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFatipsNESPLYLVNHGEsgPVRCNR-CKAYMCPFMQFIEGGRRYQCGFCNC 403
Cdd:COG5047 12 IRLTWNVFPATRGDATRTVIPIACLYTPL-----HEDDALTVNYYE--PVKCTApCKAVLNPYCHIDERNQSWICPFCNQ 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 404 VNDVPPFYfqhldhigRRLDHYEKP-ELSLGS--YEYVAtldycrkNKPPN-PPAFIFMIDVSYSNIKNGLVKlicEELK 479
Cdd:COG5047 85 RNTLPPQY--------RDISNANLPlELLPQSstIEYTL-------SKPVIlPPVFFFVVDACCDEEELTALK---DSLI 146
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 480 TMLEKLPKEeqeemsAIrVGFITYNKVLHFFNV------KSNLAQP----QMMVVTDVGEVFVPLLDG------------ 537
Cdd:COG5047 147 VSLSLLPPE------AL-VGLITYGTSIQVHELnaenhrRSYVFSGnkeyTKENLQELLALSKPTKSGgfeskisgigqf 219
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 538 ----FLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGKLKN 608
Cdd:COG5047 220 assrFLLPTQQCEFKLLNILEQLqPDPWPVPAGKRplrcTGSALNIASSLLEQCFPNAGCHIVLFAGG-PCTVGPGTVVS 298
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 609 RDDKK------LVNTDKEKiLFQPQTNVYDSLAKDCVAHGCSVTLFLfpSQYVDVASLGLVP--QLTGGTLYKYNNFQMH 680
Cdd:COG5047 299 TELKEpmrshhDIESDSAQ-HSKKATKFYKGLAERVANQGHALDIFA--GCLDQIGIMEMEPltTSTGGALVLSDSFTTS 375
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 681 LDRQQFLN--DLRNDIEKKIGFDAIMRVRTSTGFRATDFFG---------------GILMNNTTDVEMAAIDCDKAVTVE 743
Cdd:COG5047 376 IFKQSFQRifNRDSEGYLKMGFNANMEVKTSKNLKIKGLIGhavsvkkkannisdsEIGIGATNSWKMASLSPKSNYALY 455
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 744 FK-----HDDKLSEDTGALIQCAVLYTTISGQRRLRIHNLGLNCSSQLADL-YKSCETDALINFFAK-SAFKAVLHQPLK 816
Cdd:COG5047 456 FEialgaASGSAQRPAEAYIQFITTYQHSSGTYRIRVTTVARMFTDGGLPKiNRSFDQEAAAVFMARiAAFKAETEDIID 535
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 817 VIREI---LVNQTAHmLACYRKNcaspsAASQLILPDSMKVLPVYMnCLLKNCVLLSRPEISTDERAYQRQLVMTMGVAD 893
Cdd:COG5047 536 VFRWIdrnLIRLCQK-FADYRKD-----DPSSFRLDPNFTLYPQFM-YHLRRSPFLSVFNNSPDETAFYRHMLNNADVND 608
|
650 660
....*....|....*....|....*...
gi 1622941236 894 SQLFFYPQLLPIH--------TLDVKST 913
Cdd:COG5047 609 SLIMIQPTLQSYSfekggvpvLLDSVSV 636
|
|
| zf-Sec23_Sec24 |
pfam04810 |
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum ... |
373-409 |
9.57e-16 |
|
Sec23/Sec24 zinc finger; COPII-coated vesicles carry proteins from the endoplasmic reticulum to the Golgi complex. This vesicular transport can be reconstituted by using three cytosolic components containing five proteins: the small GTPase Sar1p, the Sec23p/24p complex, and the Sec13p/Sec31p complex. This domain is found to be zinc binding domain.
Pssm-ID: 461437 [Multi-domain] Cd Length: 38 Bit Score: 71.71 E-value: 9.57e-16
10 20 30
....*....|....*....|....*....|....*..
gi 1622941236 373 PVRCNRCKAYMCPFMQFIEGGRRYQCGFCNCVNDVPP 409
Cdd:pfam04810 1 PVRCRRCRAYLNPFCQFDFGGKKWTCNFCGTRNPVPP 37
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
8-306 |
8.50e-12 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 69.97 E-value: 8.50e-12
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 8 ATPPYSQPQPGIGLSPPHYGhygDPSHTASPTGMMKPAGPLGVTATGGmlppgppppgppppgphqfgQNGAHAAGHPQQ 87
Cdd:PHA03247 2718 ATPLPPGPAAARQASPALPA---APAPPAVPAGPATPGGPARPARPPT--------------------TAGPPAPAPPAA 2774
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 88 RFPGPPPVNNV--ASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQinsygsgmahagsgmAPPSQGPPGPLSATSLQ 165
Cdd:PHA03247 2775 PAAGPPRRLTRpaVASLSESRESLPSPWDPADPPAAVLAPAAALPPAA---------------SPAGPLPPPTSAQPTAP 2839
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 166 TPPRPPQPSILQPGSQVLPPPPTTLNGPG----ASPLLPPMYRPDGLSGPPPPNSQyQPPPLPGQTLGAGYPPQQAANSG 241
Cdd:PHA03247 2840 PPPPGPPPPSLPLGGSVAPGGDVRRRPPSrspaAKPAAPARPPVRRLARPAVSRST-ESFALPPDQPERPPQPQAPPPPQ 2918
|
250 260 270 280 290 300
....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1622941236 242 PQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAtnTRGQIPP 306
Cdd:PHA03247 2919 PQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAV--PRFRVPQ 2981
|
|
| PLN00162 |
PLN00162 |
transport protein sec23; Provisional |
325-669 |
2.04e-11 |
|
transport protein sec23; Provisional
Pssm-ID: 215083 [Multi-domain] Cd Length: 761 Bit Score: 68.04 E-value: 2.04e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 325 IRCTTYCFPCTSDMAKQAQIPLAAVIKPFAtiPSNESPL--YlvnhgesGPVRCNRCKAYMCPFMQFIEGGRRYQCGFCN 402
Cdd:PLN00162 12 VRMSWNVWPSSKIEASKCVIPLAALYTPLK--PLPELPVlpY-------DPLRCRTCRAVLNPYCRVDFQAKIWICPFCF 82
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 403 CVNDVPPFYFQhldhIGrrlDHYEKPELslgsYEYVATLDY---CRKNKPPNPPAFIFMIDVSYSNIKNGLVKlicEELK 479
Cdd:PLN00162 83 QRNHFPPHYSS----IS---ETNLPAEL----FPQYTTVEYtlpPGSGGAPSPPVFVFVVDTCMIEEELGALK---SALL 148
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 480 TMLEKLPKEeqeemsaIRVGFITY----------------------------NKVLHFFNVKSNLAQPQMMVVTDVGEVF 531
Cdd:PLN00162 149 QAIALLPEN-------ALVGLITFgthvhvhelgfsecsksyvfrgnkevskDQILEQLGLGGKKRRPAGGGIAGARDGL 221
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 532 VPL-LDGFLVNYQESQSVIHNLLDQI-PDMFADSNENE----TVFAPVIQAGMEALKAADCPGKLFIFHSSlPTAEAPGK 605
Cdd:PLN00162 222 SSSgVNRFLLPASECEFTLNSALEELqKDPWPVPPGHRparcTGAALSVAAGLLGACVPGTGARIMAFVGG-PCTEGPGA 300
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1622941236 606 LKNRDDKKLVNTDKEKI-----LFQPQTNVYDSLAKDCVAHGCSVTLFLFPSQYVDVASLGLVPQLTGG 669
Cdd:PLN00162 301 IVSKDLSEPIRSHKDLDkdaapYYKKAVKFYEGLAKQLVAQGHVLDVFACSLDQVGVAEMKVAVERTGG 369
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
81-322 |
6.97e-10 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 63.80 E-value: 6.97e-10
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 81 AAGHPQQRFPGPPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGmaPPSQGPPGPLS 160
Cdd:PHA03247 2699 ADPPPPPPTPEPAPHALVSATPLPPGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAG--PPAPAPPAAPA 2776
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 161 ATSLQTPPRPPQPSiLQPGSQVLP----PPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPLPG-----QTLGAG 231
Cdd:PHA03247 2777 AGPPRRLTRPAVAS-LSESRESLPspwdPADPPAAVLAPAAALPPAASPAGPLPPPTSAQPTAPPPPPGppppsLPLGGS 2855
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 232 Y---------PPQQAANSGP----QMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYAT 298
Cdd:PHA03247 2856 VapggdvrrrPPSRSPAAKPaapaRPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPP 2935
|
250 260
....*....|....*....|....
gi 1622941236 299 NTRGQIPPLVTTDciIQDQGNASP 322
Cdd:PHA03247 2936 PPRPQPPLAPTTD--PAGAGEPSG 2957
|
|
| Gelsolin |
pfam00626 |
Gelsolin repeat; |
912-987 |
1.07e-09 |
|
Gelsolin repeat;
Pssm-ID: 395501 [Multi-domain] Cd Length: 76 Bit Score: 55.78 E-value: 1.07e-09
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1622941236 912 STMLPAAVRCSESRLSEEGIFLLANGLHMFLWLGVSSppELIQGIFNVPSFAHINTDM-TLLPEVGN-PYSQQLRMIM 987
Cdd:pfam00626 1 KFVLPPPVPLSQESLNSGDCYLLDNGFTIFLWVGKGS--SLLEKLFAALLAAQLDDDErFPLPEVIRvPQGKEPARFL 76
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
3-279 |
6.56e-09 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 60.17 E-value: 6.56e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 3 QQGYVATPPYSQPQPGIGLSPPhyghYGDPSHTASPTGMMKPAGPlGVTATGGMLPPGPPPPGPPPPGPHQFGQNGAHAa 82
Cdd:pfam03154 164 QQILQTQPPVLQAQSGAASPPS----PPPPGTTQAATAGPTPSAP-SVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTL- 237
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 83 gHPQqRFPGP-PPVNNVASSYAPYQPSAQSsYLSPMSTSSVTQLGSQLSA--MQINSYGSGMAHAGSGMAPPSQGPPGPL 159
Cdd:pfam03154 238 -HPQ-RLPSPhPPLQPMTQPPPPSQVSPQP-LPQPSLHGQMPPMPHSLQTgpSHMQHPVPPQPFPLTPQSSQSQVPPGPS 314
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 160 SATSLQ------TPPRPPQPSILQPG-SQVLPPPPTTL---NGPGASPlLPPMYRPDG------LSGPPP--PNSQYQPP 221
Cdd:pfam03154 315 PAAPGQsqqrihTPPSQSQLQSQQPPrEQPLPPAPLSMphiKPPPTTP-IPQLPNPQShkhpphLSGPSPfqMNSNLPPP 393
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 222 PL--PGQTLGAGYPP-------------QQAANSGPQ-----------MAGAQLSYPGGFPGGPAQlagPPQPQKKLDPD 275
Cdd:pfam03154 394 PAlkPLSSLSTHHPPsahppplqlmpqsQQLPPPPAQppvltqsqslpPPAASHPPTSGLHQVPSQ---SPFPQHPFVPG 470
|
....
gi 1622941236 276 SIPS 279
Cdd:pfam03154 471 GPPP 474
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
91-267 |
4.57e-08 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 57.35 E-value: 4.57e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 91 GPPPVNNVASSYAPYQPSAQSSYLSP----MSTSSVTqlgsqlSAMQINSYGSGmAHAGSGMAPPSQGPPGPLSATSLQT 166
Cdd:pfam09770 164 GVAPKKAAAPAPAPQPAAQPASLPAPsrkmMSLEEVE------AAMRAQAKKPA-QQPAPAPAQPPAAPPAQQAQQQQQF 236
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 167 PPRPPQPSILQPGSQVLPP------PPTTLNGPGASPLLPPMYRPdglsGPPPPNSQYQPPPLPGQTL------------ 228
Cdd:pfam09770 237 PPQIQQQQQPQQQPQQPQQhpgqghPVTILQRPQSPQPDPAQPSI----QPQAQQFHQQPPPVPVQPTqilqnpnrlsaa 312
|
170 180 190
....*....|....*....|....*....|....*....
gi 1622941236 229 GAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQLAGPPQ 267
Cdd:pfam09770 313 RVGYPQNPQPGVQPAPAHQAHRQQGSFGRQAPIITHPQQ 351
|
|
| SOBP |
pfam15279 |
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual ... |
93-281 |
5.35e-08 |
|
Sine oculis-binding protein; SOBP is associated with syndromic and nonsyndromic intellectual disability. It carries a zinc-finger of the zf-C2H2 type at the N-terminus, and a highly characteriztic C-terminal PhPhPhPhPhPh motif. The deduced 873-amino acid protein contains an N-terminal nuclear localization signal (NLS), followed by 2 FCS-type zinc finger motifs, a proline-rich region (PR1), a putative RNA-binding motif region, and a C-terminal NLS embedded in a second proline-rich motif. SOBP is expressed in various human tissues, including developing mouse brain at embryonic day 14. In postnatal and adult mouse brain SOBP is expressed in all neurons, with intense staining in the limbic system. Highest expression is in layer V cortical neurons, hippocampus, pyriform cortex, dorsomedial nucleus of thalamus, amygdala, and hypothalamus. Postnatal expression of SOBP in the limbic system corresponds to a time of active synaptogenesis. the family is also referred to as Jackson circler, JXC1. In seven affected siblings from a consanguineous Israeli Arab family with mental retardation, anterior maxillary protrusion, and strabismus mutations were found in this protein.
Pssm-ID: 464609 [Multi-domain] Cd Length: 325 Bit Score: 55.98 E-value: 5.35e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 93 PPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLS---AMQINSYGSGMAHAGSGMAPPSQGPPGPLSAT----SLQ 165
Cdd:pfam15279 117 ISVASSSKLLAPKPHEPPSLPPPPLPPKKGRRHRPGLHpplGRPPGSPPMSMTPRGLLGKPQQHPPPSPLPAFmepsSMP 196
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 166 TPPRPPQPSILQPGSQVLPP------PPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAA- 238
Cdd:pfam15279 197 PPFLRPPPSIPQPNSPLSNPmlpgigPPPKPPRNLGPPSNPMHRPPFSPHHPPPPPTPPGPPPGLPPPPPRGFTPPFGPp 276
|
170 180 190 200
....*....|....*....|....*....|....*....|....*..
gi 1622941236 239 -NSGPQMAGAQLSYPgGFPgGPAQLAGPPQ---PQKKLDPDSIPSPI 281
Cdd:pfam15279 277 fPPVNMMPNPPEMNF-GLP-SLAPLVPPVTvlvPYPVIIPLPVPIPI 321
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
9-302 |
5.82e-08 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 57.08 E-value: 5.82e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 9 TPPYSQPQPGIGLSPPHYGH--YGDPSHTASPTgmmkPAGPLGVTATGGMLPPGPPPPGPPPPGPHQFGQNGAHAAGHPQ 86
Cdd:pfam03154 261 VSPQPLPQPSLHGQMPPMPHslQTGPSHMQHPV----PPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSQLQS 336
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 87 QRFP---------------GPPPVNNVassyaPYQPSAQSSYLSP-MSTSSVTQLGSQLSAMQINSYGSGMahagSGMAP 150
Cdd:pfam03154 337 QQPPreqplppaplsmphiKPPPTTPI-----PQLPNPQSHKHPPhLSGPSPFQMNSNLPPPPALKPLSSL----STHHP 407
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 151 PSQGPPgPLS--ATSLQTPPRPPQPSILQ------PGSQVLPPPPTTLNGPGASPLLPPMYRPdglSGPPPPNSQYQPPP 222
Cdd:pfam03154 408 PSAHPP-PLQlmPQSQQLPPPPAQPPVLTqsqslpPPAASHPPTSGLHQVPSQSPFPQHPFVP---GGPPPITPPSGPPT 483
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 223 LPGQTLGAGYPPQQA--ANSGPQMAGAQLSYPggfpggPAQLAGPP-----QPQKKLDPDSIPSPIQVIEN--DRASRGG 293
Cdd:pfam03154 484 STSSAMPGIQPPSSAsvSSSGPVPAAVSCPLP------PVQIKEEAldeaeEPESPPPPPRSPSPEPTVVNtpSHASQSA 557
|
....*....
gi 1622941236 294 QVYATNTRG 302
Cdd:pfam03154 558 RFYKHLDRG 566
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
92-272 |
8.67e-08 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 56.70 E-value: 8.67e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 92 PPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSygsgmahagsgmAPPSQGPPGPLSATSLQTP-PRP 170
Cdd:pfam03154 134 PKDIDQDNRSTSPSIPSPQDNESDSDSSAQQQILQTQPPVLQAQS------------GAASPPSPPPPGTTQAATAgPTP 201
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 171 PQPSILQPGSQVLPPPPTTLNGPgASPLL----PPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAANSGP---- 242
Cdd:pfam03154 202 SAPSVPPQGSPATSQPPNQTQST-AAPHTliqqTPTLHPQRLPSPHPPLQPMTQPPPPSQVSPQPLPQPSLHGQMPpmph 280
|
170 180 190
....*....|....*....|....*....|....*..
gi 1622941236 243 --QMAGAQLSYPG---GFPGGP--AQLAGPPQPQKKL 272
Cdd:pfam03154 281 slQTGPSHMQHPVppqPFPLTPqsSQSQVPPGPSPAA 317
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
1-274 |
9.19e-08 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 56.17 E-value: 9.19e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 1 MSQQGYVATP--PYSQPQPGIGLSPPHYGHYGDPSHTASPTGMMKPAGPlgvtatggmlppgppppGPPPPGPHQFGQNG 78
Cdd:pfam09606 162 SSGQPGSGTPnqMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGMP-----------------GPADAGAQMGQQAQ 224
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 79 AHAAGHPQQRFPGPPpvnNVASSYAPYQPSAQSSylspmstssvtQLGSQLSAMQINSYGSGMAH----AGSGMAPPSQG 154
Cdd:pfam09606 225 ANGGMNPQQMGGAPN---QVAMQQQQPQQQGQQS-----------QLGMGINQMQQMPQGVGGGAgqggPGQPMGPPGQQ 290
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 155 PPGPLSATSLQTPPRPPQPSILQPGSQvlppppTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPP 234
Cdd:pfam09606 291 PGAMPNVMSIGDQNNYQQQQTRQQQQQ------QGGNHPAAHQQQMNQSVGQGGQVVALGGLNHLETWNPGNFGGLGANP 364
|
250 260 270 280
....*....|....*....|....*....|....*....|
gi 1622941236 235 QQAANsgPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDP 274
Cdd:pfam09606 365 MQRGQ--PGMMSSPSPVPGQQVRQVTPNQFMRQSPQPSVP 402
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
107-244 |
9.98e-08 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 55.97 E-value: 9.98e-08
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 107 PSAQSSYLSPMSTS-SVTQLGSQLSAMQINSYGSGmaHAGSGMAPPSQGPPgplsatslqtppRPPQPSILQPGSQVLPP 185
Cdd:TIGR01628 381 RMRQLPMGSPMGGAmGQPPYYGQGPQQQFNGQPLG--WPRMSMMPTPMGPG------------GPLRPNGLAPMNAVRAP 446
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*....
gi 1622941236 186 PPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQyQPPPLPGQTLGAGYPPQQAANSGPQM 244
Cdd:TIGR01628 447 SRNAQNAAQKPPMQPVMYPPNYQSLPLSQDLP-QPQSTASQGGQNKKLAQVLASATPQM 504
|
|
| PHA03247 |
PHA03247 |
large tegument protein UL36; Provisional |
8-362 |
2.02e-07 |
|
large tegument protein UL36; Provisional
Pssm-ID: 223021 [Multi-domain] Cd Length: 3151 Bit Score: 55.71 E-value: 2.02e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 8 ATPPYSQPQPGIGLSPphYGHYGDPSHTASPTGMMKPAGPLGVTATGGmlppgppppgppppGPHQFGQNGAHAAGHPqq 87
Cdd:PHA03247 2586 ARRPDAPPQSARPRAP--VDDRGDPRGPAPPSPLPPDTHAPDPPPPSP--------------SPAANEPDPHPPPTVP-- 2647
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 88 rfpgPPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQiNSYGSGMAHAGSGMAPPSQGPPGPLSATSLQTP 167
Cdd:PHA03247 2648 ----PPERPRDDPAPGRVSRPRRARRLGRAAQASSPPQRPRRRAAR-PTVGSLTSLADPPPPPPTPEPAPHALVSATPLP 2722
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 168 PRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPP--PLPGQTLGAGYPPQQAANSGP--- 242
Cdd:PHA03247 2723 PGPAAARQASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAagPPRRLTRPAVASLSESRESLPspw 2802
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 243 QMAGAQLSYPGGFPG-----GPAQLAGPPQPQKKLDPDSIPSPIQVIENDRAS--RGGQVYATNTRGQIPPLVTTDCIIQ 315
Cdd:PHA03247 2803 DPADPPAAVLAPAAAlppaaSPAGPLPPPTSAQPTAPPPPPGPPPPSLPLGGSvaPGGDVRRRPPSRSPAAKPAAPARPP 2882
|
330 340 350 360
....*....|....*....|....*....|....*....|....*...
gi 1622941236 316 DQGNASPRFIRcTTYCFPCTSD-MAKQAQIPLAAVIKPFATIPSNESP 362
Cdd:PHA03247 2883 VRRLARPAVSR-STESFALPPDqPERPPQPQAPPPPQPQPQPPPPPQP 2929
|
|
| Atrophin-1 |
pfam03154 |
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ... |
86-281 |
2.28e-07 |
|
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.
Pssm-ID: 460830 [Multi-domain] Cd Length: 991 Bit Score: 55.16 E-value: 2.28e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 86 QQRFPGPPPV---NNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAmqinsygsgmahagsGMAPPSQGPPGPLSAT 162
Cdd:pfam03154 164 QQILQTQPPVlqaQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSP---------------ATSQPPNQTQSTAAPH 228
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 163 SL--QTPPRPPQ--PSILQPGSQVLPPPPTTLNGPGASPlLPPMYRPdglsGPPPPNS-QYQPP--PLPGQTLGAGYPPQ 235
Cdd:pfam03154 229 TLiqQTPTLHPQrlPSPHPPLQPMTQPPPPSQVSPQPLP-QPSLHGQ----MPPMPHSlQTGPShmQHPVPPQPFPLTPQ 303
|
170 180 190 200
....*....|....*....|....*....|....*....|....*..
gi 1622941236 236 QAANSGPQMAGAQLSYPG-GFPGGPAQLAGPPQPQKKLDPDSIPSPI 281
Cdd:pfam03154 304 SSQSQVPPGPSPAAPGQSqQRIHTPPSQSQLQSQQPPREQPLPPAPL 350
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
85-280 |
5.71e-07 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 53.92 E-value: 5.71e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 85 PQQRFPGPPPVNNVASSYAPYQPSAQSSylspmstssvtqlgsqlSAMQINSYGSGMAH---AGSGMAPPSQGPPGPLS- 160
Cdd:PHA03378 654 PPQVEITPYKPTWTQIGHIPYQPSPTGA-----------------NTMLPIQWAPGTMQpppRAPTPMRPPAAPPGRAQr 716
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 161 ATSLQTPPRPPQ--PSILQPGSQV---LPPP---PTTLNGPGASPllPPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAG- 231
Cdd:PHA03378 717 PAAATGRARPPAaaPGRARPPAAApgrARPPaaaPGRARPPAAAP--GRARPPAAAPGAPTPQPPPQAPPAPQQRPRGAp 794
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1622941236 232 ---------------YPPQQAANSGPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIPSP 280
Cdd:PHA03378 795 tpqpppqagptsmqlMPRAAPGQQGPTKQILRQLLTGGVKRGRPSLKKPAALERQAAAGPTPSP 858
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
75-297 |
6.50e-07 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 53.45 E-value: 6.50e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 75 GQNGAHAAGHPQQRFPGPPPVNN-----VASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMA 149
Cdd:PRK07764 590 PAPGAAGGEGPPAPASSGPPEEAarpaaPAAPAAPAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPDASDGGDGW 669
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 150 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPP----LPG 225
Cdd:PRK07764 670 PAKAGGAAPAAPPPAPAPAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAADDPVPLPpepdDPP 749
|
170 180 190 200 210 220 230
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622941236 226 QTLGAGYPPQQA-ANSGPQmagaqlsypggfPGGPAQLAGPPQPQKKLDPDSIPSPiqvieNDRASRGGQVYA 297
Cdd:PRK07764 750 DPAGAPAQPPPPpAPAPAA------------APAAAPPPSPPSEEEEMAEDDAPSM-----DDEDRRDAEEVA 805
|
|
| Drf_FH1 |
pfam06346 |
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ... |
147-268 |
8.10e-07 |
|
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.
Pssm-ID: 461881 [Multi-domain] Cd Length: 157 Bit Score: 49.87 E-value: 8.10e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 147 GMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTlngPGASPLLPPMYRPDGLSGPPPP-----NSQYQPP 221
Cdd:pfam06346 9 DSSTIPLPPGACIPTPPPLPGGGGPPPPPPLPGSAAIPPPPPL---PGGTSIPPPPPLPGAASIPPPPplpgsTGIPPPP 85
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*.
gi 1622941236 222 PLPGqtlGAGYPPQQAANSG---------PQMAGAQLSYPGGFPGGPAQLAGPPQP 268
Cdd:pfam06346 86 PLPG---GAGIPPPPPPLPGgagvpppppPLPGGPGIPPPPPFPGGPGIPPPPPGM 138
|
|
| PHA03378 |
PHA03378 |
EBNA-3B; Provisional |
146-282 |
1.92e-06 |
|
EBNA-3B; Provisional
Pssm-ID: 223065 [Multi-domain] Cd Length: 991 Bit Score: 51.99 E-value: 1.92e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 146 SGMAPPSQGPpgplsaTSLQTPPRPPQPsilqpgsqVLPP--PPTTLNGPGASPllPPMYRPDGLSGPP-PPNSQYQPPP 222
Cdd:PHA03378 682 NTMLPIQWAP------GTMQPPPRAPTP--------MRPPaaPPGRAQRPAAAT--GRARPPAAAPGRArPPAAAPGRAR 745
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622941236 223 LPGQTLGAGYPPQQAANSGPQmagaqlsyPGGFPGGPA---QLAGPPQPQKKldPDSIPSPIQ 282
Cdd:PHA03378 746 PPAAAPGRARPPAAAPGRARP--------PAAAPGAPTpqpPPQAPPAPQQR--PRGAPTPQP 798
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
149-293 |
4.37e-06 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 50.94 E-value: 4.37e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 149 APPSQGPPGPLSATSLQTPPRPPQPsilqpgsqvLPPPPTTlngpgaSPLLPPMYRPDGLSGPPPPNSQYQPPPLPGQTL 228
Cdd:PHA03307 99 SPAREGSPTPPGPSSPDPPPPTPPP---------ASPPPSP------APDLSEMLRPVGSPGPPPAASPPAAGASPAAVA 163
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1622941236 229 GAGYPPQQAA---NSGPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIPSPIQVIENDRASRGG 293
Cdd:PHA03307 164 SDAASSRQAAlplSSPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADD 231
|
|
| PPE |
COG5651 |
PPE-repeat protein [Function unknown]; |
89-266 |
5.60e-06 |
|
PPE-repeat protein [Function unknown];
Pssm-ID: 444372 [Multi-domain] Cd Length: 385 Bit Score: 49.89 E-value: 5.60e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 89 FPGPPPVN----------NVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMAPPSQGPPGP 158
Cdd:COG5651 167 FTQPPPTItnpggllgaqNAGSGNTSSNPGFANLGLTGLNQVGIGGLNSGSGPIGLNSGPGNTGFAGTGAAAGAAAAAAA 246
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 159 LSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPmyrPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAA 238
Cdd:COG5651 247 AAAAAGAGASAALASLAATLLNASSLGLAATAASSAATNLGLA---GSPLGLAGGGAGAAAATGLGLGAGGAAGAAGATG 323
|
170 180
....*....|....*....|....*...
gi 1622941236 239 NSGPQMAGAQLSYPGGFPGGPAQLAGPP 266
Cdd:COG5651 324 AGAALGAGAAAAAAGAAAGAGAAAAAAA 351
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
74-306 |
7.22e-06 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 50.03 E-value: 7.22e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 74 FGQNG--AHAAGHPQQRFP--GPPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMaHAGS--- 146
Cdd:pfam09770 85 FGQTAkvSDAIEEEQVRFNrqQPAARAAQSSAQPPASSLPQYQYASQQSQQPSKPVRTGYEKYKEPEPIPDL-QVDAslw 163
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 147 GMAPPSQGPPGPLSATSLQtPPRPPQPS--IL-------QPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQ 217
Cdd:pfam09770 164 GVAPKKAAAPAPAPQPAAQ-PASLPAPSrkMMsleeveaAMRAQAKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQ 242
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 218 YQPPPLPGQtlgagyPPQQAANSGPQMAgaQLSYPGGFPGGPAQLAGPPQPQ--KKLDPDSIPSPIQVIEN-DRASRGGQ 294
Cdd:pfam09770 243 QQQPQQQPQ------QPQQHPGQGHPVT--ILQRPQSPQPDPAQPSIQPQAQqfHQQPPPVPVQPTQILQNpNRLSAARV 314
|
250
....*....|..
gi 1622941236 295 VYATNTRGQIPP 306
Cdd:pfam09770 315 GYPQNPQPGVQP 326
|
|
| Pro-rich |
pfam15240 |
Proline-rich protein; This family includes several eukaryotic proline-rich proteins. |
127-282 |
8.04e-06 |
|
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
Pssm-ID: 464580 [Multi-domain] Cd Length: 167 Bit Score: 47.34 E-value: 8.04e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 127 SQLSAMQINSYGSGMAHAGSgmaPPSQGPPGPLSATSLQTPPRP--PQPSILQPGSQV--LPPPPTTLNGPGASPLLPPM 202
Cdd:pfam15240 30 SLISEEEGQSQQGGQGPQGP---PPGGFPPQPPASDDPPGPPPPggPQQPPPQGGKQKpqGPPPQGGPRPPPGKPQGPPP 106
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 203 YRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAansgpqmagaqlsypggfpggPAQLAGPPQ--PQKKLDPDSIPSP 280
Cdd:pfam15240 107 QGGNQQQGPPPPGKPQGPPPQGGGPPPQGGNQQGP---------------------PPPPPGNPQgpPQRPPQPGNPQGP 165
|
..
gi 1622941236 281 IQ 282
Cdd:pfam15240 166 PQ 167
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
139-277 |
1.18e-05 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 49.60 E-value: 1.18e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 139 SGMAHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQY 218
Cdd:PRK07764 383 RRLGVAGGAGAPAAAAPSAAAAAPAAAPAPAAAAP----AAAAAPAPAAAPQPAPAPAPAPAPPSPAGNAPAGGAPSPPP 458
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*....
gi 1622941236 219 QPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSypggfPGGPAQLAGPPQPQKKLDPDSI 277
Cdd:PRK07764 459 AAAPSAQPAPAPAAAPEPTAAPAPAPPAAPAP-----AAAPAAPAAPAAPAGADDAATL 512
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
136-306 |
1.21e-05 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 49.49 E-value: 1.21e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 136 SYGSGMAHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPN 215
Cdd:PRK12323 366 GQSGGGAGPATAAAAPVAQPAPAAAAPAAAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPG 445
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 216 SQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLdPDSIPSPiQVIENDRASRGGQV 295
Cdd:PRK12323 446 GAPAPAPAPAAAPAAAARPAAAGPRPVAAAAAAAPARAAPAAAPAPADDDPPPWEEL-PPEFASP-APAQPDAAPAGWVA 523
|
170
....*....|.
gi 1622941236 296 YATNTRGQIPP 306
Cdd:PRK12323 524 ESIPDPATADP 534
|
|
| PRK07764 |
PRK07764 |
DNA polymerase III subunits gamma and tau; Validated |
142-306 |
3.73e-05 |
|
DNA polymerase III subunits gamma and tau; Validated
Pssm-ID: 236090 [Multi-domain] Cd Length: 824 Bit Score: 47.67 E-value: 3.73e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 142 AHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPSilQPGSqvlPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPP 221
Cdd:PRK07764 587 VVGPAPGAAGGEGPPAPASSGPPEEAARPAAPA--APAA---PAAPAPAGAAAAPAEASAAPAPGVAAPEHHPKHVAVPD 661
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 222 PlPGQTLGAGYPPQQAANSGPQMAGAqlsyPGGFPGGPAQLAGPPQPQKKLDPDSIPSPIQVIENDRASRGGQVYATNTR 301
Cdd:PRK07764 662 A-SDGGDGWPAKAGGAAPAAPPPAPA----PAAPAAPAGAAPAQPAPAPAATPPAGQADDPAAQPPQAAQGASAPSPAAD 736
|
....*
gi 1622941236 302 GQIPP 306
Cdd:PRK07764 737 DPVPL 741
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
3-243 |
4.08e-05 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 47.70 E-value: 4.08e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 3 QQGYVATPPYSQPQPGIGLSPPHYGHYGDPShtASPTGMmkPAGPLGVTATG------GMLPPGPPPPGPPPPGPHQFGQ 76
Cdd:pfam09606 243 MQQQQPQQQGQQSQLGMGINQMQQMPQGVGG--GAGQGG--PGQPMGPPGQQpgampnVMSIGDQNNYQQQQTRQQQQQQ 318
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 77 NGAHAAGHPQQrfpgpppvnnvassyaPYQPSAQSSYLSPMStssvtQLGSQLSAMQINSYGSGMAHAGSG----MAPPS 152
Cdd:pfam09606 319 GGNHPAAHQQQ----------------MNQSVGQGGQVVALG-----GLNHLETWNPGNFGGLGANPMQRGqpgmMSSPS 377
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 153 QGPPGPL-SATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPplPGQTLGAg 231
Cdd:pfam09606 378 PVPGQQVrQVTPNQFMRQSPQPSVPSPQGPGSQPPQSHPGGMIPSPALIPSPSPQMSQQPAQQRTIGQDS--PGGSLNT- 454
|
250
....*....|....
gi 1622941236 232 yPPQQAANS--GPQ 243
Cdd:pfam09606 455 -PGQSAVNSplNPQ 467
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
140-269 |
6.29e-05 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 46.93 E-value: 6.29e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 140 GMAHAGSGMAPPSQGPPGPLSAtsLQTPP----RPPQPSILQPGSQvlPPPPTTLNGPG-ASPLLPPMYRPD---GLSGP 211
Cdd:pfam09606 64 QGGQGNGGMGGGQQGMPDPINA--LQNLAgqgtRPQMMGPMGPGPG--GPMGQQMGGPGtASNLLASLGRPQmpmGGAGF 139
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 212 PPPNSQYQPPPLPGQTLGAGYPPQQAANSGP--QMAGAQLSYPGGFPGGPAQLAGPPQPQ 269
Cdd:pfam09606 140 PSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTpnQMGPNGGPGQGQAGGMNGGQQGPMGGQ 199
|
|
| PAT1 |
pfam09770 |
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ... |
79-239 |
8.56e-05 |
|
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.
Pssm-ID: 401645 [Multi-domain] Cd Length: 846 Bit Score: 46.57 E-value: 8.56e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 79 AHAAGHPQQRFPGPPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLS---AMQInsygsgMAHAGSGMAPPSQGP 155
Cdd:pfam09770 207 AKKPAQQPAPAPAQPPAAPPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGqghPVTI------LQRPQSPQPDPAQPS 280
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 156 PGPLSATSLQTPPRPPQpsilQPgSQVLPPPpttlNGPGASPLLPPMYRPDGlSGPPPPNSQYQPPPLPGQTLGAGYPPQ 235
Cdd:pfam09770 281 IQPQAQQFHQQPPPVPV----QP-TQILQNP----NRLSAARVGYPQNPQPG-VQPAPAHQAHRQQGSFGRQAPIITHPQ 350
|
....
gi 1622941236 236 QAAN 239
Cdd:pfam09770 351 QLAQ 354
|
|
| DUF3824 |
pfam12868 |
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It ... |
193-268 |
9.91e-05 |
|
Domain of unknwon function (DUF3824); This is a repeating domain found in fungal proteins. It is proline-rich, and the function is not known.
Pssm-ID: 372351 [Multi-domain] Cd Length: 145 Bit Score: 43.58 E-value: 9.91e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 193 PGASPLLPPM----YRPDGLSGPPPPNSQYQPPPLPgqTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQLAGPPQP 268
Cdd:pfam12868 61 YPPSPAGPYAsqgqYYPETNYFPPPPGSTPQPPVDP--QPNAPPPPYNPADYPPPPGAAPPPQPYQYPPPPGPDPYAPRP 138
|
|
| PHA02682 |
PHA02682 |
ORF080 virion core protein; Provisional |
103-276 |
1.51e-04 |
|
ORF080 virion core protein; Provisional
Pssm-ID: 177464 [Multi-domain] Cd Length: 280 Bit Score: 44.85 E-value: 1.51e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 103 APYQPSAQSSYLSPMSTSSVTQLG----SQLSA----MQINSYGSGMAHAGSGMAPPSQGP-PGPLSATSLQTPPrPPQP 173
Cdd:PHA02682 36 APAAPCPPDADVDPLDKYSVKEAGryyqSRLKAnsacMQRPSGQSPLAPSPACAAPAPACPaCAPAAPAPAVTCP-APAP 114
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 174 SILQPGSQVLPPPPTTLNGPGASPLLPPMYRpDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPg 253
Cdd:PHA02682 115 ACPPATAPTCPPPAVCPAPARPAPACPPSTR-QCPPAPPLPTPKPAPAAKPIFLHNQLPPPDYPAASCPTIETAPAASP- 192
|
170 180
....*....|....*....|...
gi 1622941236 254 gfpggpaqLAGPPQPQKKLDPDS 276
Cdd:PHA02682 193 --------VLEPRIPDKIIDADN 207
|
|
| SP6_N |
cd22544 |
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins ... |
117-266 |
1.58e-04 |
|
N-terminal domain of transcription factor Specificity Protein (SP) 6; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP6, also known as epiprofin, shows specific expression pattern in hair follicles and the apical ectodermal ridge (AER) of the developing limbs. SP6 null mice are nude and show defects in skin, teeth, limbs (syndactyly and oligodactyly), and lung alveoli. SP6 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. This model represents the N-terminal domain of SP6.
Pssm-ID: 411693 [Multi-domain] Cd Length: 245 Bit Score: 44.53 E-value: 1.58e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 117 MSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMAPPSQGPPG----PLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNG 192
Cdd:cd22544 1 MLTAVCGSLGNQHSETPRASPPTLDLQPLQPYQIHSSPEAGdypsPLQPTELQSLPLGPGVDFSARESYEPHSSRRTCLD 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 193 PGasPLLPPMYRPDGLSGPPPPNSQYQP---PPLPGQTLGAG----------------YPPQQAANSGPQMAGAQLSYPG 253
Cdd:cd22544 81 LE--SDLPLGPFPKLLHPPPDMAHPYESwfrPPHPGGSGEEGgvpswwdlhagsswmdLQHGQGGLQSPGPPGGLQPPLG 158
|
170
....*....|...
gi 1622941236 254 GFpGGPAQLAGPP 266
Cdd:cd22544 159 GY-GSEHQLCGPP 170
|
|
| PRK12323 |
PRK12323 |
DNA polymerase III subunit gamma/tau; |
81-264 |
1.66e-04 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237057 [Multi-domain] Cd Length: 700 Bit Score: 45.64 E-value: 1.66e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 81 AAGHPQQRFPGPPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMAPPSQGPPGPLS 160
Cdd:PRK12323 394 AAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPAAAGPRPVA 473
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 161 ATSLQtPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPM---YRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQA 237
Cdd:PRK12323 474 AAAAA-APARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPdaaPAGWVAESIPDPATADPDDAFETLAPAPAAAPAPR 552
|
170 180
....*....|....*....|....*..
gi 1622941236 238 ANSGPQMAGAQLSYPGGFPGGPAQLAG 264
Cdd:PRK12323 553 AAAATEPVVAPRPPRASASGLPDMFDG 579
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
81-268 |
1.93e-04 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 45.55 E-value: 1.93e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 81 AAGHPQQRFPGPPPVNNVASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMAPPSQGP----- 155
Cdd:PHA03307 271 EASGWNGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSSSTSSSSESSRGAAVSPGPspsrs 350
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 156 PGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPdglSGPPPPNSQYQPPPLPgqtLGAGYPPQ 235
Cdd:PHA03307 351 PSPSRPPPPADPSSPRKRPRPSRAPSSPAASAGRPTRRRARAAVAGRARR---RDATGRFPAGRPRPSP---LDAGAASG 424
|
170 180 190
....*....|....*....|....*....|....*
gi 1622941236 236 QAANSGPqmagaqLSYPGG--FPGgpaqlAGPPQP 268
Cdd:PHA03307 425 AFYARYP------LLTPSGepWPG-----SPPPPP 448
|
|
| PRK13729 |
PRK13729 |
conjugal transfer pilus assembly protein TraB; Provisional |
122-242 |
3.13e-04 |
|
conjugal transfer pilus assembly protein TraB; Provisional
Pssm-ID: 184281 [Multi-domain] Cd Length: 475 Bit Score: 44.43 E-value: 3.13e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 122 VTQLGSQLSAM--QINSYGSGMAHAGSGMAP-PSQGPPGPLSatslqTPPRPPQPSILQPGSQVLPPPPTtlngpgasPL 198
Cdd:PRK13729 106 IEKLGQDNAALaeQVKALGANPVTATGEPVPqMPASPPGPEG-----EPQPGNTPVSFPPQGSVAVPPPT--------AF 172
|
90 100 110 120
....*....|....*....|....*....|....*....|....*.
gi 1622941236 199 LPpmyrpdGLSGPPPPNSQYQPPPLPG--QTLGAGYPPQQAANSGP 242
Cdd:PRK13729 173 YP------GNGVTPPPQVTYQSVPVPNriQRKTFTYNEGKKGPSLP 212
|
|
| PRK10263 |
PRK10263 |
DNA translocase FtsK; Provisional |
119-281 |
3.20e-04 |
|
DNA translocase FtsK; Provisional
Pssm-ID: 236669 [Multi-domain] Cd Length: 1355 Bit Score: 45.08 E-value: 3.20e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 119 TSSVTQLGSQLSAMQINSYgSGMAHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGASPL 198
Cdd:PRK10263 693 AAAEAELARQFAQTQQQRY-SGEQPAGANPFSLDDFEFSPMKALLDDGPHEPLFTPIVEPVQQPQQPVAPQQQYQQPQQP 771
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 199 LPPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQaansgPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIP 278
Cdd:PRK10263 772 VAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQYQQPQQ-----PVAPQPQYQQPQQPVAPQPQYQQPQQPVAPQPQDTLL 846
|
...
gi 1622941236 279 SPI 281
Cdd:PRK10263 847 HPL 849
|
|
| Pro-rich |
pfam15240 |
Proline-rich protein; This family includes several eukaryotic proline-rich proteins. |
172-274 |
3.62e-04 |
|
Proline-rich protein; This family includes several eukaryotic proline-rich proteins.
Pssm-ID: 464580 [Multi-domain] Cd Length: 167 Bit Score: 42.33 E-value: 3.62e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 172 QPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAANSGPqmaGAQLSY 251
Cdd:pfam15240 28 SPSLISEEEGQSQQGGQGPQGPPPGGFPPQPPASDDPPGPPPPGGPQQPPPQGGKQKPQGPPPQGGPRPPP---GKPQGP 104
|
90 100
....*....|....*....|...
gi 1622941236 252 PggfPGGPAQLAGPPQPQKKLDP 274
Cdd:pfam15240 105 P---PQGGNQQQGPPPPGKPQGP 124
|
|
| BimA_second |
NF040983 |
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia ... |
165-266 |
4.14e-04 |
|
trimeric autotransporter actin-nucleating factor BimA; This HMM describes BimA (Burkholderia intracellular motility A), WP_004266405.1-like proteins in Burkholderia mallei or B. pseudomallei. The term BimA has also been used for WP_011205626.1-like homologs that have a very different N-terminal half.
Pssm-ID: 468913 [Multi-domain] Cd Length: 382 Bit Score: 44.12 E-value: 4.14e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 165 QTPPRPPQPsilqpgsqvlPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPlPGQTLGAGYPPQQAANSgpqm 244
Cdd:NF040983 88 KVPPPPPPP----------PPPPPPPPTPPPPPPPPPPPPPPSPPPPPPPSPPPSPPP-PTTTPPTRTTPSTTTPT---- 152
|
90 100
....*....|....*....|..
gi 1622941236 245 agaqlsyPGGFPGGPAQLAGPP 266
Cdd:NF040983 153 -------PSMHPIQPTQLPSIP 167
|
|
| PHA03321 |
PHA03321 |
tegument protein VP11/12; Provisional |
81-298 |
4.43e-04 |
|
tegument protein VP11/12; Provisional
Pssm-ID: 223041 [Multi-domain] Cd Length: 694 Bit Score: 44.18 E-value: 4.43e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 81 AAGHPQQRFpgpppvnnvASSYAPYQPSAQSSYLSPMSTSSVTQLGSQLsAMQINSYGSGMAHAGSGMAPPSQ-GPPGPL 159
Cdd:PHA03321 365 AAVERQERF---------CRTTAPLFPTMTASSWARMERSIKAWFEAAL-ATELFRTGVPSEHYEASLRLLSSrQPPGAP 434
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 160 SATSLQTPPRPPQPsilQPGSqvlPPppttlngpgASPLLPPMYRPDGlSGPPPPNSQYQPPPLPgqtLGAGYPPQQAAN 239
Cdd:PHA03321 435 APRRDNDPPPPPRA---RPGS---TP---------ACARRARAQRARD-AGPEYVDPLGALRRLP---AGAAPPPEPAAA 495
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1622941236 240 SGP-----QMAGAQLSYPGGFPGGPaQLAGPPQPQKKLDPDSIPSPIQVIENDRASR--GGQVYAT 298
Cdd:PHA03321 496 PSPatyytRMGGGPPRLPPRNRATE-TLRPDWGPPAAAPPEQMEDPYLEPDDDRFDRrdGAAAAAT 560
|
|
| Drf_FH1 |
pfam06346 |
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ... |
155-224 |
6.18e-04 |
|
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.
Pssm-ID: 461881 [Multi-domain] Cd Length: 157 Bit Score: 41.39 E-value: 6.18e-04
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 155 PPGPLSATSLQTPPRPPQPsilqpGSQVLPPPPTTLngPGASPLLPPMYRPDGLSGPPPPNSQYQPPPLP 224
Cdd:pfam06346 83 PPPPLPGGAGIPPPPPPLP-----GGAGVPPPPPPL--PGGPGIPPPPPFPGGPGIPPPPPGMGMPPPPP 145
|
|
| PHA03379 |
PHA03379 |
EBNA-3A; Provisional |
92-268 |
7.13e-04 |
|
EBNA-3A; Provisional
Pssm-ID: 223066 [Multi-domain] Cd Length: 935 Bit Score: 43.89 E-value: 7.13e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 92 PPPVNNVASSYA---PYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHagsgmaPPSQGPP-GPLSATSLQTP 167
Cdd:PHA03379 610 PPQLTQVSPQQPmeyPLEPEQQMFPGSPFSQVADVMRAGGVPAMQPQYFDLPLQQ------PISQGAPlAPLRASMGPVP 683
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 168 PRPP-QPSILQ-PGSQVLPPPPTTLNG----PGASPLLPP--MYRPDGLSGPPPP---NSQY------QP---------- 220
Cdd:PHA03379 684 PVPAtQPQYFDiPLTEPINQGASAAHFlpqqPMEGPLVPErwMFQGATLSQSVRPgvaQSQYfdlpltQPinhgapaahf 763
|
170 180 190 200 210
....*....|....*....|....*....|....*....|....*....|....*.
gi 1622941236 221 ---PPLPG-----QTLGAGYPPQQAANSGPQMAGAqLSYPggfpggPAQLAGPPQP 268
Cdd:PHA03379 764 lhqPPMEGpwvpeQWMFQGAPPSQGTDVVQHQLDA-LGYV------LHVLNHPGVP 812
|
|
| PRK14971 |
PRK14971 |
DNA polymerase III subunit gamma/tau; |
146-238 |
7.45e-04 |
|
DNA polymerase III subunit gamma/tau;
Pssm-ID: 237874 [Multi-domain] Cd Length: 614 Bit Score: 43.61 E-value: 7.45e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 146 SGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLP-PPPTTLNGPGASPLLPPMYRPDglSGPPPPNSQYQPPPLP 224
Cdd:PRK14971 384 TQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGtPPTVSVDPPAAVPVNPPSTAPQ--AVRPAQFKEEKKIPVS 461
|
90 100
....*....|....*....|
gi 1622941236 225 GQ------TLGAGYPPQQAA 238
Cdd:PRK14971 462 KVsslgpsTLRPIQEKAEQA 481
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
160-306 |
7.56e-04 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 43.46 E-value: 7.56e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 160 SATSLQTPPRPPQPSILQPGSQVLPPPPTTL-NGPGASPlLPPMYRPDGLSGPPPPNSQYQPPPLPGQTLGAGYPPQQAA 238
Cdd:pfam09606 56 KAAQQQQPQGGQGNGGMGGGQQGMPDPINALqNLAGQGT-RPQMMGPMGPGPGGPMGQQMGGPGTASNLLASLGRPQMPM 134
|
90 100 110 120 130 140 150
....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1622941236 239 NSG----PQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIPSPIQViendrASRGGQVYATNTRGQIPP 306
Cdd:pfam09606 135 GGAgfpsQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQG-----QAGGMNGGQQGPMGGQMP 201
|
|
| PBP1 |
COG5180 |
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification]; ... |
139-288 |
9.83e-04 |
|
PAB1-binding protein, interacts with poly(A)-binding protein [RNA processing and modification];
Pssm-ID: 444064 [Multi-domain] Cd Length: 548 Bit Score: 43.13 E-value: 9.83e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 139 SGMAHAGSGMAPPSQGPPGPLSATSLQTPPRPPQ------------------PSILQPGSQVLP--PPPTTLNGPGASPL 198
Cdd:COG5180 300 RPIDVKGVASAPPATRPVRPPGGARDPGTPRPGQpterpagvpeaasdagqpPSAYPPAEEAVPgkPLEQGAPRPGSSGG 379
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 199 LPPMYRPDglSGPPPPNSQYQPPPLPGQTLGAGYPP----QQAANSGPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDP 274
Cdd:COG5180 380 DGAPFQPP--NGAPQPGLGRRGAPGPPMGAGDLVQAaldgGGRETASLGGAAGGAGQGPKADFVPGDAESVSGPAGLADQ 457
|
170
....*....|....
gi 1622941236 275 DSIPSPIQVIENDR 288
Cdd:COG5180 458 AGAAASTAMADFVA 471
|
|
| DUF3729 |
pfam12526 |
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ... |
148-226 |
1.00e-03 |
|
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.
Pssm-ID: 372164 [Multi-domain] Cd Length: 115 Bit Score: 40.06 E-value: 1.00e-03
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1622941236 148 MAPPSQGPPGPLSATSLQTPPRPPQPSilqPGSQVLPPPpttlngpgasPLLPPMYRPDGLSGPPPPNSQYQPPPLPGQ 226
Cdd:pfam12526 39 PPPPVGDPRPPVVDTPPPVSAVWVLPP---PSEPAAPEP----------DLVPPVTGPAGPPSPLAPPAPAQKPPLPPP 104
|
|
| PHA03307 |
PHA03307 |
transcriptional regulator ICP4; Provisional |
144-280 |
1.02e-03 |
|
transcriptional regulator ICP4; Provisional
Pssm-ID: 223039 [Multi-domain] Cd Length: 1352 Bit Score: 43.24 E-value: 1.02e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 144 AGSGMAPPSQGPPGPlsatslqtpprPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPL 223
Cdd:PHA03307 57 AGAAACDRFEPPTGP-----------PPGPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTPPPAS 125
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 1622941236 224 PGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQlAGPPQPQKKLDPDSIPSP 280
Cdd:PHA03307 126 PPPSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVA-SDAASSRQAALPLSSPEE 181
|
|
| PABP-1234 |
TIGR01628 |
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ... |
165-272 |
1.02e-03 |
|
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.
Pssm-ID: 130689 [Multi-domain] Cd Length: 562 Bit Score: 42.87 E-value: 1.02e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 165 QTPPRPPQPSILQP--GSQVLPP----------PPTTLNGPGASPLLPPMY--RPDGLSGPPPPNSQYQPP----PLPGQ 226
Cdd:TIGR01628 377 QLQPRMRQLPMGSPmgGAMGQPPyygqgpqqqfNGQPLGWPRMSMMPTPMGpgGPLRPNGLAPMNAVRAPSrnaqNAAQK 456
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|..
gi 1622941236 227 TLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGP-----AQLAG-PPQPQKKL 272
Cdd:TIGR01628 457 PPMQPVMYPPNYQSLPLSQDLPQPQSTASQGGQnkklaQVLASaTPQMQKQV 508
|
|
| Drf_FH1 |
pfam06346 |
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs) ... |
151-234 |
1.32e-03 |
|
Formin Homology Region 1; This region is found in some of the Diaphanous related formins (Drfs). It consists of low complexity repeats of around 12 residues.
Pssm-ID: 461881 [Multi-domain] Cd Length: 157 Bit Score: 40.62 E-value: 1.32e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 151 PSQGPPGPLSAtSLQTPPRPPQPSilqpGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPP-PNSQYQPPPLPGQtlg 229
Cdd:pfam06346 67 ASIPPPPPLPG-STGIPPPPPLPG----GAGIPPPPPPLPGGAGVPPPPPPLPGGPGIPPPPPfPGGPGIPPPPPGM--- 138
|
....*
gi 1622941236 230 aGYPP 234
Cdd:pfam06346 139 -GMPP 142
|
|
| PHA03369 |
PHA03369 |
capsid maturational protease; Provisional |
125-252 |
1.58e-03 |
|
capsid maturational protease; Provisional
Pssm-ID: 223061 [Multi-domain] Cd Length: 663 Bit Score: 42.29 E-value: 1.58e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 125 LGSQLSAMQINSYGSGMAHAGSGMAPPSQGPPGPLSATSLQTPPRPPqpsilqPGSQVLPPPPTTLNGPGASPLLPPMYR 204
Cdd:PHA03369 341 LKAHNEILKTASLTAPSRVLAAAAKVAVIAAPQTHTGPADRQRPQRP------DGIPYSVPARSPMTAYPPVPQFCGDPG 414
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|...
gi 1622941236 205 PDGLSGPPPPNSQYQPPPLpgqtlgAGYPPQ-----QAANSGPQMAGAQLSYP 252
Cdd:PHA03369 415 LVSPYNPQSPGTSYGPEPV------GPVPPQptnpyVMPISMANMVYPGHPQE 461
|
|
| SAV_2336_NTERM |
NF041121 |
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 ... |
142-226 |
2.11e-03 |
|
SAV_2336 family N-terminal domain; This HMM describes an N-terminal domain shared by SAV_2336 (BAC70047.1) whose C-terminal region suggests restriction enzyme activity (PMID: 18456708), and with other proteins with unrelated C-terminal regions. A member protein was also identified in a kanamycin biosynthetic gene cluster (PMID:16766657), while N-terminal regions of two other member proteins were named Trypco1 in a bioinformatic study (PMID:32101166) of predicted bacterial conflict systems.
Pssm-ID: 469044 [Multi-domain] Cd Length: 473 Bit Score: 41.91 E-value: 2.11e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 142 AHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPSiLQPGSQVLPPPPTTLNGPGASP--LLPPMYRPDGLSGPPPPNSQ-- 217
Cdd:NF041121 13 AQMGRAAAPPSPEGPAPTAASQPATPPPPAAPP-SPPGDPPEPPAPEPAPLPAPYPgsLAPPPPPPPGPAGAAPGAALpv 91
|
90
....*....|.
gi 1622941236 218 --YQPPPLPGQ 226
Cdd:NF041121 92 rvPAPPALPNP 102
|
|
| dnaA |
PRK14086 |
chromosomal replication initiator protein DnaA; |
150-278 |
2.53e-03 |
|
chromosomal replication initiator protein DnaA;
Pssm-ID: 237605 [Multi-domain] Cd Length: 617 Bit Score: 41.73 E-value: 2.53e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 150 PPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGPGAsPLLPPMY-----RPDGLSGPPPPNS---QYQ-- 219
Cdd:PRK14086 94 EPAPPPPHARRTSEPELPRPGRRPYEGYGGPRADDRPPGLPRQDQL-PTARPAYpayqqRPEPGAWPRAADDygwQQQrl 172
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1622941236 220 --PPPLPGQTLGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQLAGPPQPQKKLDPDSIP 278
Cdd:PRK14086 173 gfPPRAPYASPASYAPEQERDREPYDAGRPEYDQRRRDYDHPRPDWDRPRRDRTDRPEPPP 233
|
|
| SSDP |
pfam04503 |
Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA ... |
85-268 |
2.59e-03 |
|
Single-stranded DNA binding protein, SSDP; This is a family of eukaryotic single-stranded DNA binding proteins with specificity to a pyrimidine-rich element found in the promoter region of the alpha2(I) collagen gene.
Pssm-ID: 461334 [Multi-domain] Cd Length: 293 Bit Score: 41.10 E-value: 2.59e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 85 PQQRFPGPPPVNNVASSYAPYQPSAQSSylSPMSTSSVTQLGSQLSA-------MQINSYGSGMAHAGSGMAP----PSQ 153
Cdd:pfam04503 40 PPGFFQSPPSHPSSQPSPHAQPPPHNPA--TMMGPHSQPFMGPRYPGgprpsvrMPQQGNDFNGPPGQQPMMPnsmdPTR 117
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 154 GPPGPLSATSLQ--TPPRPPQPSILQPGS---QVLPPPPTTLNGPGAsplLPPMYRPDGLSGPPP-PNSQ---------- 217
Cdd:pfam04503 118 PGGHPNMGGPMQrmNPPRGPGMGPMGPQSygpGMRGPPPNSTDGPGG---MPPMNMGPGGRRPWPqPNASnplpyssssp 194
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1622941236 218 --YQPPPLPGQTLGAG---YPPQQAANSGPQM-----AGAQLSYPGGFPGGPAqLAGPPQP 268
Cdd:pfam04503 195 gsYGGPPGGGGPPGPTpimPSPQDSTNSGENMytlmnPVGPGGNRANFPMGPG-LEGPMGP 254
|
|
| Med15 |
pfam09606 |
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ... |
27-274 |
2.64e-03 |
|
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.
Pssm-ID: 312941 [Multi-domain] Cd Length: 732 Bit Score: 41.92 E-value: 2.64e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 27 GHYGDPSHTASPTGMMKPAGPLGVTATGG----MLPPGPPPPGPPPpgphqfgqngahaaghPQQRFPGPPPVNNVASSY 102
Cdd:pfam09606 65 GGQGNGGMGGGQQGMPDPINALQNLAGQGtrpqMMGPMGPGPGGPM----------------GQQMGGPGTASNLLASLG 128
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 103 APYQPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMApPSQGPPGPLSATSLQTPPRPPQpsILQPGSQV 182
Cdd:pfam09606 129 RPQMPMGGAGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMG-PNGGPGQGQAGGMNGGQQGPMG--GQMPPQMG 205
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 183 LPPPPTTLNGPGAspllppMYRPDGLSGPPPPNSQYQPPPlpgQTLGAGYPPQQAANSGP-QMAGAQL-SYPGGFP---- 256
Cdd:pfam09606 206 VPGMPGPADAGAQ------MGQQAQANGGMNPQQMGGAPN---QVAMQQQQPQQQGQQSQlGMGINQMqQMPQGVGggag 276
|
250
....*....|....*....
gi 1622941236 257 -GGPAQLAGPPQPQKKLDP 274
Cdd:pfam09606 277 qGGPGQPMGPPGQQPGAMP 295
|
|
| Gag_spuma |
pfam03276 |
Spumavirus gag protein; |
146-284 |
3.02e-03 |
|
Spumavirus gag protein;
Pssm-ID: 460872 [Multi-domain] Cd Length: 614 Bit Score: 41.66 E-value: 3.02e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 146 SGMAPPSQGPPGPLSATSlqtppRPPQPSILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPLPG 225
Cdd:pfam03276 176 AEISPGAQGGIPPGASFS-----GLPSLPAIGGIHLPAIPGIHARAPPGNIARSLGDDIMPSLGDAGMPQPRFAFHPGNP 250
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1622941236 226 QTLGAGYPPQQAANSGP--QMAGAQLSYPGGFPGGPAQLAGPPQPQkkldPDSIPSPIQVI 284
Cdd:pfam03276 251 FAEAEGHPFAEAEGERPrdIPRAPRIDAPSAPAIPAIQPIAPPMIP----PIGAPIPIPHG 307
|
|
| FAP |
pfam07174 |
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment ... |
149-240 |
4.04e-03 |
|
Fibronectin-attachment protein (FAP); This family contains bacterial fibronectin-attachment proteins (FAP). Family members are rich in alanine and proline, are approximately 300 long, and seem to be restricted to mycobacteria. These proteins contain a fibronectin-binding motif that allows mycobacteria to bind to fibronectin in the extracellular matrix.
Pssm-ID: 429334 Cd Length: 301 Bit Score: 40.68 E-value: 4.04e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 149 APPSQGPPGPLSATSLQTPPRPPQPsilqPGSQVLPPPPTTLNGPGaspllppmyrpdglsgPPPPNSQYQPPPLPGQTL 228
Cdd:pfam07174 39 ADPEPAPPPPSTATAPPAPPPPPPA----PAAPAPPPPPAAPNAPN----------------APPPPADPNAPPPPPADP 98
|
90
....*....|..
gi 1622941236 229 GAGYPPQQAANS 240
Cdd:pfam07174 99 NAPPPPAVDPNA 110
|
|
| DUF2076 |
pfam09849 |
Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various ... |
191-263 |
4.36e-03 |
|
Uncharacterized protein conserved in bacteria (DUF2076); This domain, found in various hypothetical prokaryotic proteins, has no known function. The domain, however, is found in various periplasmic ligand-binding sensor proteins.
Pssm-ID: 430876 Cd Length: 263 Bit Score: 40.11 E-value: 4.36e-03
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622941236 191 NGPGASPLLPPMYRPDGLSGPPPPNSQYQPPPlPGQtlGAGYPPQQAANSGPQMAGAQLSYPGGFPGGPAQLA 263
Cdd:pfam09849 97 GGSQSRPPPPPQARPAWPAGQAPGQPQPYPGQ-PGY--AQQGQPQYGQPAQPPRGPWGPGGGGGFLGGALQTA 166
|
|
| GGN |
pfam15685 |
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the ... |
92-267 |
5.24e-03 |
|
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking.
Pssm-ID: 434857 [Multi-domain] Cd Length: 668 Bit Score: 40.91 E-value: 5.24e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 92 PPP--VNNVASSYAPYQP-----SAQSSYLSPMSTSsVTQLGSQLSAmqinsygsgmAHAGSGmAPPSQGPPGPLSATSL 164
Cdd:pfam15685 86 PPPeeAAAAAVSTAPPPAvgsllPAPSKWRKPTGTA-VARIRGLLEA----------SHRGQG-DPLSLRPLLPLLPRQL 153
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 165 -QTPPRPPQPSilqpgsqvlPPPPTtlngpgasPLLPpmYRPdglsgPPPPNSQYQPPplpgqtlGAGYPPQQAAN-SGP 242
Cdd:pfam15685 154 iEKDPAPGAPA---------PPPPT--------PLEP--RKP-----PPLPPSDRQPP-------NRGITPALATSaTSP 202
|
170 180
....*....|....*....|....*
gi 1622941236 243 QMAGAQLSYPGGFPGGpAQLAGPPQ 267
Cdd:pfam15685 203 TDSQAKHIAEGKTAGG-ACGGAPPQ 226
|
|
| PHA03132 |
PHA03132 |
thymidine kinase; Provisional |
149-272 |
5.82e-03 |
|
thymidine kinase; Provisional
Pssm-ID: 222997 [Multi-domain] Cd Length: 580 Bit Score: 40.52 E-value: 5.82e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 149 APPSQGPPGPLSATSLQTPPRPPQPS-------ILQPGSQVLPPPPTTLNGPGASPLLPPMYRPDGLSgpppPNSQYQPP 221
Cdd:PHA03132 56 PPRETGSGGGVATSTIYTVPRPPRGPeqtldkpDSLPASRELPPGPTPVPPGGFRGASSPRLGADSTS----PRFLYQVN 131
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|....*..
gi 1622941236 222 PLPGQT-LGAGYPPQQAAN-----SGPQMAGAQLSYPGGfPGGPAQLAGPPQPQKKL 272
Cdd:PHA03132 132 FPVILApIGESNSSSEELSeeeehSRPPPSESLKVKNGG-KVYPKGFSKHKTHKRSE 187
|
|
| SPT5 |
COG5164 |
Transcription elongation factor SPT5 [Transcription]; |
66-270 |
6.52e-03 |
|
Transcription elongation factor SPT5 [Transcription];
Pssm-ID: 444063 [Multi-domain] Cd Length: 495 Bit Score: 40.40 E-value: 6.52e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 66 PPPPGPHQFGQNGAHAAGHPQQRFPGPPPVNNvaSSYAPyqPSAQSSYLSPMSTSSVTQLGSQLSAMQINSYGS-GMAHA 144
Cdd:COG5164 11 PSDPGGVTTPAGSQGSTKPAQNQGSTRPAGNT--GGTRP--AQNQGSTTPAGNTGGTRPAGNQGATGPAQNQGGtTPAQN 86
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 145 GSGMAPP-SQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPTTLNGP----GASPLLPPMYRPDGLSGPPPPNSQYQ 219
Cdd:COG5164 87 QGGTRPAgNTGGTTPAGDGGATGPPDDGGATGPPDDGGSTTPPSGGSTTPpgdgGSTPPGPGSTGPGGSTTPPGDGGSTT 166
|
170 180 190 200 210 220
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1622941236 220 PPPLPGQTL-----GAGYPPQQA--------ANSGPQMAGAQLSYPGGFPGGP----AQLAGPPQPQK 270
Cdd:COG5164 167 PPGPGGSTTppddgGSTTPPNKGetgtdiptGGTPRQGPDGPVKKDDKNGKGNppddRGGKTGPKDQR 234
|
|
| PRK11901 |
PRK11901 |
hypothetical protein; Reviewed |
115-290 |
6.53e-03 |
|
hypothetical protein; Reviewed
Pssm-ID: 237015 [Multi-domain] Cd Length: 327 Bit Score: 40.05 E-value: 6.53e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 115 SPMSTSSVTQLGSQLSAMQINSYGSGMAHAGSGMAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSqvlpPPPTTlngpg 194
Cdd:PRK11901 60 SPTEHESQQSSNNAGAEKNIDLSGSSSLSSGNQSSPSAANNTSDGHDASGVKNTAPPQDISAPPIS----PTPTQ----- 130
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 195 ASPllppmyrpdglsgPPPPNSQyQPPPLPGQTLGAGYPPQQAANSGPQMAGAQLSypgGFPGGPAQLAGPPQPQKKLDP 274
Cdd:PRK11901 131 AAP-------------PQTPNGQ-QRIELPGNISDALSQQQGQVNAASQNAQGNTS---TLPTAPATVAPSKGAKVPATA 193
|
170
....*....|....*.
gi 1622941236 275 DSIPSPIQVIENDRAS 290
Cdd:PRK11901 194 ETHPTPPQKPATKKPA 209
|
|
| COG3416 |
COG3416 |
Uncharacterized conserved protein, DUF2076 domain [Function unknown]; |
122-263 |
9.29e-03 |
|
Uncharacterized conserved protein, DUF2076 domain [Function unknown];
Pssm-ID: 442642 [Multi-domain] Cd Length: 237 Bit Score: 38.85 E-value: 9.29e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622941236 122 VTQLGSQLSAMQinsygsgmahagsgmAPPSQGPPGPLSATSLQTPPRPPQPSILQPGSQVLPPPPttlngpgaspllpp 201
Cdd:COG3416 64 IQELEAQLAQLQ---------------QQQPQSSGGFLSGLFGGGQRPPPAPQPSQPGPQQQPAPP-------------- 114
|
90 100 110 120 130 140
....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1622941236 202 myrpdglsgPPPPNSQYQPPPLPGQtlgagypPQQAAnsgPQMAGAQlsyPGGFPGGPAQLA 263
Cdd:COG3416 115 ---------SGPWGQAAPQQPGYGQ-------PQYGQ---PAAGPSG---GGGFLGGALQTA 154
|
|
|