NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1622844908|ref|XP_028685302|]
View 

protein FAM186A [Macaca mulatta]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PTZ00121 super family cl31754
MAEBL; Provisional
386-999 2.24e-15

MAEBL; Provisional


The actual alignment was detected with superfamily member PTZ00121:

Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 83.27  E-value: 2.24e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  386 VDEVQRKETKDSGIKWESSISYIAQAE--RTPDLTELQQQPVASEDISEDSTKDNVSLKEGD-VYQEDEIDEYQSWKRKH 462
Cdd:PTZ00121  1232 AEEAKKDAEEAKKAEEERNNEEIRKFEeaRMAHFARRQAAIKAEEARKADELKKAEEKKKADeAKKAEEKKKADEAKKKA 1311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  463 TKGTHVSETSgpnlsdnkggQRVSEAKlsqyYELQALKKKRKEMKSFPEDKsKSPTEAKRKHLFLTETKSQGGKSGTSMM 542
Cdd:PTZ00121  1312 EEAKKADEAK----------KKAEEAK----KKADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEKKKEEA 1376
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  543 LEQFRMVKRESPFDKRPTAAEFKVEPTIESLD--KEGEGEISSLVEPLNMIQFDDTAEPQKGKIKGKKhriSSGTTTSKE 620
Cdd:PTZ00121  1377 KKKADAAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK---KADEAKKKA 1453
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  621 ETTEEKEVLTKQVKSHRLVKSLSRVAKETSESTRVLESpdGESEQSNLEEFQKAIMAflKQKIDNTGKPFDKKtvpKEEA 700
Cdd:PTZ00121  1454 EEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKK--AEEAKKKADEAKKAAEA--KKKADEAKKAEEAK---KADE 1526
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  701 LLKRTEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKEIKKEERVgeKPIKQKKVVSFMPGLHFQKSPISAKSESSTFLSH 780
Cdd:PTZ00121  1527 AKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEV 1597
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  781 ESTDPVINNLMQMILAEIESERdiptVSAVQKDHKETEKQRREQY-SQEGQEQMSGMSLKQQflEERNLLKERYEKISEN 859
Cdd:PTZ00121  1598 MKLYEEEKKMKAEEAKKAEEAK----IKAEELKKAEEEKKKVEQLkKKEAEEKKKAEELKKA--EEENKIKAAEEAKKAE 1671
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  860 WEEKKAQLQMKEGKQEqqkqkqwqkeemwKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKMKAQgllle 939
Cdd:PTZ00121  1672 EDKKKAEEAKKAEEDE-------------KKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAE----- 1733
                          570       580       590       600       610       620
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1622844908  940 kengQMRQIEKEVKHLGPNMRREKGKEK--QKPERGLEDLRRQIKTKEQMQMKETQPKELEK 999
Cdd:PTZ00121  1734 ----EAKKEAEEDKKKAEEAKKDEEEKKkiAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEK 1791
PHA03379 super family cl33730
EBNA-3A; Provisional
1215-1602 5.14e-11

EBNA-3A; Provisional


The actual alignment was detected with superfamily member PHA03379:

Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 68.55  E-value: 5.14e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1215 GIPLTPQQAQALGIPLTLQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQq 1294
Cdd:PHA03379   414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGR- 492
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1295 aqaLGITLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQEL----GIPLTPQQAQEL 1370
Cdd:PHA03379   493 ---PACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRER 569
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1371 -----GIPLTPQQAQELGIPLTPQQAQELGIPLT-PQQAQELGIPLTPQQaQELGIPLTPQQAQELGIPLTPQQ--AQEL 1442
Cdd:PHA03379   570 wrpapWTPNPPRSPSQMSVRDRLARLRAEAQPYQaSVEVQPPQLTQVSPQ-QPMEYPLEPEQQMFPGSPFSQVAdvMRAG 648
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1443 GIPLTPQQAQELGVtltpQQAQELGIPLTPQQAQELGIPLTPQ-QAQELGIPLTPQQAQ-ALGIPLTPQQAQElgIPLTP 1520
Cdd:PHA03379   649 GVPAMQPQYFDLPL----QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQgASAAHFLPQQPME--GPLVP 722
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1521 QQAQALGIPLS------LQQAQELGIPLT-PQQAQALGIPLTPQQAQELgiPLTPQQAQSQGIPLTPqqaqelGIPLTPQ 1593
Cdd:PHA03379   723 ERWMFQGATLSqsvrpgVAQSQYFDLPLTqPINHGAPAAHFLHQPPMEG--PWVPEQWMFQGAPPSQ------GTDVVQH 794

                   ....*....
gi 1622844908 1594 QAQALGIPL 1602
Cdd:PHA03379   795 QLDALGYVL 803
PHA03247 super family cl33720
large tegument protein UL36; Provisional
1397-1963 1.00e-10

large tegument protein UL36; Provisional


The actual alignment was detected with superfamily member PHA03247:

Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 68.04  E-value: 1.00e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1397 PLTPQQAQELGIPlTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGVTLTPQQAQELGIPLTPQ-QA 1475
Cdd:PHA03247  2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSpAA 2635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1476 QELGIPLTPQQAQelgiPLTPQQAQALGIPLTPQQAQELGIPL----TPQQAQALGIPLSLQQAQELGIPLTPQQAqalg 1551
Cdd:PHA03247  2636 NEPDPHPPPTVPP----PERPRDDPAPGRVSRPRRARRLGRAAqassPPQRPRRRAARPTVGSLTSLADPPPPPPT---- 2707
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1552 iPLTPQQAQELGIPLTP-QQAQSQGIPLTPQQaqelgiPLTPQQAQALGIPLTPQQMQAQGITLTPQQAQALGIPLTPQQ 1630
Cdd:PHA03247  2708 -PEPAPHALVSATPLPPgPAAARQASPALPAA------PAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPP 2780
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1631 LQAQGITLTPQQAQALGVPITPVNAWVSAVTLTPEQTqvlespinleqaqeqlsklgVPLTLDKAHTLGSPLTLKEVQwS 1710
Cdd:PHA03247  2781 RRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAA--------------------LPPAASPAGPLPPPTSAQPTA-P 2839
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1711 HKPFQKPKASLPTGQSIISRLSPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPgkplemgilsEPGKLGAPQTLRSs 1790
Cdd:PHA03247  2840 PPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRST----------ESFALPPDQPERP- 2908
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1791 gqtlvyggqstsvqfPAPQAPPTPGQLPKFGAPPTPGQPfeleafssrelfitrasltPPPPQMSNAPLAPrQRLIAGVP 1870
Cdd:PHA03247  2909 ---------------PQPQAPPPPQPQPQPPPPPQPQPP-------------------PPPPPRPQPPLAP-TTDPAGAG 2953
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1871 PTSGQIPSLW-APLSPGQRLVPEaSSIPgdllesgpltfseqlqefqPPATVEQSPYLQAPTSGQHLAPwTLPGLASSLW 1949
Cdd:PHA03247  2954 EPSGAVPQPWlGALVPGRVAVPR-FRVP-------------------QPAPSREAPASSTPPLTGHSLS-RVSSWASSLA 3012
                          570
                   ....*....|....*....
gi 1622844908 1950 IPPTSRHPP-----TLWPS 1963
Cdd:PHA03247  3013 LHEETDPPPvslkqTLWPP 3031
SP1-4_N super family cl41773
N-terminal domain of transcription factor Specificity Proteins (SP) 1-4; Specificity Proteins ...
1084-1311 3.88e-03

N-terminal domain of transcription factor Specificity Proteins (SP) 1-4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. SPs belong to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP1-4.


The actual alignment was detected with superfamily member cd22553:

Pssm-ID: 425404 [Multi-domain]  Cd Length: 384  Bit Score: 41.94  E-value: 3.88e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1084 ITLTPQQAQAqgiMLTIQQAQELGIPLTLQQAQALEIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGITL----- 1158
Cdd:cd22553    101 IQLAPGGTQA---ILANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGQTVyqtiq 177
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1159 TPQQAQELGIPLTPQ---QAQALGITLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTLQQA 1235
Cdd:cd22553    178 VPIQAIQSGNAGGGNqalQAQVIPQLAQAAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIIGQVASASS 257
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622844908 1236 QELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAqelgIPLTPQQAQALGITLTP-QQAQALG 1311
Cdd:cd22553    258 IQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPASSS----IPTVVQQQAIQGNPLPPgTQIIAAG 330
 
Name Accession Description Interval E-value
PTZ00121 PTZ00121
MAEBL; Provisional
386-999 2.24e-15

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 83.27  E-value: 2.24e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  386 VDEVQRKETKDSGIKWESSISYIAQAE--RTPDLTELQQQPVASEDISEDSTKDNVSLKEGD-VYQEDEIDEYQSWKRKH 462
Cdd:PTZ00121  1232 AEEAKKDAEEAKKAEEERNNEEIRKFEeaRMAHFARRQAAIKAEEARKADELKKAEEKKKADeAKKAEEKKKADEAKKKA 1311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  463 TKGTHVSETSgpnlsdnkggQRVSEAKlsqyYELQALKKKRKEMKSFPEDKsKSPTEAKRKHLFLTETKSQGGKSGTSMM 542
Cdd:PTZ00121  1312 EEAKKADEAK----------KKAEEAK----KKADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEKKKEEA 1376
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  543 LEQFRMVKRESPFDKRPTAAEFKVEPTIESLD--KEGEGEISSLVEPLNMIQFDDTAEPQKGKIKGKKhriSSGTTTSKE 620
Cdd:PTZ00121  1377 KKKADAAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK---KADEAKKKA 1453
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  621 ETTEEKEVLTKQVKSHRLVKSLSRVAKETSESTRVLESpdGESEQSNLEEFQKAIMAflKQKIDNTGKPFDKKtvpKEEA 700
Cdd:PTZ00121  1454 EEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKK--AEEAKKKADEAKKAAEA--KKKADEAKKAEEAK---KADE 1526
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  701 LLKRTEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKEIKKEERVgeKPIKQKKVVSFMPGLHFQKSPISAKSESSTFLSH 780
Cdd:PTZ00121  1527 AKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEV 1597
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  781 ESTDPVINNLMQMILAEIESERdiptVSAVQKDHKETEKQRREQY-SQEGQEQMSGMSLKQQflEERNLLKERYEKISEN 859
Cdd:PTZ00121  1598 MKLYEEEKKMKAEEAKKAEEAK----IKAEELKKAEEEKKKVEQLkKKEAEEKKKAEELKKA--EEENKIKAAEEAKKAE 1671
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  860 WEEKKAQLQMKEGKQEqqkqkqwqkeemwKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKMKAQgllle 939
Cdd:PTZ00121  1672 EDKKKAEEAKKAEEDE-------------KKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAE----- 1733
                          570       580       590       600       610       620
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1622844908  940 kengQMRQIEKEVKHLGPNMRREKGKEK--QKPERGLEDLRRQIKTKEQMQMKETQPKELEK 999
Cdd:PTZ00121  1734 ----EAKKEAEEDKKKAEEAKKDEEEKKkiAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEK 1791
PHA03379 PHA03379
EBNA-3A; Provisional
1215-1602 5.14e-11

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 68.55  E-value: 5.14e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1215 GIPLTPQQAQALGIPLTLQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQq 1294
Cdd:PHA03379   414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGR- 492
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1295 aqaLGITLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQEL----GIPLTPQQAQEL 1370
Cdd:PHA03379   493 ---PACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRER 569
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1371 -----GIPLTPQQAQELGIPLTPQQAQELGIPLT-PQQAQELGIPLTPQQaQELGIPLTPQQAQELGIPLTPQQ--AQEL 1442
Cdd:PHA03379   570 wrpapWTPNPPRSPSQMSVRDRLARLRAEAQPYQaSVEVQPPQLTQVSPQ-QPMEYPLEPEQQMFPGSPFSQVAdvMRAG 648
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1443 GIPLTPQQAQELGVtltpQQAQELGIPLTPQQAQELGIPLTPQ-QAQELGIPLTPQQAQ-ALGIPLTPQQAQElgIPLTP 1520
Cdd:PHA03379   649 GVPAMQPQYFDLPL----QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQgASAAHFLPQQPME--GPLVP 722
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1521 QQAQALGIPLS------LQQAQELGIPLT-PQQAQALGIPLTPQQAQELgiPLTPQQAQSQGIPLTPqqaqelGIPLTPQ 1593
Cdd:PHA03379   723 ERWMFQGATLSqsvrpgVAQSQYFDLPLTqPINHGAPAAHFLHQPPMEG--PWVPEQWMFQGAPPSQ------GTDVVQH 794

                   ....*....
gi 1622844908 1594 QAQALGIPL 1602
Cdd:PHA03379   795 QLDALGYVL 803
PHA03247 PHA03247
large tegument protein UL36; Provisional
1397-1963 1.00e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 68.04  E-value: 1.00e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1397 PLTPQQAQELGIPlTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGVTLTPQQAQELGIPLTPQ-QA 1475
Cdd:PHA03247  2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSpAA 2635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1476 QELGIPLTPQQAQelgiPLTPQQAQALGIPLTPQQAQELGIPL----TPQQAQALGIPLSLQQAQELGIPLTPQQAqalg 1551
Cdd:PHA03247  2636 NEPDPHPPPTVPP----PERPRDDPAPGRVSRPRRARRLGRAAqassPPQRPRRRAARPTVGSLTSLADPPPPPPT---- 2707
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1552 iPLTPQQAQELGIPLTP-QQAQSQGIPLTPQQaqelgiPLTPQQAQALGIPLTPQQMQAQGITLTPQQAQALGIPLTPQQ 1630
Cdd:PHA03247  2708 -PEPAPHALVSATPLPPgPAAARQASPALPAA------PAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPP 2780
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1631 LQAQGITLTPQQAQALGVPITPVNAWVSAVTLTPEQTqvlespinleqaqeqlsklgVPLTLDKAHTLGSPLTLKEVQwS 1710
Cdd:PHA03247  2781 RRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAA--------------------LPPAASPAGPLPPPTSAQPTA-P 2839
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1711 HKPFQKPKASLPTGQSIISRLSPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPgkplemgilsEPGKLGAPQTLRSs 1790
Cdd:PHA03247  2840 PPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRST----------ESFALPPDQPERP- 2908
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1791 gqtlvyggqstsvqfPAPQAPPTPGQLPKFGAPPTPGQPfeleafssrelfitrasltPPPPQMSNAPLAPrQRLIAGVP 1870
Cdd:PHA03247  2909 ---------------PQPQAPPPPQPQPQPPPPPQPQPP-------------------PPPPPRPQPPLAP-TTDPAGAG 2953
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1871 PTSGQIPSLW-APLSPGQRLVPEaSSIPgdllesgpltfseqlqefqPPATVEQSPYLQAPTSGQHLAPwTLPGLASSLW 1949
Cdd:PHA03247  2954 EPSGAVPQPWlGALVPGRVAVPR-FRVP-------------------QPAPSREAPASSTPPLTGHSLS-RVSSWASSLA 3012
                          570
                   ....*....|....*....
gi 1622844908 1950 IPPTSRHPP-----TLWPS 1963
Cdd:PHA03247  3013 LHEETDPPPvslkqTLWPP 3031
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1715-1969 2.11e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 63.25  E-value: 2.11e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1715 QKPKASLPTGQSIISRLSPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPGKPLEMGILSEPGKLGAPQTLRSSGQTL 1794
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSPHPPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1795 VYGGQSTSVQFPAPQAPPTP---GQLPKFGAP----------PTPGQPFELEAFSSRELF-----------ITRASLTPP 1850
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQPslhGQMPPMPHSlqtgpshmqhPVPPQPFPLTPQSSQSQVppgpspaapgqSQQRIHTPP 329
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1851 PPQMSNAPLAPRQRLIAGV--------PPTSGQIPSLWAPLS---PGQRLVPEASSIPGDLLESGPLTFSEQLQEFQPPA 1919
Cdd:pfam03154  330 SQSQLQSQQPPREQPLPPAplsmphikPPPTTPIPQLPNPQShkhPPHLSGPSPFQMNSNLPPPPALKPLSSLSTHHPPS 409
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1622844908 1920 TveQSPYLQAPTSGQHL--APWTLPGLASSLWIPPT-SRHPPTLWPSPAPGKP 1969
Cdd:pfam03154  410 A--HPPPLQLMPQSQQLppPPAQPPVLTQSQSLPPPaASHPPTSGLHQVPSQS 460
AvrBs3 NF041308
type III secretion system effector avirulence protein AvrBs3;
1072-1698 2.57e-07

type III secretion system effector avirulence protein AvrBs3;


Pssm-ID: 469205 [Multi-domain]  Cd Length: 1179  Bit Score: 56.50  E-value: 2.57e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1072 IHLTPQQAQEVGITLTPQQAQAQGIMLTIQQAQEL-GIPLTLQQAQALEIP-----LTPQQAQALGIPLTpqqaqelGIP 1145
Cdd:NF041308   216 IAVLPEATHKDIVEVGKQWSGARALQALLMVAEELrGPPLQLDTGQLIKIAkrggaPAVEAVHASRNALT-------GAP 288
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1146 L--TPQQAQALGITLTPQQAQE---LGIP--------LTPQQAQALGITLTPQQA--------QELGIP---LTPQQAQA 1201
Cdd:NF041308   289 LhlTPHQVVAIASNNGGKQALEtvqRLLPvlcqpphgLTPEQVVAIASNDGGKQAletvqrllPVLCQAehgLTPDQVVA 368
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1202 LGIPLTPQQAQELGIPLTPQQAQAlgiPLTLQQAQELGIPLTPQQAQALGI--PLTPQQAQELGiPLTPQQAQALGIPLT 1279
Cdd:NF041308   369 IASNIGGKPALETVQRLLPVLCQP---PHGLTPDQVVAIASNDGGKQALETvqRLLPVLCQAPH-GLTPDQVVAIASNDG 444
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1280 PQQAQELGIPLTPQQAQALGitLTPQQAQALGIPLTPQQAQELGIPLTPQQAQeLGIPLTPQQAQELGIPLTPQQAQELG 1359
Cdd:NF041308   445 GKQALETVQRLLPELCQAHG--LTPDQVVAIASNGGGKQALETVQRLLPVLCQ-PPHGLTPEQVVAIASNGGGKQALETV 521
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1360 IPLTPQQAQELGiPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQA 1439
Cdd:NF041308   522 QRLLPVLCQPPH-GLTPEQVVAIASHDGGKQALETVHRLLPVLCQA-PHGLTPEQVVAIASHNGGKQALETVQRLLPVLC 599
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1440 QElGIPLTPQQAQELGVTLTPQQAQELGIPLTPQQAQElGIPLTPQQ----AQELGIPLTPQQAQALgIPLTPQQAQELg 1515
Cdd:NF041308   600 QR-PYGLTPNQVVAIASNDGGKQALETVQRLLPVLCQA-PHGLTPDQvvaiASNGGGKQALETVQRL-LPVLCQRPHGL- 675
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1516 iplTPQQAQAL----GIPLSLQQAQELgIP--------LTPQQAQALGIPLTPQQA--------QELGIP---LTPQQAQ 1572
Cdd:NF041308   676 ---TPHQVVAIasndGGKQALETVQRL-LPvlcqppygLTPEQVVAIASNNGGKQAletvqrllPVLCQRphgLTPDQVV 751
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1573 SQGIPLTPQQAQELGIPLTPQQAQALgIPLTPQQMQAQGITLTPQQAQALGIPLTPQQLQAQGiTLTPQQA--------- 1643
Cdd:NF041308   752 AIASNDGGKQALETVQRLLPVLCQPP-HGLTPDQVVAIASNDGGKQALETVQRLLPVLCDAPH-GLTPHQVvaiasnigg 829
                          650       660       670       680       690
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1622844908 1644 -QALGVPITPVNAWVSAVTLTPEQTQVLESPINLEQAQEQLSKLgVPLTLDKAHTL 1698
Cdd:NF041308   830 rQALETVQRLLPVLCQAHGLTPDQVVAIASNNGGKQALETVQRL-LPVLCQPPHGL 884
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
1341-1654 9.93e-07

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 54.24  E-value: 9.93e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1341 QQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGipltPQQAQELGIPLTPQQAQeLGI 1420
Cdd:pfam09606   62 QPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMGPMGPGPGGPMGQQMGG----PGTASNLLASLGRPQMP-MGG 136
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1421 PLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGVTLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQ 1500
Cdd:pfam09606  137 AGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGMPGPADAG 216
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1501 ALGipltPQQAQELGIPltPQQAQALGIPLSLQQAQELGipLTPQQAQALGIPLTPQQAQELGIPLTPQQAQSQGIPLTP 1580
Cdd:pfam09606  217 AQM----GQQAQANGGM--NPQQMGGAPNQVAMQQQQPQ--QQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPG 288
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1622844908 1581 QQAQELGIPLTPQQAQALGIPLTPQQMQAQGITLTPQQaqalgiPLTPQQLQAQGITLTPQQAQALGVPITPVN 1654
Cdd:pfam09606  289 QQPGAMPNVMSIGDQNNYQQQQTRQQQQQQGGNHPAAH------QQQMNQSVGQGGQVVALGGLNHLETWNPGN 356
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
298-1005 1.39e-05

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 50.74  E-value: 1.39e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  298 EKELSLKVIQDLSNENEMLQQKLHDAEEKCEQliRSKIVTEQAYAILSTSSTLKVLPGPSPQSSRAIIKvgdiEDNMDNI 377
Cdd:pfam02463  238 RIDLLQELLRDEQEEIESSKQEIEKEEEKLAQ--VLKENKEEEKEKKLQEEELKLLAKEEEELKSELLK----LERRKVD 311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  378 LDKDLENIVDEVQRKEtkdsgIKWESSISYIAQAERTPDLTELQQQPV--ASEDISEDSTKDNVSLKEGDVYQEDEIDEY 455
Cdd:pfam02463  312 DEEKLKESEKEKKKAE-----KELKKEKEEIEELEKELKELEIKREAEeeEEEELEKLQEKLEQLEEELLAKKKLESERL 386
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  456 QSWKRKHTKGThvsetsgpNLSDNKGGQRVSEAKLSQYYELqALKKKRKEMKSFPEDKSKSPTEAKRKHLFLTETKsqgg 535
Cdd:pfam02463  387 SSAAKLKEEEL--------ELKSEEEKEAQLLLELARQLED-LLKEEKKEELEILEEEEESIELKQGKLTEEKEEL---- 453
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  536 KSGTSMMLEQFRMVKRESpfDKRPTAAEFKVEPTIESLDKEGEGEISSLVEplnmiqfddtaepQKGKIKGKKHRISSGT 615
Cdd:pfam02463  454 EKQELKLLKDELELKKSE--DLLKETQLVKLQEQLELLLSRQKLEERSQKE-------------SKARSGLKVLLALIKD 518
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  616 TTSKEETTEEKEVLTKQVkshrlVKSLSRVAKETSESTRVLESPDGESEQSNL---EEFQKAIMAFLKQKIDNTGKPfdK 692
Cdd:pfam02463  519 GVGGRIISAHGRLGDLGV-----AVENYKVAISTAVIVEVSATADEVEERQKLvraLTELPLGARKLRLLIPKLKLP--L 591
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  693 KTVPKEEALLKRTEAEKLGIIKAKMEEyfQKVAETVTKILRKYKEIKKEERVGEKPIKQKKVVSFMPGLhFQKSPISAKS 772
Cdd:pfam02463  592 KSIAVLEIDPILNLAQLDKATLEADED--DKRAKVVEGILKDTELTKLKESAKAKESGLRKGVSLEEGL-AEKSEVKASL 668
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  773 ESSTFLSHESTDPVINNLMQMILAEIESERdiptvsavQKDHKETEKQRREQYSQEGQEQMSGMSLKQQFLEERNLLKER 852
Cdd:pfam02463  669 SELTKELLEIQELQEKAESELAKEEILRRQ--------LEIKKKEQREKEELKKLKLEAEELLADRVQEAQDKINEELKL 740
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  853 YEKISENWEEKKAQLQMKEGKQEQQKQKQWQKEEMWKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKmK 932
Cdd:pfam02463  741 LKQKIDEEEEEEEKSRLKKEEKEEEKSELSLKEKELAEEREKTEKLKVEEEKEEKLKAQEEELRALEEELKEEAELLE-E 819
                          650       660       670       680       690       700       710
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622844908  933 AQGLLLEKENGQMRQIEKEVKHLGPNMRREKGKEKQKpERGLEDLRRQIKTKEQMQMKETQPKELEKLVTQTP 1005
Cdd:pfam02463  820 EQLLIEQEEKIKEEELEELALELKEEQKLEKLAEEEL-ERLEEEITKEELLQELLLKEEELEEQKLKDELESK 891
AvrBs3 NF041308
type III secretion system effector avirulence protein AvrBs3;
1146-1633 1.82e-05

type III secretion system effector avirulence protein AvrBs3;


Pssm-ID: 469205 [Multi-domain]  Cd Length: 1179  Bit Score: 50.34  E-value: 1.82e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1146 LTPQQAQALGITLTPQQAQELGIPLTPQQAQALGitLTPQQAQELGIPLTPQQAQALGIPLTPQQAQeLGIPLTPQQAQA 1225
Cdd:NF041308   431 LTPDQVVAIASNDGGKQALETVQRLLPELCQAHG--LTPDQVVAIASNGGGKQALETVQRLLPVLCQ-PPHGLTPEQVVA 507
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1226 LGIPLTLQQAQELGIPLTPQQAQAlgipltPQQaqelgipLTPQQAQALGIPLTPQQAQELGIPLTPQQAQAlGITLTPQ 1305
Cdd:NF041308   508 IASNGGGKQALETVQRLLPVLCQP------PHG-------LTPEQVVAIASHDGGKQALETVHRLLPVLCQA-PHGLTPE 573
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1306 QAQALGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIP 1385
Cdd:NF041308   574 QVVAIASHNGGKQALETVQRLLPVLCQR-PYGLTPNQVVAIASNDGGKQALETVQRLLPVLCQA-PHGLTPDQVVAIASN 651
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1386 LTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGVTLTPQQAQE 1465
Cdd:NF041308   652 GGGKQALETVQRLLPVLCQR-PHGLTPHQVVAIASNDGGKQALETVQRLLPVLCQP-PYGLTPEQVVAIASNNGGKQALE 729
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1466 LGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELgIPLTPQQAQALGIPLSLQQAQELGIPLTPQ 1545
Cdd:NF041308   730 TVQRLLPVLCQR-PHGLTPDQVVAIASNDGGKQALETVQRLLPVLCQPP-HGLTPDQVVAIASNDGGKQALETVQRLLPV 807
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1546 QAQALGiPLTPQQAQELGIPLTPQQAQSQGIPLTPQQAQELGipLTPQQAQALGIPLTPQQMQAQGITLTPQQAQAlGIP 1625
Cdd:NF041308   808 LCDAPH-GLTPHQVVAIASNIGGRQALETVQRLLPVLCQAHG--LTPDQVVAIASNNGGKQALETVQRLLPVLCQP-PHG 883

                   ....*...
gi 1622844908 1626 LTPQQLQA 1633
Cdd:NF041308   884 LTPHQVVA 891
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
1089-1472 4.72e-05

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 48.38  E-value: 4.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1089 QQAQAQGIMLTIQQAQelGIPLTlQQAQALEIPLTPQQAQA-LGIPLTPQQAQELGIPLTPQQAQALGITLTPQQAQELG 1167
Cdd:cd22540    130 PQIQAAGQINNSGQIQ--IIPGT-NQAIITPVQVLQQPQQAhKPVPIKPAPLQTSNTNSASLQVPGNVIKLQSGGNVALT 206
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1168 IPLTPQQAQALGITLTPQQAQELGIP-LTPQQAQALGIPLTPQQAQE--LGIPLTPQQAQALGIPLTLQQAQELGIPLTP 1244
Cdd:cd22540    207 LPVNNLVGTQDGATQLQLAAAPSKPSkKIRKKSAQAAQPAVTVAEQVetVLIETTADNIIQAGNNLLIVQSPGTGQPAVL 286
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1245 QQAQALgipltpQQAQELGIPLTPQQAqalgipLTPQQAQELGIPLTPQQaqalgitlTPQQAQALGIPLTPQQAQeLGI 1324
Cdd:cd22540    287 QQVQVL------QPKQEQQVVQIPQQA------LRVVQAASATLPTVPQK--------PLQNIQIQNSEPTPTQVY-IKT 345
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1325 PLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELG--IPLTPQQAQELGiPLTPQQAQELGIplTPQQAQELGIplTPQQ 1402
Cdd:cd22540    346 PSGEVQTVLLQEAPAATATPSSSTSTVQQQVTANNgtGTSKPNYNVRKE-RTLPKIAPAGGI--ISLNAAQLAA--AAQA 420
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1403 AQELGIPLTPQQAQELGIPLTPQQAQeLGIPLTPQQAQELGiPLTPQQAQElgvtltpQQAQELGIPLTP 1472
Cdd:cd22540    421 IQTININGVQVQGVPVTITNAGGQQQ-LTVQTVSSNNLTIS-GLSPTQIQL-------QMEQALEIETQP 481
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
272-611 1.93e-04

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 46.98  E-value: 1.93e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  272 LSMLNDELKGVNFQSSTvCVQETSEAEKEL---------SLKVIQDLSNENEMLQQKLHDAEEKCEQLIRSKIVTEQayA 342
Cdd:TIGR02169  676 LQRLRERLEGLKRELSS-LQSELRRIENRLdelsqelsdASRKIGEIEKEIEQLEQEEEKLKERLEELEEDLSSLEQ--E 752
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  343 ILSTSSTLKVLPGPSPQSSRAIIKvgdIEDNMDNILDKDLENIVDEVQRK--ETKDSGIKWESSISYIAQaertpDLTEL 420
Cdd:TIGR02169  753 IENVKSELKELEARIEELEEDLHK---LEEALNDLEARLSHSRIPEIQAElsKLEEEVSRIEARLREIEQ-----KLNRL 824
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  421 QQQPVASEDISEDSTKDNVSLKEGDVYQEDEIDEYQSWKRKhtKGTHVSETsgpnlsdnkggqrvsEAKLSQYY-ELQAL 499
Cdd:TIGR02169  825 TLEKEYLEKEIQELQEQRIDLKEQIKSIEKEIENLNGKKEE--LEEELEEL---------------EAALRDLEsRLGDL 887
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  500 KKKRKEMKS------FPEDKSKSPTEAKRKHLFLTETKSQGGKSGTSMMLEQFRMVKRESPfdkrPTAAEFKVEPTIESL 573
Cdd:TIGR02169  888 KKERDELEAqlreleRKIEELEAQIEKKRKRLSELKAKLEALEEELSEIEDPKGEDEEIPE----EELSLEDVQAELQRV 963
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|
gi 1622844908  574 dkegEGEISSLvEPLNMIQFDDTAEPQK--GKIKGKKHRI 611
Cdd:TIGR02169  964 ----EEEIRAL-EPVNMLAIQEYEEVLKrlDELKEKRAKL 998
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
1778-1931 3.59e-04

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 45.63  E-value: 3.59e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1778 PGKLGAPQTLRSSGQTLVYGGQSTSVQFPAPQAPPTP--GQLPKFGAPPTPGQPFELEAfssrelfitRASLTPPPPQMS 1855
Cdd:cd23959    112 AARVPNPFSASSSTQRETHKTAQVAPPKAEPQTAPVTpfGQLPMFGQHPPPAKPLPAAA---------AAQQSSASPGEV 182
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622844908 1856 NAPLAPrqrliaGVPPTSgqipslwaPLSPGQRLVPEASSIPGDLLE-SGPLTFSEQLQEFQPPATVEQSPYLQAPT 1931
Cdd:cd23959    183 ASPFAS------GTVSAS--------PFATATDTAPSSGAPDGFPAEaSAPSPFAAPASAASFPAAPVANGEAATPT 245
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1084-1311 3.88e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 41.94  E-value: 3.88e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1084 ITLTPQQAQAqgiMLTIQQAQELGIPLTLQQAQALEIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGITL----- 1158
Cdd:cd22553    101 IQLAPGGTQA---ILANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGQTVyqtiq 177
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1159 TPQQAQELGIPLTPQ---QAQALGITLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTLQQA 1235
Cdd:cd22553    178 VPIQAIQSGNAGGGNqalQAQVIPQLAQAAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIIGQVASASS 257
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622844908 1236 QELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAqelgIPLTPQQAQALGITLTP-QQAQALG 1311
Cdd:cd22553    258 IQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPASSS----IPTVVQQQAIQGNPLPPgTQIIAAG 330
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
1521-1628 9.37e-03

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 41.33  E-value: 9.37e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1521 QQAQALGIPLSLQQAQELGIPLTPQ-QAQALGI-PLTPQQAQELGIPLTPQQAQSQGIPLTPQQAQELGIPLTPQQA--- 1595
Cdd:TIGR01628  405 PQQQFNGQPLGWPRMSMMPTPMGPGgPLRPNGLaPMNAVRAPSRNAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQsta 484
                           90       100       110
                   ....*....|....*....|....*....|....*
gi 1622844908 1596 -QALGIPLTPQQMQAqgitLTPQ-QAQALGIPLTP 1628
Cdd:TIGR01628  485 sQGGQNKKLAQVLAS----ATPQmQKQVLGERLFP 515
 
Name Accession Description Interval E-value
PTZ00121 PTZ00121
MAEBL; Provisional
386-999 2.24e-15

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 83.27  E-value: 2.24e-15
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  386 VDEVQRKETKDSGIKWESSISYIAQAE--RTPDLTELQQQPVASEDISEDSTKDNVSLKEGD-VYQEDEIDEYQSWKRKH 462
Cdd:PTZ00121  1232 AEEAKKDAEEAKKAEEERNNEEIRKFEeaRMAHFARRQAAIKAEEARKADELKKAEEKKKADeAKKAEEKKKADEAKKKA 1311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  463 TKGTHVSETSgpnlsdnkggQRVSEAKlsqyYELQALKKKRKEMKSFPEDKsKSPTEAKRKHLFLTETKSQGGKSGTSMM 542
Cdd:PTZ00121  1312 EEAKKADEAK----------KKAEEAK----KKADAAKKKAEEAKKAAEAA-KAEAEAAADEAEAAEEKAEAAEKKKEEA 1376
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  543 LEQFRMVKRESPFDKRPTAAEFKVEPTIESLD--KEGEGEISSLVEPLNMIQFDDTAEPQKGKIKGKKhriSSGTTTSKE 620
Cdd:PTZ00121  1377 KKKADAAKKKAEEKKKADEAKKKAEEDKKKADelKKAAAAKKKADEAKKKAEEKKKADEAKKKAEEAK---KADEAKKKA 1453
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  621 ETTEEKEVLTKQVKSHRLVKSLSRVAKETSESTRVLESpdGESEQSNLEEFQKAIMAflKQKIDNTGKPFDKKtvpKEEA 700
Cdd:PTZ00121  1454 EEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEAKKK--AEEAKKKADEAKKAAEA--KKKADEAKKAEEAK---KADE 1526
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  701 LLKRTEAEKLGiiKAKMEEYFQKVAEtvtkiLRKYKEIKKEERVgeKPIKQKKVVSFMPGLHFQKSPISAKSESSTFLSH 780
Cdd:PTZ00121  1527 AKKAEEAKKAD--EAKKAEEKKKADE-----LKKAEELKKAEEK--KKAEEAKKAEEDKNMALRKAEEAKKAEEARIEEV 1597
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  781 ESTDPVINNLMQMILAEIESERdiptVSAVQKDHKETEKQRREQY-SQEGQEQMSGMSLKQQflEERNLLKERYEKISEN 859
Cdd:PTZ00121  1598 MKLYEEEKKMKAEEAKKAEEAK----IKAEELKKAEEEKKKVEQLkKKEAEEKKKAEELKKA--EEENKIKAAEEAKKAE 1671
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  860 WEEKKAQLQMKEGKQEqqkqkqwqkeemwKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKMKAQgllle 939
Cdd:PTZ00121  1672 EDKKKAEEAKKAEEDE-------------KKAAEALKKEAEEAKKAEELKKKEAEEKKKAEELKKAEEENKIKAE----- 1733
                          570       580       590       600       610       620
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1622844908  940 kengQMRQIEKEVKHLGPNMRREKGKEK--QKPERGLEDLRRQIKTKEQMQMKETQPKELEK 999
Cdd:PTZ00121  1734 ----EAKKEAEEDKKKAEEAKKDEEEKKkiAHLKKEEEKKAEEIRKEKEAVIEEELDEEDEK 1791
PHA03379 PHA03379
EBNA-3A; Provisional
1215-1602 5.14e-11

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 68.55  E-value: 5.14e-11
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1215 GIPLTPQQAQALGIPLTLQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQq 1294
Cdd:PHA03379   414 GTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQLPGVVQDGR- 492
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1295 aqaLGITLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQEL----GIPLTPQQAQEL 1370
Cdd:PHA03379   493 ---PACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGETSGIVRVRER 569
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1371 -----GIPLTPQQAQELGIPLTPQQAQELGIPLT-PQQAQELGIPLTPQQaQELGIPLTPQQAQELGIPLTPQQ--AQEL 1442
Cdd:PHA03379   570 wrpapWTPNPPRSPSQMSVRDRLARLRAEAQPYQaSVEVQPPQLTQVSPQ-QPMEYPLEPEQQMFPGSPFSQVAdvMRAG 648
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1443 GIPLTPQQAQELGVtltpQQAQELGIPLTPQQAQELGIPLTPQ-QAQELGIPLTPQQAQ-ALGIPLTPQQAQElgIPLTP 1520
Cdd:PHA03379   649 GVPAMQPQYFDLPL----QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQgASAAHFLPQQPME--GPLVP 722
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1521 QQAQALGIPLS------LQQAQELGIPLT-PQQAQALGIPLTPQQAQELgiPLTPQQAQSQGIPLTPqqaqelGIPLTPQ 1593
Cdd:PHA03379   723 ERWMFQGATLSqsvrpgVAQSQYFDLPLTqPINHGAPAAHFLHQPPMEG--PWVPEQWMFQGAPPSQ------GTDVVQH 794

                   ....*....
gi 1622844908 1594 QAQALGIPL 1602
Cdd:PHA03379   795 QLDALGYVL 803
PHA03247 PHA03247
large tegument protein UL36; Provisional
1397-1963 1.00e-10

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 68.04  E-value: 1.00e-10
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1397 PLTPQQAQELGIPlTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGVTLTPQQAQELGIPLTPQ-QA 1475
Cdd:PHA03247  2557 PAAPPAAPDRSVP-PPRPAPRPSEPAVTSRARRPDAPPQSARPRAPVDDRGDPRGPAPPSPLPPDTHAPDPPPPSPSpAA 2635
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1476 QELGIPLTPQQAQelgiPLTPQQAQALGIPLTPQQAQELGIPL----TPQQAQALGIPLSLQQAQELGIPLTPQQAqalg 1551
Cdd:PHA03247  2636 NEPDPHPPPTVPP----PERPRDDPAPGRVSRPRRARRLGRAAqassPPQRPRRRAARPTVGSLTSLADPPPPPPT---- 2707
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1552 iPLTPQQAQELGIPLTP-QQAQSQGIPLTPQQaqelgiPLTPQQAQALGIPLTPQQMQAQGITLTPQQAQALGIPLTPQQ 1630
Cdd:PHA03247  2708 -PEPAPHALVSATPLPPgPAAARQASPALPAA------PAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPP 2780
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1631 LQAQGITLTPQQAQALGVPITPVNAWVSAVTLTPEQTqvlespinleqaqeqlsklgVPLTLDKAHTLGSPLTLKEVQwS 1710
Cdd:PHA03247  2781 RRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAA--------------------LPPAASPAGPLPPPTSAQPTA-P 2839
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1711 HKPFQKPKASLPTGQSIISRLSPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPgkplemgilsEPGKLGAPQTLRSs 1790
Cdd:PHA03247  2840 PPPPGPPPPSLPLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRST----------ESFALPPDQPERP- 2908
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1791 gqtlvyggqstsvqfPAPQAPPTPGQLPKFGAPPTPGQPfeleafssrelfitrasltPPPPQMSNAPLAPrQRLIAGVP 1870
Cdd:PHA03247  2909 ---------------PQPQAPPPPQPQPQPPPPPQPQPP-------------------PPPPPRPQPPLAP-TTDPAGAG 2953
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1871 PTSGQIPSLW-APLSPGQRLVPEaSSIPgdllesgpltfseqlqefqPPATVEQSPYLQAPTSGQHLAPwTLPGLASSLW 1949
Cdd:PHA03247  2954 EPSGAVPQPWlGALVPGRVAVPR-FRVP-------------------QPAPSREAPASSTPPLTGHSLS-RVSSWASSLA 3012
                          570
                   ....*....|....*....
gi 1622844908 1950 IPPTSRHPP-----TLWPS 1963
Cdd:PHA03247  3013 LHEETDPPPvslkqTLWPP 3031
PHA03379 PHA03379
EBNA-3A; Provisional
1301-1812 1.35e-09

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 63.92  E-value: 1.35e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1301 TLTPQQAQALGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQ 1380
Cdd:PHA03379   404 ALEKASEPTYGTPRPPVEKPRPEVPQSLETATSHGSAQVPEPPPVHDLEPGPLHDQHSMAPCPVAQLPPGPLQDLEPGDQ 483
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1381 ELGIPLTPQqaqeLGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQEL----GV 1456
Cdd:PHA03379   484 LPGVVQDGR----PACAPVPAPAGPIVRPWEASLSQVPGVAFAPVMPQPMPVEPVPVPTVALERPVCPAPPLIAmqgpGE 559
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1457 TLTPQQAQEL-----GIPLTPQQAQELGIPLTPQQAQELGIPLT-PQQAQALGIPLTPQQaQELGIPLTPQQAQALGIPL 1530
Cdd:PHA03379   560 TSGIVRVRERwrpapWTPNPPRSPSQMSVRDRLARLRAEAQPYQaSVEVQPPQLTQVSPQ-QPMEYPLEPEQQMFPGSPF 638
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1531 SlqqaqelgipLTPQQAQALGIPLtpQQAQELGIPLtpQQAQSQGIPLTPQQAQELGIPLTPQ-QAQALGIPLTPQQMQA 1609
Cdd:PHA03379   639 S----------QVADVMRAGGVPA--MQPQYFDLPL--QQPISQGAPLAPLRASMGPVPPVPAtQPQYFDIPLTEPINQG 704
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1610 QGITLTPQQAQALGiPLTPQQLQAQGITLTPQ------QAQALGVPIT-PVNAWVSAVTL----------TPEQTQVLES 1672
Cdd:PHA03379   705 ASAAHFLPQQPMEG-PLVPERWMFQGATLSQSvrpgvaQSQYFDLPLTqPINHGAPAAHFlhqppmegpwVPEQWMFQGA 783
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1673 PIN--LEQAQEQLSKLGVPLTLDKAHTLGSPLTLKEVQWSHKPFQKPKASLPTGQ-SIISRLSPSLRLSL-------ASS 1742
Cdd:PHA03379   784 PPSqgTDVVQHQLDALGYVLHVLNHPGVPVSPAVNQYHVSQAAFGLPIDEDESGEgSDTSEPCEALDLSIhgrpcpqAPE 863
                          490       500       510       520       530       540       550
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1622844908 1743 VPTPEKSSILPISRVPLNQGPFPPGKPLEMGILSEPGK-LGAPQTLRSSGQTLVYGGQSTsvQFPAPQAPP 1812
Cdd:PHA03379   864 WPVQGEGGQDATEVLDLSIHGRPRPRTPEWPVQGEDGQnVTGAESRRVVVSALVHMCQDD--EFPDLQDPP 932
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1715-1969 2.11e-09

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 63.25  E-value: 2.11e-09
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1715 QKPKASLPTGQSIISRLSPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPGKPLEMGILSEPGKLGAPQTLRSSGQTL 1794
Cdd:pfam03154  170 QPPVLQAQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPQRLPSPHPPL 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1795 VYGGQSTSVQFPAPQAPPTP---GQLPKFGAP----------PTPGQPFELEAFSSRELF-----------ITRASLTPP 1850
Cdd:pfam03154  250 QPMTQPPPPSQVSPQPLPQPslhGQMPPMPHSlqtgpshmqhPVPPQPFPLTPQSSQSQVppgpspaapgqSQQRIHTPP 329
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1851 PPQMSNAPLAPRQRLIAGV--------PPTSGQIPSLWAPLS---PGQRLVPEASSIPGDLLESGPLTFSEQLQEFQPPA 1919
Cdd:pfam03154  330 SQSQLQSQQPPREQPLPPAplsmphikPPPTTPIPQLPNPQShkhPPHLSGPSPFQMNSNLPPPPALKPLSSLSTHHPPS 409
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1622844908 1920 TveQSPYLQAPTSGQHL--APWTLPGLASSLWIPPT-SRHPPTLWPSPAPGKP 1969
Cdd:pfam03154  410 A--HPPPLQLMPQSQQLppPPAQPPVLTQSQSLPPPaASHPPTSGLHQVPSQS 460
PHA03247 PHA03247
large tegument protein UL36; Provisional
1732-2122 1.37e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 60.72  E-value: 1.37e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1732 SPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPGKPLEMGILSEPGKLGAPQTLRSSGQTlvyGGQSTSVQFPAPQA- 1810
Cdd:PHA03247  2612 APPSPLPPDTHAPDPPPPSPSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGRA---AQASSPPQRPRRRAa 2688
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1811 PPTPGQLPKFGAPPTPGQPFEleafsSRELFITRASLTPPPPQMSN--APLAPrqrlIAGVPPTSGQIPSLWA----PLS 1884
Cdd:PHA03247  2689 RPTVGSLTSLADPPPPPPTPE-----PAPHALVSATPLPPGPAAARqaSPALP----AAPAPPAVPAGPATPGgparPAR 2759
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1885 PGQRLVPEASSIPGDLLESGPLTFSeqlqefqPPATVEQSPYLQAPTSGQHLAPWTLPGLASSLWIPPTSR----HPPTL 1960
Cdd:PHA03247  2760 PPTTAGPPAPAPPAAPAAGPPRRLT-------RPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASpagpLPPPT 2832
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1961 WPSPAPGKPQKGWSP-------SVA-----KKRSAIISSLTSKSALIHPRAPAFKVAQVPFTTKKFQMPevsdtseetqi 2028
Cdd:PHA03247  2833 SAQPTAPPPPPGPPPpslplggSVApggdvRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALP----------- 2901
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 2029 lrdPFAMESFRTFQSHLTKYRTPVSQTPYTGEGALPTLMKPTSLSSLTTLLKTsqisplewyQKSRFPPIDKPWILSSVS 2108
Cdd:PHA03247  2902 ---PDQPERPPQPQAPPPPQPQPQPPPPPQPQPPPPPPPRPQPPLAPTTDPAG---------AGEPSGAVPQPWLGALVP 2969
                          410
                   ....*....|....*
gi 1622844908 2109 GTKK-PKIMVPPSSP 2122
Cdd:PHA03247  2970 GRVAvPRFRVPQPAP 2984
PHA03247 PHA03247
large tegument protein UL36; Provisional
1744-2024 6.97e-08

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 58.41  E-value: 6.97e-08
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1744 PTPEKSSILPISRVPLNQGPFPPGKplemgilSEPGKLGAPQTLRSSGQTLVYGGQSTSVQFPAPQAP--PTPGQLPKFG 1821
Cdd:PHA03247  2706 PTPEPAPHALVSATPLPPGPAAARQ-------ASPALPAAPAPPAVPAGPATPGGPARPARPPTTAGPpaPAPPAAPAAG 2778
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1822 APPTPGQPFELEAFSSRELFITRASLTPPPPQMS--NAPLAPRQRLIAGVPP--TSGQIPSLWAPLSPGQRLVPEASSIP 1897
Cdd:PHA03247  2779 PPRRLTRPAVASLSESRESLPSPWDPADPPAAVLapAAALPPAASPAGPLPPptSAQPTAPPPPPGPPPPSLPLGGSVAP 2858
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1898 G-DLLESGPL--TFSEQLQEFQPPATVEQSPYLQAPTSGQHLAPWTLPGLAS-SLWIPPTSRHPPTLWPSPAPGKPQKGW 1973
Cdd:PHA03247  2859 GgDVRRRPPSrsPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQpQAPPPPQPQPQPPPPPQPQPPPPPPPR 2938
                          250       260       270       280       290
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1622844908 1974 SPSVAKKRSAIISSLTSKSALIHPRAPAFKVAQVPFTtkKFQMPEVSDTSE 2024
Cdd:PHA03247  2939 PQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVP--RFRVPQPAPSRE 2987
AvrBs3 NF041308
type III secretion system effector avirulence protein AvrBs3;
1072-1698 2.57e-07

type III secretion system effector avirulence protein AvrBs3;


Pssm-ID: 469205 [Multi-domain]  Cd Length: 1179  Bit Score: 56.50  E-value: 2.57e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1072 IHLTPQQAQEVGITLTPQQAQAQGIMLTIQQAQEL-GIPLTLQQAQALEIP-----LTPQQAQALGIPLTpqqaqelGIP 1145
Cdd:NF041308   216 IAVLPEATHKDIVEVGKQWSGARALQALLMVAEELrGPPLQLDTGQLIKIAkrggaPAVEAVHASRNALT-------GAP 288
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1146 L--TPQQAQALGITLTPQQAQE---LGIP--------LTPQQAQALGITLTPQQA--------QELGIP---LTPQQAQA 1201
Cdd:NF041308   289 LhlTPHQVVAIASNNGGKQALEtvqRLLPvlcqpphgLTPEQVVAIASNDGGKQAletvqrllPVLCQAehgLTPDQVVA 368
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1202 LGIPLTPQQAQELGIPLTPQQAQAlgiPLTLQQAQELGIPLTPQQAQALGI--PLTPQQAQELGiPLTPQQAQALGIPLT 1279
Cdd:NF041308   369 IASNIGGKPALETVQRLLPVLCQP---PHGLTPDQVVAIASNDGGKQALETvqRLLPVLCQAPH-GLTPDQVVAIASNDG 444
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1280 PQQAQELGIPLTPQQAQALGitLTPQQAQALGIPLTPQQAQELGIPLTPQQAQeLGIPLTPQQAQELGIPLTPQQAQELG 1359
Cdd:NF041308   445 GKQALETVQRLLPELCQAHG--LTPDQVVAIASNGGGKQALETVQRLLPVLCQ-PPHGLTPEQVVAIASNGGGKQALETV 521
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1360 IPLTPQQAQELGiPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQA 1439
Cdd:NF041308   522 QRLLPVLCQPPH-GLTPEQVVAIASHDGGKQALETVHRLLPVLCQA-PHGLTPEQVVAIASHNGGKQALETVQRLLPVLC 599
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1440 QElGIPLTPQQAQELGVTLTPQQAQELGIPLTPQQAQElGIPLTPQQ----AQELGIPLTPQQAQALgIPLTPQQAQELg 1515
Cdd:NF041308   600 QR-PYGLTPNQVVAIASNDGGKQALETVQRLLPVLCQA-PHGLTPDQvvaiASNGGGKQALETVQRL-LPVLCQRPHGL- 675
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1516 iplTPQQAQAL----GIPLSLQQAQELgIP--------LTPQQAQALGIPLTPQQA--------QELGIP---LTPQQAQ 1572
Cdd:NF041308   676 ---TPHQVVAIasndGGKQALETVQRL-LPvlcqppygLTPEQVVAIASNNGGKQAletvqrllPVLCQRphgLTPDQVV 751
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1573 SQGIPLTPQQAQELGIPLTPQQAQALgIPLTPQQMQAQGITLTPQQAQALGIPLTPQQLQAQGiTLTPQQA--------- 1643
Cdd:NF041308   752 AIASNDGGKQALETVQRLLPVLCQPP-HGLTPDQVVAIASNDGGKQALETVQRLLPVLCDAPH-GLTPHQVvaiasnigg 829
                          650       660       670       680       690
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1622844908 1644 -QALGVPITPVNAWVSAVTLTPEQTQVLESPINLEQAQEQLSKLgVPLTLDKAHTL 1698
Cdd:NF041308   830 rQALETVQRLLPVLCQAHGLTPDQVVAIASNNGGKQALETVQRL-LPVLCQPPHGL 884
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1559-1966 9.10e-07

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 54.39  E-value: 9.10e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1559 AQELGIPLTPQQAQSQGIPLTPQQaqeLGIPLTPQQAQALGIPLTPQqMQAQGITLTPQQAQALGIPLTPQQLQAQGITL 1638
Cdd:pfam03154  162 AQQQILQTQPPVLQAQSGAASPPS---PPPPGTTQAATAGPTPSAPS-VPPQGSPATSQPPNQTQSTAAPHTLIQQTPTL 237
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1639 TPQQAQALGVPITPVnawvsavTLTPEQTQVLESPINLEQAQEQLSKLGVPLTLDKAH------TLGSPLTLKEVQWSHK 1712
Cdd:pfam03154  238 HPQRLPSPHPPLQPM-------TQPPPPSQVSPQPLPQPSLHGQMPPMPHSLQTGPSHmqhpvpPQPFPLTPQSSQSQVP 310
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1713 PFQKPKASLPTGQSIISRLSPSlrlslASSVPTPEKSSILPISRVPLNQGPFPPGKPLEmgilsepgKLGAPQtlrssgq 1792
Cdd:pfam03154  311 PGPSPAAPGQSQQRIHTPPSQS-----QLQSQQPPREQPLPPAPLSMPHIKPPPTTPIP--------QLPNPQ------- 370
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1793 tlvyggqstSVQFPAPQAPPTPGQLPKFGAPPTPGQPfeLEAFSSRElfitRASLTPPP----PQMSNAPLAPRQrliag 1868
Cdd:pfam03154  371 ---------SHKHPPHLSGPSPFQMNSNLPPPPALKP--LSSLSTHH----PPSAHPPPlqlmPQSQQLPPPPAQ----- 430
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1869 vPPTSGQIPSLWAPLS---PGQRLVPEASSIPGDLLESGPLTFSEQLQEFQPPATVEQS-PYLQAPTSGQHLAPWTLPGL 1944
Cdd:pfam03154  431 -PPVLTQSQSLPPPAAshpPTSGLHQVPSQSPFPQHPFVPGGPPPITPPSGPPTSTSSAmPGIQPPSSASVSSSGPVPAA 509
                          410       420       430
                   ....*....|....*....|....*....|....*
gi 1622844908 1945 ASSLWIP-------------PTSRHPPTLWPSPAP 1966
Cdd:pfam03154  510 VSCPLPPvqikeealdeaeePESPPPPPRSPSPEP 544
Med15 pfam09606
ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of ...
1341-1654 9.93e-07

ARC105 or Med15 subunit of Mediator complex non-fungal; The approx. 70 residue Med15 domain of the ARC-Mediator co-activator is a three-helix bundle with marked similarity to the KIX domain. The sterol regulatory element binding protein (SREBP) family of transcription activators use the ARC105 subunit to activate target genes in the regulation of cholesterol and fatty acid homeostasis. In addition, Med15 is a critical transducer of gene activation signals that control early metazoan development.


Pssm-ID: 312941 [Multi-domain]  Cd Length: 732  Bit Score: 54.24  E-value: 9.93e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1341 QQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGipltPQQAQELGIPLTPQQAQeLGI 1420
Cdd:pfam09606   62 QPQGGQGNGGMGGGQQGMPDPINALQNLAGQGTRPQMMGPMGPGPGGPMGQQMGG----PGTASNLLASLGRPQMP-MGG 136
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1421 PLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGVTLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQ 1500
Cdd:pfam09606  137 AGFPSQMSRVGRMQPGGQAGGMMQPSSGQPGSGTPNQMGPNGGPGQGQAGGMNGGQQGPMGGQMPPQMGVPGMPGPADAG 216
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1501 ALGipltPQQAQELGIPltPQQAQALGIPLSLQQAQELGipLTPQQAQALGIPLTPQQAQELGIPLTPQQAQSQGIPLTP 1580
Cdd:pfam09606  217 AQM----GQQAQANGGM--NPQQMGGAPNQVAMQQQQPQ--QQGQQSQLGMGINQMQQMPQGVGGGAGQGGPGQPMGPPG 288
                          250       260       270       280       290       300       310
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1622844908 1581 QQAQELGIPLTPQQAQALGIPLTPQQMQAQGITLTPQQaqalgiPLTPQQLQAQGITLTPQQAQALGVPITPVN 1654
Cdd:pfam09606  289 QQPGAMPNVMSIGDQNNYQQQQTRQQQQQQGGNHPAAH------QQQMNQSVGQGGQVVALGGLNHLETWNPGN 356
PHA03378 PHA03378
EBNA-3B; Provisional
1487-1976 1.86e-06

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 53.53  E-value: 1.86e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1487 AQELGIPLTPQQAQAlgiPLTPQQA----QELGIPLTPQQAQALGIPLSLQQAQELGIPLTPQQAQALGIPLTPQQ---- 1558
Cdd:PHA03378   459 TQPLEGPTGPLSVQA---PLEPWQPlphpQVTPVILHQPPAQGVQAHGSMLDLLEKDDEDMEQRVMATLLPPSPPQprag 535
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1559 -------AQELGI----PLTPQQAQSQGIP---LTPQQAQELGIPLTPQQAqalgiplTPQQMQAQGITLTPQQAQALGI 1624
Cdd:PHA03378   536 rrapcvyTEDLDIesdePASTEPVHDQLLPapgLGPLQIQPLTSPTTSQLA-------SSAPSYAQTPWPVPHPSQTPEP 608
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1625 PLTPQQLQAqgiTLTPQQAQALGVPITPVNAWVSAVTLTPEQTQVLESPINLEQAQEQLSKLGVPLTLDKAHTLGsPLTL 1704
Cdd:PHA03378   609 PTTQSHIPE---TSAPRQWPMPLRPIPMRPLRMQPITFNVLVFPTPHQPPQVEITPYKPTWTQIGHIPYQPSPTG-ANTM 684
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1705 KEVQWSHKPFQKPKASlPTGQSiiSRLSPSLRLSLASSVPTPEKSSILPISRVPLNQGPFPPGKPLEMGILSEPGKLGAP 1784
Cdd:PHA03378   685 LPIQWAPGTMQPPPRA-PTPMR--PPAAPPGRAQRPAAATGRARPPAAAPGRARPPAAAPGRARPPAAAPGRARPPAAAP 761
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1785 QTLRSSGQTlvyGGQSTSVqfPAPQAPPTPGQLPKfgAPPTPGQPFELEAFSsrelfiTRASLTPPPPQMSNAPLAPRQR 1864
Cdd:PHA03378   762 GRARPPAAA---PGAPTPQ--PPPQAPPAPQQRPR--GAPTPQPPPQAGPTS------MQLMPRAAPGQQGPTKQILRQL 828
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1865 LIAGVPP--TSGQIPSLWAPLSPGQrLVPEASSIPGDLLESGPLTFSEQLQEFQ------PPATVEQSPYLQAPTsgQHL 1936
Cdd:PHA03378   829 LTGGVKRgrPSLKKPAALERQAAAG-PTPSPGSGTSDKIVQAPVFYPPVLQPIQvmrqlgSVRAAAASTVTQAPT--EYT 905
                          490       500       510       520
                   ....*....|....*....|....*....|....*....|
gi 1622844908 1937 APWTLPGLASSLWIPPTSRHPPTLWPSPAPgkPQKGWSPS 1976
Cdd:PHA03378   906 GERRGVGPMHPTDIPPSKRAKTDAYVESQP--PHGGQSHS 943
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1485-1887 3.92e-06

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 52.46  E-value: 3.92e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1485 QQAQELGIPLTPQQAQALGIPLTPQqaqelgiPLTPQQAQALGIPLSLQQAQELGIPLTPQQAQALGIPLTPQQAQELGI 1564
Cdd:pfam03154  163 QQQILQTQPPVLQAQSGAASPPSPP-------PPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTP 235
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1565 PLTPQQAQSQGIPLTPqqaqelgIPLTPQQAQALGIPLTPQQMQAQGITLtPQQAQAlGIPLTPQQLQAQGITLTPQQAQ 1644
Cdd:pfam03154  236 TLHPQRLPSPHPPLQP-------MTQPPPPSQVSPQPLPQPSLHGQMPPM-PHSLQT-GPSHMQHPVPPQPFPLTPQSSQ 306
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1645 AlGVPITPVNAwvsavtLTPEQTQVLESPINLEQAQEQLSKLGVPLTldkahtlGSPLTLKEVQwshKPFQKPKASLPTG 1724
Cdd:pfam03154  307 S-QVPPGPSPA------APGQSQQRIHTPPSQSQLQSQQPPREQPLP-------PAPLSMPHIK---PPPTTPIPQLPNP 369
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1725 QSII--SRLSPSLRLSLASSVPTPekSSILPISRVPLNQGPFPPGKPLEMGILSEPGKlgaPQTLRSSGQTlvyggQSTS 1802
Cdd:pfam03154  370 QSHKhpPHLSGPSPFQMNSNLPPP--PALKPLSSLSTHHPPSAHPPPLQLMPQSQQLP---PPPAQPPVLT-----QSQS 439
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1803 VQFPAPQAPPTPGQLPKFGAPPTPGQPFeleafssreLFITRASLTPP--PPQMSNAPLAPRQRLIAGVPPTSGQIP-SL 1879
Cdd:pfam03154  440 LPPPAASHPPTSGLHQVPSQSPFPQHPF---------VPGGPPPITPPsgPPTSTSSAMPGIQPPSSASVSSSGPVPaAV 510

                   ....*...
gi 1622844908 1880 WAPLSPGQ 1887
Cdd:pfam03154  511 SCPLPPVQ 518
PHA03379 PHA03379
EBNA-3A; Provisional
1184-1392 5.49e-06

EBNA-3A; Provisional


Pssm-ID: 223066 [Multi-domain]  Cd Length: 935  Bit Score: 51.98  E-value: 5.49e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1184 PQQAQELGIPLTPQQaQALGIPLTPQQAQELGIPLTPQQ--AQALGIP--------LTLQQAQELGIPLTPQQAQALGIP 1253
Cdd:PHA03379   605 SVEVQPPQLTQVSPQ-QPMEYPLEPEQQMFPGSPFSQVAdvMRAGGVPamqpqyfdLPLQQPISQGAPLAPLRASMGPVP 683
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1254 LTPQ-QAQELGIPLTPQQAQ-ALGIPLTPQQAQElgIPLTPQQAQALGITLTPQ------QAQALGIPLT-PQQAQELGI 1324
Cdd:PHA03379   684 PVPAtQPQYFDIPLTEPINQgASAAHFLPQQPME--GPLVPERWMFQGATLSQSvrpgvaQSQYFDLPLTqPINHGAPAA 761
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 1622844908 1325 PLTPQQAQELgiPLTPQQAQELGIPLTPqqaqelGIPLTPQQAQELGIPltPQQAQELGIPLTPQQAQ 1392
Cdd:PHA03379   762 HFLHQPPMEG--PWVPEQWMFQGAPPSQ------GTDVVQHQLDALGYV--LHVLNHPGVPVSPAVNQ 819
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
298-1005 1.39e-05

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 50.74  E-value: 1.39e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  298 EKELSLKVIQDLSNENEMLQQKLHDAEEKCEQliRSKIVTEQAYAILSTSSTLKVLPGPSPQSSRAIIKvgdiEDNMDNI 377
Cdd:pfam02463  238 RIDLLQELLRDEQEEIESSKQEIEKEEEKLAQ--VLKENKEEEKEKKLQEEELKLLAKEEEELKSELLK----LERRKVD 311
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  378 LDKDLENIVDEVQRKEtkdsgIKWESSISYIAQAERTPDLTELQQQPV--ASEDISEDSTKDNVSLKEGDVYQEDEIDEY 455
Cdd:pfam02463  312 DEEKLKESEKEKKKAE-----KELKKEKEEIEELEKELKELEIKREAEeeEEEELEKLQEKLEQLEEELLAKKKLESERL 386
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  456 QSWKRKHTKGThvsetsgpNLSDNKGGQRVSEAKLSQYYELqALKKKRKEMKSFPEDKSKSPTEAKRKHLFLTETKsqgg 535
Cdd:pfam02463  387 SSAAKLKEEEL--------ELKSEEEKEAQLLLELARQLED-LLKEEKKEELEILEEEEESIELKQGKLTEEKEEL---- 453
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  536 KSGTSMMLEQFRMVKRESpfDKRPTAAEFKVEPTIESLDKEGEGEISSLVEplnmiqfddtaepQKGKIKGKKHRISSGT 615
Cdd:pfam02463  454 EKQELKLLKDELELKKSE--DLLKETQLVKLQEQLELLLSRQKLEERSQKE-------------SKARSGLKVLLALIKD 518
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  616 TTSKEETTEEKEVLTKQVkshrlVKSLSRVAKETSESTRVLESPDGESEQSNL---EEFQKAIMAFLKQKIDNTGKPfdK 692
Cdd:pfam02463  519 GVGGRIISAHGRLGDLGV-----AVENYKVAISTAVIVEVSATADEVEERQKLvraLTELPLGARKLRLLIPKLKLP--L 591
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  693 KTVPKEEALLKRTEAEKLGIIKAKMEEyfQKVAETVTKILRKYKEIKKEERVGEKPIKQKKVVSFMPGLhFQKSPISAKS 772
Cdd:pfam02463  592 KSIAVLEIDPILNLAQLDKATLEADED--DKRAKVVEGILKDTELTKLKESAKAKESGLRKGVSLEEGL-AEKSEVKASL 668
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  773 ESSTFLSHESTDPVINNLMQMILAEIESERdiptvsavQKDHKETEKQRREQYSQEGQEQMSGMSLKQQFLEERNLLKER 852
Cdd:pfam02463  669 SELTKELLEIQELQEKAESELAKEEILRRQ--------LEIKKKEQREKEELKKLKLEAEELLADRVQEAQDKINEELKL 740
                          570       580       590       600       610       620       630       640
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  853 YEKISENWEEKKAQLQMKEGKQEQQKQKQWQKEEMWKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKmK 932
Cdd:pfam02463  741 LKQKIDEEEEEEEKSRLKKEEKEEEKSELSLKEKELAEEREKTEKLKVEEEKEEKLKAQEEELRALEEELKEEAELLE-E 819
                          650       660       670       680       690       700       710
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 1622844908  933 AQGLLLEKENGQMRQIEKEVKHLGPNMRREKGKEKQKpERGLEDLRRQIKTKEQMQMKETQPKELEKLVTQTP 1005
Cdd:pfam02463  820 EQLLIEQEEKIKEEELEELALELKEEQKLEKLAEEEL-ERLEEEITKEELLQELLLKEEELEEQKLKDELESK 891
GGN pfam15685
Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the ...
1499-1976 1.41e-05

Gametogenetin; GGN is a family of proteins largely found in mammals. It reacts with POG in the maturation of sperm and is expressed virtually only in the testis. It is found to be associated with the intracellular membrane, binds with GGNBP1 and may be involved in vesicular trafficking.


Pssm-ID: 434857 [Multi-domain]  Cd Length: 668  Bit Score: 50.54  E-value: 1.41e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1499 AQALGIPLtPQQAQELGIPLTPQ-QAQALGIPLSLqqaqELGIPLTPqqaqalgiplTPQQAQELGIPLTPQQAQSQGIP 1577
Cdd:pfam15685   45 AQGLGVWF-PGSSAPPGLLVPPEpQASPSPLPLTL----ELPLPVTP----------PPEEAAAAAVSTAPPPAVGSLLP 109
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1578 lTPQQAQElgiPLTPQQAQALGIPLTPQQMQAQGITLTPQqaqalgIPLTPQQLQAQG-----------ITLTPQQAQal 1646
Cdd:pfam15685  110 -APSKWRK---PTGTAVARIRGLLEASHRGQGDPLSLRPL------LPLLPRQLIEKDpapgapappppTPLEPRKPP-- 177
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1647 gvPITPVNAWVSAVTLTPEQTQVLESPInleQAQEQLSKLGvplTLDKAHTLGSPLTLKEVQWSHKPFQKPKASLPTGQS 1726
Cdd:pfam15685  178 --PLPPSDRQPPNRGITPALATSATSPT---DSQAKHIAEG---KTAGGACGGAPPQAGEGEMARFAASESGLSLLCKVT 249
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1727 IISR--LSPSlrlslASSVPTPEKSSI--------------LPISRVpLNQGPFPPGKPLEMGILSEpgklGAPQTLRSS 1790
Cdd:pfam15685  250 FKSAapLCPA-----AASGPLAAKASLggggggglfaasgaISCAEV-LKQGPLAPGAARPLGEVPR----AALETEGGE 319
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1791 GQTLVYGG-----QSTSVQFPAPQAPPTPGQLPKF---GAPPTPGQPFELE-------AFSSRELFITRASLTPPPPQMS 1855
Cdd:pfam15685  320 GDGEGCSGgpaapASHARALPPPAYTTFPGSKPKFdwvSPPDGPERHFRFNgagggigAPRRRAAALSGPWGSPPPPPGK 399
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1856 NAPL-APRQRLIAGVPPtsgqiPSLWAPLSPGQRLVPEASSIPGDLLESGPLTFSEQLQEFQPPATvEQSPYLQaPTSGQ 1934
Cdd:pfam15685  400 AHPIpGPRRPAPALLAP-----PMFIFPAPTNGEPVRPGPPAPQALLPRPPPPTPPATPPPVPPPI-PQLPALQ-PMPLA 472
                          490       500       510       520
                   ....*....|....*....|....*....|....*....|....*.
gi 1622844908 1935 HLAPWTL---PGLASSLWIP-PTSRHPPTLWPSPAPGkPQKGWSPS 1976
Cdd:pfam15685  473 AARPPTPrpcPGHGESALAPaPTAPLPPALAADQAPA-PALAAAPA 517
AvrBs3 NF041308
type III secretion system effector avirulence protein AvrBs3;
1146-1633 1.82e-05

type III secretion system effector avirulence protein AvrBs3;


Pssm-ID: 469205 [Multi-domain]  Cd Length: 1179  Bit Score: 50.34  E-value: 1.82e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1146 LTPQQAQALGITLTPQQAQELGIPLTPQQAQALGitLTPQQAQELGIPLTPQQAQALGIPLTPQQAQeLGIPLTPQQAQA 1225
Cdd:NF041308   431 LTPDQVVAIASNDGGKQALETVQRLLPELCQAHG--LTPDQVVAIASNGGGKQALETVQRLLPVLCQ-PPHGLTPEQVVA 507
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1226 LGIPLTLQQAQELGIPLTPQQAQAlgipltPQQaqelgipLTPQQAQALGIPLTPQQAQELGIPLTPQQAQAlGITLTPQ 1305
Cdd:NF041308   508 IASNGGGKQALETVQRLLPVLCQP------PHG-------LTPEQVVAIASHDGGKQALETVHRLLPVLCQA-PHGLTPE 573
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1306 QAQALGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIP 1385
Cdd:NF041308   574 QVVAIASHNGGKQALETVQRLLPVLCQR-PYGLTPNQVVAIASNDGGKQALETVQRLLPVLCQA-PHGLTPDQVVAIASN 651
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1386 LTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGVTLTPQQAQE 1465
Cdd:NF041308   652 GGGKQALETVQRLLPVLCQR-PHGLTPHQVVAIASNDGGKQALETVQRLLPVLCQP-PYGLTPEQVVAIASNNGGKQALE 729
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1466 LGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELgIPLTPQQAQALGIPLSLQQAQELGIPLTPQ 1545
Cdd:NF041308   730 TVQRLLPVLCQR-PHGLTPDQVVAIASNDGGKQALETVQRLLPVLCQPP-HGLTPDQVVAIASNDGGKQALETVQRLLPV 807
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1546 QAQALGiPLTPQQAQELGIPLTPQQAQSQGIPLTPQQAQELGipLTPQQAQALGIPLTPQQMQAQGITLTPQQAQAlGIP 1625
Cdd:NF041308   808 LCDAPH-GLTPHQVVAIASNIGGRQALETVQRLLPVLCQAHG--LTPDQVVAIASNNGGKQALETVQRLLPVLCQP-PHG 883

                   ....*...
gi 1622844908 1626 LTPQQLQA 1633
Cdd:NF041308   884 LTPHQVVA 891
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
1089-1472 4.72e-05

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 48.38  E-value: 4.72e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1089 QQAQAQGIMLTIQQAQelGIPLTlQQAQALEIPLTPQQAQA-LGIPLTPQQAQELGIPLTPQQAQALGITLTPQQAQELG 1167
Cdd:cd22540    130 PQIQAAGQINNSGQIQ--IIPGT-NQAIITPVQVLQQPQQAhKPVPIKPAPLQTSNTNSASLQVPGNVIKLQSGGNVALT 206
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1168 IPLTPQQAQALGITLTPQQAQELGIP-LTPQQAQALGIPLTPQQAQE--LGIPLTPQQAQALGIPLTLQQAQELGIPLTP 1244
Cdd:cd22540    207 LPVNNLVGTQDGATQLQLAAAPSKPSkKIRKKSAQAAQPAVTVAEQVetVLIETTADNIIQAGNNLLIVQSPGTGQPAVL 286
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1245 QQAQALgipltpQQAQELGIPLTPQQAqalgipLTPQQAQELGIPLTPQQaqalgitlTPQQAQALGIPLTPQQAQeLGI 1324
Cdd:cd22540    287 QQVQVL------QPKQEQQVVQIPQQA------LRVVQAASATLPTVPQK--------PLQNIQIQNSEPTPTQVY-IKT 345
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1325 PLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELG--IPLTPQQAQELGiPLTPQQAQELGIplTPQQAQELGIplTPQQ 1402
Cdd:cd22540    346 PSGEVQTVLLQEAPAATATPSSSTSTVQQQVTANNgtGTSKPNYNVRKE-RTLPKIAPAGGI--ISLNAAQLAA--AAQA 420
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1403 AQELGIPLTPQQAQELGIPLTPQQAQeLGIPLTPQQAQELGiPLTPQQAQElgvtltpQQAQELGIPLTP 1472
Cdd:cd22540    421 IQTININGVQVQGVPVTITNAGGQQQ-LTVQTVSSNNLTIS-GLSPTQIQL-------QMEQALEIETQP 481
SP2_N cd22540
N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins ...
1305-1672 6.36e-05

N-terminal domain of transcription factor Specificity Protein (SP) 2; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. SP2 contains the least conserved DNA-binding domain within the SP subfamily of proteins, and its DNA sequence specificity differs from the other SP proteins. It localizes primarily within subnuclear foci associated with the nuclear matrix, and can activate, or in some cases, repress expression from different promoters. The transcription factor SP2 serves as a paradigm for indirect genomic binding. It does not require its DNA-binding domain for genomic DNA binding and occupies target promoters independently of whether they contain a cognate DNA-binding motif. SP2 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP2.


Pssm-ID: 411776 [Multi-domain]  Cd Length: 511  Bit Score: 48.00  E-value: 6.36e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1305 QQAQALGIPltPQQAQELGIPLTPQ-QAQELGIPLTPQQAQELgIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELG 1383
Cdd:cd22540    130 PQIQAAGQI--NNSGQIQIIPGTNQaIITPVQVLQQPQQAHKP-VPIKPAPLQTSNTNSASLQVPGNVIKLQSGGNVALT 206
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1384 IPLTPQQAQELGipltPQQAQELGIPLTPQQAqelgiplTPQQAQELGIPLTPQQAQE--LGIPLTPQQAQELGVTLTPQ 1461
Cdd:cd22540    207 LPVNNLVGTQDG----ATQLQLAAAPSKPSKK-------IRKKSAQAAQPAVTVAEQVetVLIETTADNIIQAGNNLLIV 275
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1462 QAQELGIPLTPQQAQELgipltpQQAQELGIPLTPQQAqalgipLTPQQAQELGIPLTPQQAQAlgiPLSLQQAQELGIP 1541
Cdd:cd22540    276 QSPGTGQPAVLQQVQVL------QPKQEQQVVQIPQQA------LRVVQAASATLPTVPQKPLQ---NIQIQNSEPTPTQ 340
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1542 L---TPQ---QAQAL--GIPLTPQQAQELGIPLTPQQAQSQGIPLTPQQAQELGiPLTPQQAQALG-IPLTPQQMQAqgi 1612
Cdd:cd22540    341 VyikTPSgevQTVLLqeAPAATATPSSSTSTVQQQVTANNGTGTSKPNYNVRKE-RTLPKIAPAGGiISLNAAQLAA--- 416
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1622844908 1613 tlTPQQAQALGIPLTpqQLQAQGITLT--PQQAQALGVPITPVNAWVSAVTLTPEQTQVLES 1672
Cdd:cd22540    417 --AAQAIQTININGV--QVQGVPVTITnaGGQQQLTVQTVSSNNLTISGLSPTQIQLQMEQA 474
PHA03378 PHA03378
EBNA-3B; Provisional
1161-1645 6.75e-05

EBNA-3B; Provisional


Pssm-ID: 223065 [Multi-domain]  Cd Length: 991  Bit Score: 48.52  E-value: 6.75e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1161 QQAQELGIPLTPQQAQALGITLTPQQAQELGIPLTPQQAQAlgiPLTPQQA----QELGIPLTPQQAQALGIPLTLQQAQ 1236
Cdd:PHA03378   433 KKAARTEQPRATPHSQAPTVVLHRPPTQPLEGPTGPLSVQA---PLEPWQPlphpQVTPVILHQPPAQGVQAHGSMLDLL 509
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1237 ELGIPLTPQQAQALGIPLTPQQ-----------AQELGI------PLTPQQAQALGIP-LTPQQAQELGIPLTPQQAqal 1298
Cdd:PHA03378   510 EKDDEDMEQRVMATLLPPSPPQpragrrapcvyTEDLDIesdepaSTEPVHDQLLPAPgLGPLQIQPLTSPTTSQLA--- 586
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1299 giTLTPQQAQALGipLTPQQAQELGIPLTPQQAQELGIPltpqqaQELGIPLTPqqaqelgIPLTPQQAQEL--GIPLTP 1376
Cdd:PHA03378   587 --SSAPSYAQTPW--PVPHPSQTPEPPTTQSHIPETSAP------RQWPMPLRP-------IPMRPLRMQPItfNVLVFP 649
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1377 QQAQELGIPLTPQQAQELGIPLTPQQaqelgiPLTPQQAQELGIPLTPQQAQElgIPLTPQQAQELGIPLTPQQAQELGV 1456
Cdd:PHA03378   650 TPHQPPQVEITPYKPTWTQIGHIPYQ------PSPTGANTMLPIQWAPGTMQP--PPRAPTPMRPPAAPPGRAQRPAAAT 721
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1457 TLTPQQAqelGIPLTPQQAQELGIPLTPQQaqelGIPLTPQQAQALGIPLTPQQAqelgipltpqqaqALGIPLSLQQAQ 1536
Cdd:PHA03378   722 GRARPPA---AAPGRARPPAAAPGRARPPA----AAPGRARPPAAAPGRARPPAA-------------APGAPTPQPPPQ 781
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1537 elGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQSQGIPlTPQQAQELGIPLTPQQAQALGIPLTPQQMQAQGITLTP 1616
Cdd:PHA03378   782 --APPAPQQRPRGAPTPQPPPQAGPTSMQLMPRAAPGQQGP-TKQILRQLLTGGVKRGRPSLKKPAALERQAAAGPTPSP 858
                          490       500       510
                   ....*....|....*....|....*....|....*...
gi 1622844908 1617 Q--------QAQALGIP-LTPQQLQAQGITLTPQQAQA 1645
Cdd:PHA03378   859 GsgtsdkivQAPVFYPPvLQPIQVMRQLGSVRAAAAST 896
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
1753-1993 1.17e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 47.86  E-value: 1.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1753 PISRVPLNQGPFPPGKPlEMGILSEPGKLGAPQTLRSSGQTLVYGGQSTSVQFPAPQAPPTPGQLPKFGAPPT--PGQPF 1830
Cdd:PHA03307    49 ELAAVTVVAGAAACDRF-EPPTGPPPGPGTEAPANESRSTPTWSLSTLAPASPAREGSPTPPGPSSPDPPPPTppPASPP 127
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1831 ELEAFSSRELFITRASLTPPPPQMSNAPLAPRQRLIAGvPPTSGQI--------PSLWAPLSPGQRLVPEASSIPGD--- 1899
Cdd:PHA03307   128 PSPAPDLSEMLRPVGSPGPPPAASPPAAGASPAAVASD-AASSRQAalplsspeETARAPSSPPAEPPPSTPPAAASprp 206
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1900 ----------LLESGPLtfSEQLQEFQPPATVEQSPYLQAPTSGQ---------HLAPWTLPG--LASSLWIPPTSRHPP 1958
Cdd:PHA03307   207 prrsspisasASSPAPA--PGRSAADDAGASSSDSSSSESSGCGWgpenecplpRPAPITLPTriWEASGWNGPSSRPGP 284
                          250       260       270       280
                   ....*....|....*....|....*....|....*....|...
gi 1622844908 1959 T--------LWPSPAPGKPQKGWSPSVAKKRSAIISSLTSKSA 1993
Cdd:PHA03307   285 AssssspreRSPSPSPSSPGSGPAPSSPRASSSSSSSRESSSS 327
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
272-611 1.93e-04

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 46.98  E-value: 1.93e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  272 LSMLNDELKGVNFQSSTvCVQETSEAEKEL---------SLKVIQDLSNENEMLQQKLHDAEEKCEQLIRSKIVTEQayA 342
Cdd:TIGR02169  676 LQRLRERLEGLKRELSS-LQSELRRIENRLdelsqelsdASRKIGEIEKEIEQLEQEEEKLKERLEELEEDLSSLEQ--E 752
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  343 ILSTSSTLKVLPGPSPQSSRAIIKvgdIEDNMDNILDKDLENIVDEVQRK--ETKDSGIKWESSISYIAQaertpDLTEL 420
Cdd:TIGR02169  753 IENVKSELKELEARIEELEEDLHK---LEEALNDLEARLSHSRIPEIQAElsKLEEEVSRIEARLREIEQ-----KLNRL 824
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  421 QQQPVASEDISEDSTKDNVSLKEGDVYQEDEIDEYQSWKRKhtKGTHVSETsgpnlsdnkggqrvsEAKLSQYY-ELQAL 499
Cdd:TIGR02169  825 TLEKEYLEKEIQELQEQRIDLKEQIKSIEKEIENLNGKKEE--LEEELEEL---------------EAALRDLEsRLGDL 887
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  500 KKKRKEMKS------FPEDKSKSPTEAKRKHLFLTETKSQGGKSGTSMMLEQFRMVKRESPfdkrPTAAEFKVEPTIESL 573
Cdd:TIGR02169  888 KKERDELEAqlreleRKIEELEAQIEKKRKRLSELKAKLEALEEELSEIEDPKGEDEEIPE----EELSLEDVQAELQRV 963
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|
gi 1622844908  574 dkegEGEISSLvEPLNMIQFDDTAEPQK--GKIKGKKHRI 611
Cdd:TIGR02169  964 ----EEEIRAL-EPVNMLAIQEYEEVLKrlDELKEKRAKL 998
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
1783-2006 1.96e-04

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 47.09  E-value: 1.96e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1783 APQTLRSSGQTLVYG--GQSTSVQFPAPQAPPTPGQLPKFGAPPTPGQPFELEAFSSRELFITRASLTPPPPQMsnAPLA 1860
Cdd:PHA03307    24 PPATPGDAADDLLSGsqGQLVSDSAELAAVTVVAGAAACDRFEPPTGPPPGPGTEAPANESRSTPTWSLSTLAP--ASPA 101
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1861 PRQRLIAGVPPTSGQIPSLWAPLSPGQRLVPEASSIPGDLLESGPltfseqlqefqPPATVEQSPYLQAPTSGQHLAPWT 1940
Cdd:PHA03307   102 REGSPTPPGPSSPDPPPPTPPPASPPPSPAPDLSEMLRPVGSPGP-----------PPAASPPAAGASPAAVASDAASSR 170
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1622844908 1941 LPGLASSlwIPPTSRHPPTLWPSPAPGKPQKGWSPSVAKKRSAIISSLTSKSALIHPRAPAFKVAQ 2006
Cdd:PHA03307   171 QAALPLS--SPEETARAPSSPPAEPPPSTPPAAASPRPPRRSSPISASASSPAPAPGRSAADDAGA 234
KREPA2 cd23959
Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of ...
1778-1931 3.59e-04

Kinetoplastid RNA Editing Protein A2 (KREPA2); The KREPA2 (TbMP63) protein is a component of the parasitic protozoan's KREPA RNA editing catalytic complex (RECC). Kinetoplastid RNA editing (KRE) proteins occur as pairs or sets of related proteins in multiple complexes. KREPA complex is composed of six components (KREPA1-6), which share a conserved C-terminal region containing an oligonucleotide-binding (OB)-fold-like domain. KREPAs are responsible for the site-specific insertion and deletion of U nucleotides in the kinetoplastid mitochondria pre-messenger RNA. Apart from the conserved C-terminal OB-fold domain, KREPA1, KREPA2, and KREPA3 contain two conserved C2H2 zinc-finger domains. KREPA2 and kinetoplastid RNA editing ligase 1 (KREL1) are specific for ligation post-U-deletion and are paralogous to KREL2 and KREPA1 that are specific for ligation post-U-insertion. KREPA2, is critical for RECC stability and KREL1 integration into the complex.


Pssm-ID: 467780 [Multi-domain]  Cd Length: 424  Bit Score: 45.63  E-value: 3.59e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1778 PGKLGAPQTLRSSGQTLVYGGQSTSVQFPAPQAPPTP--GQLPKFGAPPTPGQPFELEAfssrelfitRASLTPPPPQMS 1855
Cdd:cd23959    112 AARVPNPFSASSSTQRETHKTAQVAPPKAEPQTAPVTpfGQLPMFGQHPPPAKPLPAAA---------AAQQSSASPGEV 182
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622844908 1856 NAPLAPrqrliaGVPPTSgqipslwaPLSPGQRLVPEASSIPGDLLE-SGPLTFSEQLQEFQPPATVEQSPYLQAPT 1931
Cdd:cd23959    183 ASPFAS------GTVSAS--------PFATATDTAPSSGAPDGFPAEaSAPSPFAAPASAASFPAAPVANGEAATPT 245
PTZ00121 PTZ00121
MAEBL; Provisional
645-1000 6.46e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 45.52  E-value: 6.46e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  645 VAKETSESTRVLESPDGESEQSNLEEFQKAIMAFLK----QKIDNTGKPFDKKTVpkEEAllKRTEAEKLGIIKAKMEEy 720
Cdd:PTZ00121  1088 RADEATEEAFGKAEEAKKTETGKAEEARKAEEAKKKaedaRKAEEARKAEDARKA--EEA--RKAEDAKRVEIARKAED- 1162
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  721 fQKVAEtvtkILRKYKEIKKEERvGEKPIKQKKVVSFMPGLHFQKSPISAKSESSTFLSHestdpvinnlmqmiLAEIES 800
Cdd:PTZ00121  1163 -ARKAE----EARKAEDAKKAEA-ARKAEEVRKAEELRKAEDARKAEAARKAEEERKAEE--------------ARKAED 1222
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  801 ERDIPTVSAVQKDHKETEKQRREQYSQEGQEQMSGMSLKQQFLEERNLLKERYEKISENwEEKKAQLQMKEGKQEQQKQK 880
Cdd:PTZ00121  1223 AKKAEAVKKAEEAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKAD-ELKKAEEKKKADEAKKAEEK 1301
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  881 QWQKEEMWKKEQKQTTPKQAEREEKQKQRGQE-----EEELSKSSLQRLEEGTRKMKAQGLLLEKENGQMRQIEKEVKhl 955
Cdd:PTZ00121  1302 KKADEAKKKAEEAKKADEAKKKAEEAKKKADAakkkaEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEAKKK-- 1379
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*
gi 1622844908  956 gPNMRREKGKEKQKPERGLEDLRRQIKTKEQMQMKETQPKELEKL 1000
Cdd:PTZ00121  1380 -ADAAKKKAEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEA 1423
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
815-1000 9.13e-04

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 44.58  E-value: 9.13e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  815 KETEKQRREQYSQEGQEQMSGMSLKQQFLEERNLLKERYEKISENWEEKKAQLQMKEGKQEQQKQKQWqkeemwKKEQKQ 894
Cdd:pfam02463  193 EELKLQELKLKEQAKKALEYYQLKEKLELEEEYLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQEI------EKEEEK 266
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  895 TTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKMKAQG---LLLEKENGQMRQIEKEVKHLgpNMRREKGKEKQKpE 971
Cdd:pfam02463  267 LAQVLKENKEEEKEKKLQEEELKLLAKEEEELKSELLKLERrkvDDEEKLKESEKEKKKAEKEL--KKEKEEIEELEK-E 343
                          170       180
                   ....*....|....*....|....*....
gi 1622844908  972 RGLEDLRRQIKTKEQMQMKETQPKELEKL 1000
Cdd:pfam02463  344 LKELEIKREAEEEEEEELEKLQEKLEQLE 372
PHA03247 PHA03247
large tegument protein UL36; Provisional
1191-1445 9.22e-04

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 44.93  E-value: 9.22e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1191 GIPLTPQQAQALGIPLTPQQAqelgiPLTPQQAQALGIPLTLQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQ 1270
Cdd:PHA03247  2718 ATPLPPGPAAARQASPALPAA-----PAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPPRRLTRPAVASLS 2792
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1271 AQALGIPLTPQQAQelgiplTPQQAQALGITLTPQQAQALGIPlTPQQAQELGIPLTPQQAQElgiPLTPQQAQELGIPL 1350
Cdd:PHA03247  2793 ESRESLPSPWDPAD------PPAAVLAPAAALPPAASPAGPLP-PPTSAQPTAPPPPPGPPPP---SLPLGGSVAPGGDV 2862
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1351 T--PQQAQELGIPLTPQQ--AQELGIPLTPQQAQELGIPLTPQQAqelgiPLTPQQAQElgiPLTPQQAQELGIPLTPQQ 1426
Cdd:PHA03247  2863 RrrPPSRSPAAKPAAPARppVRRLARPAVSRSTESFALPPDQPER-----PPQPQAPPP---PQPQPQPPPPPQPQPPPP 2934
                          250
                   ....*....|....*....
gi 1622844908 1427 AQELGIPLTPQQAQELGIP 1445
Cdd:PHA03247  2935 PPPRPQPPLAPTTDPAGAG 2953
SP4_N cd22536
N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins ...
1484-1692 9.71e-04

N-terminal domain of transcription factor Specificity Protein (SP) 4; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. Human SP4 is a risk gene of multiple psychiatric disorders including schizophrenia, bipolar disorder, and major depression. SP4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. SP factors may be separated into three groups based on their domain architecture and the similarity of their N-terminal transactivation domains: SP1-4, SP5, and SP6-9. The transactivation domains between the three groups are not homologous to one another. SP1-4 have similar N-terminal transactivation domains characterized by glutamine-rich regions, which, in most cases, have adjacent serine/threonine-rich regions. This model represents the N-terminal domain of SP4.


Pssm-ID: 411773 [Multi-domain]  Cd Length: 623  Bit Score: 44.52  E-value: 9.71e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1484 PQQAQELGIPLTPQQAQALGIPLTPQQAQELgipltpQQAQALGIPLSLQQaqelgIPLTPQQAQalgIPLTPQQAQELG 1563
Cdd:cd22536    336 PQTSAAESEAQSSSQLQSNGLQNVQDQSNSL------QQVQIVGQPILQQI-----QIQQPQQQI---IQAIQPQSFQLQ 401
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1564 IPLTPQQAQSQgipltPQQAQELGIPLTPQQAQALGIPLTP------QQMQAQGI-TLTPQQAQALGIP----LTPQQLQ 1632
Cdd:cd22536    402 SGQTIQTIQQQ-----PLQNVQLQAVQSPTQVLIRAPTLTPsgqiswQTVQVQNIqSLSNLQVQNAGLPqqltLTPVSSS 476
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1633 AQGITLTpqQAQALGVPITPVNAWVSAVTLTPEQTQVleSPINLEQAQEQLSklGVPLTL 1692
Cdd:cd22536    477 AGGTTIA--QIAPVAVAGTPITLNAAQLASVPNLQTV--NVANLGAAGVQVQ--GVPVTI 530
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
1377-1858 1.26e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.37  E-value: 1.26e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1377 QQAQELGIPLTPQQAQELGIPLTPqqaqelGIPLTPQQAQELGIPLTPQQAQElGIPLTPQQAQELGIPLTPQQAQELGV 1456
Cdd:pfam03154  163 QQQILQTQPPVLQAQSGAASPPSP------PPPGTTQAATAGPTPSAPSVPPQ-GSPATSQPPNQTQSTAAPHTLIQQTP 235
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1457 TLTPQQaqeLGIPLTPQQaqelgiPLTPqqaqelgiPLTPQQaqalgIPLTPQQAQELGIPLTPqqaqalgIPLSLQQaq 1536
Cdd:pfam03154  236 TLHPQR---LPSPHPPLQ------PMTQ--------PPPPSQ-----VSPQPLPQPSLHGQMPP-------MPHSLQT-- 284
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1537 elGIPLTPQqaqalgiPLTPQqaqelGIPLTPQQAQSQGiPLTPQqaqelgiPLTPQQAQALgIPLTPQQMQAQgiTLTP 1616
Cdd:pfam03154  285 --GPSHMQH-------PVPPQ-----PFPLTPQSSQSQV-PPGPS-------PAAPGQSQQR-IHTPPSQSQLQ--SQQP 339
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1617 QQAQalgiPLTPQQLQAQGITLTPQqaqalgvpiTPVnawvsavtltpeqtqvleSPINLEQAQEQLSKLGVPLTLDKAH 1696
Cdd:pfam03154  340 PREQ----PLPPAPLSMPHIKPPPT---------TPI------------------PQLPNPQSHKHPPHLSGPSPFQMNS 388
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1697 TLGSPLTLKEVQW--SHKP---FQKPKASLPTGQSI--------ISRLSPSLRLSlASSVPTPEKSSILPiSRVPLNQGP 1763
Cdd:pfam03154  389 NLPPPPALKPLSSlsTHHPpsaHPPPLQLMPQSQQLppppaqppVLTQSQSLPPP-AASHPPTSGLHQVP-SQSPFPQHP 466
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1764 FPPGKPlemgilsepgklgaPQTLRSSGQTLVYGGQSTSVQFPAPQAPPTPGQLPKFGAPPTPGQPFELEAFSSRElfit 1843
Cdd:pfam03154  467 FVPGGP--------------PPITPPSGPPTSTSSAMPGIQPPSSASVSSSGPVPAAVSCPLPPVQIKEEALDEAE---- 528
                          490
                   ....*....|....*
gi 1622844908 1844 rASLTPPPPQMSNAP 1858
Cdd:pfam03154  529 -EPESPPPPPRSPSP 542
PHA03377 PHA03377
EBNA-3C; Provisional
1782-2000 1.42e-03

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 43.89  E-value: 1.42e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1782 GAPQTLRSSGQTLVYGGQSTS----VQFPAPQAPPTPGQLPKFGAPPTPGQPFeleafSSRElfitraslTPPPPQmsna 1857
Cdd:PHA03377   696 GRAQPSEESHLSSMSPTQPISheeqPRYEDPDDPLDLSLHPDQAPPPSHQAPY-----SGHE--------EPQAQQ---- 758
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1858 plAPRQRLIAGVPPtsgQIPSLWAPLSPGQRLvpEASSIPGDlleSGPLTFSEQLQEFQPPatveQSPYLQAPTSGQHLA 1937
Cdd:PHA03377   759 --APYPGYWEPRPP---QAPYLGYQEPQAQGV--QVSSYPGY---AGPWGLRAQHPRYRHS----WAYWSQYPGHGHPQG 824
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1622844908 1938 PWTlpglasslwipPTSRHPPTLW-PSPAPGKPQKGWSPSVAKKRSAIISSLTSKSALIHPRAP 2000
Cdd:PHA03377   825 PWA-----------PRPPHLPPQWdGSAGHGQDQVSQFPHLQSETGPPRLQLSQVPQLPYSQTL 877
SMC_prok_B TIGR02168
chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of ...
697-1003 1.95e-03

chromosome segregation protein SMC, common bacterial type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. This family represents the SMC protein of most bacteria. The smc gene is often associated with scpB (TIGR00281) and scpA genes, where scp stands for segregation and condensation protein. SMC was shown (in Caulobacter crescentus) to be induced early in S phase but present and bound to DNA throughout the cell cycle. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274008 [Multi-domain]  Cd Length: 1179  Bit Score: 43.51  E-value: 1.95e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  697 KEEALLK--RTEA--EKLGIIKAKME---EYFQKVAETVTKilrkYKEIKKEERVGEKPIKQKKVVSFMPGLHFQKSpiS 769
Cdd:TIGR02168  174 RKETERKleRTREnlDRLEDILNELErqlKSLERQAEKAER----YKELKAELRELELALLVLRLEELREELEELQE--E 247
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  770 AKSESSTFLSHEStdpvinnLMQMILAEIESERDiptvsAVQKDHKETEKQRREQYSQegQEQMSGMSLKQQFLEERnll 849
Cdd:TIGR02168  248 LKEAEEELEELTA-------ELQELEEKLEELRL-----EVSELEEEIEELQKELYAL--ANEISRLEQQKQILRER--- 310
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  850 keryekiSENWEEKKAQLQMKEGKQEQQKQKQWQKEEMWKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQRLEEGTR 929
Cdd:TIGR02168  311 -------LANLERQLEELEAQLEELESKLDELAEELAELEEKLEELKEELESLEAELEELEAELEELESRLEELEEQLET 383
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  930 KMKAQGLLLEKE---NGQMRQIEKEVKHLGPNMRR---EKGKEKQKPERG-LEDLRRQIKTKEQMQmkETQPKELEKLVT 1002
Cdd:TIGR02168  384 LRSKVAQLELQIaslNNEIERLEARLERLEDRRERlqqEIEELLKKLEEAeLKELQAELEELEEEL--EELQEELERLEE 461

                   .
gi 1622844908 1003 Q 1003
Cdd:TIGR02168  462 A 462
DUF4670 pfam15709
Domain of unknown function (DUF4670); This family of proteins is found in eukaryotes. Proteins ...
811-969 2.14e-03

Domain of unknown function (DUF4670); This family of proteins is found in eukaryotes. Proteins in this family are typically between 373 and 763 amino acids in length.


Pssm-ID: 464815 [Multi-domain]  Cd Length: 522  Bit Score: 43.02  E-value: 2.14e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  811 QKDHKETEKQRREQYSQEGQEQMSGMSLKQQFLEERNLLKERyekisenwEEKKAQLQMKEGKQEQQKQKQWQKEEMWKK 890
Cdd:pfam15709  364 QQEQLERAEKMREELELEQQRRFEEIRLRKQRLEEERQRQEE--------EERKQRLQLQAAQERARQQQEEFRRKLQEL 435
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1622844908  891 EQKQTTpKQAEREEKQKQRGQEEEELSKSSLQRLEEGTRKMKAQGLLLEKENGQMRQIEKEvkhlgpnMRREKGKEKQK 969
Cdd:pfam15709  436 QRKKQQ-EEAERAEAEKQRQKELEMQLAEEQKRLMEMAEEERLEYQRQKQEAEEKARLEAE-------ERRQKEEEAAR 506
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
688-1043 2.71e-03

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 43.04  E-value: 2.71e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  688 KPFDKKTVPKEEA---LLKRTEAEKLgiiKAKME--EYFQKVAETVTKILRKYKEIKKEERVGEKPIKQKKVVSfmpglh 762
Cdd:pfam02463  150 MKPERRLEIEEEAagsRLKRKKKEAL---KKLIEetENLAELIIDLEELKLQELKLKEQAKKALEYYQLKEKLE------ 220
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  763 fqkspisAKSESSTFLSHESTDPVINNLMQMILAEIESERDIptVSAVQKDHKETEKQRREQYSQEGQEQMSGMSLKQQF 842
Cdd:pfam02463  221 -------LEEEYLLYLDYLKLNEERIDLLQELLRDEQEEIES--SKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLL 291
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  843 LEERNLLKERYEKISENWEEKKAQLQMKEGKQEQQKQKQwqkeemwKKEQKQTTPKQAEREEKQKQRGQEEEELSKSSLQ 922
Cdd:pfam02463  292 AKEEEELKSELLKLERRKVDDEEKLKESEKEKKKAEKEL-------KKEKEEIEELEKELKELEIKREAEEEEEEELEKL 364
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  923 RLEEGTRKMKAqgLLLEKENGQMRQIEKEVKHLGPNMRREKGKEKQKP---ERGLEDLRRQIKTKEQMQMKETQpKELEK 999
Cdd:pfam02463  365 QEKLEQLEEEL--LAKKKLESERLSSAAKLKEEELELKSEEEKEAQLLlelARQLEDLLKEEKKEELEILEEEE-ESIEL 441
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....
gi 1622844908 1000 LVTQTPMTLSPRWKSVLKDVPWLYEGKESHRNLKTLENLPDEKE 1043
Cdd:pfam02463  442 KQGKLTEEKEELEKQELKLLKDELELKKSEDLLKETQLVKLQEQ 485
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1472-1632 2.73e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 43.10  E-value: 2.73e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1472 PQQAQELGIPLTPQQaqelgIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLSLQQAqelgiPLTPQQAQALG 1551
Cdd:pfam09770  210 PAQQPAPAPAQPPAA-----PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDPAQP 279
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1552 IPLTPQQAQELGIPltPQQAQSQGIPLTPQQAQELGIPLtPQQAQALGIPLTPQQMQAQGitltPQQAQALGIPLTPQQL 1631
Cdd:pfam09770  280 SIQPQAQQFHQQPP--PVPVQPTQILQNPNRLSAARVGY-PQNPQPGVQPAPAHQAHRQQ----GSFGRQAPIITHPQQL 352

                   .
gi 1622844908 1632 Q 1632
Cdd:pfam09770  353 A 353
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1445-1608 3.34e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 42.72  E-value: 3.34e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1445 PLTPQQAQELGVTLTPQQAQelgipltPQQAQELGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAqelgiPLTPQQAQ 1524
Cdd:pfam09770  209 KPAQQPAPAPAQPPAAPPAQ-------QAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDP 276
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1525 ALGIPLSLQQAQELGIPLTPQQ-AQALGIPLTPQQAQeLGIPLTPQQAQSQGIPLTPQQAQelgipltPQQAQALGIPLT 1603
Cdd:pfam09770  277 AQPSIQPQAQQFHQQPPPVPVQpTQILQNPNRLSAAR-VGYPQNPQPGVQPAPAHQAHRQQ-------GSFGRQAPIITH 348

                   ....*
gi 1622844908 1604 PQQMQ 1608
Cdd:pfam09770  349 PQQLA 353
DUF5401 pfam17380
Family of unknown function (DUF5401); This is a family of unknown function found in ...
791-1015 3.62e-03

Family of unknown function (DUF5401); This is a family of unknown function found in Chromadorea.


Pssm-ID: 375164 [Multi-domain]  Cd Length: 722  Bit Score: 42.80  E-value: 3.62e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  791 MQMILAEIESERDIPTVSAVQKDHKETEKQRREQysQEGQEQMSGMSLKQQFLEERNLLKERYEKISENWEEKKAQ-LQM 869
Cdd:pfam17380  422 MEQIRAEQEEARQREVRRLEEERAREMERVRLEE--QERQQQVERLRQQEEERKRKKLELEKEKRDRKRAEEQRRKiLEK 499
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  870 KEGKQEQQKQKQWQKEEMWKKEQKQttpKQAEREEKQKQRGQEEEelsKSSLQRLEEgTRKMKAQGLLLEKENGQMRQIE 949
Cdd:pfam17380  500 ELEERKQAMIEEERKRKLLEKEMEE---RQKAIYEEERRREAEEE---RRKQQEMEE-RRRIQEQMRKATEERSRLEAME 572
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622844908  950 KEVKHlgpnMRREKGKEKQKPERGLEDLRRQIKTKEQMQMKETQPKELE----KLVTQT-------PMTLSPRWKSV 1015
Cdd:pfam17380  573 REREM----MRQIVESEKARAEYEATTPITTIKPIYRPRISEYQPPDVEshmiRFTTQSpewatpsPATWNPEWNTV 645
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
787-1043 3.78e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 42.74  E-value: 3.78e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  787 INNLMQMILAEIESERDIptvsavqKDHKETEKQRREQYSQEGQEQMSGMSLKQQFLEERNLLKERYEKISENWEEKKAQ 866
Cdd:PRK03918   174 IKRRIERLEKFIKRTENI-------EELIKEKEKELEEVLREINEISSELPELREELEKLEKEVKELEELKEEIEELEKE 246
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  867 LQMKegkqeqqkqkqwqkeemwKKEQKQTTPKQAEREEKQKQRGQEEEELsKSSLQRLEEgtrkmkaqgllLEKENGQMR 946
Cdd:PRK03918   247 LESL------------------EGSKRKLEEKIRELEERIEELKKEIEEL-EEKVKELKE-----------LKEKAEEYI 296
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  947 QIEKEvkhlgpnmRREKGKEKQKPERGLEDLRRQIKT-KEQMQMKETQPKELEKlvtqtpmtLSPRWKSVLKDvpwLYEG 1025
Cdd:PRK03918   297 KLSEF--------YEEYLDELREIEKRLSRLEEEINGiEERIKELEEKEERLEE--------LKKKLKELEKR---LEEL 357
                          250
                   ....*....|....*...
gi 1622844908 1026 KESHRNLKTLENLPDEKE 1043
Cdd:PRK03918   358 EERHELYEEAKAKKEELE 375
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1084-1311 3.88e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 41.94  E-value: 3.88e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1084 ITLTPQQAQAqgiMLTIQQAQELGIPLTLQQAQALEIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGITL----- 1158
Cdd:cd22553    101 IQLAPGGTQA---ILANQQTLIRPNTVQGQANASNVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGQTVyqtiq 177
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1159 TPQQAQELGIPLTPQ---QAQALGITLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTLQQA 1235
Cdd:cd22553    178 VPIQAIQSGNAGGGNqalQAQVIPQLAQAAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIIGQVASASS 257
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1622844908 1236 QELGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAqelgIPLTPQQAQALGITLTP-QQAQALG 1311
Cdd:cd22553    258 IQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPASSS----IPTVVQQQAIQGNPLPPgTQIIAAG 330
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1172-1322 3.99e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 42.72  E-value: 3.99e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1172 PQQAQALGITLTPQQaqelgIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALGIPLTLQQAqelgiPLTPQQAQALG 1251
Cdd:pfam09770  210 PAQQPAPAPAQPPAA-----PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQR-----PQSPQPDPAQP 279
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1622844908 1252 IPLTPQQAQELGIPLTPQQ-AQALGIPLTPQQAQELGIPLTPQQAQALGITLTPQQAQALGIPL----TPQQAQEL 1322
Cdd:pfam09770  280 SIQPQAQQFHQQPPPVPVQpTQILQNPNRLSAARVGYPQNPQPGVQPAPAHQAHRQQGSFGRQApiitHPQQLAQL 355
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1110-1379 4.69e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 41.94  E-value: 4.69e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1110 LTLQQAQALEIPltpqqAQALGIPLTPQQAqelgIPLTPQQAQALgitltpqqaqelgipLTPQQAQALGITLTPQQAQE 1189
Cdd:cd22553     76 VTVDGHEAIFIP-----ANSGLLQTNNQQA----IQLAPGGTQAI---------------LANQQTLIRPNTVQGQANAS 131
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1190 LGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALG--IPLTLQ---QAQELGIPLTPQQA-QALGIP--LTPQQAQE 1261
Cdd:cd22553    132 NVLQNIAQIASGGNAVQLPLNNMTQTIPVQVPVSTANGqtVYQTIQvpiQAIQSGNAGGGNQAlQAQVIPqlAQAAQLQP 211
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1262 LGIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQAL-GITLTPQQAQALGIPLTPQQaQELGIPLTPQQAQELGIPLTP 1340
Cdd:cd22553    212 QQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIiGQVASASSIQAAAIPLTVYT-GALAGQNGSNQQQVGQIVTSP 290
                          250       260       270
                   ....*....|....*....|....*....|....*....
gi 1622844908 1341 QQAQELGIPLTPQQAqelgIPLTPQQAQELGIPLTPQQA 1379
Cdd:cd22553    291 IQGMTQGLTAPASSS----IPTVVQQQAIQGNPLPPGTQ 325
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1436-1632 5.34e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 41.55  E-value: 5.34e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1436 PQQAQELGIPLTPQQAqeLGVTLTPQQAQelgipltPQQAQELGIPLTPQQAQELGIPLTPQQAQALGIPLTPQQAQELG 1515
Cdd:cd22553    179 PIQAIQSGNAGGGNQA--LQAQVIPQLAQ-------AAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQII 249
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1516 IPLTPQQAQALGIPLSLQQAQELGIPLTPQQAQALGIPLTPQQAQELGIPLTPqqaqSQGIPLTPQQAQELGIPLTP-QQ 1594
Cdd:cd22553    250 GQVASASSIQAAAIPLTVYTGALAGQNGSNQQQVGQIVTSPIQGMTQGLTAPA----SSSIPTVVQQQAIQGNPLPPgTQ 325
                          170       180       190
                   ....*....|....*....|....*....|....*...
gi 1622844908 1595 AQALGipltpQQMQAQGITLTPQQAQALGIPLTPQQLQ 1632
Cdd:cd22553    326 IIAAG-----QQLQQDPNDPTKWQVVADGTPGSKKRLR 358
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
1304-1466 5.74e-03

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 41.95  E-value: 5.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1304 PQQAQALGIPLTPQQaqelgIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGiPLTPQQAQelg 1383
Cdd:pfam09770  210 PAQQPAPAPAQPPAA-----PPAQQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQGHPVTILQRPQSP-QPDPAQPS--- 280
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1384 iPLTPQQAQELGIPLTPQQ-AQELGIPLTPQQAQeLGIPLTPQQAQELGIPLTPQQAQelgipltPQQAQELGVTLTPQQ 1462
Cdd:pfam09770  281 -IQPQAQQFHQQPPPVPVQpTQILQNPNRLSAAR-VGYPQNPQPGVQPAPAHQAHRQQ-------GSFGRQAPIITHPQQ 351

                   ....
gi 1622844908 1463 AQEL 1466
Cdd:pfam09770  352 LAQL 355
SP1-4_arthropods_N cd22553
N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; ...
1076-1211 5.97e-03

N-terminal domain of transcription factor Specificity Protein (SP) 1-4 from arthropods; Specificity Proteins (SPs) are transcription factors that are involved in many cellular processes, including cell differentiation, cell growth, apoptosis, immune responses, response to DNA damage, and chromatin remodeling. There are many SPs in vertebrates (9 SPs in humans and mice, 7 SPs in the chicken, and 11 SPs in teleost fish), but arthropods only have 3 SPs. One SP is clade SP1-4, which is expressed ubiquitously throughout development. SP1-4 belongs to a family of proteins, called the SP/Kruppel or Krueppel-like Factor (KLF) family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. SP factors preferentially bind GC boxes, while KLFs bind CACCC boxes. Another characteristic hallmark of SP factors is the presence of the Buttonhead (BTD) box CXCPXC, just N-terminal to the zinc fingers. The function of the BTD box is unknown, but it is thought to play an important physiological role. Another feature of most SP factors is the presence of a conserved amino acid stretch, the so-called SP box, located close to the N-terminus. This model represents the N-terminal domain of SP1-4 from arthropods.


Pssm-ID: 411778 [Multi-domain]  Cd Length: 384  Bit Score: 41.55  E-value: 5.97e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1076 PQQAQEVGITLTPQQAQAQGIMLTIQQAQELGIPLTLQQAQALEIPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQALG 1155
Cdd:cd22553    194 ALQAQVIPQLAQAAQLQPQQLAQVSSQGYIQQIPANASQQQPQMVQQGPNQSGQIIGQVASASSIQAAAIPLTVYTGALA 273
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1622844908 1156 ITLTPQQAQELGIPLTPQQAQALGITLTPQQAqelgIPLTPQQAQALGIPLTPQQA 1211
Cdd:cd22553    274 GQNGSNQQQVGQIVTSPIQGMTQGLTAPASSS----IPTVVQQQAIQGNPLPPGTQ 325
PHA03247 PHA03247
large tegument protein UL36; Provisional
1055-1424 6.85e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.85  E-value: 6.85e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1055 PSSPGALPISGQPLTRCIHLTPQQAQEVGITLTPQQAQAQG----IMLTIQQAQELGIPLTLQQAQALEIPLTPQQAqal 1130
Cdd:PHA03247  2631 PSPAANEPDPHPPPTVPPPERPRDDPAPGRVSRPRRARRLGraaqASSPPQRPRRRAARPTVGSLTSLADPPPPPPT--- 2707
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1131 giPLTPQQAQELGIPLTPQQAQALGITLTPQQAqelgiPLTPQQAQALGITLTPQQAQELGIPLTPQQAQALGIPLTPQQ 1210
Cdd:PHA03247  2708 --PEPAPHALVSATPLPPGPAAARQASPALPAA-----PAPPAVPAGPATPGGPARPARPPTTAGPPAPAPPAAPAAGPP 2780
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1211 AQELGIPLTPQQAQALGIPLTLQQAQelgiPLTPQQAQALGIPLTPQQAQELGIPLTPQQAQAlgiPLTPQQAQElgiPL 1290
Cdd:PHA03247  2781 RRLTRPAVASLSESRESLPSPWDPAD----PPAAVLAPAAALPPAASPAGPLPPPTSAQPTAP---PPPPGPPPP---SL 2850
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1291 TPQQAQALG--ITLTPQQAQALGIPLTPQQ--AQELGIPLTPQQAQELgiPLTPQQAQELGIPLTPQQAQELgiPLTPQQ 1366
Cdd:PHA03247  2851 PLGGSVAPGgdVRRRPPSRSPAAKPAAPARppVRRLARPAVSRSTESF--ALPPDQPERPPQPQAPPPPQPQ--PQPPPP 2926
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1622844908 1367 AQELGIPLTPQQAQELGIPLTPQQAQELGIPLTPQQAQELGIP--------LTPQQAQELGIPLTP 1424
Cdd:PHA03247  2927 PQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPgrvavprfRVPQPAPSREAPASS 2992
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
640-1043 8.10e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 41.59  E-value: 8.10e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  640 KSLSRVAKETSESTRVLESPDGESE--QSNLEEFQKaimafLKQKIDNTgkpfdKKTVPKEEALLKRTEAEKLGIikakm 717
Cdd:PRK03918   200 KELEEVLREINEISSELPELREELEklEKEVKELEE-----LKEEIEEL-----EKELESLEGSKRKLEEKIREL----- 264
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  718 EEYFQKVAETVTKILRKYKEIKKEERVGEKPIKQKKvvsfmpglhFQKSPISAKSESSTFLSHESTDpvINNLmQMILAE 797
Cdd:PRK03918   265 EERIEELKKEIEELEEKVKELKELKEKAEEYIKLSE---------FYEEYLDELREIEKRLSRLEEE--INGI-EERIKE 332
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  798 IESERDipTVSAVQKDHKETEKQ--RREQYSQEGQEQMSGMSLKQQFLEER-NLLKERYEKISENWEEKKAQLQmkegkq 874
Cdd:PRK03918   333 LEEKEE--RLEELKKKLKELEKRleELEERHELYEEAKAKKEELERLKKRLtGLTPEKLEKELEELEKAKEEIE------ 404
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  875 eqqkqkqwqkeemwkKEQKQTTPKQAEREEKQKQRGQEEEELSKSSL------QRLEEGTRKMkaqglLLEKENGQMRQI 948
Cdd:PRK03918   405 ---------------EEISKITARIGELKKEIKELKKAIEELKKAKGkcpvcgRELTEEHRKE-----LLEEYTAELKRI 464
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908  949 EKEVKHLGpNMRREKGKEKQKPERGLEDLRRQIKTKEQM-QMKETQPK-------ELEK------LVTQTPMTLSPRWKS 1014
Cdd:PRK03918   465 EKELKEIE-EKERKLRKELRELEKVLKKESELIKLKELAeQLKELEEKlkkynleELEKkaeeyeKLKEKLIKLKGEIKS 543
                          410       420
                   ....*....|....*....|....*....
gi 1622844908 1015 VLKDvpwLYEGKESHRNLKTLENLPDEKE 1043
Cdd:PRK03918   544 LKKE---LEKLEELKKKLAELEKKLDELE 569
PABP-1234 TIGR01628
polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins ...
1521-1628 9.37e-03

polyadenylate binding protein, human types 1, 2, 3, 4 family; These eukaryotic proteins recognize the poly-A of mRNA and consists of four tandem RNA recognition domains at the N-terminus (rrm: pfam00076) followed by a PABP-specific domain (pfam00658) at the C-terminus. The protein is involved in the transport of mRNA's from the nucleus to the cytoplasm. There are four paralogs in Homo sapiens which are expressed in testis, platelets, broadly expressed and of unknown tissue range.


Pssm-ID: 130689 [Multi-domain]  Cd Length: 562  Bit Score: 41.33  E-value: 9.37e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1622844908 1521 QQAQALGIPLSLQQAQELGIPLTPQ-QAQALGI-PLTPQQAQELGIPLTPQQAQSQGIPLTPQQAQELGIPLTPQQA--- 1595
Cdd:TIGR01628  405 PQQQFNGQPLGWPRMSMMPTPMGPGgPLRPNGLaPMNAVRAPSRNAQNAAQKPPMQPVMYPPNYQSLPLSQDLPQPQsta 484
                           90       100       110
                   ....*....|....*....|....*....|....*
gi 1622844908 1596 -QALGIPLTPQQMQAqgitLTPQ-QAQALGIPLTP 1628
Cdd:TIGR01628  485 sQGGQNKKLAQVLAS----ATPQmQKQVLGERLFP 515
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH