NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|113674054|ref|NP_001038232|]
View 

histone-lysine N-methyltransferase SETDB1-A [Danio rerio]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
SET_SETDB1 cd10517
SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) ...
1022-1436 5.47e-174

SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes.


:

Pssm-ID: 380915 [Multi-domain]  Cd Length: 288  Bit Score: 519.92  E-value: 5.47e-174
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1022 QPHLYLPDISEGKEVMPVPCVNEVDNTLAPNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCACHKLTIEATSL 1101
Cdd:cd10517     1 KPYYYICDISYGKEGVPIPCVNEIDNSSPPYVEYSKERIPGKGVNINLDPDFLVGCDCTDGCRDKSKCACQQLTIEATAA 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1102 CTGGPVDVSAGYTHKRLPTSLPTGVYECNPLCRCDpRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFT 1181
Cdd:cd10517    81 TPGGQINPSAGYQYRRLMEKLPTGVYECNSRCKCD-KRCYNRVVQNGLQVRLQVFKTEKKGWGIRCLDDIPKGSFVCIYA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1182 GKIVNEDKMNEDDTMSGNEYLANLDFIEGVEKLKEGYESEAycsdtevesskktitmktgpllknslykedsssgeepme 1261
Cdd:cd10517   160 GQILTEDEANEEGLQYGDEYFAELDYIEVVEKLKEGYESDV--------------------------------------- 200
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1262 vdtakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkn 1341
Cdd:cd10517       --------------------------------------------------------------------------------
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1342 trglfndEDACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKV 1421
Cdd:cd10517   201 -------EEHCYIIDAKSEGNLGRYLNHSCSPNLFVQNVFVDTHDLRFPWVAFFASRYIRAGTELTWDYNYEVGSVPGKV 273
                         410
                  ....*....|....*
gi 113674054 1422 LLCCCGSLRCTGRLL 1436
Cdd:cd10517   274 LYCYCGSSNCRGRLL 288
Tudor_SF super family cl02573
Tudor domain superfamily; The Tudor domain is a conserved structural domain, originally ...
636-718 4.50e-27

Tudor domain superfamily; The Tudor domain is a conserved structural domain, originally identified in the Tudor protein of Drosophila, that adopts a beta-barrel-like core structure containing four short beta-strands followed by an alpha-helical region. It binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions. Tudor domain-containing proteins may mediate protein-protein interactions required for various DNA-templated biological processes, such as RNA metabolism, as well as histone modification and the DNA damage response. Members of this superfamily contain one or more copies of the Tudor domain.


The actual alignment was detected with superfamily member cd20382:

Pssm-ID: 470623  Cd Length: 82  Bit Score: 105.83  E-value: 4.50e-27
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  636 IGCRVVASAKSENGKSLYNAGVMVELPERKNRMRFLVFFDDGLATYLALPDLYFVCKQTKKVWREIkDESSRKQVKDYLQ 715
Cdd:cd20382     1 VGSRVVAQYKDEGNQVWLYAGIVAEPPKVKNRYRYLIFFDDGYAQYVTPSDVYLVCQQSKKVWEDI-HEDSRDFIREYLE 79

                  ...
gi 113674054  716 VYP 718
Cdd:cd20382    80 AYP 82
HMT_MBD cd01395
Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as ...
943-1003 1.22e-19

Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins.


:

Pssm-ID: 238689  Cd Length: 60  Bit Score: 83.97  E-value: 1.22e-19
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 113674054  943 PLLIPLLFKFRRMTARRRIdGKLFFHIFYRSPCGRSLCDMQEVQDYLFETrCDFLFLEMFC 1003
Cdd:cd01395     1 PLHTPLLCGFQRMKYRARV-GKVKKHVIYKAPCGRSLRNMSEVHRYLRET-CSFLTVDNFS 59
Tudor_4 super family cl39701
Histone methyltransferase Tudor domain; This is a Tudor domain found in histone-lysine ...
725-773 3.28e-11

Histone methyltransferase Tudor domain; This is a Tudor domain found in histone-lysine N-methyltransferase SETDB1 proteins (EC:2.1.1.43), also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4.


The actual alignment was detected with superfamily member pfam18358:

Pssm-ID: 408159  Cd Length: 50  Bit Score: 59.68  E-value: 3.28e-11
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 113674054   725 LRLGQETKAVRNGQFEDCTVLQLDGSLVQICYKNDKQKEWIYKGSDKLE 773
Cdd:pfam18358    1 LKKGQTVKTEWNGKWWTARVLEVDASLVKVYFLSDKRTEWIYRGSTRLE 49
PRK08581 super family cl35718
amidase domain-containing protein;
351-546 2.63e-09

amidase domain-containing protein;


The actual alignment was detected with superfamily member PRK08581:

Pssm-ID: 236304 [Multi-domain]  Cd Length: 619  Bit Score: 61.73  E-value: 2.63e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  351 DLLESDSEQSDNAATKTRFKPSEVTASSKLKSSGDHNSASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTD 430
Cdd:PRK08581   29 DPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKFSTIDSSTSDSNNIIDFIYKNLPQTNINQLL 108
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  431 SELPTETPVEESTLPSNpkeavIMSDAESTDKTEKPQTRKKSSKP-SVTTTSPESRLTSSKSPPVTKTSSTQKETARAQS 509
Cdd:PRK08581  109 TKNKYDDNYSLTTLIQN-----LFNLNSDISDYEQPRNSEKSTNDsNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTK 183
                         170       180       190
                  ....*....|....*....|....*....|....*..
gi 113674054  510 PSDSIDESADMEDSPDEPSNSPTESPTKTPDKTTRND 546
Cdd:PRK08581  184 PSTSNKQPNSPKPTQPNQSNSQPASDDTANQKSSSKD 220
Herpes_BLLF1 super family cl37540
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
129-500 5.72e-07

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


The actual alignment was detected with superfamily member pfam05109:

Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 54.15  E-value: 5.72e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   129 VVIDLGATKETLEPMLEKVTVAIQKSSKLVQDLVQMVSKTSMGATSPLSTSSSDINRPSSSSTPEIVRPesVTPKLEITN 208
Cdd:pfam05109  419 VIFSKAPESTTTSPTLNTTGFAAPNTTTGLPSSTHVPTNLTAPASTGPTVSTADVTSPTPAGTTSGASP--VTPSPSPRD 496
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   209 SITIVKTESLSSvpKISSLFNSSEQCKSiadhdsyfkPTiktePEWTPLTPWEDS----ESSPFEKLIKTESQSTDVTPS 284
Cdd:pfam05109  497 NGTESKAPDMTS--PTSAVTTPTPNATS---------PT----PAVTTPTPNATSptlgKTSPTSAVTTPTPNATSPTPA 561
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   285 VMTPNKQPELLSFQSTTKIK----PEPQSTQANTELSSPPSNSKlleNHNSLSIAAIKN-ESQLKASVSEVDLLESDSEQ 359
Cdd:pfam05109  562 VTTPTPNATIPTLGKTSPTSavttPTPNATSPTVGETSPQANTT---NHTLGGTSSTPVvTSPPKNATSAVTTGQHNITS 638
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   360 SDNAATKTRfkPSEVTASSKLKSSGDHNSASASLNRTDP----KVRPVTPSGTPP---PSKSP---PAVDNTASVETNQT 429
Cdd:pfam05109  639 SSTSSMSLR--PSSISETLSPSTSDNSTSHMPLLTSAHPtggeNITQVTPASTSThhvSTSSPaprPGTTSQASGPGNSS 716
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   430 DSELPTETPVEESTLPSNPKEavimSDAESTDKTEKPQTRKKSSKPSVTT-------------TSPESRLTSSKSPPVTK 496
Cdd:pfam05109  717 TSTKPGEVNVTKGTPPKNATS----PQAPSGQKTAVPTVTSTGGKANSTTggkhttghgartsTEPTTDYGGDSTTPRTR 792

                   ....
gi 113674054   497 TSST 500
Cdd:pfam05109  793 YNAT 796
DUF5604 super family cl39647
Domain of unknown function (DUF5604); This domain is often found in the N-terminal region of ...
570-624 3.96e-06

Domain of unknown function (DUF5604); This domain is often found in the N-terminal region of proteins carrying the SET domain (pfam00856), such as the SETDB1 protein present in Homo sapiens. SETDB1 is a histone methyltransferase that suppresses gene expression and modulates heterochromatin formation through H3K9me2/3.


The actual alignment was detected with superfamily member pfam18300:

Pssm-ID: 408109  Cd Length: 58  Bit Score: 45.45  E-value: 3.96e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054   570 KEIKLKVGAAVLGKKRHNHWSRGTVQEVETEDDGNTYKVEF-KKGKTIvLSANHVA 624
Cdd:pfam18300    2 KDGDLIVSMRILGKKRTKTWHKGTLIAIQTVGPGKKYKVKFdNKGKSL-LSGNHIA 56
 
Name Accession Description Interval E-value
SET_SETDB1 cd10517
SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) ...
1022-1436 5.47e-174

SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes.


Pssm-ID: 380915 [Multi-domain]  Cd Length: 288  Bit Score: 519.92  E-value: 5.47e-174
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1022 QPHLYLPDISEGKEVMPVPCVNEVDNTLAPNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCACHKLTIEATSL 1101
Cdd:cd10517     1 KPYYYICDISYGKEGVPIPCVNEIDNSSPPYVEYSKERIPGKGVNINLDPDFLVGCDCTDGCRDKSKCACQQLTIEATAA 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1102 CTGGPVDVSAGYTHKRLPTSLPTGVYECNPLCRCDpRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFT 1181
Cdd:cd10517    81 TPGGQINPSAGYQYRRLMEKLPTGVYECNSRCKCD-KRCYNRVVQNGLQVRLQVFKTEKKGWGIRCLDDIPKGSFVCIYA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1182 GKIVNEDKMNEDDTMSGNEYLANLDFIEGVEKLKEGYESEAycsdtevesskktitmktgpllknslykedsssgeepme 1261
Cdd:cd10517   160 GQILTEDEANEEGLQYGDEYFAELDYIEVVEKLKEGYESDV--------------------------------------- 200
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1262 vdtakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkn 1341
Cdd:cd10517       --------------------------------------------------------------------------------
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1342 trglfndEDACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKV 1421
Cdd:cd10517   201 -------EEHCYIIDAKSEGNLGRYLNHSCSPNLFVQNVFVDTHDLRFPWVAFFASRYIRAGTELTWDYNYEVGSVPGKV 273
                         410
                  ....*....|....*
gi 113674054 1422 LLCCCGSLRCTGRLL 1436
Cdd:cd10517   274 LYCYCGSSNCRGRLL 288
Tudor_SETDB1_rpt1 cd20382
first Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, ...
636-718 4.50e-27

first Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.


Pssm-ID: 410453  Cd Length: 82  Bit Score: 105.83  E-value: 4.50e-27
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  636 IGCRVVASAKSENGKSLYNAGVMVELPERKNRMRFLVFFDDGLATYLALPDLYFVCKQTKKVWREIkDESSRKQVKDYLQ 715
Cdd:cd20382     1 VGSRVVAQYKDEGNQVWLYAGIVAEPPKVKNRYRYLIFFDDGYAQYVTPSDVYLVCQQSKKVWEDI-HEDSRDFIREYLE 79

                  ...
gi 113674054  716 VYP 718
Cdd:cd20382    80 AYP 82
Pre-SET pfam05033
Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines ...
1030-1143 6.87e-24

Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains.


Pssm-ID: 461530 [Multi-domain]  Cd Length: 99  Bit Score: 97.49  E-value: 6.87e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  1030 ISEGKEVMPVPCVNEVDNTLAP-NVTYTKDRVPARGVFintsSDFMVGCDCTDgCrDRSKCACHKLtieatslcTGGpvD 1108
Cdd:pfam05033    1 ISKGKENVPIPVVNEVDDEPPPpDFTYITSYIYPKEFL----LIIPQGCDCGD-C-SSEKCSCAQL--------NGG--E 64
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 113674054  1109 VSAGYTHK-RLPTSLPTGVYECNPLCRCdPRMCSNR 1143
Cdd:pfam05033   65 FRFPYDKDgLLVPESKPPIYECNPLCGC-PPSCPNR 99
PreSET smart00468
N-terminal to some SET domains; A Cys-rich putative Zn2+-binding domain that occurs N-terminal ...
1027-1134 1.84e-21

N-terminal to some SET domains; A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.


Pssm-ID: 128744 [Multi-domain]  Cd Length: 98  Bit Score: 90.55  E-value: 1.84e-21
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   1027 LPDISEGKEVMPVPCVNEVDNTLAP-NVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCACHKLTieatslCTGG 1105
Cdd:smart00468    1 CLDISNGKENVPVPLVNEVDEDPPPpDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSSSNKCECARKN------GGEF 74
                            90       100
                    ....*....|....*....|....*....
gi 113674054   1106 PVDVSAGYTHKRlptslPTGVYECNPLCR 1134
Cdd:smart00468   75 AYELNGGLRLKR-----KPLIYECNSRCS 98
Tudor_5 pfam18359
Histone methyltransferase Tudor domain 1; This is the first TUDOR domain found in SETDB1 ...
634-686 3.35e-20

Histone methyltransferase Tudor domain 1; This is the first TUDOR domain found in SETDB1 enzymes (EC:2.1.1.43) in homosapiens, also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4. SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase (HMT) that methylates lysine 9 on histone H3 (H3K9). The enzymatic activity of SETDB1, in association with MBD1-containing chromatin-associated factor 1 (MCAF1), converts H3K9me2 to H3K9me3 and represses subsequent transcription. SETDB1 is amplified in cancers such as melanoma and lung cancer, and increased expression of SETDB1 promotes tumorigenesis in a zebrafish melanoma model. In addition, SETDB1 is required for endogenous retrovirus silencing during early embryogenesis, inhibition of adipocyte differentiation, and differentiation of mesenchymal cells into osteoblasts. The tandem Tudor domains in the N-terminal region are involved in protein-protein interactions. The second tudor domain is pfam18385.


Pssm-ID: 465723  Cd Length: 53  Bit Score: 85.34  E-value: 3.35e-20
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 113674054   634 LYIGCRVVASAKSENGKSLYNAGVMVELPERKNRMRFLVFFDDGLATYLALPD 686
Cdd:pfam18359    1 LPVGTRVIAKYKDSNGKSAYYAGVIAEPPKDLNRYRYLVFFDDGYAQYVVHKD 53
HMT_MBD cd01395
Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as ...
943-1003 1.22e-19

Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins.


Pssm-ID: 238689  Cd Length: 60  Bit Score: 83.97  E-value: 1.22e-19
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 113674054  943 PLLIPLLFKFRRMTARRRIdGKLFFHIFYRSPCGRSLCDMQEVQDYLFETrCDFLFLEMFC 1003
Cdd:cd01395     1 PLHTPLLCGFQRMKYRARV-GKVKKHVIYKAPCGRSLRNMSEVHRYLRET-CSFLTVDNFS 59
SET COG2940
SET domain-containing protein (function unknown) [General function prediction only];
1345-1434 2.27e-15

SET domain-containing protein (function unknown) [General function prediction only];


Pssm-ID: 442183 [Multi-domain]  Cd Length: 134  Bit Score: 74.23  E-value: 2.27e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1345 LFNDEDACYIiDARQEGNLGRYINHSCSPNLFVqnvfvdthDLRFPWVAFFASKRIKAGTELTWDYNYEvgsVEGKVLLC 1424
Cdd:COG2940    59 LFELDDDGVI-DGALGGNPARFINHSCDPNCEA--------DEEDGRIFIVALRDIAAGEELTYDYGLD---YDEEEYPC 126
                          90
                  ....*....|
gi 113674054 1425 CCGSlrCTGR 1434
Cdd:COG2940   127 RCPN--CRGT 134
MBD smart00391
Methyl-CpG binding domain; Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, ...
941-1016 1.68e-14

Methyl-CpG binding domain; Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain


Pssm-ID: 128673  Cd Length: 77  Bit Score: 69.71  E-value: 1.68e-14
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 113674054    941 KNPLLIPLLFKFRRMTARRRI-DGKLFFHIFYRSPCGRSLCDMQEVQDYLFETRCDFLFLEMFCMDPFVLVNRARPP 1016
Cdd:smart00391    1 GDPLRLPLPCGWRRETKQRKSgRSAGKFDVYYISPCGKKLRSKSELARYLHKNGDLSLDLECFDFNATVPVGPKFTP 77
MBD pfam01429
Methyl-CpG binding domain; The Methyl-CpG binding domain (MBD) binds to DNA that contains one ...
939-1012 1.76e-12

Methyl-CpG binding domain; The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more symmetrically methylated CpGs. DNA methylation in animals is associated with alterations in chromatin structure and silencing of gene expression. MBD has negligible non-specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 12 nucleotide region surrounding a methyl CpG pair. MBDs are found in several Methyl-CpG binding proteins and also DNA demethylase.


Pssm-ID: 396147 [Multi-domain]  Cd Length: 76  Bit Score: 63.92  E-value: 1.76e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 113674054   939 HGKNPLLIPLLFKFRRMTARRRIDGKLF-FHIFYRSPCGRSLCDMQEVQDYLFETRCDFLFLEMFCMDPFVLVNR 1012
Cdd:pfam01429    2 ERKREDRLPLPPGWRREERQRKSGSKAGkVDVFYYSPTGKKLRSKSEVARYLEANGGTSPKLEDFSFTVRSEVGR 76
Tudor_4 pfam18358
Histone methyltransferase Tudor domain; This is a Tudor domain found in histone-lysine ...
725-773 3.28e-11

Histone methyltransferase Tudor domain; This is a Tudor domain found in histone-lysine N-methyltransferase SETDB1 proteins (EC:2.1.1.43), also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4.


Pssm-ID: 408159  Cd Length: 50  Bit Score: 59.68  E-value: 3.28e-11
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 113674054   725 LRLGQETKAVRNGQFEDCTVLQLDGSLVQICYKNDKQKEWIYKGSDKLE 773
Cdd:pfam18358    1 LKKGQTVKTEWNGKWWTARVLEVDASLVKVYFLSDKRTEWIYRGSTRLE 49
Tudor_SETDB1_rpt2 cd21181
second Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, ...
725-773 3.38e-10

second Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.


Pssm-ID: 410548  Cd Length: 54  Bit Score: 56.95  E-value: 3.38e-10
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*....
gi 113674054  725 LRLGQETKAVRNGQFEDCTVLQLDGSLVQICYKNDKQKEWIYKGSDKLE 773
Cdd:cd21181     1 LKVGQLIKTEWNGKWWKARVEEVDGSLVKMLFLDDKRTEWIYRGSTRLE 49
PRK08581 PRK08581
amidase domain-containing protein;
351-546 2.63e-09

amidase domain-containing protein;


Pssm-ID: 236304 [Multi-domain]  Cd Length: 619  Bit Score: 61.73  E-value: 2.63e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  351 DLLESDSEQSDNAATKTRFKPSEVTASSKLKSSGDHNSASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTD 430
Cdd:PRK08581   29 DPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKFSTIDSSTSDSNNIIDFIYKNLPQTNINQLL 108
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  431 SELPTETPVEESTLPSNpkeavIMSDAESTDKTEKPQTRKKSSKP-SVTTTSPESRLTSSKSPPVTKTSSTQKETARAQS 509
Cdd:PRK08581  109 TKNKYDDNYSLTTLIQN-----LFNLNSDISDYEQPRNSEKSTNDsNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTK 183
                         170       180       190
                  ....*....|....*....|....*....|....*..
gi 113674054  510 PSDSIDESADMEDSPDEPSNSPTESPTKTPDKTTRND 546
Cdd:PRK08581  184 PSTSNKQPNSPKPTQPNQSNSQPASDDTANQKSSSKD 220
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
129-500 5.72e-07

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 54.15  E-value: 5.72e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   129 VVIDLGATKETLEPMLEKVTVAIQKSSKLVQDLVQMVSKTSMGATSPLSTSSSDINRPSSSSTPEIVRPesVTPKLEITN 208
Cdd:pfam05109  419 VIFSKAPESTTTSPTLNTTGFAAPNTTTGLPSSTHVPTNLTAPASTGPTVSTADVTSPTPAGTTSGASP--VTPSPSPRD 496
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   209 SITIVKTESLSSvpKISSLFNSSEQCKSiadhdsyfkPTiktePEWTPLTPWEDS----ESSPFEKLIKTESQSTDVTPS 284
Cdd:pfam05109  497 NGTESKAPDMTS--PTSAVTTPTPNATS---------PT----PAVTTPTPNATSptlgKTSPTSAVTTPTPNATSPTPA 561
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   285 VMTPNKQPELLSFQSTTKIK----PEPQSTQANTELSSPPSNSKlleNHNSLSIAAIKN-ESQLKASVSEVDLLESDSEQ 359
Cdd:pfam05109  562 VTTPTPNATIPTLGKTSPTSavttPTPNATSPTVGETSPQANTT---NHTLGGTSSTPVvTSPPKNATSAVTTGQHNITS 638
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   360 SDNAATKTRfkPSEVTASSKLKSSGDHNSASASLNRTDP----KVRPVTPSGTPP---PSKSP---PAVDNTASVETNQT 429
Cdd:pfam05109  639 SSTSSMSLR--PSSISETLSPSTSDNSTSHMPLLTSAHPtggeNITQVTPASTSThhvSTSSPaprPGTTSQASGPGNSS 716
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   430 DSELPTETPVEESTLPSNPKEavimSDAESTDKTEKPQTRKKSSKPSVTT-------------TSPESRLTSSKSPPVTK 496
Cdd:pfam05109  717 TSTKPGEVNVTKGTPPKNATS----PQAPSGQKTAVPTVTSTGGKANSTTggkhttghgartsTEPTTDYGGDSTTPRTR 792

                   ....
gi 113674054   497 TSST 500
Cdd:pfam05109  793 YNAT 796
Treacle pfam03546
Treacher Collins syndrome protein Treacle;
259-522 9.44e-07

Treacher Collins syndrome protein Treacle;


Pssm-ID: 460967 [Multi-domain]  Cd Length: 531  Bit Score: 53.15  E-value: 9.44e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   259 PWEDSESSPFEkliktESQSTDVTPSVMTP------NKQPELLSFQSTTKIKPEPQSTQANTELSSPPSnskllenhnsl 332
Cdd:pfam03546   20 PEEDSESSSEE-----ESDSEEETPAAKTPlqakpsGKTPQVRAASAPAKESPRKGAPPVPPGKTGPAA----------- 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   333 siaaikneSQLKASVSEVDLlESDSEQSDnAATKTRFKPSEVTASSKLKSSGDhnsasaslnrtDPKVRPVTPSGTPPPS 412
Cdd:pfam03546   84 --------AQAQAGKPEEDS-ESSSEESD-SDGETPAAATLTTSPAQVKPLGK-----------NSQVRPASTVGKGPSG 142
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   413 K----SPPAVDNTASVETNQTDSELPTETPVEEStlpsnpkeavimsdaESTDKTEKPQTRKKSSK--PSVTTTSPESRL 486
Cdd:pfam03546  143 KganpAPPGKAGSAAPLVQVGKKEEDSESSSEES---------------DSEGEAPPAATQAKPSGkiLQVRPASGPAKG 207
                          250       260       270
                   ....*....|....*....|....*....|....*.
gi 113674054   487 TSSKSPPVTKTSSTQKETARAQSPSDSIDESADMED 522
Cdd:pfam03546  208 AAPAPPQKAGPVATQVKAERSKEDSESSEESSDSEE 243
DUF5604 pfam18300
Domain of unknown function (DUF5604); This domain is often found in the N-terminal region of ...
570-624 3.96e-06

Domain of unknown function (DUF5604); This domain is often found in the N-terminal region of proteins carrying the SET domain (pfam00856), such as the SETDB1 protein present in Homo sapiens. SETDB1 is a histone methyltransferase that suppresses gene expression and modulates heterochromatin formation through H3K9me2/3.


Pssm-ID: 408109  Cd Length: 58  Bit Score: 45.45  E-value: 3.96e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054   570 KEIKLKVGAAVLGKKRHNHWSRGTVQEVETEDDGNTYKVEF-KKGKTIvLSANHVA 624
Cdd:pfam18300    2 KDGDLIVSMRILGKKRTKTWHKGTLIAIQTVGPGKKYKVKFdNKGKSL-LSGNHIA 56
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
153-475 3.72e-05

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 48.53  E-value: 3.72e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  153 KSSKLVQDL-VQMVSKTSMGATSPLSTSSsdinrPSSSSTPEivRPESvtPKleitnsitIVKTESLSSVPKISslFNSS 231
Cdd:PTZ00449  610 KSPKLPELLdIPKSPKRPESPKSPKRPPP-----PQRPSSPE--RPEG--PK--------IIKSPKPPKSPKPP--FDPK 670
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  232 EQCKSiadHDSYFKPTIKTEPEWTPLTPWEDSESSPFEKLikTESQSTDVTPSVMTPNKQPELLSFQSTTKIKPEPQSTq 311
Cdd:PTZ00449  671 FKEKF---YDDYLDAAAKSKETKTTVVLDESFESILKETL--PETPGTPFTTPRPLPPKLPRDEEFPFEPIGDPDAEQP- 744
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  312 ANTELSSPPSNSKLL--ENHNSLSIAAIKNESqlkasVSEVDLlESDSEQSDNAATKTRfKPSEVTAssklKSSGDHNSA 389
Cdd:PTZ00449  745 DDIEFFTPPEEERTFfhETPADTPLPDILAEE-----FKEEDI-HAETGEPDEAMKRPD-SPSEHED----KPPGDHPSL 813
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  390 SASLNR------------TDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTDSELPTETPVE--------ESTLPSNPK 449
Cdd:PTZ00449  814 PKKRHRldglalsttdleSDAGRIAKDASGKIVKLKRSKSFDDLTTVEEAEEMGAEARKIVVDddgteaddEDTHPPEEK 893
                         330       340
                  ....*....|....*....|....*.
gi 113674054  450 EAVIMSDAESTDKTEKPQTRKKSSKP 475
Cdd:PTZ00449  894 HKSEVRRRRPPKKPSKPKKPSKPKKP 919
rad2 TIGR00600
DNA excision repair protein (rad2); All proteins in this family for which functions are known ...
226-540 4.81e-03

DNA excision repair protein (rad2); All proteins in this family for which functions are known are flap endonucleases that generate the 3' incision next to DNA damage as part of nucleotide excision repair. This family is related to many other flap endonuclease families including the fen1 family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]


Pssm-ID: 273166 [Multi-domain]  Cd Length: 1034  Bit Score: 41.42  E-value: 4.81e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   226 SLFNSSEQCKSIADHDSYFkPTIKTEPEWTPLTPWEdsESSPFEKLIKTESQSTDvtPSVMTPNKQPELLSFQSTTKIKP 305
Cdd:TIGR00600  312 SLPSLSSQLDSNSEDLKSS-PWEKLKPESESIVEAE--PPSPRTLLAKQAAMSES--SSEDSDESEWERQELKRNNVAFV 386
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   306 EPQSTQANTElsspPSNSKLLENHNSLSIAAiKNESQLKASVSEVDLLESDSEQSDNA-ATKTRFKPSEVTASSKLKSSG 384
Cdd:TIGR00600  387 DDGSLSPRTL----QAIGQALDDDEDKKVSA-SSDDQASPSKKTKMLLISRIEVEDDDlDYLDQGEGIPLMAALQLSSVN 461
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   385 DHNSASASLNRTdpkvRPVTPSGTpppSKSPPAVDNTASVETNqtDSELPTEtpveeSTLPSNPKEAVImsdaestdkte 464
Cdd:TIGR00600  462 SKPEAVASTKIA----REVTSSGH---EAVPKAVQSLLLGATN--DSPIPSE-----FTILDRKSELSI----------- 516
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   465 kpqtrKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQK-----------------ETARAQSPSDSIDESADMEdSPDEP 527
Cdd:TIGR00600  517 -----ERTVKPVSSEFGLPSQREDKLAIPTEGTQNLQGisdhpeqfefqnelsplETKNNESNLSSDAETEGSP-NPEMP 590
                          330
                   ....*....|...
gi 113674054   528 SNSPTESPTKTPD 540
Cdd:TIGR00600  591 SWSSVTVPSEALD 603
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
380-543 5.72e-03

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 41.43  E-value: 5.72e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  380 LKSSGDHNSASASLNRTDPKVRPvTPSGTPPPSKSPPAVDNTasvetNQTDSELPTETPVEESTLPSNPKEavimsdaes 459
Cdd:NF033609   31 LLSSKEADASENSVTQSDSASNE-SKSNDSSSVSAAPKTDDT-----NVSDTKTSSNTNNGETSVAQNPAQ--------- 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  460 TDKTEKPQTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETARAQSPSDSIDESADMEDSPDEPSNSP-TESPTKT 538
Cdd:NF033609   96 QETTQSASTNATTEETPVTGEATTTATNQANTPATTQSSNTNAEELVNQTSNETTSNDTNTVSSVNSPQNSTnAENVSTT 175

                  ....*
gi 113674054  539 PDKTT 543
Cdd:NF033609  176 QDTST 180
 
Name Accession Description Interval E-value
SET_SETDB1 cd10517
SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) ...
1022-1436 5.47e-174

SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes.


Pssm-ID: 380915 [Multi-domain]  Cd Length: 288  Bit Score: 519.92  E-value: 5.47e-174
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1022 QPHLYLPDISEGKEVMPVPCVNEVDNTLAPNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCACHKLTIEATSL 1101
Cdd:cd10517     1 KPYYYICDISYGKEGVPIPCVNEIDNSSPPYVEYSKERIPGKGVNINLDPDFLVGCDCTDGCRDKSKCACQQLTIEATAA 80
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1102 CTGGPVDVSAGYTHKRLPTSLPTGVYECNPLCRCDpRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFT 1181
Cdd:cd10517    81 TPGGQINPSAGYQYRRLMEKLPTGVYECNSRCKCD-KRCYNRVVQNGLQVRLQVFKTEKKGWGIRCLDDIPKGSFVCIYA 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1182 GKIVNEDKMNEDDTMSGNEYLANLDFIEGVEKLKEGYESEAycsdtevesskktitmktgpllknslykedsssgeepme 1261
Cdd:cd10517   160 GQILTEDEANEEGLQYGDEYFAELDYIEVVEKLKEGYESDV--------------------------------------- 200
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1262 vdtakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkn 1341
Cdd:cd10517       --------------------------------------------------------------------------------
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1342 trglfndEDACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKV 1421
Cdd:cd10517   201 -------EEHCYIIDAKSEGNLGRYLNHSCSPNLFVQNVFVDTHDLRFPWVAFFASRYIRAGTELTWDYNYEVGSVPGKV 273
                         410
                  ....*....|....*
gi 113674054 1422 LLCCCGSLRCTGRLL 1436
Cdd:cd10517   274 LYCYCGSSNCRGRLL 288
SET_SETDB cd10541
SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1), ...
1022-1435 6.80e-133

SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1), SET domain bifurcated 2 (SETDB2), and similar proteins; SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis.


Pssm-ID: 380939 [Multi-domain]  Cd Length: 236  Bit Score: 408.86  E-value: 6.80e-133
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1022 QPHLYLPDISEGKevmpvpcvnevdntlapnvtytkdrvpargvfintssdFMVGCDCTDGCRDRSKCACHKLTIEATSL 1101
Cdd:cd10541     1 KPFYYIPDISYGK--------------------------------------FLVGCDCTDGCRDKSKCACHQLTIQATAC 42
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1102 CTGGPVDVSAGYTHKRLPTSLPTGVYECNPLCRCDPRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFT 1181
Cdd:cd10541    43 TPGGQDNPTAGYQYKRLEECLPTGVYECNKLCKCDPNMCQNRLVQHGLQVRLQLFKTQNKGWGIRCLDDIAKGTFVCIYA 122
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1182 GKIVNEDKMNEDDTMSGNEYLANLDFIegveklkegyeseaycsdtevesskktitmktgpllknslykedsssgeepme 1261
Cdd:cd10541   123 GKILTDDFADKEGLEMGDEYFANLDHI----------------------------------------------------- 149
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1262 vdtakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkn 1341
Cdd:cd10541       --------------------------------------------------------------------------------
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1342 trglfndEDACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKV 1421
Cdd:cd10541   150 -------EESCYIIDAKLEGNLGRYLNHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKE 222
                         410
                  ....*....|....
gi 113674054 1422 LLCCCGSLRCTGRL 1435
Cdd:cd10541   223 LLCCCGSNECRGRL 236
SET_SETDB2 cd10523
SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 2 (SETDB2) ...
1047-1435 2.49e-78

SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 2 (SETDB2) and similar proteins; SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis.


Pssm-ID: 380921 [Multi-domain]  Cd Length: 266  Bit Score: 259.76  E-value: 2.49e-78
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1047 NTLAPNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCACHKLTIEATSLCTGGPVDVSAGYTHKRLPTSLPTGV 1126
Cdd:cd10523     4 NTYVQLDRNPQDQQQLVDDFDISNGAFVDSCDCTDGCIDILKCACLQLTARAFSKSESSPSKGGRGYKYKRLQEPIPSGL 83
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1127 YECNPLCRCDPRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFTGKIVNEdkmneddTMSGNEYLANLD 1206
Cdd:cd10523    84 YECNVSCKCNRMLCQNRVVQHGLQVRLQVFKTEKKGWGVRCLDDIDKGTFVCIYAGRVLSR-------ARSPTEPLPPKL 156
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1207 fiegveklkegyeseaycsDTEVESSKKTITMKTGPllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkph 1286
Cdd:cd10523   157 -------------------ELPSENEVEVVTSWLIL-------------------------------------------- 173
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1287 etpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktPKNTRGlfndEDACYIIDARQEGNLGRY 1366
Cdd:cd10523   174 ----------------------------------------------------SKKRKL----RENVCFLDASKEGNVGRF 197
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 113674054 1367 INHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKVLLCCCGSLRCTGRL 1435
Cdd:cd10523   198 LNHSCCPNLFVQNVFVDTHDKNFPWVAFFTNRVVKAGTELTWDYSYDAGTSPEQEIPCLCGVNKCQKKI 266
SET_SETDB-like cd10538
SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) ...
1051-1411 7.73e-59

SET domain (including pre-SET and post-SET domains) found in SET domain bifurcated 1 (SETDB1) and 2 (SETDB2), suppressor of variegation 3-9 homologs, SUV39H1 and SUV39H2, euchromatic histone-lysine N-methyltransferase EHMT1 and EHMT2, and similar proteins; The family includes SET domain bifurcated 1 (SETDB1) and 2 (SETDB2), suppressor of variegation 3-9 homologs, SUV39H1 and SUV39H2, euchromatic histone-lysine N-methyltransferase EHMT1 and EHMT2. SETDB1 (EC 2.1.1.43; also termed ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. SETDB2 (EC 2.1.1.43; also termed chronic lymphocytic leukemia deletion region gene 8 protein (CLLD8), or lysine N-methyltransferase 1F (KMT1F)) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It is involved in left-right axis specification in early development and mitosis. SUV39H1 (also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A, KMT1A, position-effect variegation 3-9 homolog, SUV39H, or Su(var)3-9 homolog 1) and SUV39H2 (also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B, KMT1B, or Su(var)3-9 homolog 2), both act as histone-lysine N-methyltransferases that specifically trimethylate 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. They mainly function in heterochromatin regions, thereby playing central roles in the establishment of constitutive heterochromatin at pericentric and telomere regions. EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, lysine N-methyltransferase 1D, or KMT1D) and EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C, KMT1C, or protein G9a), both act as histone-lysine N-methyltransferases that specifically mono- and dimethylate 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin. This family also includes the pre-SET domain, which is found in a number of histone methyltransferases (HMTase), N-terminal to the SET domain. Pre-SET domain is a zinc binding motif which contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilizing SET domains. Most family members, except for Arabidopsis thaliana SUVH9, contain a post-SET domain which harbors a zinc-binding site.


Pssm-ID: 380936 [Multi-domain]  Cd Length: 217  Bit Score: 201.83  E-value: 7.73e-59
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1051 PNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDrSKCACHKLTieatslctggpvDVSAGYTHKRL--PTSLPTGVYE 1128
Cdd:cd10538     1 PSFTYIKDNIVGKNVQPFSNIIDSVGCKCKDDCLD-SKCACAAES------------DGIFAYTKNGLlrLNNSPPPIFE 67
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1129 CNPLCRCDPrMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFTGKI---VNEDKMNEDDTMSGNEYLANL 1205
Cdd:cd10538    68 CNSKCSCDD-DCKNRVVQRGLQARLQVFRTSKKGWGVRSLEFIPKGSFVCEYVGEVittSEADRRGKIYDKSGGSYLFDL 146
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1206 DfiegveklkegyeseaycsdtevesskktitmktgpllknslykEDSSSGEEPMevdtakdkvkvhdkplgerklpnkp 1285
Cdd:cd10538   147 D--------------------------------------------EFSDSDGDGE------------------------- 157
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1286 hetpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkntrglfndedaCYIIDARQEGNLGR 1365
Cdd:cd10538   158 ------------------------------------------------------------------ELCVDATFCGNVSR 171
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|....*.
gi 113674054 1366 YINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYN 1411
Cdd:cd10538   172 FINHSCDPNLFPFNVVIDHDDLRYPRIALFATRDILPGEELTFDYG 217
SET_AtSUVH-like cd10545
SET domain found in Arabidopsis thaliana histone H3-K9 methyltransferases (SUVHs) and similar ...
1076-1412 7.71e-51

SET domain found in Arabidopsis thaliana histone H3-K9 methyltransferases (SUVHs) and similar proteins; Arabidopsis thaliana SUVH protein (also termed suppressor of variegation 3-9 homolog protein) is a histone-lysine N-methyltransferase that methylates 'Lys-9' of histone H3. H3 'Lys-9' methylation represents a specific tag for epigenetic transcriptional repression. Some family members contain a post-SET domain which binds a Zn2+ ion. Most family members, except for Arabidopsis thaliana SUVH9, contain a post-SET domain which harbors a zinc-binding site.


Pssm-ID: 380943 [Multi-domain]  Cd Length: 232  Bit Score: 179.52  E-value: 7.71e-51
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1076 GCDCTDGCRDRSK-CAChkltieaTSLCTGGPVDVSAGYTHKRLPTslptgVYECNPLCRCDPRmCSNRLVQHGMQLRLE 1154
Cdd:cd10545    23 GCDCKNRCTDGASdCAC-------VKKNGGEIPYNFNGRLIRAKPA-----IYECGPLCKCPPS-CYNRVTQKGLRYRLE 89
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1155 LFMTQHKGWGIRCKDDVPKGTFVCVFTGKIVNEDKMNEddTMSGNEYLANLDFIEGVEKLKEGYESEAYCSDtevesskk 1234
Cdd:cd10545    90 VFKTAERGWGVRSWDSIPAGSFICEYVGELLDTSEADT--RSGNDDYLFDIDNRQTNRGWDGGQRLDVGMSD-------- 159
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1235 titmktgpllknslykedsssGEEPMEVDTakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqESSGpkrcf 1314
Cdd:cd10545   160 ---------------------GERSSAEDE-----------------------------------------ESSE----- 172
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1315 aiksfqrrvkplesteaqkektktpkntrglfndedacYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPWVAF 1394
Cdd:cd10545   173 --------------------------------------FTIDAGSFGNVARFINHSCSPNLFVQCVLYDHNDLRLPRVML 214
                         330
                  ....*....|....*...
gi 113674054 1395 FASKRIKAGTELTWDYNY 1412
Cdd:cd10545   215 FAADNIPPLQELTYDYGY 232
SET_SUV39H cd10542
SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 ...
1051-1435 6.63e-46

SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homologs, SUV39H1, SUV39H2 and similar proteins; This family includes SUV39H1 (also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A, KMT1A, position-effect variegation 3-9 homolog, SUV39H, or Su(var)3-9 homolog 1) and SUV39H2 (also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B, KMT1B, or Su(var)3-9 homolog 2), both act as histone-lysine N-methyltransferases that specifically trimethylate 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. They mainly function in heterochromatin regions, thereby playing central roles in the establishment of constitutive heterochromatin at pericentric and telomere regions. Also included are Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (SUV39H homolog) and Neurospora crassa DIM-5, both of which also methylate 'Lys-9' of histone H3.


Pssm-ID: 380940 [Multi-domain]  Cd Length: 245  Bit Score: 165.93  E-value: 6.63e-46
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1051 PNVTYTKDRVPARGVFINTssDFMVGCDCTDGCRDRSKCACHKLTieatslctggpvDVSAGYT---HKRLPTSLPtgVY 1127
Cdd:cd10542     1 PNFQYINDYIPGDGVKIPE--DFLVGCECTEDCHNNNPTCCPAES------------GVKFAYDkqgRLRLPPGTP--IY 64
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1128 ECNPLCRCDPRmCSNRLVQHGMQLRLELFMTQH-KGWGIRCKDDVPKGTFVCVFTGKIVNEDKMNEddtmSGNEYLANld 1206
Cdd:cd10542    65 ECNSRCKCGPD-CPNRVVQRGRKVPLCIFRTSNgRGWGVKTLEDIKKGTFVMEYVGEIITSEEAER----RGKIYDAN-- 137
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1207 fiegveklkegyeSEAYcsdtevesskktitmktgpllknsLYKEDsssgeepmevdtakdkvkvhdkplgerklpnkph 1286
Cdd:cd10542   138 -------------GRTY------------------------LFDLD---------------------------------- 146
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1287 etpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkntrglFNDEDACYIIDARQEGNLGRY 1366
Cdd:cd10542   147 -----------------------------------------------------------YNDDDCEYTVDAAYYGNISHF 167
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1367 INHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNY---------EVGSVEGKVLLCCCGSLRCTGRL 1435
Cdd:cd10542   168 INHSCDPNLAVYAVWINHLDPRLPRIAFFAKRDIKAGEELTFDYLMtgtggssesTIPKPKDVRVPCLCGSKNCRKYL 245
SET_EHMT cd10543
SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine ...
1051-1431 1.61e-39

SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase EHMT1, EHMT2 and similar proteins; This family includes EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, lysine N-methyltransferase 1D, or KMT1D) and EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C, KMT1C, or protein G9a), both act as histone-lysine N-methyltransferases that specifically mono- and dimethylate 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin.


Pssm-ID: 380941 [Multi-domain]  Cd Length: 231  Bit Score: 147.10  E-value: 1.61e-39
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1051 PNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRsKCACHKLTIEATslctggpvdvsagYTHK-RLPTSL----PTG 1125
Cdd:cd10543     1 PDFLYVTENCETSPLNIDRNITSLQTCSCRDDCSSD-NCVCGRLSVRCW-------------YDKEgRLLPDFnkldPPL 66
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1126 VYECNPLCRCDpRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFTGKIVN--EDKMNEDDTmsgneYLA 1203
Cdd:cd10543    67 IFECNRACSCW-RNCRNRVVQNGIRYRLQLFRTRGMGWGVRALQDIPKGTFVCEYIGELISdsEADSREDDS-----YLF 140
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1204 NLDfiegveklkegyeseaycsdtevesskktitmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpN 1283
Cdd:cd10543   141 DLD----------------------------------------------------------------------------N 144
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1284 KPHETpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkntrglfndedacYIIDARQEGNL 1363
Cdd:cd10543   145 KDGET----------------------------------------------------------------YCIDARRYGNI 160
                         330       340       350       360       370       380
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1364 GRYINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVEGKVLLCCCGSLRC 1431
Cdd:cd10543   161 SRFINHLCEPNLIPVRVFVEHQDLRFPRIAFFASRDIKAGEELGFDYGEKFWRIKGKYFTCRCGSPKC 228
SET_SETMAR cd10544
SET domain (including pre-SET and post-SET domains) found in SET domain and mariner ...
1051-1435 1.50e-31

SET domain (including pre-SET and post-SET domains) found in SET domain and mariner transposase fusion protein (SETMAR) and similar proteins; SETMAR (also termed metnase) is a DNA-binding protein that is indirectly recruited to sites of DNA damage through protein-protein interactions. It has a sequence-specific DNA-binding activity recognizing the 19-mer core of the 5'-terminal inverted repeats (TIRs) of the Hsmar1 element and displays a DNA nicking and end joining activity. SETMAR also acts as a histone-lysine N-methyltransferase that methylates 'Lys-4' and 'Lys-36' of histone H3. It specifically mediates dimethylation of H3 'Lys-36' at sites of DNA double-strand break and may recruit proteins required for efficient DSB repair through non-homologous end-joining.


Pssm-ID: 380942 [Multi-domain]  Cd Length: 254  Bit Score: 124.72  E-value: 1.50e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1051 PNVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCAChkltieatsLCTGGPVdvsagYTHKRL----PTSLPTGV 1126
Cdd:cd10544     1 PDFQYTPENVPGPGADTDPNEITFPGCDCKTSSCEPETCSC---------LRKYGPN-----YDDDGClldfDGKYSGPV 66
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1127 YECNPLCRCDpRMCSNRLVQHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFTGKIVNedkmneddtmsgneylanld 1206
Cdd:cd10544    67 FECNSMCKCS-ESCQNRVVQNGLQFKLQVFKTPKKGWGLRTLEFIPKGRFVCEYAGEVIG-------------------- 125
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1207 fiegveklkegyESEAYcsdtevesskktitmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkph 1286
Cdd:cd10544   126 ------------FEEAR--------------------------------------------------------------- 130
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1287 etpkdtqKKISELRKNDGQessgpkRCFAIKSFQRRVKPLESteaqkektktpkntrglfndedacyIIDARQEGNLGRY 1366
Cdd:cd10544   131 -------RRTKSQTKGDMN------YIIVLREHLSSGKVLET-------------------------FVDPTYIGNIGRF 172
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1367 INHSCSPNLFVQNVFVDThdlRFPWVAFFASKRIKAGTELTWDY----NYEVGSVEGKVL-------LCCCGSLRCTGRL 1435
Cdd:cd10544   173 LNHSCEPNLFMVPVRVDS---MVPKLALFAARDIVAGEELSFDYsgefSNSVESVTLARQdesksrkPCLCGAENCRGFL 249
SET_EHMT1 cd10535
SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine ...
1077-1431 2.82e-31

SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase 1 (EHMT1) and similar proteins; EHMT1 (also termed Eu-HMTase1, G9a-like protein 1, GLP, GLP1, histone H3-K9 methyltransferase 5, H3-K9-HMTase 5, or lysine N-methyltransferase 1D (KMT1D)) acts as a histone-lysine N-methyltransferase that specifically mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin.


Pssm-ID: 380933 [Multi-domain]  Cd Length: 231  Bit Score: 123.12  E-value: 2.82e-31
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1077 CDCTDGCRDrSKCACHKLTIEATSLCTGgpvdvsagythKRLP---TSLPTGVYECNPLCRCdPRMCSNRLVQHGMQLRL 1153
Cdd:cd10535    27 CVCIDDCSS-SNCMCGQLSMRCWYDKDG-----------RLLPefnMAEPPLIFECNHACSC-WRNCRNRVVQNGLRARL 93
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1154 ELFMTQHKGWGIRCKDDVPKGTFVCVFTGKIVN--EDKMNEDDTmsgneYLANLDFIEGveklkegyesEAYCsdteves 1231
Cdd:cd10535    94 QLYRTRDMGWGVRSLQDIPPGTFVCEYVGELISdsEADVREEDS-----YLFDLDNKDG----------EVYC------- 151
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1232 skktitmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqessgpk 1311
Cdd:cd10535       --------------------------------------------------------------------------------
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1312 rcfaiksfqrrvkplesteaqkektktpkntrglfndedacyiIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPW 1391
Cdd:cd10535   152 -------------------------------------------IDARFYGNVSRFINHHCEPNLVPVRVFMAHQDLRFPR 188
                         330       340       350       360
                  ....*....|....*....|....*....|....*....|
gi 113674054 1392 VAFFASKRIKAGTELTWDYNYEVGSVEGKVLLCCCGSLRC 1431
Cdd:cd10535   189 IAFFSTRLIEAGEQLGFDYGERFWDIKGKLFSCRCGSPKC 228
SET_EHMT2 cd10533
SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine ...
1077-1431 4.57e-28

SET domain (including pre-SET and post-SET domains) found in euchromatic histone-lysine N-methyltransferase 2 (EHMT2) and similar proteins; EHMT2 (also termed Eu-HMTase2, HLA-B-associated transcript 8, histone H3-K9 methyltransferase 3, H3-K9-HMTase 3, lysine N-methyltransferase 1C (KMT1C), or protein G9a) acts as a histone-lysine N-methyltransferase that specifically mono- and dimethylates 'Lys-9' of histone H3 (H3K9me1 and H3K9me2, respectively) in euchromatin.


Pssm-ID: 380931 [Multi-domain]  Cd Length: 239  Bit Score: 114.34  E-value: 4.57e-28
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1077 CDCTDGCRDrSKCACHKLTIEATSlctggpvdVSAGYTHKRLPTSLPTGVYECNPLCRCDpRMCSNRLVQHGMQLRLELF 1156
Cdd:cd10533    27 CTCVDDCSS-SNCLCGQLSIRCWY--------DKDGRLLQEFNKIEPPLIFECNQACSCW-RNCKNRVVQSGIKVRLQLY 96
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1157 MTQHKGWGIRCKDDVPKGTFVCVFTGKIVN--EDKMNEDDTmsgneYLANLDFIEGveklkegyesEAYCsdtevesskk 1234
Cdd:cd10533    97 RTAKMGWGVRALQTIPQGTFICEYVGELISdaEADVREDDS-----YLFDLDNKDG----------EVYC---------- 151
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1235 titmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkphetpkdtqkkiselrkndgqessgpkrcf 1314
Cdd:cd10533       --------------------------------------------------------------------------------
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1315 aiksfqrrvkplesteaqkektktpkntrglfndedacyiIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFPWVAF 1394
Cdd:cd10533   152 ----------------------------------------IDARYYGNISRFINHLCDPNIIPVRVFMLHQDLRFPRIAF 191
                         330       340       350
                  ....*....|....*....|....*....|....*..
gi 113674054 1395 FASKRIKAGTELTWDYNYEVGSVEGKVLLCCCGSLRC 1431
Cdd:cd10533   192 FSSRDIRTGEELGFDYGDRFWDIKSKYFTCQCGSEKC 228
Tudor_SETDB1_rpt1 cd20382
first Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, ...
636-718 4.50e-27

first Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the first one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.


Pssm-ID: 410453  Cd Length: 82  Bit Score: 105.83  E-value: 4.50e-27
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  636 IGCRVVASAKSENGKSLYNAGVMVELPERKNRMRFLVFFDDGLATYLALPDLYFVCKQTKKVWREIkDESSRKQVKDYLQ 715
Cdd:cd20382     1 VGSRVVAQYKDEGNQVWLYAGIVAEPPKVKNRYRYLIFFDDGYAQYVTPSDVYLVCQQSKKVWEDI-HEDSRDFIREYLE 79

                  ...
gi 113674054  716 VYP 718
Cdd:cd20382    80 AYP 82
Pre-SET pfam05033
Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines ...
1030-1143 6.87e-24

Pre-SET motif; This protein motif is a zinc binding motif. It contains 9 conserved cysteines that coordinate three zinc ions. It is thought that this region plays a structural role in stabilising SET domains.


Pssm-ID: 461530 [Multi-domain]  Cd Length: 99  Bit Score: 97.49  E-value: 6.87e-24
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  1030 ISEGKEVMPVPCVNEVDNTLAP-NVTYTKDRVPARGVFintsSDFMVGCDCTDgCrDRSKCACHKLtieatslcTGGpvD 1108
Cdd:pfam05033    1 ISKGKENVPIPVVNEVDDEPPPpDFTYITSYIYPKEFL----LIIPQGCDCGD-C-SSEKCSCAQL--------NGG--E 64
                           90       100       110
                   ....*....|....*....|....*....|....*.
gi 113674054  1109 VSAGYTHK-RLPTSLPTGVYECNPLCRCdPRMCSNR 1143
Cdd:pfam05033   65 FRFPYDKDgLLVPESKPPIYECNPLCGC-PPSCPNR 99
SET_SUV39H1 cd10525
SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 ...
1052-1418 5.14e-23

SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homolog 1 (SUV39H1) and similar proteins; SUV39H1 (EC 2.1.1.43; also termed histone H3-K9 methyltransferase 1, H3-K9-HMTase 1, lysine N-methyltransferase 1A (KMT1A), position-effect variegation 3-9 homolog (SUV39H), or Su(var)3-9 homolog 1) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. It mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions.


Pssm-ID: 380923 [Multi-domain]  Cd Length: 255  Bit Score: 99.97  E-value: 5.14e-23
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1052 NVTYTKDRVPARGVFINTSSdfmVGCDCTDGCRDRSKCACHKLTIEATSLCTGGPVDVSAGythkrlptsLPtgVYECNP 1131
Cdd:cd10525     2 DFVYINEYKVGEGVTLNQVA---VGCECQDCLSQPVGGCCPGASKHRFAYNEQGQVKVRPG---------LP--IYECNS 67
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1132 LCRCDPRmCSNRLVQHGMQLRLELFMTQH-KGWGIRCKDDVPKGTFVCVFTGKIVNEDKMNEDDTM---SGNEYLANLDF 1207
Cdd:cd10525    68 RCRCGPD-CPNRVVQKGIQYDLCIFRTDNgRGWGVRTLEKIRKNSFVMEYVGEIITSEEAERRGQIydrQGATYLFDLDY 146
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1208 IEGVeklkegyeseaycsdtevesskktitmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkphe 1287
Cdd:cd10525   147 VEDV---------------------------------------------------------------------------- 150
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1288 tpkdtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkntrglfndedacYIIDARQEGNLGRYI 1367
Cdd:cd10525   151 -----------------------------------------------------------------YTVDAAYYGNISHFV 165
                         330       340       350       360       370
                  ....*....|....*....|....*....|....*....|....*....|.
gi 113674054 1368 NHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYEVGSVE 1418
Cdd:cd10525   166 NHSCDPNLQVYNVFIDNLDERLPRIALFATRTIRAGEELTFDYNMQVDPVD 216
SET_SUV39H_Clr4-like cd20073
SET domain (including pre-SET and post-SET domains) found in of Schizosaccharomyces pombe H3K9 ...
1071-1435 1.29e-21

SET domain (including pre-SET and post-SET domains) found in of Schizosaccharomyces pombe H3K9 methyltransferase Clr4, and similar proteins; This subfamily contains fission yeast Schizosaccharomyces pombe H3K9 methyltransferase Clr4 (also known as Suv39h), the sole homolog of the mammalian SUV39H1 and SUV39H2 enzymes, that has a critical role in preventing aberrant heterochromatin formation. It is known to di- and tri-methylate Lys-9 of histone H3, a central heterochromatic histone modification, with its specificity profile most similar to that of the human SUV39H2 homolog.


Pssm-ID: 380999 [Multi-domain]  Cd Length: 259  Bit Score: 96.10  E-value: 1.29e-21
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1071 SDFMVGCDCT--DGC--RDRSKCACHKLTIEAtslctggpvdvSAGY-THKRLPTSLPTGVYECNPLCRCDPRmCSNRLV 1145
Cdd:cd20073    20 PLFISGCSCSklGGCdlNNPGSCQCLEDSNEK-----------SFAYdEYGRVRANTGSIIYECNENCDCGIN-CPNRVV 87
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1146 QHGMQLRLELFMTQHKGWGIRCKDDVPKGTFVCVFTGKIVNE------DKMNEDDtmsGNEYLANLDFiegveklkegye 1219
Cdd:cd20073    88 QRGRKLPLEIFKTKHKGWGLRCPRFIKAGTFIGVYLGEVITQseaeirGKKYDNV---GVTYLFDLDL------------ 152
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1220 seaycsdtevesskktitmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkphetpkdtqkkisel 1299
Cdd:cd20073       --------------------------------------------------------------------------------
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1300 rkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkntrglFNDE-DACYIIDARQEGNLGRYINHSCSPNLFVQ 1378
Cdd:cd20073   153 ----------------------------------------------FEDQvDEYYTVDAQYCGDVTRFINHSCDPNLAIY 186
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 113674054 1379 NVFVDTHDLRFPWVAFFASKRIKAGTELTWDY-----NYEVGSVEGKV----------LLCCCGSLRCTGRL 1435
Cdd:cd20073   187 SVLRDKSDSKIYDLAFFAIKDIPALEELTFDYsgrnnFDQLGFIGNRSnskyinlknkRPCYCGSANCRGWL 258
PreSET smart00468
N-terminal to some SET domains; A Cys-rich putative Zn2+-binding domain that occurs N-terminal ...
1027-1134 1.84e-21

N-terminal to some SET domains; A Cys-rich putative Zn2+-binding domain that occurs N-terminal to some SET domains. Function is unknown. Unpublished.


Pssm-ID: 128744 [Multi-domain]  Cd Length: 98  Bit Score: 90.55  E-value: 1.84e-21
                            10        20        30        40        50        60        70        80
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   1027 LPDISEGKEVMPVPCVNEVDNTLAP-NVTYTKDRVPARGVFINTSSDFMVGCDCTDGCRDRSKCACHKLTieatslCTGG 1105
Cdd:smart00468    1 CLDISNGKENVPVPLVNEVDEDPPPpDFEYISEYIYGQGVPIDRSPSPLVGCSCSGDCSSSNKCECARKN------GGEF 74
                            90       100
                    ....*....|....*....|....*....
gi 113674054   1106 PVDVSAGYTHKRlptslPTGVYECNPLCR 1134
Cdd:smart00468   75 AYELNGGLRLKR-----KPLIYECNSRCS 98
SET_SETD2 cd19172
SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2) and ...
1317-1433 3.03e-21

SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2) and similar proteins; SETD2 (also termed HIF-1, huntingtin yeast partner B, huntingtin-interacting protein 1 (HIP-1), huntingtin-interacting protein B, lysine N-methyltransferase 3A or protein-lysine N-methyltransferase SETD2) acts as histone-lysine N-methyltransferase that specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate. It has been shown that methylation is a posttranslational modification of dynamic microtubules and that SETD2 methylates alpha-tubulin at lysine 40, the same lysine that is marked by acetylation on microtubules. Methylation of microtubules occurs during mitosis and cytokinesis and can be ablated by SETD2 deletion, which causes mitotic spindle and cytokinesis defects, micronuclei, and polyploidy.


Pssm-ID: 380949 [Multi-domain]  Cd Length: 142  Bit Score: 91.49  E-value: 3.03e-21
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1317 KSFQRRVKplestEAQKEKTK-----TpkntrgLFNDEdacyIIDARQEGNLGRYINHSCSPNLFVQNVFVDtHDLRfpw 1391
Cdd:cd19172    39 KEFKRRMK-----EYAREGNRhyyfmA------LKSDE----IIDATKKGNLSRFINHSCEPNCETQKWTVN-GELR--- 99
                          90       100       110       120
                  ....*....|....*....|....*....|....*....|..
gi 113674054 1392 VAFFASKRIKAGTELTWDYNYEVGSVEGKVllCCCGSLRCTG 1433
Cdd:cd19172   100 VGFFAKRDIPAGEELTFDYQFERYGKEAQK--CYCGSPNCRG 139
SET smart00317
SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain; Putative methyl transferase, based on ...
1348-1413 7.76e-21

SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain; Putative methyl transferase, based on outlier plant homologues


Pssm-ID: 214614 [Multi-domain]  Cd Length: 124  Bit Score: 89.70  E-value: 7.76e-21
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054   1348 DEDACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDlrfpWVAFFASKRIKAGTELTWDYNYE 1413
Cdd:smart00317   59 DIDSDLCIDARRKGNLARFINHSCEPNCELLFVEVNGDD----RIVIFALRDIKPGEELTIDYGSD 120
Tudor_5 pfam18359
Histone methyltransferase Tudor domain 1; This is the first TUDOR domain found in SETDB1 ...
634-686 3.35e-20

Histone methyltransferase Tudor domain 1; This is the first TUDOR domain found in SETDB1 enzymes (EC:2.1.1.43) in homosapiens, also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4. SET domain, bifurcated 1 (SETDB1) is a histone methyltransferase (HMT) that methylates lysine 9 on histone H3 (H3K9). The enzymatic activity of SETDB1, in association with MBD1-containing chromatin-associated factor 1 (MCAF1), converts H3K9me2 to H3K9me3 and represses subsequent transcription. SETDB1 is amplified in cancers such as melanoma and lung cancer, and increased expression of SETDB1 promotes tumorigenesis in a zebrafish melanoma model. In addition, SETDB1 is required for endogenous retrovirus silencing during early embryogenesis, inhibition of adipocyte differentiation, and differentiation of mesenchymal cells into osteoblasts. The tandem Tudor domains in the N-terminal region are involved in protein-protein interactions. The second tudor domain is pfam18385.


Pssm-ID: 465723  Cd Length: 53  Bit Score: 85.34  E-value: 3.35e-20
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|...
gi 113674054   634 LYIGCRVVASAKSENGKSLYNAGVMVELPERKNRMRFLVFFDDGLATYLALPD 686
Cdd:pfam18359    1 LPVGTRVIAKYKDSNGKSAYYAGVIAEPPKDLNRYRYLVFFDDGYAQYVVHKD 53
SET_SUV39H_DIM5-like cd19473
SET domain (including pre-SET domain) found in Neurospora crassa (DIM-5) and similar proteins; ...
1051-1435 4.15e-20

SET domain (including pre-SET domain) found in Neurospora crassa (DIM-5) and similar proteins; This subfamily contains Neurospora crassa DIM-5 (also termed H3-K9-HMTase dim-5, or HKMT) which functions as histone-lysine N-methyltransferase that specifically trimethylates histone H3 to form H3K9me3.


Pssm-ID: 380996 [Multi-domain]  Cd Length: 274  Bit Score: 91.99  E-value: 4.15e-20
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1051 PNVTYTKDRVPARGVfINTSSDFMVGCDCTDG--CRDRskcACHKLTIEATSLCTGGPVDVSAGYTH---------KRLP 1119
Cdd:cd19473     1 PDFRFIEKSILGEGV-ELADEEFRSGCECTDDedCMYS---GCLCLQDVDPDDDRDPGKKKNAYHSSgakkgclrgHMLN 76
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1120 TSLPtgVYECNPLCRCDPRmCSNRLVQHGMQLRLELFMTQ-HKGWGIRCKDDVPKGTFVCVFTGKIVNEDKMN---EDDT 1195
Cdd:cd19473    77 SRLP--IYECHEGCACSDD-CPNRVVERGRKVPLQIFRTSdGRGWGVRSTVDIKRGQFVDCYVGEIITPEEAQrrrDAAT 153
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1196 MSGNE--YLANLD-FiegveklkegyeseaycsdTEVESSKktitmktgPLLKNslykedsssgeEPMEVdtakdkvkvh 1272
Cdd:cd19473   154 IAQRKdvYLFALDkF-------------------SDPDSLD--------PRLRG-----------DPYEI---------- 185
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1273 dkplgerklpnkphetpkdtqkkiselrknDGQESSGPKrcfaiksfqrrvkplesteaqkektktpkntrglfndedac 1352
Cdd:cd19473   186 ------------------------------DGEFMSGPT----------------------------------------- 194
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1353 yiidarqegnlgRYINHSCSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDY-------NYEVGSVEGKVLL-- 1423
Cdd:cd19473   195 ------------RFINHSCDPNLRIFARVGDHADKHIHDLAFFAIKDIPRGTELTFDYvdgvtglDDDAGDEEKEKEMtk 262
                         410
                  ....*....|..
gi 113674054 1424 CCCGSLRCTGRL 1435
Cdd:cd19473   263 CLCGSPKCRGYL 274
SET_SETD2-like cd10531
SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2), ...
1353-1432 1.19e-19

SET domain (including post-SET domain) found in SET domain-containing protein 2 (SETD2), nuclear SETD2 (NSD2), ASH1-like protein (ASH1L) and similar proteins; This family includes SET domain-containing protein 2 (SETD2), nuclear SETD2 (NSD2) and ASH1-like protein (ASH1L), which function as histone-lysine N-methyltransferases. SETD2 specifically trimethylates 'Lys-36' of histone H3 (H3K36me3) using demethylated 'Lys-36' (H3K36me2) as substrate. NSD2 shows histone H3 'Lys-27' (H3K27me) methyltransferase activity. ASH1L specifically methylates 'Lys-36' of histone H3 (H3K36me). The family also includes Arabidopsis thaliana ASH1-related protein 3 (ASHR3) and similar proteins.


Pssm-ID: 380929  Cd Length: 136  Bit Score: 86.54  E-value: 1.19e-19
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1353 YIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHdlrfPWVAFFASKRIKAGTELTWDYNYEVGSVEGKVllCCCGSLRCT 1432
Cdd:cd10531    63 VVIDATRKGNLSRFINHSCEPNCETQKWIVNGE----YRIGIFALRDIPAGEELTFDYNFVNYNEAKQV--CLCGAQNCR 136
HMT_MBD cd01395
Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as ...
943-1003 1.22e-19

Methyl-CpG binding domains (MBD) present in putative histone methyltransferases (HMT) such as CLLD8 and SETDB1 proteins; CLLD8 contains a MBD, a PreSET and a bifurcated SET domain, suggesting that CLLD8 might be associated with methylation-mediated transcriptional repression. SETDB1 and other proteins in this group have a similar domain architecture. SETDB1 is a novel KAP-1-associated histone H3, lysine 9-specific methyltransferase that contributes to HP1-mediated silencing of euchromatic genes by KRAB zinc-finger proteins.


Pssm-ID: 238689  Cd Length: 60  Bit Score: 83.97  E-value: 1.22e-19
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 113674054  943 PLLIPLLFKFRRMTARRRIdGKLFFHIFYRSPCGRSLCDMQEVQDYLFETrCDFLFLEMFC 1003
Cdd:cd01395     1 PLHTPLLCGFQRMKYRARV-GKVKKHVIYKAPCGRSLRNMSEVHRYLRET-CSFLTVDNFS 59
SET_SUV39H2 cd10532
SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 ...
1055-1435 7.01e-19

SET domain (including pre-SET and post-SET domains) found in suppressor of variegation 3-9 homolog 2 (SUV39H2) and similar proteins; SUV39H2 (EC 2.1.1.43; also termed histone H3-K9 methyltransferase 2, H3-K9-HMTase 2, lysine N-methyltransferase 1B (KMT1B), or Su(var)3-9 homolog 2) acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3) using monomethylated H3 'Lys-9' as substrate. It mainly functions in heterochromatin regions, thereby playing a central role in the establishment of constitutive heterochromatin at pericentric and telomere regions.


Pssm-ID: 380930 [Multi-domain]  Cd Length: 243  Bit Score: 87.64  E-value: 7.01e-19
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1055 YTKDRVPARGvfINTSSDFMVGCDCTDgCRdRSKCachkltieatslCTGGPVDVSAGYTHKRLPTSLPTGVYECNPLCR 1134
Cdd:cd10532     5 YINEYKPAPG--INLDNEATVGCDCSD-CF-FGKC------------CPAEAGVLFAYNEHGQLKIPPGTPIYECNSRCK 68
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1135 CDPRmCSNRLVQHGMQLRLELFMTQH-KGWGIRCKDDVPKGTFVCVFTGKIVNEDKMNEDDTM---SGNEYLANLDfieg 1210
Cdd:cd10532    69 CGPD-CPNRVVQKGTQYSLCIFRTSNgRGWGVKTLQKIKKNSFVMEYVGEVITSEEAERRGQFydsKGITYLFDLD---- 143
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1211 veklkegYESeaycsdtevesskktitmktgpllknslykedsssgeepmevdtakdkvkvhdkplgerklpnkphetpk 1290
Cdd:cd10532   144 -------YES---------------------------------------------------------------------- 146
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1291 dtqkkiselrkndgqessgpkrcfaiksfqrrvkplesteaqkektktpkntrglfnDEdacYIIDARQEGNLGRYINHS 1370
Cdd:cd10532   147 ---------------------------------------------------------DE---FTVDAARYGNVSHFVNHS 166
                         330       340       350       360       370       380       390
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 113674054 1371 CSPNLFVQNVFVDTHDLRFPWVAFFASKRIKAGTELTWDY-----------NYEVGSVEGKVLLCC-CGSLRCTGRL 1435
Cdd:cd10532   167 CDPNLQVFNVFIDNLDTRLPRIALFSTRTIKAGEELTFDYqmkgsgdlssdSIDNSPAKKRVRTVCkCGAVTCRGYL 243
SET_ASH1L cd19174
SET domain (including post-SET domain) found in ASH1-like protein (ASH1L) and similar proteins; ...
1354-1433 7.71e-19

SET domain (including post-SET domain) found in ASH1-like protein (ASH1L) and similar proteins; ASH1L (EC 2.1.1.43; also termed absent small and homeotic disks protein 1 homolog, KMT2H, or lysine N-methyltransferase 2H) acts as histone-lysine N-methyltransferase that specifically methylates 'Lys-36' of histone H3 (H3K36me). It plays important roles in development; heterozygous mutation of ASH1L is associated with severe intellectual disability (ID) and multiple congenital anomaly (MCA).


Pssm-ID: 380951 [Multi-domain]  Cd Length: 141  Bit Score: 84.27  E-value: 7.71e-19
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDTHdlrfPWVAFFASKRIKAGTELTWDYNYEVGSVEgKVLLCCCGSLRCTG 1433
Cdd:cd19174    63 VIDGYRMGNEARFVNHSCDPNCEMQKWSVNGV----YRIGLFALKDIPAGEELTYDYNFHSFNVE-KQQPCKCGSPNCRG 137
SET_SETD1-like cd10518
SET domain (including post-SET domain) found in SET domain-containing proteins (SETD1A/SETD1B), ...
1348-1431 3.42e-18

SET domain (including post-SET domain) found in SET domain-containing proteins (SETD1A/SETD1B), histone-lysine N-methyltransferases (KMT2A/KMT2B/KMT2C/KMT2D) and similar proteins; This family includes SET domain-containing protein 1A (SETD1A), 1B (SETD1B), as well as histone-lysine N-methyltransferase 2A (KMT2A), 2B (KMT2B), 2C (KMT2C), 2D (KMT2D). These proteins are histone-lysine N-methyltransferases (EC 2.1.1.43) that specifically methylate 'Lys-4' of histone H3 (H3K4me).


Pssm-ID: 380916  Cd Length: 150  Bit Score: 82.64  E-value: 3.42e-18
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1348 DEDacYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLrfpwVAFFASKRIKAGTELTWDYNYEvgSVEGKVLLCCCG 1427
Cdd:cd10518    74 DED--LVIDATKKGNIARFINHSCDPNCYAKIITVDGEKH----IVIFAKRDIAPGEELTYDYKFP--IEDEEKIPCLCG 145

                  ....
gi 113674054 1428 SLRC 1431
Cdd:cd10518   146 APNC 149
SET_NSD cd19173
SET domain (including post-SET domain) found in nuclear SET domain-containing proteins, NSD1, ...
1354-1435 2.47e-17

SET domain (including post-SET domain) found in nuclear SET domain-containing proteins, NSD1, NSD2, NSD3 and similar proteins; The nuclear receptor-binding SET Domain (NSD) family of histone H3 lysine 36 methyltransferases is comprised of NSD1, NSD2, and NSD3, which are primarily known to be involved in chromatin integrity and gene expression through mono-, di-, or tri-methylating lysine 36 of histone H3 (H3K36), respectively. NSD1 (EC 2.1.1.43; also termed histone-lysine N-methyltransferase H3 lysine-36 and H4 lysine-20 specific, androgen receptor coactivator 267 kDa protein (ARA267), androgen receptor-associated protein of 267 kDa, H3-K36-HMTase, H4-K20-HMTase, lysine N-methyltransferase 3B (KMT3B) or NR-binding SET domain-containing protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4. NSD2 (EC 2.1.1.43; also termed multiple myeloma SET domain-containing protein (MMSET), protein trithorax-5 (TRX5), or wolf-Hirschhorn syndrome candidate 1 protein (WHSC1)) acts as histone-lysine N-methyltransferase with histone H3 'Lys-27' (H3K27me) methyltransferase activity. NSD3 (EC 2.1.1.43; also termed protein whistle, WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1L1), or WHSC1-like protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-4' and 'Lys-27' of histone H3.


Pssm-ID: 380950 [Multi-domain]  Cd Length: 142  Bit Score: 80.05  E-value: 2.47e-17
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNYEVGSVEGKVllCCCGSLRCTG 1433
Cdd:cd19173    66 IIDAGPKGNLSRFMNHSCQPNCETQKWTVNG-DTR---VGLFAVRDIPAGEELTFNYNLDCLGNEKKV--CRCGAPNCSG 139

                  ..
gi 113674054 1434 RL 1435
Cdd:cd19173   140 FL 141
SET pfam00856
SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be ...
1348-1411 2.29e-16

SET domain; SET domains are protein lysine methyltransferase enzymes. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains have been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as SET-N and SET-C. SET-C forms an unusual and conserved knot-like structure of probably functional importance. Additionally to SET-N and SET-C, an insert region (SET-I) and flanking regions of high structural variability form part of the overall structure.


Pssm-ID: 459965 [Multi-domain]  Cd Length: 115  Bit Score: 76.41  E-value: 2.29e-16
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054  1348 DEDACYIIDAR--QEGNLGRYINHSCSPNLFVQNVFVDthdlRFPWVAFFASKRIKAGTELTWDYN 1411
Cdd:pfam00856   54 DEDSEYCIDARalYYGNWARFINHSCDPNCEVRVVYVN----GGPRIVIFALRDIKPGEELTIDYG 115
SET_KMT2A_2B cd19170
SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A), ...
1350-1435 3.49e-16

SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A), 2B (KMT2B) and similar proteins; This family includes KMT2A and KMT2B. Both KMT2A (also termed ALL-1 or CXXC7 or MLL or MLL1 or TRX1 or HRX) and KMT2B (also termed MLL4 or TRX2) act as histone methyltransferases that methylate 'Lys-4' of histone H3 (H3K4me).


Pssm-ID: 380947 [Multi-domain]  Cd Length: 152  Bit Score: 77.05  E-value: 3.49e-16
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1350 DACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDlrfpWVAFFASKRIKAGTELTWDYNYEVGSVEgkvLLCCCGSL 1429
Cdd:cd19170    73 DDDEVVDATMHGNAARFINHSCEPNCYSRVVNIDGKK----HIVIFALRRILRGEELTYDYKFPIEDVK---IPCTCGSK 145

                  ....*.
gi 113674054 1430 RCTGRL 1435
Cdd:cd19170   146 KCRKYL 151
SET COG2940
SET domain-containing protein (function unknown) [General function prediction only];
1345-1434 2.27e-15

SET domain-containing protein (function unknown) [General function prediction only];


Pssm-ID: 442183 [Multi-domain]  Cd Length: 134  Bit Score: 74.23  E-value: 2.27e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1345 LFNDEDACYIiDARQEGNLGRYINHSCSPNLFVqnvfvdthDLRFPWVAFFASKRIKAGTELTWDYNYEvgsVEGKVLLC 1424
Cdd:COG2940    59 LFELDDDGVI-DGALGGNPARFINHSCDPNCEA--------DEEDGRIFIVALRDIAAGEELTYDYGLD---YDEEEYPC 126
                          90
                  ....*....|
gi 113674054 1425 CCGSlrCTGR 1434
Cdd:COG2940   127 RCPN--CRGT 134
SET_ASHR3-like cd19175
SET domain (including post-SET domain) found in Arabidopsis thaliana ASH1-related protein 3 ...
1354-1435 2.41e-15

SET domain (including post-SET domain) found in Arabidopsis thaliana ASH1-related protein 3 (ASHR3) and similar proteins; This family includes Arabidopsis thaliana ASH1-related protein 3 (ASHR3, also termed protein SET DOMAIN GROUP 4 or protein stamen loss), ASH1 homolog 3 (ASHH3, also termed protein SET DOMAIN GROUP 7) and homolog 4 (ASHH4, also termed protein SET DOMAIN GROUP 24). They all function as histone-lysine N-methyltransferases (EC 2.1.1.43).


Pssm-ID: 380952 [Multi-domain]  Cd Length: 139  Bit Score: 74.38  E-value: 2.41e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNY-EVGSVEGkvllCCCGSLRCT 1432
Cdd:cd19175    64 VIDATFKGNLSRFINHSCDPNCELQKWQVDG-ETR---IGVFAIRDIKKGEELTYDYQFvQFGADQD----CHCGSKNCR 135

                  ...
gi 113674054 1433 GRL 1435
Cdd:cd19175   136 GKL 138
MBD cd00122
MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of ...
943-1004 4.77e-15

MeCP2, MBD1, MBD2, MBD3, MBD4, CLLD8-like, and BAZ2A-like proteins constitute a family of proteins that share the methyl-CpG-binding domain (MBD). The MBD consists of about 70 residues and is defined as the minimal region required for binding to methylated DNA by a methyl-CpG-binding protein which binds specifically to methylated DNA. The MBD can recognize a single symmetrically methylated CpG either as naked DNA or within chromatin. MeCP2, MBD1 and MBD2 (and likely MBD3) form complexes with histone deacetylase and are involved in histone deacetylase-dependent repression of transcription. MBD4 is an endonuclease that forms a complex with the DNA mismatch-repair protein MLH1. The MBDs present in putative chromatin remodelling subunit, BAZ2A, and putative histone methyltransferase, CLLD8, represent two phylogenetically distinct groups within the MBD protein family.


Pssm-ID: 238069  Cd Length: 62  Bit Score: 70.82  E-value: 4.77e-15
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|..
gi 113674054  943 PLLIPLLFKFRRMTARRRIDGKLFFHIFYRSPCGRSLCDMQEVQDYLFETRCDFLFLEMFCM 1004
Cdd:cd00122     1 PLRDPLPPGWKRELVIRKSGSAGKGDVYYYSPCGKKLRSKPEVARYLEKTGPSSLDLENFSF 62
SET_KMT2C_2D cd19171
SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C), ...
1333-1431 5.90e-15

SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C), 2D (KMT2D) and similar proteins; This family includes KMT2C and KMT2D. Both, KMT2C (also termed HALR or MLL3) and KMT2D (also termed ALR or MLL2), act as histone methyltransferases that methylate 'Lys-4' of histone H3 (H3K4me). They are subunits of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation.


Pssm-ID: 380948 [Multi-domain]  Cd Length: 153  Bit Score: 73.62  E-value: 5.90e-15
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1333 KEKTKTPKNtRGLFN---DEDacYIIDARQEGNLGRYINHSCSPNLFVQNVFVDthdlRFPWVAFFASKRIKAGTELTWD 1409
Cdd:cd19171    56 REKIYESQN-RGIYMfriDND--WVIDATMTGGPARYINHSCNPNCVAEVVTFD----KEKKIIIISNRRIAKGEELTYD 128
                          90       100
                  ....*....|....*....|..
gi 113674054 1410 YNYEVGSVEGKVlLCCCGSLRC 1431
Cdd:cd19171   129 YKFDFEDDQHKI-PCLCGAPNC 149
MBD smart00391
Methyl-CpG binding domain; Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, ...
941-1016 1.68e-14

Methyl-CpG binding domain; Methyl-CpG binding domain, also known as the TAM (TTF-IIP5, ARBP, MeCP1) domain


Pssm-ID: 128673  Cd Length: 77  Bit Score: 69.71  E-value: 1.68e-14
                            10        20        30        40        50        60        70
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 113674054    941 KNPLLIPLLFKFRRMTARRRI-DGKLFFHIFYRSPCGRSLCDMQEVQDYLFETRCDFLFLEMFCMDPFVLVNRARPP 1016
Cdd:smart00391    1 GDPLRLPLPCGWRRETKQRKSgRSAGKFDVYYISPCGKKLRSKSELARYLHKNGDLSLDLECFDFNATVPVGPKFTP 77
SET_EZH cd10519
SET domain found in enhancer of zeste homolog 1 (EZH1), zeste homolog 2 (EZH2) and similar ...
1345-1412 5.40e-14

SET domain found in enhancer of zeste homolog 1 (EZH1), zeste homolog 2 (EZH2) and similar proteins; The family includes EZH1 and EZH2. EZH1 (EC 2.1.1.43; also termed ENX-2, or histone-lysine N-methyltransferase EZH1) is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. EZH2 (EC 2.1.1.43; also termed lysine N-methyltransferase 6, ENX-1, or histone-lysine N-methyltransferase EZH2) is a catalytic subunit of the PRC2/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. Both, EZH1 and EZH2, can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively.


Pssm-ID: 380917  Cd Length: 117  Bit Score: 69.58  E-value: 5.40e-14
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1345 LFNDEDAcYIIDARQEGNLGRYINHSCSPNLFVQNVFVDtHDLRfpwVAFFASKRIKAGTELTWDYNY 1412
Cdd:cd10519    55 LFNLNDQ-FVVDATRKGNKIRFANHSSNPNCYAKVMMVN-GDHR---IGIFAKRDIEAGEELFFDYGY 117
SET_LegAS4-like cd10522
SET domain found in Legionella pneumophila type IV secretion system effector LegAS4 and ...
1346-1413 7.19e-13

SET domain found in Legionella pneumophila type IV secretion system effector LegAS4 and similar proteins; LegAS4 is a type IV secretion system effector of Legionella pneumophila. It contains a SET domain that is involved in the modification of Lys4 of histone H3 (H3K4) in the nucleolus of the host cell, thereby enhancing heterochromatic rDNA transcription. It also contains an ankyrin repeat domain of unknown function at its C-terminal region.


Pssm-ID: 380920 [Multi-domain]  Cd Length: 122  Bit Score: 66.60  E-value: 7.19e-13
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1346 FNDEDACYIIDARQEGNLGRYINHSCSPNLFvqnvFVDTHDLRFPWVAFFASKRIKAGTELTWDYNYE 1413
Cdd:cd10522    56 FDLNGDILVIDAGKKGNLTRFINHSDQPNLE----LIVRTLKGEQHIGFVAIRDIKPGEELFISYGPK 119
MBD pfam01429
Methyl-CpG binding domain; The Methyl-CpG binding domain (MBD) binds to DNA that contains one ...
939-1012 1.76e-12

Methyl-CpG binding domain; The Methyl-CpG binding domain (MBD) binds to DNA that contains one or more symmetrically methylated CpGs. DNA methylation in animals is associated with alterations in chromatin structure and silencing of gene expression. MBD has negligible non-specific affinity for DNA. In vitro foot-printing with MeCP2 showed the MBD can protect a 12 nucleotide region surrounding a methyl CpG pair. MBDs are found in several Methyl-CpG binding proteins and also DNA demethylase.


Pssm-ID: 396147 [Multi-domain]  Cd Length: 76  Bit Score: 63.92  E-value: 1.76e-12
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 113674054   939 HGKNPLLIPLLFKFRRMTARRRIDGKLF-FHIFYRSPCGRSLCDMQEVQDYLFETRCDFLFLEMFCMDPFVLVNR 1012
Cdd:pfam01429    2 ERKREDRLPLPPGWRREERQRKSGSKAGkVDVFYYSPTGKKLRSKSEVARYLEANGGTSPKLEDFSFTVRSEVGR 76
SET cd08161
SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain superfamily; The Su(var)3-9, ...
1363-1411 1.93e-12

SET (Su(var)3-9, Enhancer-of-zeste, Trithorax) domain superfamily; The Su(var)3-9, Enhancer-of-zeste, Trithorax (SET) domain superfamily corresponds to SET domain-containing lysine methyltransferases, which catalyze site and state-specific methylation of lysine residues in histones that are fundamental in epigenetic regulation of gene activation and silencing in eukaryotic organisms. SET domains appear to be protein-protein interaction domains. It has been demonstrated that SET domains mediate interactions with a family of proteins that display similarity with dual-specificity phosphatases (dsPTPases). A subset of SET domains has been called PR domains. These domains are divergent in sequence from other SET domains, but also appear to mediate protein-protein interaction. The SET domain consists of two regions known as N-SET and C-SET. C-SET forms an unusual and conserved knot-like structure of probable functional importance. In addition to N-SET and C-SET, an insert region (I-SET) and flanking regions of high structural variability form part of the overall structure. Some family members contain a pre-SET domain, which is found in a number of histone methyltransferases (HMTase), and a post-SET domain, which harbors a zinc-binding site.


Pssm-ID: 380914 [Multi-domain]  Cd Length: 72  Bit Score: 63.81  E-value: 1.93e-12
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*....
gi 113674054 1363 LGRYINHSCSPNLFVQNVFVDTHdlrfPWVAFFASKRIKAGTELTWDYN 1411
Cdd:cd08161    28 LARFINHSCEPNCEFEEVYVGGK----PRVFIVALRDIKAGEELTVDYG 72
SET_NSD1 cd19210
SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing ...
1354-1435 3.32e-12

SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing protein 1 (NSD1) and similar proteins; NSD1 (EC 2.1.1.43; also termed Histone-lysine N-methyltransferase H3 lysine-36 and H4 lysine-20 specific, androgen receptor coactivator 267 kDa protein (ARA267), androgen receptor-associated protein of 267 kDa, H3-K36-HMTase, H4-K20-HMTase, lysine N-methyltransferase 3B (KMT3B), or NR-binding SET domain-containing protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-36' of histone H3 and 'Lys-20' of histone H4. NSD1 is altered in approximately 10% of head and neck cancer patients with 55% decrease in risk of death in NSD1-mutated versus non-mutated patients; its disruption promotes favorable chemotherapeutic responses linked to hypomethylation.


Pssm-ID: 380987 [Multi-domain]  Cd Length: 142  Bit Score: 65.33  E-value: 3.32e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNYE-VGSveGKVlLCCCGSLRCT 1432
Cdd:cd19210    66 IIDAGPKGNYARFMNHCCQPNCETQKWTVNG-DTR---VGLFALCDIKAGTELTFNYNLEcLGN--GKT-VCKCGAPNCS 138

                  ...
gi 113674054 1433 GRL 1435
Cdd:cd19210   139 GFL 141
SET_SETD1 cd19169
SET domain (including post-SET domain) found in SET domain-containing protein 1 (SETD1) and ...
1350-1431 6.54e-12

SET domain (including post-SET domain) found in SET domain-containing protein 1 (SETD1) and similar proteins; This family includes SET domain-containing protein 1A (SETD1A) and SET domain-containing protein 1B (SETD1B). These proteins are histone-lysine N-methyltransferases that specifically methylate 'Lys-4' of histone H3 (H3K4me) when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated.


Pssm-ID: 380946  Cd Length: 148  Bit Score: 64.67  E-value: 6.54e-12
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1350 DACYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLrfpwVAFFASKRIKAGTELTWDYNYevgSVEGKVLLCCCGSL 1429
Cdd:cd19169    73 DDDTIIDATKCGNLARFINHSCNPNCYAKIITVESQKK----IVIYSKRPIAVNEEITYDYKF---PIEDEKIPCLCGAP 145

                  ..
gi 113674054 1430 RC 1431
Cdd:cd19169   146 QC 147
SET_SETD5-like cd10529
SET domain found in SET domain-containing protein 5 (SETD5), inactive histone-lysine ...
1345-1412 1.37e-11

SET domain found in SET domain-containing protein 5 (SETD5), inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar proteins; SETD5 is a probable transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. KMT2E (also termed inactive lysine N-methyltransferase 2E or myeloid/lymphoid or mixed-lineage leukemia protein 5 (MLL5)) associates with chromatin regions downstream of transcriptional start sites of active genes and thus regulates gene transcription. The family also includes Saccharomyces cerevisiae SET domain-containing proteins, SET3 and SET4, and Schizosaccharomyces pombe SET3. Most of these family members contain a post-SET domain which harbors a zinc-binding site.


Pssm-ID: 380927  Cd Length: 127  Bit Score: 63.06  E-value: 1.37e-11
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1345 LFNDEDACyiIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLRFpwvAFFASKRIKAGTELT--WDYNY 1412
Cdd:cd10529    63 GFEGLPLC--VDARKYGNEARFIRRSCRPNAELRHVVVSNGELRL---FIFALKDIRKGTEITipFDYDY 127
SET_NSD2 cd19211
SET domain (including post-SET domain) found in nuclear SET domain-containing protein 2 (NSD2) ...
1354-1435 2.25e-11

SET domain (including post-SET domain) found in nuclear SET domain-containing protein 2 (NSD2) and similar proteins; NSD2 (EC 2.1.1.43; also termed multiple myeloma SET domain-containing protein (MMSET), protein trithorax-5 (TRX5), or wolf-Hirschhorn syndrome candidate 1 protein (WHSC1)) acts as histone-lysine N-methyltransferase with histone H3 'Lys-36' (H3K36me) methyltransferase activity. NSD2 has been shown to mediate di- and trimethylation of H3K36 and dimethylation of H4K20 in different systems, and has been characterized as a transcriptional repressor interacting with histone deacetylase HDAC1 and histone demethylase LSD1. NSD2 mediates constitutive NF-kappaB signaling for cancer cell proliferation, survival and tumor growth. It is highly overexpressed in several types of human cancers, including small-cell lung cancers, neuroblastoma, carcinomas of stomach and colon, and bladder cancers, and its overexpression tends to be associated with tumor aggressiveness. WHSC1 is frequently deleted in Wolf-Hirschhorn syndrome (WHS).


Pssm-ID: 380988 [Multi-domain]  Cd Length: 142  Bit Score: 63.09  E-value: 2.25e-11
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNYEVGSVEGKVllCCCGSLRCTG 1433
Cdd:cd19211    66 IIDAGPKGNYSRFMNHSCQPNCETQKWTVNG-DTR---VGLFAVCDIPAGTELTFNYNLDCLGNEKTV--CRCGAPNCSG 139

                  ..
gi 113674054 1434 RL 1435
Cdd:cd19211   140 FL 141
Tudor_4 pfam18358
Histone methyltransferase Tudor domain; This is a Tudor domain found in histone-lysine ...
725-773 3.28e-11

Histone methyltransferase Tudor domain; This is a Tudor domain found in histone-lysine N-methyltransferase SETDB1 proteins (EC:2.1.1.43), also known as Eggless in Drosophila. In Drosophila, SetdB1 (Egg) is important for oogenesis and the silencing of chromosome 4.


Pssm-ID: 408159  Cd Length: 50  Bit Score: 59.68  E-value: 3.28e-11
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 113674054   725 LRLGQETKAVRNGQFEDCTVLQLDGSLVQICYKNDKQKEWIYKGSDKLE 773
Cdd:pfam18358    1 LKKGQTVKTEWNGKWWTARVLEVDASLVKVYFLSDKRTEWIYRGSTRLE 49
SET_KMT2A cd19206
SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A) ...
1354-1431 3.51e-11

SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2A (KMT2A) and similar proteins; KMT2A (EC2.1.1.43; also termed lysine N-methyltransferase 2A, ALL-1, CXXC-type zinc finger protein 7 (CXXC7), myeloid/lymphoid or mixed-lineage leukemia (MLL), myeloid/lymphoid or mixed-lineage leukemia protein 1 (MLL1), trithorax-like protein (TRX1), or zinc finger protein HRX) acts as a histone methyltransferase that plays an essential role in early development and hematopoiesis. It is a catalytic subunit of the MLL1/MLL complex, a multiprotein complex that mediates both methylation of 'Lys-4' of histone H3 (H3K4me) complex and acetylation of 'Lys-16' of histone H4 (H4K16ac).


Pssm-ID: 380983 [Multi-domain]  Cd Length: 154  Bit Score: 62.73  E-value: 3.51e-11
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDlrfpWVAFFASKRIKAGTELTWDYNYEVGSVEGKvLLCCCGSLRC 1431
Cdd:cd19206    77 VVDATMHGNAARFINHSCEPNCYSRVINIDGQK----HIVIFAMRKIYRGEELTYDYKFPIEDASNK-LPCNCGAKKC 149
SET_KMT2B cd19207
SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2B (KMT2B) ...
1354-1431 5.90e-11

SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2B (KMT2B) and similar proteins; KMT2B (EC2.1.1.43; also termed lysine N-methyltransferase 2B, myeloid/lymphoid or mixed-lineage leukemia protein 4 (MLL2/MLL4), trithorax homolog 2 (TRX2), or WW domain-binding protein 7 (WBP-7)), acts as a histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me). It is required during the transcriptionally active period of oocyte growth for the establishment and/or maintenance of bulk H3K4 trimethylation (H3K4me3), global transcriptional silencing that precedes resumption of meiosis, oocyte survival and normal zygotic genome activation.


Pssm-ID: 380984 [Multi-domain]  Cd Length: 154  Bit Score: 62.35  E-value: 5.90e-11
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDlrfpWVAFFASKRIKAGTELTWDYNYEVGSVEGKvLLCCCGSLRC 1431
Cdd:cd19207    77 VVDATMHGNAARFINHSCEPNCYSRVIHVEGQK----HIVIFALRKIYRGEELTYDYKFPIEDASNK-LPCNCGAKRC 149
SET_NSD3 cd19212
SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing ...
1354-1435 1.00e-10

SET domain (including post-SET domain) found in nuclear receptor-binding SET domain-containing protein 3 (NSD3) and similar proteins; NSD3 (EC 2.1.1.43; also termed protein whistle, WHSC1-like 1 isoform 9 with methyltransferase activity to lysine, Wolf-Hirschhorn syndrome candidate 1-like protein 1 (WHSC1L1), or WHSC1-like protein 1) functions as a histone-lysine N-methyltransferase that preferentially methylates 'Lys-4' and 'Lys-27' of histone H3. NSD3 is amplified and overexpressed in multiple cancer types, including acute myeloid leukemia (AML), breast, lung, pancreatic and bladder cancers, as well as squamous cell carcinoma of the head and neck (SCCHN). NSD3 contributes to tumorigenesis by interacting with bromodomain-containing protein 4 (BRD4), the bromodomain and extraterminal (BET) protein, which is a potential therapeutic target in acute myeloid leukemia (AML). NSD3 is amplified in primary tumors and cell lines from breast carcinoma, and can promote the cell viability of small-cell lung cancer and pancreatic ductal adenocarcinoma. High NSD3 expression is implicated in poor grade and heavy smoking history in SCCHN. Thus, NSD3 may serve as a potential druggable target for selective cancer therapy.


Pssm-ID: 380989 [Multi-domain]  Cd Length: 142  Bit Score: 61.09  E-value: 1.00e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNYE-VGSveGKVlLCCCGSLRCT 1432
Cdd:cd19212    66 IIDAGPKGNYSRFMNHSCNPNCETQKWTVNG-DVR---VGLFALCDIPAGMELTFNYNLDcLGN--GRT-ECHCGADNCS 138

                  ...
gi 113674054 1433 GRL 1435
Cdd:cd19212   139 GFL 141
SET_SET1 cd20072
SET domain (including post-SET domain) found in catalytic component of the Saccharomyces ...
1348-1431 1.51e-10

SET domain (including post-SET domain) found in catalytic component of the Saccharomyces cerevisiae COMPASS complex and similar proteins; The family contains mostly fungal SET domains, including SET1 found in the catalytic component of the Saccharomyces cerevisiae COMPASS (complex of proteins associated with Set1). SET1 is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me), when part of the SET1 histone methyltransferase (HMT) complex. The activity of this catalytic domain is established through forming a complex with a set of core proteins; it is extensively contacted by Cps60 (Bre2), Cps50 (Swd1), and Cps30 (Swd3).


Pssm-ID: 380998  Cd Length: 148  Bit Score: 60.90  E-value: 1.51e-10
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1348 DEDAcyIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLrfpwVAFFASKRIKAGTELTWDYNYEVGSVEgkvLLCCCG 1427
Cdd:cd20072    73 DDDT--VVDATKKGNIARFINHCCDPNCTAKIIKVEGEKR----IVIYAKRDIAAGEELTYDYKFPREEDK---IPCLCG 143

                  ....
gi 113674054 1428 SLRC 1431
Cdd:cd20072   144 APNC 147
Tudor_SETDB1_rpt2 cd21181
second Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, ...
725-773 3.38e-10

second Tudor domain found in SET domain bifurcated 1 (SETDB1) and similar proteins; SETDB1, also called ERG-associated protein with SET domain (ESET), histone H3-K9 methyltransferase 4, H3-K9-HMTase 4, or lysine N-methyltransferase 1E (KMT1E), acts as a histone-lysine N-methyltransferase that specifically trimethylates 'Lys-9' of histone H3 (H3K9me3). It mainly functions in euchromatin regions, thereby playing a central role in the silencing of euchromatic genes. It contains two Tudor domains. This model corresponds to the second one. The Tudor domain binds to proteins with dimethylated arginine or lysine residues, and may also bind methylated histone tails to facilitate protein-protein interactions.


Pssm-ID: 410548  Cd Length: 54  Bit Score: 56.95  E-value: 3.38e-10
                          10        20        30        40
                  ....*....|....*....|....*....|....*....|....*....
gi 113674054  725 LRLGQETKAVRNGQFEDCTVLQLDGSLVQICYKNDKQKEWIYKGSDKLE 773
Cdd:cd21181     1 LKVGQLIKTEWNGKWWKARVEEVDGSLVKMLFLDDKRTEWIYRGSTRLE 49
PRK08581 PRK08581
amidase domain-containing protein;
351-546 2.63e-09

amidase domain-containing protein;


Pssm-ID: 236304 [Multi-domain]  Cd Length: 619  Bit Score: 61.73  E-value: 2.63e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  351 DLLESDSEQSDNAATKTRFKPSEVTASSKLKSSGDHNSASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTD 430
Cdd:PRK08581   29 DPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKFSTIDSSTSDSNNIIDFIYKNLPQTNINQLL 108
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  431 SELPTETPVEESTLPSNpkeavIMSDAESTDKTEKPQTRKKSSKP-SVTTTSPESRLTSSKSPPVTKTSSTQKETARAQS 509
Cdd:PRK08581  109 TKNKYDDNYSLTTLIQN-----LFNLNSDISDYEQPRNSEKSTNDsNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNTK 183
                         170       180       190
                  ....*....|....*....|....*....|....*..
gi 113674054  510 PSDSIDESADMEDSPDEPSNSPTESPTKTPDKTTRND 546
Cdd:PRK08581  184 PSTSNKQPNSPKPTQPNQSNSQPASDDTANQKSSSKD 220
SET_KMT2D cd19209
SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2D (KMT2D) ...
1332-1431 2.89e-09

SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2D (KMT2D) and similar proteins; KMT2D (EC2.1.1.43; also termed lysine N-methyltransferase 2D, ALL1-related protein (ALR), or myeloid/lymphoid or mixed-lineage leukemia protein 2 (MLL2)), acts as histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me). It is a coactivator for estrogen receptor by being recruited by ESR1, thereby activating transcription. KMT2D is a subunit of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation.


Pssm-ID: 380986 [Multi-domain]  Cd Length: 155  Bit Score: 57.40  E-value: 2.89e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1332 QKEKTKTPKNtRGLF----NDEdacYIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLrfpwVAFFASKRIKAGTELT 1407
Cdd:cd19209    57 RREKIYEEQN-RGIYmfriNNE---HVIDATLTGGPARYINHSCAPNCVAEVVTFDKEDK----IIIISSRRIPKGEELT 128
                          90       100
                  ....*....|....*....|....
gi 113674054 1408 WDYNYEVGSVEGKVlLCCCGSLRC 1431
Cdd:cd19209   129 YDYQFDFEDDQHKI-PCHCGAWNC 151
SET_SETD1A cd19204
SET domain (including post-SET domain) found in SET domain-containing protein 1A (SETD1A) and ...
1348-1435 3.92e-09

SET domain (including post-SET domain) found in SET domain-containing protein 1A (SETD1A) and similar proteins; SETD1A (EC2.1.1.43), also termed lysine N-methyltransferase 2F, or Set1/Ash2 histone methyltransferase complex subunit SET1, is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me), when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. Human SET domain containing protein 1A (hSETD1A) expression occurs at a high rate in hepatocellular carcinoma patients and controls tumor metastasis in breast cancer by activating MMP expression.


Pssm-ID: 380981 [Multi-domain]  Cd Length: 153  Bit Score: 56.96  E-value: 3.92e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1348 DEDAcyIIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLrfpwVAFFASKRIKAGTELTWDYNYevgSVEGKVLLCCCG 1427
Cdd:cd19204    74 DHDT--IIDATKCGNLARFINHCCTPNCYAKVITIESQKK----IVIYSKQPIGVNEEITYDYKF---PIEDNKIPCLCG 144

                  ....*...
gi 113674054 1428 SLRCTGRL 1435
Cdd:cd19204   145 TENCRGTL 152
SET_KMT2C cd19208
SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C) ...
1332-1431 5.76e-09

SET domain (including post-SET domain) found in histone-lysine N-methyltransferase 2C (KMT2C) and similar proteins; KMT2C (EC2.1.1.43; also termed lysine N-methyltransferase 2C, homologous to ALR protein (HALR) myeloid/lymphoid, or mixed-lineage leukemia protein 3 (MLL3)), acts as a histone methyltransferase that methylates 'Lys-4' of histone H3 (H3K4me) and may be involved in leukemogenesis and developmental disorder. KMT2C is a catalytic subunit of MLL2/3 complex, a coactivator complex of nuclear receptors, involved in transcriptional coactivation. Overexpression of KMT2C is associated with estrogen receptor-positive breast cancer; KMT2C mediates the estrogen dependence of breast cancer through regulation of estrogen receptor alpha (ERalpha) enhancer function. KMT2C is frequently mutated in certain populations with diffuse-type gastric adenocarcinomas (DGA); its loss promotes epithelial-to-mesenchymal transition (EMT) and is associated with worse overall survival.


Pssm-ID: 380985 [Multi-domain]  Cd Length: 154  Bit Score: 56.56  E-value: 5.76e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1332 QKEKTKTPKNtRGLFN---DEDacYIIDARQEGNLGRYINHSCSPNLFVQNVFVDthdlRFPWVAFFASKRIKAGTELTW 1408
Cdd:cd19208    56 RKEKLYESQN-RGVYMfriDND--HVIDATLTGGPARYINHSCAPNCVAEVVTFE----KGHKIIISSSRRIQKGEELCY 128
                          90       100
                  ....*....|....*....|...
gi 113674054 1409 DYNYEVGSVEGKVlLCCCGSLRC 1431
Cdd:cd19208   129 DYKFDFEDDQHKI-PCHCGAVNC 150
SET_SETD8 cd10528
SET domain found in SET domain-containing protein 8 (SETD8) and similar proteins; SETD8 (EC 2. ...
1346-1410 8.73e-09

SET domain found in SET domain-containing protein 8 (SETD8) and similar proteins; SETD8 (EC 2.1.1.43; also termed N-lysine methyltransferase KMT5A, H4-K20-HMTase KMT5A, lysine N-methyltransferase 5A, lysine-specific methylase 5A, PR/SET domain-containing protein 07, PR-Set7 or PR/SET07) is a nucleosomal histone-lysine N-methyltransferase that specifically monomethylates 'Lys-20' of histone H4 (H4K20me1). It plays a central role in the silencing of euchromatic genes.


Pssm-ID: 380926 [Multi-domain]  Cd Length: 141  Bit Score: 55.66  E-value: 8.73e-09
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 113674054 1346 FNDEDACYIIDARQE-GNLGRYINHSC-SPNLFVQNVFVDTHdlrfPWVAFFASKRIKAGTELTWDY 1410
Cdd:cd10528    76 FQYKGKTYCVDATKEsGRLGRLINHSKkKPNLKTKLLVIDGV----PHLILVAKRDIKPGEELLYDY 138
SET_SETD1B cd19205
SET domain (including post-SET domain) found in SET domain-containing protein 1B (SETD1B) and ...
1354-1435 1.37e-08

SET domain (including post-SET domain) found in SET domain-containing protein 1B (SETD1B) and similar proteins; SETD1B (EC2.1.1.43), also termed lysine N-methyltransferase 2G, is a histone-lysine N-methyltransferase that specifically methylates 'Lys-4' of histone H3 (H3K4me) when part of the SET1 histone methyltransferase (HMT) complex, but not if the neighboring 'Lys-9' residue is already methylated. Loss of SETD1B occurs in up to half the gastric and colorectal cancers, most commonly via SETD1B mutations, while de novo variants in SETD1B are associated with intellectual disability, epilepsy and autism.


Pssm-ID: 380982 [Multi-domain]  Cd Length: 153  Bit Score: 55.45  E-value: 1.37e-08
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1354 IIDARQEGNLGRYINHSCSPNLFVQNVFVDTHDLrfpwVAFFASKRIKAGTELTWDYNYevgSVEGKVLLCCCGSLRCTG 1433
Cdd:cd19205    78 IIDATKCGNFARFINHSCNPNCYAKVITVESQKK----IVIYSKQHINVNEEITYDYKF---PIEDVKIPCLCGSENCRG 150

                  ..
gi 113674054 1434 RL 1435
Cdd:cd19205   151 TL 152
PHA03255 PHA03255
BDLF3; Provisional
387-538 3.03e-07

BDLF3; Provisional


Pssm-ID: 165513 [Multi-domain]  Cd Length: 234  Bit Score: 52.98  E-value: 3.03e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  387 NSASASLNRTDpkVRPVTPSGTPPPSKSPPAvdntasveTNQTDSELPTETPVEESTLPSNPKEAVImsdaeSTDKTEKP 466
Cdd:PHA03255   26 SSGSSTASAGN--VTGTTAVTTPSPSASGPS--------TNQSTTLTTTSAPITTTAILSTNTTTVT-----STGTTVTP 90
                          90       100       110       120       130       140       150
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|...
gi 113674054  467 -QTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETARAQSPSDSIDESADMEDSPDEPSNSPTESPTKT 538
Cdd:PHA03255   91 vPTTSNASTINVTTKVTAQNITATEAGTGTSTGVTSNVTTRSSSTTSATTRITNATTLAPTLSSKGTSNATKT 163
SET_SMYD cd20071
SET domain (including SET domain and post-SET domain) found in SET and MYND domain-containing ...
1366-1431 3.55e-07

SET domain (including SET domain and post-SET domain) found in SET and MYND domain-containing protein, and similar proteins; The family includes SET and MYND domain-containing proteins, SMYD1-SYMD5. SMYD1 (EC 2.1.1.43; also termed BOP) is a heart and muscle specific SET-MYND domain containing protein, which functions as a histone methyltransferase and regulates downstream gene transcription. It methylates histone H3 at 'Lys-4' (H3K4me), seems able to perform both mono-, di-, and trimethylation. SMYD2 (also termed HSKM-B, or lysine N-methyltransferase 3C (KMT3C)) functions as a histone methyltransferase that methylates both histones and non-histone proteins, including p53/TP53 and RB1. It specifically methylates histone H3 'Lys-4' (H3K4me) and dimethylates histone H3 'Lys-36' (H3K36me2). SMYD3 (also termed zinc finger MYND domain-containing protein 1) functions as a histone methyltransferase that specifically methylates 'Lys-4' of histone H3, inducing di- and tri-methylation, but not monomethylation. It also methylates 'Lys-5' of histone H4. SMYD3 plays an important role in transcriptional activation as a member of an RNA polymerase complex. SMYD4 functions as a potential tumor suppressor that plays a critical role in breast carcinogenesis at least partly through inhibiting the expression of PDGFR-alpha. SMYD5 (also termed protein NN8-4AG, or retinoic acid-induced protein 15) functions as histone lysine methyltransferase that mediates H4K20me3 at heterochromatin regions.


Pssm-ID: 380997 [Multi-domain]  Cd Length: 122  Bit Score: 50.45  E-value: 3.55e-07
                          10        20        30        40        50        60        70
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 113674054 1366 YINHSCSPNLFVqnVFVDTHDLRfpwvaFFASKRIKAGTELTWDYNYEVGSVEG--KVLL------CCCgsLRC 1431
Cdd:cd20071    58 LLNHSCDPNAVV--VFDGNGTLR-----VRALRDIKAGEELTISYIDPLLPRTErrRELLekygftCSC--PRC 122
PRK08581 PRK08581
amidase domain-containing protein;
335-535 3.94e-07

amidase domain-containing protein;


Pssm-ID: 236304 [Multi-domain]  Cd Length: 619  Bit Score: 54.79  E-value: 3.94e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  335 AAIKNESQLKASVSEVDllESDSEQSDNAATKTRFKPSEVTASSKLKSSGDH-----NSASASLNRTDPKVRPVTPSGTP 409
Cdd:PRK08581   28 DDPQKDSTAKTTSHDSK--KSNDDETSKDTSSKDTDKADNNNTSNQDNNDKKfstidSSTSDSNNIIDFIYKNLPQTNIN 105
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  410 PPSKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAvimSDAESTDKTEKPQTRKKSSKPSVTTTSPESRLTSS 489
Cdd:PRK08581  106 QLLTKNKYDDNYSLTTLIQNLFNLNSDISDYEQPRNSEKSTN---DSNKNSDSSIKNDTDTQSSKQDKADNQKAPSSNNT 182
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*.
gi 113674054  490 KSPPVTKTSSTQKETARAQSPSDSIDESADMEDSPDEPSNSPTESP 535
Cdd:PRK08581  183 KPSTSNKQPNSPKPTQPNQSNSQPASDDTANQKSSSKDNQSMSDSA 228
PTZ00108 PTZ00108
DNA topoisomerase 2-like protein; Provisional
268-529 5.20e-07

DNA topoisomerase 2-like protein; Provisional


Pssm-ID: 240271 [Multi-domain]  Cd Length: 1388  Bit Score: 54.67  E-value: 5.20e-07
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  268 FEKLIKTESQSTDVTPSVMTPNKQPELLSFQSTTKIKPEPQSTQANTELSSPPSNSKLLenhnslsiAAIKNESQLKASV 347
Cdd:PTZ00108 1148 EEKEIAKEQRLKSKTKGKASKLRKPKLKKKEKKKKKSSADKSKKASVVGNSKRVDSDEK--------RKLDDKPDNKKSN 1219
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  348 SevdlleSDSEQSDNAATKTRFKPSEVTASSKLKSSGDHNSASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTASveTN 427
Cdd:PTZ00108 1220 S------SGSDQEDDEEQKTKPKKSSVKRLKSKKNNSSKSSEDNDEFSSDDLSKEGKPKNAPKRVSAVQYSPPPPS--KR 1291
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  428 QTDSELPTETPVEESTLPSNPKEAVIMSDAESTDKTEKPQTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKEtara 507
Cdd:PTZ00108 1292 PDGESNGGSKPSSPTKKKVKKRLEGSLAALKKKKKSEKKTARKKKSKTRVKQASASQSSRLLRRPRKKKSDSSSED---- 1367
                         250       260
                  ....*....|....*....|..
gi 113674054  508 qSPSDSIDESADMEDSPDEPSN 529
Cdd:PTZ00108 1368 -DDDSEVDDSEDEDDEDDEDDD 1388
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
129-500 5.72e-07

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 54.15  E-value: 5.72e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   129 VVIDLGATKETLEPMLEKVTVAIQKSSKLVQDLVQMVSKTSMGATSPLSTSSSDINRPSSSSTPEIVRPesVTPKLEITN 208
Cdd:pfam05109  419 VIFSKAPESTTTSPTLNTTGFAAPNTTTGLPSSTHVPTNLTAPASTGPTVSTADVTSPTPAGTTSGASP--VTPSPSPRD 496
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   209 SITIVKTESLSSvpKISSLFNSSEQCKSiadhdsyfkPTiktePEWTPLTPWEDS----ESSPFEKLIKTESQSTDVTPS 284
Cdd:pfam05109  497 NGTESKAPDMTS--PTSAVTTPTPNATS---------PT----PAVTTPTPNATSptlgKTSPTSAVTTPTPNATSPTPA 561
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   285 VMTPNKQPELLSFQSTTKIK----PEPQSTQANTELSSPPSNSKlleNHNSLSIAAIKN-ESQLKASVSEVDLLESDSEQ 359
Cdd:pfam05109  562 VTTPTPNATIPTLGKTSPTSavttPTPNATSPTVGETSPQANTT---NHTLGGTSSTPVvTSPPKNATSAVTTGQHNITS 638
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   360 SDNAATKTRfkPSEVTASSKLKSSGDHNSASASLNRTDP----KVRPVTPSGTPP---PSKSP---PAVDNTASVETNQT 429
Cdd:pfam05109  639 SSTSSMSLR--PSSISETLSPSTSDNSTSHMPLLTSAHPtggeNITQVTPASTSThhvSTSSPaprPGTTSQASGPGNSS 716
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   430 DSELPTETPVEESTLPSNPKEavimSDAESTDKTEKPQTRKKSSKPSVTT-------------TSPESRLTSSKSPPVTK 496
Cdd:pfam05109  717 TSTKPGEVNVTKGTPPKNATS----PQAPSGQKTAVPTVTSTGGKANSTTggkhttghgartsTEPTTDYGGDSTTPRTR 792

                   ....
gi 113674054   497 TSST 500
Cdd:pfam05109  793 YNAT 796
Treacle pfam03546
Treacher Collins syndrome protein Treacle;
259-522 9.44e-07

Treacher Collins syndrome protein Treacle;


Pssm-ID: 460967 [Multi-domain]  Cd Length: 531  Bit Score: 53.15  E-value: 9.44e-07
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   259 PWEDSESSPFEkliktESQSTDVTPSVMTP------NKQPELLSFQSTTKIKPEPQSTQANTELSSPPSnskllenhnsl 332
Cdd:pfam03546   20 PEEDSESSSEE-----ESDSEEETPAAKTPlqakpsGKTPQVRAASAPAKESPRKGAPPVPPGKTGPAA----------- 83
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   333 siaaikneSQLKASVSEVDLlESDSEQSDnAATKTRFKPSEVTASSKLKSSGDhnsasaslnrtDPKVRPVTPSGTPPPS 412
Cdd:pfam03546   84 --------AQAQAGKPEEDS-ESSSEESD-SDGETPAAATLTTSPAQVKPLGK-----------NSQVRPASTVGKGPSG 142
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   413 K----SPPAVDNTASVETNQTDSELPTETPVEEStlpsnpkeavimsdaESTDKTEKPQTRKKSSK--PSVTTTSPESRL 486
Cdd:pfam03546  143 KganpAPPGKAGSAAPLVQVGKKEEDSESSSEES---------------DSEGEAPPAATQAKPSGkiLQVRPASGPAKG 207
                          250       260       270
                   ....*....|....*....|....*....|....*.
gi 113674054   487 TSSKSPPVTKTSSTQKETARAQSPSDSIDESADMED 522
Cdd:pfam03546  208 AAPAPPQKAGPVATQVKAERSKEDSESSEESSDSEE 243
SET_SpSET3-like cd19183
SET domain (including post-SET domain) found in Schizosaccharomyces pombe SET ...
1355-1410 1.04e-06

SET domain (including post-SET domain) found in Schizosaccharomyces pombe SET domain-containing protein 3 (SETD3) and similar proteins; Schizosaccharomyces pombe SETD3 functions as a transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. It is required for both, gene activation and repression.


Pssm-ID: 380960  Cd Length: 173  Bit Score: 50.48  E-value: 1.04e-06
                          10        20        30        40        50
                  ....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1355 IDARQEGNLGRYINHSCSPNLFVQNVFVDthDLRFPWVAFFASKRIKAGTELT--WDY 1410
Cdd:cd19183    69 IDTRRSGSVARFIRRSCRPNAELVTVASD--SGSVLKFVLYASRDISPGEEITigWDW 124
SET_EZH1 cd19217
SET domain found in enhancer of zeste homolog 1 (EZH1) and similar proteins; EZH1 (EC 2.1.1.43) ...
1345-1412 2.84e-06

SET domain found in enhancer of zeste homolog 1 (EZH1) and similar proteins; EZH1 (EC 2.1.1.43), also termed ENX-2, or histone-lysine N-methyltransferase EZH1, is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. It can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively.


Pssm-ID: 380994  Cd Length: 136  Bit Score: 48.14  E-value: 2.84e-06
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054 1345 LFNDEDAcYIIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNY 1412
Cdd:cd19217    60 LFNLNND-FVVDATRKGNKIRFANHSVNPNCYAKVVMVNG-DHR---IGIFAKRAIQQGEELFFDYRY 122
PHA03307 PHA03307
transcriptional regulator ICP4; Provisional
227-540 2.87e-06

transcriptional regulator ICP4; Provisional


Pssm-ID: 223039 [Multi-domain]  Cd Length: 1352  Bit Score: 52.10  E-value: 2.87e-06
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  227 LFNSSEQCKSIADHDSYFKPTIKTEPEWTPlTPWEDSESSPFEKLIKTESQSTDVTPSVMTPNKQPEllsfQSTTKIKPE 306
Cdd:PHA03307   42 QLVSDSAELAAVTVVAGAAACDRFEPPTGP-PPGPGTEAPANESRSTPTWSLSTLAPASPAREGSPT----PPGPSSPDP 116
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  307 PQSTqANTELSSPPSNSKLLEN-HNSLSIAAIKNESQLKASVSEVDL---LESDSEQSDNAATKTRFKPSEVTASSKLKS 382
Cdd:PHA03307  117 PPPT-PPPASPPPSPAPDLSEMlRPVGSPGPPPAASPPAAGASPAAVasdAASSRQAALPLSSPEETARAPSSPPAEPPP 195
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  383 SGDHNSASASLNRTDPKVRPVTPSGTPPPSKS-----PPAVDNTASVETNQTDSELPTETPV---EESTLPSNPKEAVIM 454
Cdd:PHA03307  196 STPPAAASPRPPRRSSPISASASSPAPAPGRSaaddaGASSSDSSSSESSGCGWGPENECPLprpAPITLPTRIWEASGW 275
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  455 SDAESTDKTEKPQTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSStqketaraqSPSDSIDESADMEDSPDEPSNSPTES 534
Cdd:PHA03307  276 NGPSSRPGPASSSSSPRERSPSPSPSSPGSGPAPSSPRASSSSSS---------SRESSSSSTSSSSESSRGAAVSPGPS 346

                  ....*.
gi 113674054  535 PTKTPD 540
Cdd:PHA03307  347 PSRSPS 352
DUF5604 pfam18300
Domain of unknown function (DUF5604); This domain is often found in the N-terminal region of ...
570-624 3.96e-06

Domain of unknown function (DUF5604); This domain is often found in the N-terminal region of proteins carrying the SET domain (pfam00856), such as the SETDB1 protein present in Homo sapiens. SETDB1 is a histone methyltransferase that suppresses gene expression and modulates heterochromatin formation through H3K9me2/3.


Pssm-ID: 408109  Cd Length: 58  Bit Score: 45.45  E-value: 3.96e-06
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054   570 KEIKLKVGAAVLGKKRHNHWSRGTVQEVETEDDGNTYKVEF-KKGKTIvLSANHVA 624
Cdd:pfam18300    2 KDGDLIVSMRILGKKRTKTWHKGTLIAIQTVGPGKKYKVKFdNKGKSL-LSGNHIA 56
SET_EZH2 cd19218
SET domain found in enhancer of zeste homolog 2 (EZH2) and similar proteins; EZH2 (EC 2.1.1.43) ...
1345-1412 4.66e-06

SET domain found in enhancer of zeste homolog 2 (EZH2) and similar proteins; EZH2 (EC 2.1.1.43), also termed lysine N-methyltransferase 6, or ENX-1, or histone-lysine N-methyltransferase EZH2, is a catalytic subunit of the polycomb repressive complex 2 (PRC2)/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. It can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. PRC2 is involved in several cancers; EZH2 is overexpressed in breast, liver and prostate cancer, while point mutations in EZH2 alter the substrate preference and product specificity of PRC2 in Non-Hodgkin lymphomas (NHLs). Thus, PRC2 is a popular target for cancer therapeutics.


Pssm-ID: 380995  Cd Length: 120  Bit Score: 47.21  E-value: 4.66e-06
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 113674054 1345 LFN-DEDacYIIDARQEGNLGRYINHSCSPNLFVQNVFVDThDLRfpwVAFFASKRIKAGTELTWDYNY 1412
Cdd:cd19218    58 LFNlNND--FVVDATRKGNKIRFANHSVNPNCYAKVMMVNG-DHR---IGIFAKRAIQTGEELFFDYRY 120
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
256-539 6.63e-06

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 50.34  E-value: 6.63e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   256 PLTPWEDSESSPfeklIKTESQSTDVTPSVMTPNKQPELLSFQSTTKIKPEPQSTQANTELS--SPPSNSKLLENHNSLS 333
Cdd:pfam17823  150 CRANASAAPRAA----IAAASAPHAASPAPRTAASSTTAASSTTAASSAPTTAASSAPATLTpaRGISTAATATGHPAAG 225
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   334 --IAAIKNESQLKASVSevdllesdseqsdnAATKTRFKPSEVTASSklkSSGDHNSASASLNRTDPKVRPVTPSGTPPP 411
Cdd:pfam17823  226 taLAAVGNSSPAAGTVT--------------AAVGTVTPAALATLAA---AAGTVASAAGTINMGDPHARRLSPAKHMPS 288
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   412 SKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPkeavimSDAESTDKTEKPQTRKKSSKPSVTTT---------SP 482
Cdd:pfam17823  289 DTMARNPAAPMGAQAQGPIIQVSTDQPVHNTAGEPTP------SPSNTTLEPNTPKSVASTNLAVVTTTkaqakepsaSP 362
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 113674054   483 ESRLTSSKSPPVTKTSSTQKE-----TARAQSPSDSI-DESADMEDSPDEPSNSPTESPTKTP 539
Cdd:pfam17823  363 VPVLHTSMIPEVEATSPTTQPspllpTQGAAGPGILLaPEQVATEATAGTASAGPTPRSSGDP 425
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
164-543 1.20e-05

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 49.91  E-value: 1.20e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   164 MVSKTSMGATSPLSTSSSDINRPSSSST---PEIVRPESVTPKLEITNSITIVKTESLSSVPKISSLFNSSEQCKSIAD- 239
Cdd:pfam05109  303 IVFSDEIPASQDMPTNTTDITYVGDNATysvPMVTSEDANSPNVTVTAFWAWPNNTETDFKCKWTLTSGTPSGCENISGa 382
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   240 --HDSYFKPTIK---TEPEWTPLTPWEDSESSPFEKLIKTES-QSTDVTPSVMTPnkqpellSFQSTTKIKPEPQSTQAN 313
Cdd:pfam05109  383 faSNRTFDITVSglgTAPKTLIITRTATNATTTTHKVIFSKApESTTTSPTLNTT-------GFAAPNTTTGLPSSTHVP 455
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   314 TELSSPPSNSkllenhnslsiaaiknesqlkASVSEVDLLESDSEQSDNAATKTRFKPSEVTASSKLKSSgDHNSASASL 393
Cdd:pfam05109  456 TNLTAPASTG---------------------PTVSTADVTSPTPAGTTSGASPVTPSPSPRDNGTESKAP-DMTSPTSAV 513
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   394 NRTDPKVRPVTPS-GTPPPSKSPPAVDNTA---SVETNQTDSELPT---ETPVEESTLPS----NPKEAVIMSDAESTDK 462
Cdd:pfam05109  514 TTPTPNATSPTPAvTTPTPNATSPTLGKTSptsAVTTPTPNATSPTpavTTPTPNATIPTlgktSPTSAVTTPTPNATSP 593
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   463 T---EKPQTRKK-------SSKPSVTT---------TSPESRLTSSKSPPVTKTSSTQKETAraqSPSDSIDESADMeds 523
Cdd:pfam05109  594 TvgeTSPQANTTnhtlggtSSTPVVTSppknatsavTTGQHNITSSSTSSMSLRPSSISETL---SPSTSDNSTSHM--- 667
                          410       420
                   ....*....|....*....|...
gi 113674054   524 PDEPSNSPT--ESPTK-TPDKTT 543
Cdd:pfam05109  668 PLLTSAHPTggENITQvTPASTS 690
SET_EZH-like cd19168
SET domain found in enhancer of zeste homolog 1 (EZH1) and zeste homolog 2 (EZH2) of polycomb ...
1355-1410 1.86e-05

SET domain found in enhancer of zeste homolog 1 (EZH1) and zeste homolog 2 (EZH2) of polycomb repressive complex 2 (PRC2), and similar proteins; The family includes EZH1 and EZH2. EZH1 (EC 2.1.1.43; also termed ENX-2, or histone-lysine N-methyltransferase EZH1) is a catalytic subunit of the PRC2/EED-EZH1 complex, which methylates 'Lys-27' of histone H3, leading to transcriptional repression of the affected target gene. EZH2 (EC 2.1.1.43; also termed lysine N-methyltransferase 6, ENX-1, or histone-lysine N-methyltransferase EZH2) is a catalytic subunit of the PRC2/EED-EZH2 complex, which methylates 'Lys-9' (H3K9me) and 'Lys-27' (H3K27me) of histone H3, leading to transcriptional repression of the affected target gene. Both EZH1 and EZH2 can mono-, di- and trimethylate 'Lys-27' of histone H3 to form H3K27me1, H3K27me2 and H3K27me3, respectively. PRC2 is involved in several cancers; EZH2 is overexpressed in breast, liver and prostate cancer, while point mutations in EZH2 alter the substrate preference and product specificity of PRC2 in Non-Hodgkin lymphomas (NHLs). Thus, PRC2 is a popular target for cancer therapeutics.


Pssm-ID: 380945  Cd Length: 124  Bit Score: 45.64  E-value: 1.86e-05
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1355 IDARQEGNLGRYINH----SCSPNLFVQNVFvDTHDLRfpwVAFFASKRIKAGTELTWDY 1410
Cdd:cd19168    65 VDAAIYGNLSRYINHatdkVKTGNCMPKIMY-VNHEWR---IKFTAIKDIKIGEELFFNY 120
PRK14949 PRK14949
DNA polymerase III subunits gamma and tau; Provisional
252-546 3.13e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237863 [Multi-domain]  Cd Length: 944  Bit Score: 48.57  E-value: 3.13e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  252 PEWTPLTPWEDSESSPFEKLIKTESQSTDVTPSVMT--PNKQPELLSFQSTTKIKPEPQSTQANTELSSPPSNSKLLENH 329
Cdd:PRK14949  377 PEGQTPSALAAAVQAPHANEPQFVNAAPAEKKTALTeqTTAQQQVQAANAEAVAEADASAEPADTVEQALDDESELLAAL 456
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  330 NS-----LSIAAikneSQ-LKASVSEVDLLESDSEQSDNAATK--------------TRFKPSEVTASSKLKSSGDHNSA 389
Cdd:PRK14949  457 NAeqaviLSQAQ----SQgFEASSSLDADNSAVPEQIDSTAEQsvvnpsvtdtqvddTSASNNSAADNTVDDNYSAEDTL 532
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  390 SASLNRTDPKVRPVTPSGTPPP-----------SKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAV----IM 454
Cdd:PRK14949  533 ESNGLDEGDYAQDSAPLDAYQDdyvafssesynALSDDEQHSANVQSAQSAAEAQPSSQSLSPISAVTTAAASLadddIL 612
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  455 S-------------DAEST--DKTEKPQTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETARAQSPSDSIDESAD 519
Cdd:PRK14949  613 DavlaardsllsdlDALSPkeGDGKKSSADRKPKTPPSRAPPASLSKPASSPDASQTSASFDLDPDFELATHQSVPEAAL 692
                         330       340       350
                  ....*....|....*....|....*....|....*
gi 113674054  520 MEDSPDEPSNSPT--------ESPTKTPDKTTRND 546
Cdd:PRK14949  693 ASGSAPAPPPVPDpydrppweEAPEVASANDGPNN 727
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
153-475 3.72e-05

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 48.53  E-value: 3.72e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  153 KSSKLVQDL-VQMVSKTSMGATSPLSTSSsdinrPSSSSTPEivRPESvtPKleitnsitIVKTESLSSVPKISslFNSS 231
Cdd:PTZ00449  610 KSPKLPELLdIPKSPKRPESPKSPKRPPP-----PQRPSSPE--RPEG--PK--------IIKSPKPPKSPKPP--FDPK 670
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  232 EQCKSiadHDSYFKPTIKTEPEWTPLTPWEDSESSPFEKLikTESQSTDVTPSVMTPNKQPELLSFQSTTKIKPEPQSTq 311
Cdd:PTZ00449  671 FKEKF---YDDYLDAAAKSKETKTTVVLDESFESILKETL--PETPGTPFTTPRPLPPKLPRDEEFPFEPIGDPDAEQP- 744
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  312 ANTELSSPPSNSKLL--ENHNSLSIAAIKNESqlkasVSEVDLlESDSEQSDNAATKTRfKPSEVTAssklKSSGDHNSA 389
Cdd:PTZ00449  745 DDIEFFTPPEEERTFfhETPADTPLPDILAEE-----FKEEDI-HAETGEPDEAMKRPD-SPSEHED----KPPGDHPSL 813
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  390 SASLNR------------TDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTDSELPTETPVE--------ESTLPSNPK 449
Cdd:PTZ00449  814 PKKRHRldglalsttdleSDAGRIAKDASGKIVKLKRSKSFDDLTTVEEAEEMGAEARKIVVDddgteaddEDTHPPEEK 893
                         330       340
                  ....*....|....*....|....*.
gi 113674054  450 EAVIMSDAESTDKTEKPQTRKKSSKP 475
Cdd:PTZ00449  894 HKSEVRRRRPPKKPSKPKKPSKPKKP 919
PHA03377 PHA03377
EBNA-3C; Provisional
246-532 4.42e-05

EBNA-3C; Provisional


Pssm-ID: 177614 [Multi-domain]  Cd Length: 1000  Bit Score: 48.13  E-value: 4.42e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  246 PTIKTEPEWTPLTPWEDS------ESSPFEKLIKTESQSTDVTPSVMTPNKQPELLSFQ-----STTKIKPEPQStqant 314
Cdd:PHA03377  422 PTPKTHPVKRTLVKTSGRsdeaeqAQSTPERPGPSDQPSVPVEPAHLTPVEHTTVILHQppqspPTVAIKPAPPP----- 496
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  315 elSSPPSNSKLLENHNSLSIaaiknesqlkasvseVDLLESDSEQSDNAATKTRFKPSevtassklksSGDHNSASASLN 394
Cdd:PHA03377  497 --SRRRRGACVVYDDDIIEV---------------IDVETTEEEESVTQPAKPHRKVQ----------DGFQRSGRRQKR 549
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  395 RTDPkvrPVTPSGTPPPSKSPPAVdntASVETNQTDSELPTETPVEESTLPSNPKEAVIMSD-AESTDKTEK--PQTRKK 471
Cdd:PHA03377  550 ATPP---KVSPSDRGPPKASPPVM---APPSTGPRVMATPSTGPRDMAPPSTGPRQQAKCKDgPPASGPHEKqpPSSAPR 623
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 113674054  472 SSKPSVTTTSPESRLTSSKSPPVTKT---SSTQKETARAQSPSDSIDESAdMEDSPDEPSNSPT 532
Cdd:PHA03377  624 DMAPSVVRMFLRERLLEQSTGPKPKSfweMRAGRDGSGIQQEPSSRRQPA-TQSTPPRPSWLPS 686
PRK13914 PRK13914
invasion associated endopeptidase;
362-543 8.20e-05

invasion associated endopeptidase;


Pssm-ID: 237555 [Multi-domain]  Cd Length: 481  Bit Score: 47.10  E-value: 8.20e-05
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  362 NAATKTrfkPSEVTASSKlKSSGDHNSASASLNRTDpkVRPVTPSGTPPP----SKSPPAVDNTA--------------S 423
Cdd:PRK13914  141 DKVTST---PVAPTQEVK-KETTTQQAAPAAETKTE--VKQTTQATTPAPkvaeTKETPVVDQNAtthavksgdtiwalS 214
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  424 VETNQTDSELPTETPVEESTLPSNPKEAVIMSDAESTDK----TEKPQTRKKSSkPSVTTTSPESRLTSSKSPPVTKTSS 499
Cdd:PRK13914  215 VKYGVSVQDIMSWNNLSSSSIYVGQKLAIKQTANTATPKaevkTEAPAAEKQAA-PVVKENTNTNTATTEKKETTTQQQT 293
                         170       180       190       200
                  ....*....|....*....|....*....|....*....|....*...
gi 113674054  500 TQKETARAQ----SPSDSIDESADMEDSPDEPSNSPTESPTKTPDKTT 543
Cdd:PRK13914  294 APKAPTEAAkpapAPSTNTNANKTNTNTNTNTNNTNTSTPSKNTNTNT 341
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
309-543 9.54e-05

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 46.88  E-value: 9.54e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   309 STQANTELSSPPSNSKllenhnSLSIAAIKNESQLKASVSEVDLLESDSEQSDNAATKTRFKPSEVTASSKLKSSgdhnS 388
Cdd:pfam17823  112 SRALAAAASSSPSSAA------QSLPAAIAALPSEAFSAPRAAACRANASAAPRAAIAAASAPHAASPAPRTAAS----S 181
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   389 ASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAVIMSDAESTDKTEKPQT 468
Cdd:pfam17823  182 TTAASSTTAASSAPTTAASSAPATLTPARGISTAATATGHPAAGTALAAVGNSSPAAGTVTAAVGTVTPAALATLAAAAG 261
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054   469 RKKSSKPSVTTTSPESR-LTSSKSPPVTKTSSTQKETARAQSPSDSIDESADmedspdEPSNSPTESPTKTPDKTT 543
Cdd:pfam17823  262 TVASAAGTINMGDPHARrLSPAKHMPSDTMARNPAAPMGAQAQGPIIQVSTD------QPVHNTAGEPTPSPSNTT 331
PRK08581 PRK08581
amidase domain-containing protein;
261-544 1.10e-04

amidase domain-containing protein;


Pssm-ID: 236304 [Multi-domain]  Cd Length: 619  Bit Score: 46.70  E-value: 1.10e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  261 EDSESSPFEKLIKTESQSTDVTPSVMTPNKQPEllSFQSTTKIKPEPQSTQANTELS-SPPSNSKLLENHNSLSIAAIKN 339
Cdd:PRK08581   29 DPQKDSTAKTTSHDSKKSNDDETSKDTSSKDTD--KADNNNTSNQDNNDKKFSTIDSsTSDSNNIIDFIYKNLPQTNINQ 106
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  340 E------------SQLKASVSEVDLLESDSEQSDNAATKTrfKPSEVTASSKLKSSGDHNSA--SASLNRTDPKVRPVTP 405
Cdd:PRK08581  107 LltknkyddnyslTTLIQNLFNLNSDISDYEQPRNSEKST--NDSNKNSDSSIKNDTDTQSSkqDKADNQKAPSSNNTKP 184
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  406 SGTPPPSKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAVIMsDAESTDKTEKPQTRKKSSKPSVTTTSpesr 485
Cdd:PRK08581  185 STSNKQPNSPKPTQPNQSNSQPASDDTANQKSSSKDNQSMSDSALDSIL-DQYSEDAKKTQKDYASQSKKDKTETS---- 259
                         250       260       270       280       290       300
                  ....*....|....*....|....*....|....*....|....*....|....*....|....
gi 113674054  486 ltSSKSPPVTKTSSTQKETARAQS-----PSDSIDESADMEDSPDEPSNSPTESPTKTPDKTTR 544
Cdd:PRK08581  260 --NTKNPQLPTQDELKHKSKPAQSfendvNQSNTRSTSLFETGPSLSNNDDSGSFNVVDSKDTR 321
SET_SETD5 cd19181
SET domain (including post-SET domain) found in SET domain-containing protein 5 (SETD5) and ...
1346-1425 2.78e-04

SET domain (including post-SET domain) found in SET domain-containing protein 5 (SETD5) and similar proteins; SETD5 is a probable transcriptional regulator that acts via the formation of large multiprotein complexes that modify and/or remodel the chromatin. SETD5 loss-of-function mutations are a likely cause of a familial syndromic intellectual disability with variable phenotypic expression.


Pssm-ID: 380958  Cd Length: 150  Bit Score: 42.69  E-value: 2.78e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054 1346 FNDEDACyiIDARQEGNLGRYINHSCSPNLFVQNVFVD--THdlrfpwVAFFASKRIKAGTELTWDYNYEVGSVEGKVLL 1423
Cdd:cd19181    66 FNGVEMC--VDARTFGNDARFIRRSCTPNAEVRHMIADgmIH------LCIYAVAAIAKDAEVTIAFDYEYSNCNYKVDC 137

                  ..
gi 113674054 1424 CC 1425
Cdd:cd19181   138 AC 139
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
380-543 5.17e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.76  E-value: 5.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   380 LKSSGDHNSASASLNRTDPKVRPVTPSG------TPPPSKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAVI 453
Cdd:pfam03154   39 LRSSGRNSPSAASTSSNDSKAESMKKSSkkikeeAPSPLKSAKRQREKGASDTEEPERATAKKSKTQEISRPNSPSEGEG 118
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   454 MS-------DAESTDKTEKPQTrKKSSKPSVtttsPESRLTSSKSppvtkTSSTQKETARAQSPSDSIDESADMEDSPDE 526
Cdd:pfam03154  119 ESsdgrsvnDEGSSDPKDIDQD-NRSTSPSI----PSPQDNESDS-----DSSAQQQILQTQPPVLQAQSGAASPPSPPP 188
                          170
                   ....*....|....*..
gi 113674054   527 PSNSPTESPTKTPDKTT 543
Cdd:pfam03154  189 PGTTQAATAGPTPSAPS 205
AF-4 pfam05110
AF-4 proto-oncoprotein N-terminal region; This family consists of AF4 (Proto-oncogene AF4) and ...
404-546 7.38e-04

AF-4 proto-oncoprotein N-terminal region; This family consists of AF4 (Proto-oncogene AF4) and FMR2 (Fragile X E mental retardation syndrome) nuclear proteins. These proteins have been linked to human diseases such as acute lymphoblastic leukaemia and mental retardation. The family also contains a Drosophila AF4 protein homolog Lilliputian which contains an AT-hook domain. Lilliputian represents a novel pair-rule gene that acts in cytoskeleton regulation, segmentation and morphogenesis in Drosophila.


Pssm-ID: 461550 [Multi-domain]  Cd Length: 514  Bit Score: 43.96  E-value: 7.38e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   404 TPSgTPPPSKSP-PAVDN--TASVETNQTDSELPTETPVEESTLPSNPKEAVIMSDAESTDKTEKPQTRKKSSKPSVTTT 480
Cdd:pfam05110  369 TPS-TAEPSKFPfPTKESqhVTSGYQNQKQYDAPSKTLPTSQQGTSMLEDDLKLSSSEDSDDDQAPEKPPPSSAPPSAPQ 447
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 113674054   481 SPESRLTSSKSPPVTKTSSTQKETAraqSPSDSIDESAdmedSPDEPSNSPTESPTKTPDKTTRND 546
Cdd:pfam05110  448 SQPNSVASAHSSSGESGSSSDSESS---SESDSESESS----SSDSEANEPPRSATPEPEPPSSNK 506
PRK14948 PRK14948
DNA polymerase III subunit gamma/tau;
245-474 7.67e-04

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237862 [Multi-domain]  Cd Length: 620  Bit Score: 43.80  E-value: 7.67e-04
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  245 KPTIKTEPEWTPLTPWEDSESSPFEKLIKTESQSTDVTPSVMTPNKQPEllsfqsTTKIKPEPQSTQANTE--------- 315
Cdd:PRK14948  372 SAPANPTPAPNPSPPPAPIQPSAPKTKQAATTPSPPPAKASPPIPVPAE------PTEPSPTPPANAANAPpslnleelw 445
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  316 ---LSS--PPSNSKLLENHNSL-SIaaikNESQLKASVSE--VDLLESDSEQSDNAATKTRFKPSEVTASSKlkssgdHN 387
Cdd:PRK14948  446 qqiLAKleLPSTRMLLSQQAELvSL----DSNRAVIAVSPnwLGMVQSRKPLLEQAFAKVLGRSIKLNLESQ------SG 515
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  388 SASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAVIMSDAESTDKTEKPQ 467
Cdd:PRK14948  516 SASNTAKTPPPPQKSPPPPAPTPPLPQPTATAPPPTPPPPPPTATQASSNAPAQIPADSSPPPPIPEEPTPSPTKDSSPE 595

                  ....*..
gi 113674054  468 TRKKSSK 474
Cdd:PRK14948  596 EIDKAAK 602
PHA03247 PHA03247
large tegument protein UL36; Provisional
402-539 1.31e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 43.39  E-value: 1.31e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  402 PVTPSGTPPPSKSPPAVdNTASVETNQTDSElPTETPVEESTLPSNPKEAVIMSDA--ESTDKTEKPQTRKKSSKPSVTT 479
Cdd:PHA03247 2772 PAAPAAGPPRRLTRPAV-ASLSESRESLPSP-WDPADPPAAVLAPAAALPPAASPAgpLPPPTSAQPTAPPPPPGPPPPS 2849
                          90       100       110       120       130       140
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 113674054  480 TSPE---------SRLTSSKSPPVTKTSSTQKETARAQSPSDSiDESADMEDSPDEPSNSPTESPTKTP 539
Cdd:PHA03247 2850 LPLGgsvapggdvRRRPPSRSPAAKPAAPARPPVRRLARPAVS-RSTESFALPPDQPERPPQPQAPPPP 2917
TALPID3 pfam15324
Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for ...
396-541 1.89e-03

Hedgehog signalling target; TALPID3 is a family of eukaryotic proteins that are targets for Hedgehog signalling. Mutations in this gene noticed first in chickens lead to multiple abnormalities of development.


Pssm-ID: 434634 [Multi-domain]  Cd Length: 1288  Bit Score: 42.95  E-value: 1.89e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   396 TDPKVRPV-----TPSGTPPPSKSP-PAVDNTASVETNQTDSELP-----TETPVEESTlPSNPKE-----AVIMSDAES 459
Cdd:pfam15324 1037 TGPAVSLVitptvTPIATPPPAATPtPPLSENSIDKLKSPSPELPkpwedSDLPLEEEN-PNSEQEelhprAVVMSVARD 1115
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   460 TDktekpqtrkksskpsvtttsPESRLTSSkSPPVTKTSSTQKETARAQSPSdsidESADMEDSPDEPSNSPTESPTKTP 539
Cdd:pfam15324 1116 EE--------------------PESVVLPA-SPPEPKPLAPPPLGAAPPSPP----QSPSSSSSTLESSSSLTVTETETA 1170

                   ..
gi 113674054   540 DK 541
Cdd:pfam15324 1171 DR 1172
Metaviral_G pfam09595
Metaviral_G glycoprotein; This is a viral attachment glycoprotein from region G of metaviruses. ...
412-546 2.71e-03

Metaviral_G glycoprotein; This is a viral attachment glycoprotein from region G of metaviruses. It is high in serine and threonine suggesting it is highly glycosylated.


Pssm-ID: 462833 [Multi-domain]  Cd Length: 183  Bit Score: 40.71  E-value: 2.71e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   412 SKSPPAVDN--TASVETNQTDSELPTETPVEESTLPSNPKEAVIMSDAESTDKtekpqtrkksskPSVTTTSPESRLTSS 489
Cdd:pfam09595   32 SLILIGESNkeAALIITDIIDININKQHPEQEHHENPPLNEAAKEAPSESEDA------------PDIDPNNQHPSQDRS 99
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*...
gi 113674054   490 KSPPVTKTSSTQKETARAQSPSDsideSADMEDSPDEPSNSPTESPTKT-PDKTTRND 546
Cdd:pfam09595  100 EAPPLEPAAKTKPSEHEPANPPD----ASNRLSPPDASTAAIREARTFRkPSTGKRNN 153
PTZ00449 PTZ00449
104 kDa microneme/rhoptry antigen; Provisional
239-543 2.86e-03

104 kDa microneme/rhoptry antigen; Provisional


Pssm-ID: 185628 [Multi-domain]  Cd Length: 943  Bit Score: 42.37  E-value: 2.86e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  239 DHDSYFKPTIKTEPEWTPLTPWEDSESSPFEKLIKTESQSTdvtpsvmTPNKQPELLSFQSTTKiKPEPQSTQAntelss 318
Cdd:PTZ00449  503 DSDKHDEPPEGPEASGLPPKAPGDKEGEEGEHEDSKESDEP-------KEGGKPGETKEGEVGK-KPGPAKEHK------ 568
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  319 pPSNSKLLEnhnslsiaaiKNESQLKASVSEVDLLESDSEQSDNAATK-TRFKPSEVTASSKLKSSGDHNSASASLNRTD 397
Cdd:PTZ00449  569 -PSKIPTLS----------KKPEFPKDPKHPKDPEEPKKPKRPRSAQRpTRPKSPKLPELLDIPKSPKRPESPKSPKRPP 637
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  398 PKVRPVTP-----SGTPPPSKSP--PAVDNTASVETNQTDSELPTETPVEESTLPSNPKEAVIMSDAESTDKTEKPQTRK 470
Cdd:PTZ00449  638 PPQRPSSPerpegPKIIKSPKPPksPKPPFDPKFKEKFYDDYLDAAAKSKETKTTVVLDESFESILKETLPETPGTPFTT 717
                         250       260       270       280       290       300       310
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 113674054  471 KSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETAraqspsdSIDESADMEDSP-DEPSNSPTESPTKTPDKTT 543
Cdd:PTZ00449  718 PRPLPPKLPRDEEFPFEPIGDPDAEQPDDIEFFTP-------PEEERTFFHETPaDTPLPDILAEEFKEEDIHA 784
rad2 TIGR00600
DNA excision repair protein (rad2); All proteins in this family for which functions are known ...
226-540 4.81e-03

DNA excision repair protein (rad2); All proteins in this family for which functions are known are flap endonucleases that generate the 3' incision next to DNA damage as part of nucleotide excision repair. This family is related to many other flap endonuclease families including the fen1 family. This family is based on the phylogenomic analysis of JA Eisen (1999, Ph.D. Thesis, Stanford University). [DNA metabolism, DNA replication, recombination, and repair]


Pssm-ID: 273166 [Multi-domain]  Cd Length: 1034  Bit Score: 41.42  E-value: 4.81e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   226 SLFNSSEQCKSIADHDSYFkPTIKTEPEWTPLTPWEdsESSPFEKLIKTESQSTDvtPSVMTPNKQPELLSFQSTTKIKP 305
Cdd:TIGR00600  312 SLPSLSSQLDSNSEDLKSS-PWEKLKPESESIVEAE--PPSPRTLLAKQAAMSES--SSEDSDESEWERQELKRNNVAFV 386
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   306 EPQSTQANTElsspPSNSKLLENHNSLSIAAiKNESQLKASVSEVDLLESDSEQSDNA-ATKTRFKPSEVTASSKLKSSG 384
Cdd:TIGR00600  387 DDGSLSPRTL----QAIGQALDDDEDKKVSA-SSDDQASPSKKTKMLLISRIEVEDDDlDYLDQGEGIPLMAALQLSSVN 461
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   385 DHNSASASLNRTdpkvRPVTPSGTpppSKSPPAVDNTASVETNqtDSELPTEtpveeSTLPSNPKEAVImsdaestdkte 464
Cdd:TIGR00600  462 SKPEAVASTKIA----REVTSSGH---EAVPKAVQSLLLGATN--DSPIPSE-----FTILDRKSELSI----------- 516
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   465 kpqtrKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQK-----------------ETARAQSPSDSIDESADMEdSPDEP 527
Cdd:TIGR00600  517 -----ERTVKPVSSEFGLPSQREDKLAIPTEGTQNLQGisdhpeqfefqnelsplETKNNESNLSSDAETEGSP-NPEMP 590
                          330
                   ....*....|...
gi 113674054   528 SNSPTESPTKTPD 540
Cdd:TIGR00600  591 SWSSVTVPSEALD 603
MSCRAMM_ClfA NF033609
MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial ...
380-543 5.72e-03

MSCRAMM family adhesin clumping factor ClfA; Clumping factor A is an MSCRAMM (Microbial Surface Components Recognizing Adhesive Matrix Molecules). It is heavily studied in Staphylococcus aureus both for its biological role in adhesion and for its potential for vaccination. Features of the sequence, but also of other MSCRAMM adhesins, include a long run of Ser-Asp dipeptide repeats and a C-terminal cell wall anchoring LPXTG motif.


Pssm-ID: 468110 [Multi-domain]  Cd Length: 934  Bit Score: 41.43  E-value: 5.72e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  380 LKSSGDHNSASASLNRTDPKVRPvTPSGTPPPSKSPPAVDNTasvetNQTDSELPTETPVEESTLPSNPKEavimsdaes 459
Cdd:NF033609   31 LLSSKEADASENSVTQSDSASNE-SKSNDSSSVSAAPKTDDT-----NVSDTKTSSNTNNGETSVAQNPAQ--------- 95
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  460 TDKTEKPQTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETARAQSPSDSIDESADMEDSPDEPSNSP-TESPTKT 538
Cdd:NF033609   96 QETTQSASTNATTEETPVTGEATTTATNQANTPATTQSSNTNAEELVNQTSNETTSNDTNTVSSVNSPQNSTnAENVSTT 175

                  ....*
gi 113674054  539 PDKTT 543
Cdd:NF033609  176 QDTST 180
SET_KMT2E cd19182
SET domain found in inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar ...
1355-1415 5.86e-03

SET domain found in inactive histone-lysine N-methyltransferase 2E (KMT2E) and similar proteins; KMT2E (also termed inactive lysine N-methyltransferase 2E, myeloid/lymphoid or mixed-lineage leukemia protein 5 (MLL5)) plays a key role in hematopoiesis, spermatogenesis and cell cycle progression. It associates with chromatin regions downstream of transcriptional start sites of active genes and thus regulates gene transcription. Lack of key residues in the SET domain as well as the presence of an unusually large loop in the SET-I subdomain preclude the interaction of MLL5 SET with its cofactor and substrate thus making MLL5 devoid of any in vitro methyltransferase activity on full-length histones and histone H3 peptide.


Pssm-ID: 380959  Cd Length: 129  Bit Score: 38.33  E-value: 5.86e-03
                          10        20        30        40        50        60
                  ....*....|....*....|....*....|....*....|....*....|....*....|...
gi 113674054 1355 IDARQEGNLGRYINHSCSPNLFVQNVFVD--THdlrfpwVAFFASKRIKAGTELTWDYNYEVG 1415
Cdd:cd19182    73 VDARTFGNEARFIRRSCTPNAEVRHVIEDgtIH------LYIYSIRSIPKGTEITIAFDFDYG 129
DUF4045 pfam13254
Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. ...
306-537 6.06e-03

Domain of unknown function (DUF4045); This presumed domain is functionally uncharacterized. This domain family is found in bacteria and eukaryotes, and is typically between 384 and 430 amino acids in length.


Pssm-ID: 433066 [Multi-domain]  Cd Length: 415  Bit Score: 40.92  E-value: 6.06e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   306 EPQSTQANTELSSPPSNSKLLENhnslsiaaiKNESQLKASVSEVDLLESDSEQSdNAATKTRFKPSEVTASSKLKSSgd 385
Cdd:pfam13254  121 TGSEEDSPSLPTSPPSPSKTMDP---------KRWSPTKSSWLESALNRPESPKP-KAQPSQPAQPAWMKELNKIRQS-- 188
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   386 hnSASASLNRTDPkVRPVTPSGtPPPSKSPPAVDNTASVETNQTDSELPTETPVEESTLPSNPKEaviMSDAESTDKTEK 465
Cdd:pfam13254  189 --RASVDLGRPNS-FKEVTPVG-LMRSPAPGGHSKSPSVSGISADSSPTKEEPSEEADTLSTDKE---QSPAPTSASEPP 261
                          170       180       190       200       210       220       230
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 113674054   466 PQTrKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETARAQSPSDSIDESADMEDSP------DEPSNSP-TESPTK 537
Cdd:pfam13254  262 PKT-KELPKDSEEPAAPSKSAEASTEKKEPDTESSPETSSEKSAPSLLSPVSKASIDKPlsspdrDPLSPKPkPQSPPK 339
PHA03247 PHA03247
large tegument protein UL36; Provisional
371-540 6.98e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 41.08  E-value: 6.98e-03
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  371 PSEVTASSKLKSSGDHNSASASLNRTDPKVRPVTPSGTPPPSKSPPAVDNTasvetnqTDSELPTETPVEEStlPSNPKE 450
Cdd:PHA03247 2780 PRRLTRPAVASLSESRESLPSPWDPADPPAAVLAPAAALPPAASPAGPLPP-------PTSAQPTAPPPPPG--PPPPSL 2850
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054  451 AVIMSDAESTDKTEKPQTRKKSSKPSVTTTSPESRLTSSKSPPVTKTSSTQKETARAQSPSDSIDESADMEDSPDEPSNS 530
Cdd:PHA03247 2851 PLGGSVAPGGDVRRRPPSRSPAAKPAAPARPPVRRLARPAVSRSTESFALPPDQPERPPQPQAPPPPQPQPQPPPPPQPQ 2930
                         170
                  ....*....|
gi 113674054  531 PTESPTKTPD 540
Cdd:PHA03247 2931 PPPPPPPRPQ 2940
Pneumo_att_G pfam05539
Pneumovirinae attachment membrane glycoprotein G;
398-544 7.11e-03

Pneumovirinae attachment membrane glycoprotein G;


Pssm-ID: 114270 [Multi-domain]  Cd Length: 408  Bit Score: 40.42  E-value: 7.11e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 113674054   398 PKVRPVTPSGTPPPSKS--PPAVDNTASVETNQTDSELPTETPVEESTLPSnpkeavIMSDAESTDKTEKPQTRKKSSKP 475
Cdd:pfam05539  168 PKTAVTTSKTTSWPTEVshPTYPSQVTPQSQPATQGHQTATANQRLSSTEP------VGTQGTTTSSNPEPQTEPPPSQR 241
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 113674054   476 SVTTTSPEsrLTSSKSPPVTKTSSTQKETARAQSPS--DSIDESADMEDSPDEPSNSPTESPTKTPDKTTR 544
Cdd:pfam05539  242 GPSGSPQH--PPSTTSQDQSTTGDGQEHTQRRKTPPatSNRRSPHSTATPPPTTKRQETGRPTPRPTATTQ 310
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH