NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|1841659210|ref|NP_001369477|]
View 

transcription elongation regulator 1 isoform 3 [Homo sapiens]

Protein Classification

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PRP40 super family cl34905
Splicing factor [RNA processing and modification];
421-1035 1.78e-23

Splicing factor [RNA processing and modification];


The actual alignment was detected with superfamily member COG5104:

Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 106.32  E-value: 1.78e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  421 ASPATLAGATAVSEWTEYKTADGKTYYYNNRTLESTWEKPQElkekekleekikepIKEPSEEPLPMEteeedpkeepik 500
Cdd:COG5104      3 AALLGMASGEARSEWEELKAPDGRIYYYNKRTGKSSWEKPKE--------------LLKGSEEDLDVD------------ 56
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  501 eikeepkeeemteeekaaqkakpvatapipgtPWCVVWTGDERVFFYNPTTRLSMWDRpddligradvdkiiqePPHKKG 580
Cdd:COG5104     57 --------------------------------PWKECRTADGKVYYYNSITRESRWKI----------------PPERKK 88
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  581 MEELKKLRHPTPTMLSIQKWQFSMSAIkEEQELMEEINEDepvkakkrkrmskksfmwiaraslfrrddnkdidsEKEAA 660
Cdd:COG5104     89 VEPIAEQKHDERSMIGGNGNDMAITDH-ETSEPKYLLGRL-----------------------------------MSQYG 132
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  661 MEAEIKAARERAivpLEARMKQFKDMLLERGVSAFSTWEKELHKIVfDPRYLLL--NPKERKQVFDQYVKTRAEEERREK 738
Cdd:COG5104    133 ITSTKDAVYRLT---KEEAEKEFITMLKENQVDSTWPIFRAIEELR-DPRYWMVdtDPLWRKDLFKKYFENQEKDQREEE 208
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  739 KNKIMQAKEDFKKMME-EAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDSKTRGEKIKSDFF 817
Cdd:COG5104    209 ENKQRKYINEFCKMLAgNSHIKYYTDWFTFKSIFSKHPYYSSVVNEKTKRQTFQKYKDKLGCYEKYVGKHMGGTALGRLE 288
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  818 ELLSNHHLDSQSRWSKVKDKVESDPRYKAvdSSSM----REDLFKQYIeKIAKNLdsekekelerqarieaslrerEREV 893
Cdd:COG5104    289 EVLRSLGSETFIIWLLNHYVFDSVVRYLK--NKEMkpldRKDILFSFI-RYVRRL---------------------EKEL 344
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  894 QKARSEQTKEIDReREQHKREeaiqNFKALLSDMVRSSDVS----WSDTRRTLRKDHRWESGSLLEREEKEKLFNEHIEA 969
Cdd:COG5104    345 LSAIEERKAAAAQ-NARHHRD----EFRTLLRKLYSEGKIYyrmkWKNAYPLIKDDPRFLNLLGRTGSSPLDLFFDFIVD 419
                          570       580       590       600       610       620       630
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  970 LTKKKREHFRQLLDETSaITLTSTW--KEVKKIIKEDPRciKFSSSDRKKQREFEE---YIRDKYITAKAD 1035
Cdd:COG5104    420 LENMYGFARRSYERETR-TGQISPTdrRAVDEIFEAIAE--KKEEGEIKFDKVDKEdisLIVDGLIKQRNE 487
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
137-162 7.56e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


:

Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.04  E-value: 7.56e-08
                           10        20
                   ....*....|....*....|....*.
gi 1841659210  137 WVENKTPDGKVYYYNARTRESAWTKP 162
Cdd:pfam00397    5 WEERWDPDGRVYYYNHETGETQWEKP 30
Herpes_BLLF1 super family cl37540
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
260-365 3.98e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


The actual alignment was detected with superfamily member pfam05109:

Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 51.07  E-value: 3.98e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  260 TPTTSSPAPAVSTStssstpssttsttttatsvAQTVSTPTTQDQTPSSAVSVATP-----TVSVSTPAPTAT------- 327
Cdd:pfam05109  517 TPNATSPTPAVTTP-------------------TPNATSPTLGKTSPTSAVTTPTPnatspTPAVTTPTPNATiptlgkt 577
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*
gi 1841659210  328 -PVQTVPQPHPQTLPPAVPHSVPQPTT------AIPAFPPVMVPP 365
Cdd:pfam05109  578 sPTSAVTTPTPNATSPTVGETSPQANTtnhtlgGTSSTPVVTSPP 622
PRK13729 super family cl42933
conjugal transfer pilus assembly protein TraB; Provisional
323-413 2.91e-05

conjugal transfer pilus assembly protein TraB; Provisional


The actual alignment was detected with superfamily member PRK13729:

Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 47.90  E-value: 2.91e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  323 APTATPVQTVPQPHPQT-LPPAVPHSVPQPTTAIPAFPP---VMVPPFRVPLPGMPIPLPGVamMQIVSCPYVKTVATTK 398
Cdd:PRK13729   122 ALGANPVTATGEPVPQMpASPPGPEGEPQPGNTPVSFPPqgsVAVPPPTAFYPGNGVTPPPQ--VTYQSVPVPNRIQRKT 199
                           90
                   ....*....|....*
gi 1841659210  399 TGVLPGMAPPIVPMI 413
Cdd:PRK13729   200 FTYNEGKKGPSLPYI 214
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
1030-1094 7.03e-05

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


:

Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 41.41  E-value: 7.03e-05
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1841659210  1030 ITAKADFRTLLKETKFITYrskkliqesDQHLKDVEKILQNDKRYLVLDcVPEERRKLIVAYVDD 1094
Cdd:smart00441    1 EEAKEAFKELLKEHEVITP---------DTTWSEARKKLKNDPRYKALL-SESEREQLFEDHIEE 55
 
Name Accession Description Interval E-value
PRP40 COG5104
Splicing factor [RNA processing and modification];
421-1035 1.78e-23

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 106.32  E-value: 1.78e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  421 ASPATLAGATAVSEWTEYKTADGKTYYYNNRTLESTWEKPQElkekekleekikepIKEPSEEPLPMEteeedpkeepik 500
Cdd:COG5104      3 AALLGMASGEARSEWEELKAPDGRIYYYNKRTGKSSWEKPKE--------------LLKGSEEDLDVD------------ 56
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  501 eikeepkeeemteeekaaqkakpvatapipgtPWCVVWTGDERVFFYNPTTRLSMWDRpddligradvdkiiqePPHKKG 580
Cdd:COG5104     57 --------------------------------PWKECRTADGKVYYYNSITRESRWKI----------------PPERKK 88
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  581 MEELKKLRHPTPTMLSIQKWQFSMSAIkEEQELMEEINEDepvkakkrkrmskksfmwiaraslfrrddnkdidsEKEAA 660
Cdd:COG5104     89 VEPIAEQKHDERSMIGGNGNDMAITDH-ETSEPKYLLGRL-----------------------------------MSQYG 132
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  661 MEAEIKAARERAivpLEARMKQFKDMLLERGVSAFSTWEKELHKIVfDPRYLLL--NPKERKQVFDQYVKTRAEEERREK 738
Cdd:COG5104    133 ITSTKDAVYRLT---KEEAEKEFITMLKENQVDSTWPIFRAIEELR-DPRYWMVdtDPLWRKDLFKKYFENQEKDQREEE 208
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  739 KNKIMQAKEDFKKMME-EAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDSKTRGEKIKSDFF 817
Cdd:COG5104    209 ENKQRKYINEFCKMLAgNSHIKYYTDWFTFKSIFSKHPYYSSVVNEKTKRQTFQKYKDKLGCYEKYVGKHMGGTALGRLE 288
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  818 ELLSNHHLDSQSRWSKVKDKVESDPRYKAvdSSSM----REDLFKQYIeKIAKNLdsekekelerqarieaslrerEREV 893
Cdd:COG5104    289 EVLRSLGSETFIIWLLNHYVFDSVVRYLK--NKEMkpldRKDILFSFI-RYVRRL---------------------EKEL 344
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  894 QKARSEQTKEIDReREQHKREeaiqNFKALLSDMVRSSDVS----WSDTRRTLRKDHRWESGSLLEREEKEKLFNEHIEA 969
Cdd:COG5104    345 LSAIEERKAAAAQ-NARHHRD----EFRTLLRKLYSEGKIYyrmkWKNAYPLIKDDPRFLNLLGRTGSSPLDLFFDFIVD 419
                          570       580       590       600       610       620       630
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  970 LTKKKREHFRQLLDETSaITLTSTW--KEVKKIIKEDPRciKFSSSDRKKQREFEE---YIRDKYITAKAD 1035
Cdd:COG5104    420 LENMYGFARRSYERETR-TGQISPTdrRAVDEIFEAIAE--KKEEGEIKFDKVDKEdisLIVDGLIKQRNE 487
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
811-860 4.24e-14

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 67.48  E-value: 4.24e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1841659210  811 KIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQY 860
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYKALLDGSEREELFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
972-1027 2.81e-10

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 56.81  E-value: 2.81e-10
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*.
gi 1841659210   972 KKKREHFRQLLDETSAITLTSTWKEVKKIIKEDPRCiKFSSSDRKKQREFEEYIRD 1027
Cdd:smart00441    1 EEAKEAFKELLKEHEVITPDTTWSEARKKLKNDPRY-KALLSESEREQLFEDHIEE 55
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
433-460 2.71e-08

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 50.60  E-value: 2.71e-08
                           10        20
                   ....*....|....*....|....*...
gi 1841659210  433 SEWTEYKTADGKTYYYNNRTLESTWEKP 460
Cdd:cd00201      2 PGWEERWDPDGRVYYYNHNTKETQWEDP 29
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
137-162 7.56e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.04  E-value: 7.56e-08
                           10        20
                   ....*....|....*....|....*.
gi 1841659210  137 WVENKTPDGKVYYYNARTRESAWTKP 162
Cdd:pfam00397    5 WEERWDPDGRVYYYNHETGETQWEKP 30
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
132-164 7.89e-08

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 49.14  E-value: 7.89e-08
                            10        20        30
                    ....*....|....*....|....*....|...
gi 1841659210   132 PTEEIWVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:smart00456    1 PLPPGWEERKDPDGRPYYYNHETKETQWEKPRE 33
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
137-164 1.99e-07

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 47.91  E-value: 1.99e-07
                           10        20
                   ....*....|....*....|....*...
gi 1841659210  137 WVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:cd00201      4 WEERWDPDGRVYYYNHNTKETQWEDPRE 31
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
260-365 3.98e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 51.07  E-value: 3.98e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  260 TPTTSSPAPAVSTStssstpssttsttttatsvAQTVSTPTTQDQTPSSAVSVATP-----TVSVSTPAPTAT------- 327
Cdd:pfam05109  517 TPNATSPTPAVTTP-------------------TPNATSPTLGKTSPTSAVTTPTPnatspTPAVTTPTPNATiptlgkt 577
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*
gi 1841659210  328 -PVQTVPQPHPQTLPPAVPHSVPQPTT------AIPAFPPVMVPP 365
Cdd:pfam05109  578 sPTSAVTTPTPNATSPTVGETSPQANTtnhtlgGTSSTPVVTSPP 622
PTZ00121 PTZ00121
MAEBL; Provisional
727-1067 5.23e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 50.91  E-value: 5.23e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  727 VKTRAEEERR--EKKNKIMQAK--EDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREAlfNEFVAAARKKEK 802
Cdd:PTZ00121  1423 AKKKAEEKKKadEAKKKAEEAKkaDEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEA--KKKAEEAKKKAD 1500
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  803 EDSKTRGEKIKSDffELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEKIAKNLDSEKEKELERQARI 882
Cdd:PTZ00121  1501 EAKKAAEAKKKAD--EAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNM 1578
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  883 EASLREREREVQKARSEQTKEIDREREQHKREEAiqnfKALLSDMVRSSDVSWSDTRRtlRKDHRWESGSLLEREEKEKL 962
Cdd:PTZ00121  1579 ALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEA----KKAEEAKIKAEELKKAEEEK--KKVEQLKKKEAEEKKKAEEL 1652
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  963 FNEHIEALTKKKREHFRQLLDETSAitltstwKEVKKIIKEDPRCIKFSSSDRKKQREFEEyIRDKYITAKADFRTLLKE 1042
Cdd:PTZ00121  1653 KKAEEENKIKAAEEAKKAEEDKKKA-------EEAKKAEEDEKKAAEALKKEAEEAKKAEE-LKKKEAEEKKKAEELKKA 1724
                          330       340
                   ....*....|....*....|....*
gi 1841659210 1043 TKFITYRSKKLIQESDQHLKDVEKI 1067
Cdd:PTZ00121  1725 EEENKIKAEEAKKEAEEDKKKAEEA 1749
PRP40 COG5104
Splicing factor [RNA processing and modification];
136-173 1.24e-05

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 49.31  E-value: 1.24e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1841659210  136 IWVENKTPDGKVYYYNARTRESAWTKPDgvKVIQQSEL 173
Cdd:COG5104     16 EWEELKAPDGRIYYYNKRTGKSSWEKPK--ELLKGSEE 51
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
323-413 2.91e-05

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 47.90  E-value: 2.91e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  323 APTATPVQTVPQPHPQT-LPPAVPHSVPQPTTAIPAFPP---VMVPPFRVPLPGMPIPLPGVamMQIVSCPYVKTVATTK 398
Cdd:PRK13729   122 ALGANPVTATGEPVPQMpASPPGPEGEPQPGNTPVSFPPqgsVAVPPPTAFYPGNGVTPPPQ--VTYQSVPVPNRIQRKT 199
                           90
                   ....*....|....*
gi 1841659210  399 TGVLPGMAPPIVPMI 413
Cdd:PRK13729   200 FTYNEGKKGPSLPYI 214
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
258-434 3.13e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 47.95  E-value: 3.13e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATP--VQTVPQP 335
Cdd:PRK12323   394 AAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPaaAGPRPVA 473
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  336 HPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLPGMPIPLPGVAMMQIVSCPYVKTVATTKTGVLPGMAPPIVPMihP 415
Cdd:PRK12323   474 AAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPA--P 551
                          170
                   ....*....|....*....
gi 1841659210  416 QVAIAASPATLAGATAVSE 434
Cdd:PRK12323   552 RAAAATEPVVAPRPPRASA 570
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
306-390 5.86e-05

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 45.41  E-value: 5.86e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  306 PSSAVSVATPTVSVSTPAPTATPvqtvPQPHPQTLPPAvPHSVPQPTTAIPaFPPVMVPPFRVPL--PGMPIPLPGVaMM 383
Cdd:cd21577     33 PSSSSSSSSSSSSSSSPSSRASP----PSPYSKSSPPS-PPQQRPLSPPLS-LPPPVAPPPLSPGsvPGGLPVISPV-MV 105

                   ....*..
gi 1841659210  384 QIVSCPY 390
Cdd:cd21577    106 QPVPVLY 112
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
1030-1094 7.03e-05

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 41.41  E-value: 7.03e-05
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1841659210  1030 ITAKADFRTLLKETKFITYrskkliqesDQHLKDVEKILQNDKRYLVLDcVPEERRKLIVAYVDD 1094
Cdd:smart00441    1 EEAKEAFKELLKEHEVITP---------DTTWSEARKKLKNDPRYKALL-SESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
1032-1091 7.63e-05

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 41.29  E-value: 7.63e-05
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210 1032 AKADFRTLLKETKfITYRSkkliqesdqHLKDVEKILQNDKRYLVLDcVPEERRKLIVAY 1091
Cdd:pfam01846    2 AREAFKELLKEHK-ITPYS---------TWSEIKKKIENDPRYKALL-DGSEREELFEDY 50
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
861-1070 2.49e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 41.98  E-value: 2.49e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  861 IEKIAKNLDSEKEKELERQARIEASLREREREVQKARSEQT---KEIDREREQ-HKREEAIQNFKALLSD-MVRSSDVSW 935
Cdd:TIGR02169  721 IEKEIEQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKeleARIEELEEDlHKLEEALNDLEARLSHsRIPEIQAEL 800
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  936 SDtrrtLRKDHRWESGSL--LEREEKEKLFNEHIEaltKKKREHFRQLLDEtsaitLTSTWKEVKKIIKEDPRCIKFSSS 1013
Cdd:TIGR02169  801 SK----LEEEVSRIEARLreIEQKLNRLTLEKEYL---EKEIQELQEQRID-----LKEQIKSIEKEIENLNGKKEELEE 868
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210 1014 DRKKQREFEEYIRDKYITAKADFRTLLKETKFITYRSKKL---IQESDQHLKDVEKILQN 1070
Cdd:TIGR02169  869 ELEELEAALRDLESRLGDLKKERDELEAQLRELERKIEELeaqIEKKRKRLSELKAKLEA 928
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
297-434 5.21e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 40.82  E-value: 5.21e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  297 STPTTQDQTPSSAVSvATPTvSVSTPaPTATPVQTVPQPHPqTLPPAVPHSVPqpttaipafPPVMVPPFRVPLPGMPIP 376
Cdd:TIGR01645  322 AVLGPRAQSPATPSS-SLPT-DIGNK-AVVSSAKKEAEEVP-PLPQAAPAVVK---------PGPMEIPTPVPPPGLAIP 388
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  377 LpGVAMMQIVScpyvktvattKTGVLPG-MAPPIVPMIHPQVAIAASP--ATLAGATAVSE 434
Cdd:TIGR01645  389 S-LVAPPGLVA----------PTEINPSfLASPRKKMKREKLPVTFGAldDTLAWKEPSKE 438
 
Name Accession Description Interval E-value
PRP40 COG5104
Splicing factor [RNA processing and modification];
421-1035 1.78e-23

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 106.32  E-value: 1.78e-23
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  421 ASPATLAGATAVSEWTEYKTADGKTYYYNNRTLESTWEKPQElkekekleekikepIKEPSEEPLPMEteeedpkeepik 500
Cdd:COG5104      3 AALLGMASGEARSEWEELKAPDGRIYYYNKRTGKSSWEKPKE--------------LLKGSEEDLDVD------------ 56
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  501 eikeepkeeemteeekaaqkakpvatapipgtPWCVVWTGDERVFFYNPTTRLSMWDRpddligradvdkiiqePPHKKG 580
Cdd:COG5104     57 --------------------------------PWKECRTADGKVYYYNSITRESRWKI----------------PPERKK 88
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  581 MEELKKLRHPTPTMLSIQKWQFSMSAIkEEQELMEEINEDepvkakkrkrmskksfmwiaraslfrrddnkdidsEKEAA 660
Cdd:COG5104     89 VEPIAEQKHDERSMIGGNGNDMAITDH-ETSEPKYLLGRL-----------------------------------MSQYG 132
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  661 MEAEIKAARERAivpLEARMKQFKDMLLERGVSAFSTWEKELHKIVfDPRYLLL--NPKERKQVFDQYVKTRAEEERREK 738
Cdd:COG5104    133 ITSTKDAVYRLT---KEEAEKEFITMLKENQVDSTWPIFRAIEELR-DPRYWMVdtDPLWRKDLFKKYFENQEKDQREEE 208
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  739 KNKIMQAKEDFKKMME-EAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDSKTRGEKIKSDFF 817
Cdd:COG5104    209 ENKQRKYINEFCKMLAgNSHIKYYTDWFTFKSIFSKHPYYSSVVNEKTKRQTFQKYKDKLGCYEKYVGKHMGGTALGRLE 288
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  818 ELLSNHHLDSQSRWSKVKDKVESDPRYKAvdSSSM----REDLFKQYIeKIAKNLdsekekelerqarieaslrerEREV 893
Cdd:COG5104    289 EVLRSLGSETFIIWLLNHYVFDSVVRYLK--NKEMkpldRKDILFSFI-RYVRRL---------------------EKEL 344
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  894 QKARSEQTKEIDReREQHKREeaiqNFKALLSDMVRSSDVS----WSDTRRTLRKDHRWESGSLLEREEKEKLFNEHIEA 969
Cdd:COG5104    345 LSAIEERKAAAAQ-NARHHRD----EFRTLLRKLYSEGKIYyrmkWKNAYPLIKDDPRFLNLLGRTGSSPLDLFFDFIVD 419
                          570       580       590       600       610       620       630
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  970 LTKKKREHFRQLLDETSaITLTSTW--KEVKKIIKEDPRciKFSSSDRKKQREFEE---YIRDKYITAKAD 1035
Cdd:COG5104    420 LENMYGFARRSYERETR-TGQISPTdrRAVDEIFEAIAE--KKEEGEIKFDKVDKEdisLIVDGLIKQRNE 487
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
811-860 4.24e-14

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 67.48  E-value: 4.24e-14
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1841659210  811 KIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQY 860
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYKALLDGSEREELFEDY 50
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
744-793 4.74e-12

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 61.70  E-value: 4.74e-12
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 1841659210  744 QAKEDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEF 793
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYKALLDGSEREELFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
972-1027 2.81e-10

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 56.81  E-value: 2.81e-10
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*.
gi 1841659210   972 KKKREHFRQLLDETSAITLTSTWKEVKKIIKEDPRCiKFSSSDRKKQREFEEYIRD 1027
Cdd:smart00441    1 EEAKEAFKELLKEHEVITPDTTWSEARKKLKNDPRY-KALLSESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
679-726 3.33e-10

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 56.31  E-value: 3.33e-10
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 1841659210  679 RMKQFKDMLLERGVSAFSTWEKELHKIVFDPRYL-LLNPKERKQVFDQY 726
Cdd:pfam01846    2 AREAFKELLKEHKITPYSTWSEIKKKIENDPRYKaLLDGSEREELFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
810-863 6.03e-10

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 55.66  E-value: 6.03e-10
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*
gi 1841659210   810 EKIKSDFFELLSNHHLD-SQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEK 863
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYKALLSESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
915-966 1.19e-09

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 54.77  E-value: 1.19e-09
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1841659210  915 EAIQNFKALLSDMVRSSDVSWSDTRRTLRKDHRWEsgSLLEREEKEKLFNEH 966
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYK--ALLDGSEREELFEDY 50
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
433-460 2.71e-08

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 50.60  E-value: 2.71e-08
                           10        20
                   ....*....|....*....|....*...
gi 1841659210  433 SEWTEYKTADGKTYYYNNRTLESTWEKP 460
Cdd:cd00201      2 PGWEERWDPDGRVYYYNHNTKETQWEDP 29
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
433-460 4.41e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.81  E-value: 4.41e-08
                           10        20
                   ....*....|....*....|....*...
gi 1841659210  433 SEWTEYKTADGKTYYYNNRTLESTWEKP 460
Cdd:pfam00397    3 PGWEERWDPDGRVYYYNHETGETQWEKP 30
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
743-796 5.50e-08

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 50.26  E-value: 5.50e-08
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*
gi 1841659210   743 MQAKEDFKKMMEEAKFN-PRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAA 796
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYKALLSESEREQLFEDHIEE 55
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
137-162 7.56e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.04  E-value: 7.56e-08
                           10        20
                   ....*....|....*....|....*.
gi 1841659210  137 WVENKTPDGKVYYYNARTRESAWTKP 162
Cdd:pfam00397    5 WEERWDPDGRVYYYNHETGETQWEKP 30
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
132-164 7.89e-08

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 49.14  E-value: 7.89e-08
                            10        20        30
                    ....*....|....*....|....*....|...
gi 1841659210   132 PTEEIWVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:smart00456    1 PLPPGWEERKDPDGRPYYYNHETKETQWEKPRE 33
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
973-1024 1.22e-07

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 48.99  E-value: 1.22e-07
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|..
gi 1841659210  973 KKREHFRQLLDETSaITLTSTWKEVKKIIKEDPRCIKFSSSDRKKQrEFEEY 1024
Cdd:pfam01846    1 KAREAFKELLKEHK-ITPYSTWSEIKKKIENDPRYKALLDGSEREE-LFEDY 50
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
435-461 1.61e-07

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 48.37  E-value: 1.61e-07
                            10        20
                    ....*....|....*....|....*..
gi 1841659210   435 WTEYKTADGKTYYYNNRTLESTWEKPQ 461
Cdd:smart00456    6 WEERKDPDGRPYYYNHETKETQWEKPR 32
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
137-164 1.99e-07

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 47.91  E-value: 1.99e-07
                           10        20
                   ....*....|....*....|....*...
gi 1841659210  137 WVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:cd00201      4 WEERWDPDGRVYYYNHNTKETQWEDPRE 31
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
914-969 9.91e-07

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 46.80  E-value: 9.91e-07
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*..
gi 1841659210   914 EEAIQNFKALLSDMVRS-SDVSWSDTRRTLRKDHRWESgsLLEREEKEKLFNEHIEA 969
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYKA--LLSESEREQLFEDHIEE 55
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
677-728 2.70e-06

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 45.64  E-value: 2.70e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 1841659210   677 EARMKQFKDMLLERGVS-AFSTWEKELHKIVFDPRY-LLLNPKERKQVFDQYVK 728
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYkALLSESEREQLFEDHIE 54
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
260-365 3.98e-06

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 51.07  E-value: 3.98e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  260 TPTTSSPAPAVSTStssstpssttsttttatsvAQTVSTPTTQDQTPSSAVSVATP-----TVSVSTPAPTAT------- 327
Cdd:pfam05109  517 TPNATSPTPAVTTP-------------------TPNATSPTLGKTSPTSAVTTPTPnatspTPAVTTPTPNATiptlgkt 577
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|....*
gi 1841659210  328 -PVQTVPQPHPQTLPPAVPHSVPQPTT------AIPAFPPVMVPP 365
Cdd:pfam05109  578 sPTSAVTTPTPNATSPTVGETSPQANTtnhtlgGTSSTPVVTSPP 622
PTZ00121 PTZ00121
MAEBL; Provisional
727-1067 5.23e-06

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 50.91  E-value: 5.23e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  727 VKTRAEEERR--EKKNKIMQAK--EDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREAlfNEFVAAARKKEK 802
Cdd:PTZ00121  1423 AKKKAEEKKKadEAKKKAEEAKkaDEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEA--KKKAEEAKKKAD 1500
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  803 EDSKTRGEKIKSDffELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEKIAKNLDSEKEKELERQARI 882
Cdd:PTZ00121  1501 EAKKAAEAKKKAD--EAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNM 1578
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  883 EASLREREREVQKARSEQTKEIDREREQHKREEAiqnfKALLSDMVRSSDVSWSDTRRtlRKDHRWESGSLLEREEKEKL 962
Cdd:PTZ00121  1579 ALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEA----KKAEEAKIKAEELKKAEEEK--KKVEQLKKKEAEEKKKAEEL 1652
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  963 FNEHIEALTKKKREHFRQLLDETSAitltstwKEVKKIIKEDPRCIKFSSSDRKKQREFEEyIRDKYITAKADFRTLLKE 1042
Cdd:PTZ00121  1653 KKAEEENKIKAAEEAKKAEEDKKKA-------EEAKKAEEDEKKAAEALKKEAEEAKKAEE-LKKKEAEEKKKAEELKKA 1724
                          330       340
                   ....*....|....*....|....*
gi 1841659210 1043 TKFITYRSKKLIQESDQHLKDVEKI 1067
Cdd:PTZ00121  1725 EEENKIKAEEAKKEAEEDKKKAEEA 1749
DUF5401 pfam17380
Family of unknown function (DUF5401); This is a family of unknown function found in ...
718-968 8.31e-06

Family of unknown function (DUF5401); This is a family of unknown function found in Chromadorea.


Pssm-ID: 375164 [Multi-domain]  Cd Length: 722  Bit Score: 50.12  E-value: 8.31e-06
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  718 ERKQVfDQYVKTRAEEERREKKNKIMQAKEdfKKMMEEAKFNPRATFSEFAAKHAKDSRFkAIEKMKDREALFNEfvaaa 797
Cdd:pfam17380  286 ERQQQ-EKFEKMEQERLRQEKEEKAREVER--RRKLEEAEKARQAEMDRQAAIYAEQERM-AMERERELERIRQE----- 356
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  798 rKKEKEDSKTRGEKIKSDFFEL--LSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMRE-DLFKQYIEKIaknldsEKEK 874
Cdd:pfam17380  357 -ERKRELERIRQEEIAMEISRMreLERLQMERQQKNERVRQELEAARKVKILEEERQRKiQQQKVEMEQI------RAEQ 429
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  875 ELERQARIEASLREREREVQKARSEQ---TKEIDREREQhkrEEAIQNFKALLSDMVRSSDVSWSDTRRTLRKDHRWESG 951
Cdd:pfam17380  430 EEARQREVRRLEEERAREMERVRLEEqerQQQVERLRQQ---EEERKRKKLELEKEKRDRKRAEEQRRKILEKELEERKQ 506
                          250
                   ....*....|....*..
gi 1841659210  952 SLLEREEKEKLFNEHIE 968
Cdd:pfam17380  507 AMIEEERKRKLLEKEME 523
Herpes_BLLF1 pfam05109
Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 ...
298-431 1.10e-05

Herpes virus major outer envelope glycoprotein (BLLF1); This family consists of the BLLF1 viral late glycoprotein, also termed gp350/220. It is the most abundantly expressed glycoprotein in the viral envelope of the Herpesviruses and is the major antigen responsible for stimulating the production of neutralising antibodies in vivo.


Pssm-ID: 282904 [Multi-domain]  Cd Length: 886  Bit Score: 49.53  E-value: 1.10e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  298 TPTTQDQ-TPSSAVSVATPTVSVSTPAPTAT-PVQTVPQPHPQTLPPAVPHSVPQPTTAIPafppvmVPPFRVPLPGMPI 375
Cdd:pfam05109  491 SPSPRDNgTESKAPDMTSPTSAVTTPTPNATsPTPAVTTPTPNATSPTLGKTSPTSAVTTP------TPNATSPTPAVTT 564
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*.
gi 1841659210  376 PLPGvAMMQIVSCPYVKTVATTKTgvlPGMAPPIVPMIHPQVaiAASPATLAGATA 431
Cdd:pfam05109  565 PTPN-ATIPTLGKTSPTSAVTTPT---PNATSPTVGETSPQA--NTTNHTLGGTSS 614
PRP40 COG5104
Splicing factor [RNA processing and modification];
136-173 1.24e-05

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 49.31  E-value: 1.24e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 1841659210  136 IWVENKTPDGKVYYYNARTRESAWTKPDgvKVIQQSEL 173
Cdd:COG5104     16 EWEELKAPDGRIYYYNKRTGKSSWEKPK--ELLKGSEE 51
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
533-561 1.34e-05

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 42.97  E-value: 1.34e-05
                            10        20
                    ....*....|....*....|....*....
gi 1841659210   533 PWCVVWTGDERVFFYNPTTRLSMWDRPDD 561
Cdd:smart00456    5 GWEERKDPDGRPYYYNHETKETQWEKPRE 33
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
532-561 2.29e-05

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 42.13  E-value: 2.29e-05
                           10        20        30
                   ....*....|....*....|....*....|
gi 1841659210  532 TPWCVVWTGDERVFFYNPTTRLSMWDRPDD 561
Cdd:cd00201      2 PGWEERWDPDGRVYYYNHNTKETQWEDPRE 31
PRK13729 PRK13729
conjugal transfer pilus assembly protein TraB; Provisional
323-413 2.91e-05

conjugal transfer pilus assembly protein TraB; Provisional


Pssm-ID: 184281 [Multi-domain]  Cd Length: 475  Bit Score: 47.90  E-value: 2.91e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  323 APTATPVQTVPQPHPQT-LPPAVPHSVPQPTTAIPAFPP---VMVPPFRVPLPGMPIPLPGVamMQIVSCPYVKTVATTK 398
Cdd:PRK13729   122 ALGANPVTATGEPVPQMpASPPGPEGEPQPGNTPVSFPPqgsVAVPPPTAFYPGNGVTPPPQ--VTYQSVPVPNRIQRKT 199
                           90
                   ....*....|....*
gi 1841659210  399 TGVLPGMAPPIVPMI 413
Cdd:PRK13729   200 FTYNEGKKGPSLPYI 214
PRK12323 PRK12323
DNA polymerase III subunit gamma/tau;
258-434 3.13e-05

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237057 [Multi-domain]  Cd Length: 700  Bit Score: 47.95  E-value: 3.13e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATP--VQTVPQP 335
Cdd:PRK12323   394 AAAPAPAAPPAAPAAAPAAAAAARAVAAAPARRSPAPEALAAARQASARGPGGAPAPAPAPAAAPAAAARPaaAGPRPVA 473
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  336 HPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLPGMPIPLPGVAMMQIVSCPYVKTVATTKTGVLPGMAPPIVPMihP 415
Cdd:PRK12323   474 AAAAAAPARAAPAAAPAPADDDPPPWEELPPEFASPAPAQPDAAPAGWVAESIPDPATADPDDAFETLAPAPAAAPA--P 551
                          170
                   ....*....|....*....
gi 1841659210  416 QVAIAASPATLAGATAVSE 434
Cdd:PRK12323   552 RAAAATEPVVAPRPPRASA 570
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
258-425 3.55e-05

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 47.65  E-value: 3.55e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQDQTPSSAVSVATPTVSV-------STPAP------ 324
Cdd:pfam17823  165 ASAPHAASPAPRTAASSTTAASSTTAASSAPTTAASSAPATLTPARGISTAATATGHPAAGTalaavgnSSPAAgtvtaa 244
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  325 -----------TATPVQTVP---------QPHPQTLPPAvpHSVPQPTTAIPAFPP-----------VMVPPFRVPLPGM 373
Cdd:pfam17823  245 vgtvtpaalatLAAAAGTVAsaagtinmgDPHARRLSPA--KHMPSDTMARNPAAPmgaqaqgpiiqVSTDQPVHNTAGE 322
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  374 PIPLPGVAMMQIVSCPYVKT-----VATTKTGVLPGMAPPiVPMIH----PQVAiAASPAT 425
Cdd:pfam17823  323 PTPSPSNTTLEPNTPKSVAStnlavVTTTKAQAKEPSASP-VPVLHtsmiPEVE-ATSPTT 381
PTZ00121 PTZ00121
MAEBL; Provisional
730-999 3.61e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 48.21  E-value: 3.61e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  730 RAEEERR--EKKNKIMQAK--EDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDS 805
Cdd:PTZ00121  1297 KAEEKKKadEAKKKAEEAKkaDEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEA 1376
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  806 KTRGEKIKSDFFELLSNHHLDSQSRwskvKDKVESDPRYKAVDSSSMREDLfKQYIEKIAKNLDSEKEKELERQARieaS 885
Cdd:PTZ00121  1377 KKKADAAKKKAEEKKKADEAKKKAE----EDKKKADELKKAAAAKKKADEA-KKKAEEKKKADEAKKKAEEAKKAD---E 1448
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  886 LREREREVQKARSEQTKEIDREREQHKREEAIQNFKAllSDMVRSSDVSWSDTRRTLRKDHRWESGSLLEREEKEKLFNE 965
Cdd:PTZ00121  1449 AKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKA--DEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADE 1526
                          250       260       270
                   ....*....|....*....|....*....|....
gi 1841659210  966 HIEALTKKKREHFRQLLDETSAITLTSTwKEVKK 999
Cdd:PTZ00121  1527 AKKAEEAKKADEAKKAEEKKKADELKKA-EELKK 1559
PRP40 COG5104
Splicing factor [RNA processing and modification];
137-172 5.38e-05

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 47.38  E-value: 5.38e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 1841659210  137 WVENKTPDGKVYYYNARTRESAWTKPDGVKVIQQSE 172
Cdd:COG5104     58 WKECRTADGKVYYYNSITRESRWKIPPERKKVEPIA 93
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
733-1087 5.51e-05

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 47.37  E-value: 5.51e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  733 EERREKKNKIMQAKEDFKKMMEEAKFNPRATFSEFAAKHAKDSRF-----KAIEKMKDREALFNEFVAAARKKEKEDSKT 807
Cdd:PRK03918   175 KRRIERLEKFIKRTENIEELIKEKEKELEEVLREINEISSELPELreeleKLEKEVKELEELKEEIEELEKELESLEGSK 254
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  808 RGEKIKsdffelLSNhhldSQSRWSKVKDKVEsDPRYKAVDSSSMREDLfKQYIEkiaknLDSEKEKELERQARIE---A 884
Cdd:PRK03918   255 RKLEEK------IRE----LEERIEELKKEIE-ELEEKVKELKELKEKA-EEYIK-----LSEFYEEYLDELREIEkrlS 317
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  885 SLREREREVQKARSEQTKEIDREREQHKREEAIQNFKALLSDMVRSSDvswsDTRRTLRKDHRWESGslLEREEKEKLFN 964
Cdd:PRK03918   318 RLEEEINGIEERIKELEEKEERLEELKKKLKELEKRLEELEERHELYE----EAKAKKEELERLKKR--LTGLTPEKLEK 391
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  965 EhIEALTKKKREHFRQLLdetsaiTLTSTWKEVKKIIKEDPRCIKFSSSDRKK----QREFEEYIRDKYITA-KADFRTL 1039
Cdd:PRK03918   392 E-LEELEKAKEEIEEEIS------KITARIGELKKEIKELKKAIEELKKAKGKcpvcGRELTEEHRKELLEEyTAELKRI 464
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*...
gi 1841659210 1040 LKETKFITYRSKKLIQEsdqhLKDVEKILQNDKRYLVLDCVPEERRKL 1087
Cdd:PRK03918   465 EKELKEIEEKERKLRKE----LRELEKVLKKESELIKLKELAEQLKEL 508
KLF3_N cd21577
N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called ...
306-390 5.86e-05

N-terminal domain of Kruppel-like factor 3; Kruppel-like factor 3 (KLF3; also called Krueppel-like factor 3 and originally called Basic Kruppel-like Factor/BKLF), was the third member of the KLF family of zinc finger transcription factors to be discovered. KLF3 possesses a wide range of biological impacts on regulating apoptosis, differentiation, and proliferation in various tissues during the entire progression process. It has been proposed as a tumor suppressor in colorectal cancer. It appears to function predominantly as a repressor of transcription, turning genes off by recruiting the C-terminal Binding Protein co-repressors CtBP1 and CtBP2. CtBP docks onto a short motif (residues 61-65) in the N-terminus of KLF3, through the Proline-X-Aspartate-Leucine-Serine (PXDLS) motif. CtBP in turn recruits histone modifying enzymes to alter chromatin and repress gene expression. KLF3 belongs to a family of proteins, called the Specificity Protein (SP)/KLF family, characterized by a C-terminal DNA-binding domain of 81 amino acids consisting of three Kruppel-like C2H2 zinc fingers. These factors bind to a loose consensus motif, namely NNRCRCCYY (where N is any nucleotide; R is A/G, and Y is C/T), such as the recurring motifs in GC and GT boxes (5'-GGGGCGGGG-3' and 5-GGTGTGGGG-3') that are present in promoters and more distal regulatory elements of mammalian genes. Members of the KLF family can act as activators or repressors of transcription depending on cell and promoter context. KLFs regulate various cellular functions, such as proliferation, differentiation, and apoptosis, as well as the development and homeostasis of several types of tissue. In addition to the C-terminal DNA-binding domain, each KLF also has a unique N-terminal activation/repression domain that confers specificity and allows it to bind specifically to a certain partner, leading to distinct activities in vivo. This model represents the N-terminal domain of KLF3.


Pssm-ID: 410554 [Multi-domain]  Cd Length: 214  Bit Score: 45.41  E-value: 5.86e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  306 PSSAVSVATPTVSVSTPAPTATPvqtvPQPHPQTLPPAvPHSVPQPTTAIPaFPPVMVPPFRVPL--PGMPIPLPGVaMM 383
Cdd:cd21577     33 PSSSSSSSSSSSSSSSPSSRASP----PSPYSKSSPPS-PPQQRPLSPPLS-LPPPVAPPPLSPGsvPGGLPVISPV-MV 105

                   ....*..
gi 1841659210  384 QIVSCPY 390
Cdd:cd21577    106 QPVPVLY 112
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
1030-1094 7.03e-05

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 41.41  E-value: 7.03e-05
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1841659210  1030 ITAKADFRTLLKETKFITYrskkliqesDQHLKDVEKILQNDKRYLVLDcVPEERRKLIVAYVDD 1094
Cdd:smart00441    1 EEAKEAFKELLKEHEVITP---------DTTWSEARKKLKNDPRYKALL-SESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
1032-1091 7.63e-05

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 41.29  E-value: 7.63e-05
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210 1032 AKADFRTLLKETKfITYRSkkliqesdqHLKDVEKILQNDKRYLVLDcVPEERRKLIVAY 1091
Cdd:pfam01846    2 AREAFKELLKEHK-ITPYS---------TWSEIKKKIENDPRYKALL-DGSEREELFEDY 50
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
306-424 7.97e-05

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 46.63  E-value: 7.97e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  306 PSSAVSVATPTVSVSTPAPTATPVqtVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLPGM--PIPLPGVAMM 383
Cdd:PRK14951   366 PAAAAEAAAPAEKKTPARPEAAAP--AAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPAPVAAPAAaaPAAAPAAAPA 443
                           90       100       110       120
                   ....*....|....*....|....*....|....*....|.
gi 1841659210  384 QIVSCPYVKTVATTKTGVLPGMAPPIVPMIHPQVAIAASPA 424
Cdd:PRK14951   444 AVALAPAPPAQAAPETVAIPVRVAPEPAVASAAPAPAAAPA 484
PRK14951 PRK14951
DNA polymerase III subunits gamma and tau; Provisional
295-381 1.12e-04

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237865 [Multi-domain]  Cd Length: 618  Bit Score: 46.25  E-value: 1.12e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  295 TVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATPVQTVP---QPHPQTLPPAVPHSVPQPtTAIPAFPPVMVPPFRVPLP 371
Cdd:PRK14951   385 EAAAPAAAPVAQAAAAPAPAAAPAAAASAPAAPPAAAPPapvAAPAAAAPAAAPAAAPAA-VALAPAPPAQAAPETVAIP 463
                           90
                   ....*....|
gi 1841659210  372 GMPIPLPGVA 381
Cdd:PRK14951   464 VRVAPEPAVA 473
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
715-1107 1.14e-04

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 46.50  E-value: 1.14e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  715 NPKERKQVFDQYVKTRAEEERREKKNKIMQAKEDfkkmmeeakfnpratfsefaakhakdsrfKAIEKMKDREALFNEFV 794
Cdd:pfam02463  151 KPERRLEIEEEAAGSRLKRKKKEALKKLIEETEN-----------------------------LAELIIDLEELKLQELK 201
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  795 AAARKKEKEDSKTRGEKIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKavDSSSMREDLFKQYIEKIAKNLDSEKEK 874
Cdd:pfam02463  202 LKEQAKKALEYYQLKEKLELEEEYLLYLDYLKLNEERIDLLQELLRDEQEE--IESSKQEIEKEEEKLAQVLKENKEEEK 279
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  875 ELERQARIEASLREREREVQKAR--SEQTKEIDREREQHKREEAIQNFKALLSDMVRSSDvswsdtRRTLRKDHRWESGS 952
Cdd:pfam02463  280 EKKLQEEELKLLAKEEEELKSELlkLERRKVDDEEKLKESEKEKKKAEKELKKEKEEIEE------LEKELKELEIKREA 353
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  953 LLEREE----KEKLFNEHIEALTKKKREHFRQLLDETSAITLTSTWKEVKKIIkedprcikfsSSDRKKQREFEEYIRDK 1028
Cdd:pfam02463  354 EEEEEEelekLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKSEEEKE----------AQLLLELARQLEDLLKE 423
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1841659210 1029 YITAKADFrtLLKETKFITYRSKKLIQESDqHLKDVEKILQNDKRYLVLDCVPEERRKLIVAYVDDLDRRGPPPPPTAS 1107
Cdd:pfam02463  424 EKKEELEI--LEEEEESIELKQGKLTEEKE-ELEKQELKLLKDELELKKSEDLLKETQLVKLQEQLELLLSRQKLEERS 499
PTZ00121 PTZ00121
MAEBL; Provisional
714-1086 1.22e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 46.67  E-value: 1.22e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  714 LNPKERKQVFDQYVKTRAEEERREKKNKIMQAKEDFKKMMEEAKfnpratFSEFAAKHAKDSRfKAIEKMKDREALFNEf 793
Cdd:PTZ00121  1072 LKPSYKDFDFDAKEDNRADEATEEAFGKAEEAKKTETGKAEEAR------KAEEAKKKAEDAR-KAEEARKAEDARKAE- 1143
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  794 vaAARKKEkEDSKTRGEKIKSDFFELLSNHHLDSQSRWSKVKDKVE---SDPRYKAVDSSSMREDLFKQYIEKIAKNLDS 870
Cdd:PTZ00121  1144 --EARKAE-DAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEvrkAEELRKAEDARKAEAARKAEEERKAEEARKA 1220
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  871 EKEKELERQARIEaSLREREREVQKARSEQTKEIDREREQHKREEAIQNFKALLSDMVRSSDvswsdtrrTLRK-DHRWE 949
Cdd:PTZ00121  1221 EDAKKAEAVKKAE-EAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKAD--------ELKKaEEKKK 1291
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  950 SGSLLEREEKEKLFNEHIEALTKKKREHFRQLLDET--SAITLTSTWKEVKKIIKEDPRCIKFSSSDRKKQREFEEYIRD 1027
Cdd:PTZ00121  1292 ADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAkkKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEK 1371
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210 1028 KYITAKADFRTLLK--ETKFITYRSKKLIQESDQHLKDVEKILQNDKRYLVLDCVPEERRK 1086
Cdd:PTZ00121  1372 KKEEAKKKADAAKKkaEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKK 1432
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
532-559 2.48e-04

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 39.41  E-value: 2.48e-04
                           10        20
                   ....*....|....*....|....*...
gi 1841659210  532 TPWCVVWTGDERVFFYNPTTRLSMWDRP 559
Cdd:pfam00397    3 PGWEERWDPDGRVYYYNHETGETQWEKP 30
PHA02682 PHA02682
ORF080 virion core protein; Provisional
299-417 3.40e-04

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 44.08  E-value: 3.40e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  299 PTTQDQ-TPSSAVsvATPTVSVSTPAPTA-TPVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRV---PLPGM 373
Cdd:PHA02682    76 PSGQSPlAPSPAC--AAPAPACPACAPAApAPAVTCPAPAPACPPATAPTCPPPAVCPAPARPAPACPPSTRqcpPAPPL 153
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 1841659210  374 PIPLPGVAMMQIV-------------SCPYVKTvattktgvlpgmAPPIVPMIHPQV 417
Cdd:PHA02682   154 PTPKPAPAAKPIFlhnqlpppdypaaSCPTIET------------APAASPVLEPRI 198
motB PRK12799
flagellar motor protein MotB; Reviewed
258-365 3.85e-04

flagellar motor protein MotB; Reviewed


Pssm-ID: 183756 [Multi-domain]  Cd Length: 421  Bit Score: 44.32  E-value: 3.85e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQdqTPSSAVSVATPTVSVSTPAPTATPVQTVPQPH- 336
Cdd:PRK12799   303 AVTPSSAVTQSSAITPSSAAIPSPAVIPSSVTTQSATTTQASAVA--LSSAGVLPSDVTLPGTVALPAAEPVNMQPQPMs 380
                           90       100       110
                   ....*....|....*....|....*....|...
gi 1841659210  337 -PQTLPPAVPHSVP---QPTTAIPAFPPVMVPP 365
Cdd:PRK12799   381 tTETQQSSTGNITStanGPTTSLPAAPASNIPV 413
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
258-383 3.86e-04

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 44.76  E-value: 3.86e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQDQTPSSAVSVATPTVSVST---PAPtATPVQTVPQ 334
Cdd:pfam03154  176 AQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQSTAAPHTLIQQTPTLHPqrlPSP-HPPLQPMTQ 254
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|...
gi 1841659210  335 PHPqtlPPAV-PHSVPQPT--TAIPAFP-PVMVPPFRVPLPGMPIPLPGVAMM 383
Cdd:pfam03154  255 PPP---PSQVsPQPLPQPSlhGQMPPMPhSLQTGPSHMQHPVPPQPFPLTPQS 304
HEC1 COG5185
Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell ...
730-1012 5.06e-04

Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444066 [Multi-domain]  Cd Length: 594  Bit Score: 44.18  E-value: 5.06e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  730 RAEEERREKKNKIMQAKEDFKKMMEEAKFNPRATFSEFAAKHAKDSrfKAIEKMKDREALFNEFVAAARKKEKEDSKTRG 809
Cdd:COG5185    257 KLVEQNTDLRLEKLGENAESSKRLNENANNLIKQFENTKEKIAEYT--KSIDIKKATESLEEQLAAAEAEQELEESKRET 334
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  810 EKIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEKIAKNLDSekekelerqarIEASLRER 889
Cdd:COG5185    335 ETGIQNLTAEIEQGQESLTENLEAIKEEIENIVGEVELSKSSEELDSFKDTIESTKESLDE-----------IPQNQRGY 403
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  890 EREVQKARSEQTKEIDREREQHKR---------EEAIQNFKALLSDMVRSSDVSWSDTRRTLRKDHRWESGSLLEREEKE 960
Cdd:COG5185    404 AQEILATLEDTLKAADRQIEELQRqieqatssnEEVSKLLNELISELNKVMREADEESQSRLEEAYDEINRSVRSKKEDL 483
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 1841659210  961 --------------KLFNEHIEALTKKKREHFRQLLDETSAITLTSTWKEVKKIIKEDPRCIKFSS 1012
Cdd:COG5185    484 neeltqiesrvstlKATLEKLRAKLERQLEGVRSKLDQVAESLKDFMRARGYAHILALENLIPASE 549
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
293-416 7.17e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 43.87  E-value: 7.17e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  293 AQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPtATPVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPPF------ 366
Cdd:pfam09770  228 QQAQQQQQFPPQIQQQQQPQQQPQQPQQHPGQ-GHPVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVQPTqilqnp 306
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  367 -RVPLPGMPIPLPGVAMMQivscPYVKTVATTKTGVLPGMAPPIVpmiHPQ 416
Cdd:pfam09770  307 nRLSAARVGYPQNPQPGVQ----PAPAHQAHRQQGSFGRQAPIIT---HPQ 350
PRK10856 PRK10856
cytoskeleton protein RodZ;
293-371 7.35e-04

cytoskeleton protein RodZ;


Pssm-ID: 236776 [Multi-domain]  Cd Length: 331  Bit Score: 43.09  E-value: 7.35e-04
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 1841659210  293 AQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATPVQTvPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLP 371
Cdd:PRK10856   161 SVPLDTSTTTDPATTPAPAAPVDTTPTNSQTPAVATAPA-PAVDPQQNAVVAPSQANVDTAATPAPAAPATPDGAAPLP 238
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
296-382 1.05e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 43.23  E-value: 1.05e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  296 VSTPTTQDQTPSSAVSVATPTVSVSTPAPTATPVQTVPQPHPQT--LPPAVPHSVPQPTTAIPAFPPvmvPPFRVPLPGM 373
Cdd:PRK14971   383 FTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGTPPTvsVDPPAAVPVNPPSTAPQAVRP---AQFKEEKKIP 459

                   ....*....
gi 1841659210  374 PIPLPGVAM 382
Cdd:PRK14971   460 VSKVSSLGP 468
DUF3729 pfam12526
Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins ...
306-379 1.48e-03

Protein of unknown function (DUF3729); This family of proteins is found in viruses. Proteins in this family are typically between 145 and 1707 amino acids in length. The family is found in association with pfam01443, pfam01661, pfam05417, pfam01660, pfam00978. There is a single completely conserved residue L that may be functionally important.


Pssm-ID: 372164 [Multi-domain]  Cd Length: 115  Bit Score: 39.68  E-value: 1.48e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1841659210  306 PSSAVSVATPTVSVSTPAPTATPVQTVPQPHPqtlpPAVPHSVPQPTTAIPAFPPVMVPPfRVPLPGMPIPLPG 379
Cdd:pfam12526   37 PDPPPPVGDPRPPVVDTPPPVSAVWVLPPPSE----PAAPEPDLVPPVTGPAGPPSPLAP-PAPAQKPPLPPPR 105
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
733-1067 1.50e-03

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 42.74  E-value: 1.50e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  733 EERREKKNKIMQAKEDFKKMMEE-AKFNPRA-TFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEK--EDSKTR 808
Cdd:PRK03918   331 KELEEKEERLEELKKKLKELEKRlEELEERHeLYEEAKAKKEELERLKKRLTGLTPEKLEKELEELEKAKEEieEEISKI 410
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  809 GEKIKSdfFELLSNHHLDSQSRWSKVKDKVesdPRYKAVDSSSMREDLFKQYIEKIAKnldseKEKELERQARIEASLRE 888
Cdd:PRK03918   411 TARIGE--LKKEIKELKKAIEELKKAKGKC---PVCGRELTEEHRKELLEEYTAELKR-----IEKELKEIEEKERKLRK 480
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  889 REREVQKARS------------EQTKEIDREREQHKREEAIQNFKALLSDMVRSsdvswsdtrRTLRKDHRwesgSLLER 956
Cdd:PRK03918   481 ELRELEKVLKkeseliklkelaEQLKELEEKLKKYNLEELEKKAEEYEKLKEKL---------IKLKGEIK----SLKKE 547
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  957 EEKEKLFNEHIEALTKKKREHFRQLldetsaitltstwKEVKKIIKEdprcIKFSSSD--RKKQREFEEYIRdKYIT--- 1031
Cdd:PRK03918   548 LEKLEELKKKLAELEKKLDELEEEL-------------AELLKELEE----LGFESVEelEERLKELEPFYN-EYLElkd 609
                          330       340       350
                   ....*....|....*....|....*....|....*.
gi 1841659210 1032 AKADFRTLLKETKFITYRSKKLIQESDQHLKDVEKI 1067
Cdd:PRK03918   610 AEKELEREEKELKKLEEELDKAFEELAETEKRLEEL 645
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
553-929 2.31e-03

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 42.27  E-value: 2.31e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  553 LSMWDR----PDDLIGRADVDKiiqepphkKGMEELKKLRHPTPTMLSIQ-------------KWQFSmSAIKEEQELME 615
Cdd:pfam02463  148 AMMKPErrleIEEEAAGSRLKR--------KKKEALKKLIEETENLAELIidleelklqelklKEQAK-KALEYYQLKEK 218
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  616 EINEDEpvkAKKRKRMSKKSFMWIARASLFRRDDNKDIDSEKEAAMEAEIKAARERAIVPLEARMKQFKDMLLERGVSAF 695
Cdd:pfam02463  219 LELEEE---YLLYLDYLKLNEERIDLLQELLRDEQEEIESSKQEIEKEEEKLAQVLKENKEEEKEKKLQEEELKLLAKEE 295
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  696 STWEKELHKIVfdpryLLLNPKERKQVFDQYVKTRAEEErrekKNKIMQAKEDFKKMMEE--AKFNPRATFSEFAAKHAK 773
Cdd:pfam02463  296 EELKSELLKLE-----RRKVDDEEKLKESEKEKKKAEKE----LKKEKEEIEELEKELKEleIKREAEEEEEEELEKLQE 366
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  774 DSRFKAIEKMKDREALFNEFVAAARKKEKEDS-KTRGEKIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSssm 852
Cdd:pfam02463  367 KLEQLEEELLAKKKLESERLSSAAKLKEEELElKSEEEKEAQLLLELARQLEDLLKEEKKEELEILEEEEESIELKQ--- 443
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*..
gi 1841659210  853 REDLFKQYIEKIAKNLDSEKEKELERQARIEAslREREREVQKARSEQTKEIDREREQHKREEAIQNFKALLSDMVR 929
Cdd:pfam02463  444 GKLTEEKEELEKQELKLLKDELELKKSEDLLK--ETQLVKLQEQLELLLSRQKLEERSQKESKARSGLKVLLALIKD 518
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
861-1070 2.49e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 41.98  E-value: 2.49e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  861 IEKIAKNLDSEKEKELERQARIEASLREREREVQKARSEQT---KEIDREREQ-HKREEAIQNFKALLSD-MVRSSDVSW 935
Cdd:TIGR02169  721 IEKEIEQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKeleARIEELEEDlHKLEEALNDLEARLSHsRIPEIQAEL 800
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  936 SDtrrtLRKDHRWESGSL--LEREEKEKLFNEHIEaltKKKREHFRQLLDEtsaitLTSTWKEVKKIIKEDPRCIKFSSS 1013
Cdd:TIGR02169  801 SK----LEEEVSRIEARLreIEQKLNRLTLEKEYL---EKEIQELQEQRID-----LKEQIKSIEKEIENLNGKKEELEE 868
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210 1014 DRKKQREFEEYIRDKYITAKADFRTLLKETKFITYRSKKL---IQESDQHLKDVEKILQN 1070
Cdd:TIGR02169  869 ELEELEAALRDLESRLGDLKKERDELEAQLRELERKIEELeaqIEKKRKRLSELKAKLEA 928
PRK12727 PRK12727
flagellar biosynthesis protein FlhF;
262-431 2.51e-03

flagellar biosynthesis protein FlhF;


Pssm-ID: 237182 [Multi-domain]  Cd Length: 559  Bit Score: 41.90  E-value: 2.51e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  262 TTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQDQTpSSAVSVATPtVSVSTPAPTATPVQTVPQPHP---Q 338
Cdd:PRK12727    65 TAAAPAPAPQAPTKPAAPVHAPLKLSANANMSQRQRVASAAEDM-IAAMALRQP-VSVPRQAPAAAPVRAASIPSPaaqA 142
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  339 TLPPAVPHSVPQPTTAIPAFPPVMvppFRVPLPGMPIPLPGVAMMQIVSCPYVKTVATT--------------------- 397
Cdd:PRK12727   143 LAHAAAVRTAPRQEHALSAVPEQL---FADFLTTAPVPRAPVQAPVVAAPAPVPAIAAAlaahaayaqdddeqldddgfd 219
                          170       180       190
                   ....*....|....*....|....*....|....*
gi 1841659210  398 -KTGVLPGMAPPIVPmihPQVAIAASPATLAGATA 431
Cdd:PRK12727   220 lDDALPQILPPAALP---PIVVAPAAPAALAAVAA 251
PHA03247 PHA03247
large tegument protein UL36; Provisional
305-376 2.56e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 42.23  E-value: 2.56e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....
gi 1841659210  305 TPSSAVSVATPTVSVSTPAPTATPVQTVPQ-PHPQTLPPAVPHSVPQP-TTAIPAFPPVMVPPFRVPLPGMPIP 376
Cdd:PHA03247   393 TPFARGPGGDDQTRPAAPVPASVPTPAPTPvPASAPPPPATPLPSAEPgSDDGPAPPPERQPPAPATEPAPDDP 466
COG4913 COG4913
Uncharacterized conserved protein, contains a C-terminal ATPase domain [Function unknown];
853-984 3.13e-03

Uncharacterized conserved protein, contains a C-terminal ATPase domain [Function unknown];


Pssm-ID: 443941 [Multi-domain]  Cd Length: 1089  Bit Score: 41.82  E-value: 3.13e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  853 REDLFKQYIEKIAKNLDSEKEKELERQARIEAsLREREREVQKARSEQ--------TKEIDR-EREQHKREEAIQNFKAL 923
Cdd:COG4913    289 RLELLEAELEELRAELARLEAELERLEARLDA-LREELDELEAQIRGNggdrleqlEREIERlERELEERERRRARLEAL 367
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 1841659210  924 LSDMvrssDVSWSDTRRTLRKDHRwESGSLLER--EEKEKLFNEHIEALTKKK--REHFRQLLDE 984
Cdd:COG4913    368 LAAL----GLPLPASAEEFAALRA-EAAALLEAleEELEALEEALAEAEAALRdlRRELRELEAE 427
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
293-387 3.15e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 41.68  E-value: 3.15e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  293 AQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATPVQTVPQPHPQTLPPAVPHSVPQPTtaiPAFPPVMVPPfRVPLPG 372
Cdd:PRK14971   376 KQHIKPVFTQPAAAPQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGTPPTVSVDPPAAV---PVNPPSTAPQ-AVRPAQ 451
                           90
                   ....*....|....*
gi 1841659210  373 MPIPLPgVAMMQIVS 387
Cdd:PRK14971   452 FKEEKK-IPVSKVSS 465
PRK10263 PRK10263
DNA translocase FtsK; Provisional
266-433 3.45e-03

DNA translocase FtsK; Provisional


Pssm-ID: 236669 [Multi-domain]  Cd Length: 1355  Bit Score: 41.61  E-value: 3.45e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  266 PAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTPTTQdQTPSSAVSVATPTVSVSTPAPTATPVQTVPQPHPQTLPPAVP 345
Cdd:PRK10263   403 PQQPYYAPAAEQPAQQPYYAPAPEQPAQQPYYAPAPE-QPVAGNAWQAEEQQSTFAPQSTYQTEQTYQQPAAQEPLYQQP 481
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  346 HSVPQPTTAIPAFPPVMVPPFRVPLPGM-------------------PIPLPGVAmmqivSCPYVKTVATTKTGVLPGMA 406
Cdd:PRK10263   482 QPVEQQPVVEPEPVVEETKPARPPLYYFeeveekrarereqlaawyqPIPEPVKE-----PEPIKSSLKAPSVAAVPPVE 556
                          170       180
                   ....*....|....*....|....*..
gi 1841659210  407 PpiVPMIHPqVAIAASPATLAGATAVS 433
Cdd:PRK10263   557 A--AAAVSP-LASGVKKATLATGAAAT 580
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
258-487 3.53e-03

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 41.48  E-value: 3.53e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVStPTTQDQTPSSAVSVAT-----------PTVSVSTPAPTA 326
Cdd:pfam17823  128 QSLPAAIAALPSEAFSAPRAAACRANASAAPRAAIAAASA-PHAASPAPRTAASSTTaassttaassaPTTAASSAPATL 206
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  327 TPVQTvpqphpqTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLPGMPIPLPGVAMMqivsCPYVKTVATTKTGVlpGMA 406
Cdd:pfam17823  207 TPARG-------ISTAATATGHPAAGTALAAVGNSSPAAGTVTAAVGTVTPAALATL----AAAAGTVASAAGTI--NMG 273
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  407 PPIV----PMIH-PQVAIAASPATLAGATAVSEWTE-------YKTADGKTYYYNNRTLESTWEKPQELKEKEKLEEKIK 474
Cdd:pfam17823  274 DPHArrlsPAKHmPSDTMARNPAAPMGAQAQGPIIQvstdqpvHNTAGEPTPSPSNTTLEPNTPKSVASTNLAVVTTTKA 353
                          250
                   ....*....|...
gi 1841659210  475 EPiKEPSEEPLPM 487
Cdd:pfam17823  354 QA-KEPSASPVPV 365
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
297-434 5.21e-03

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 40.82  E-value: 5.21e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  297 STPTTQDQTPSSAVSvATPTvSVSTPaPTATPVQTVPQPHPqTLPPAVPHSVPqpttaipafPPVMVPPFRVPLPGMPIP 376
Cdd:TIGR01645  322 AVLGPRAQSPATPSS-SLPT-DIGNK-AVVSSAKKEAEEVP-PLPQAAPAVVK---------PGPMEIPTPVPPPGLAIP 388
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 1841659210  377 LpGVAMMQIVScpyvktvattKTGVLPG-MAPPIVPMIHPQVAIAASP--ATLAGATAVSE 434
Cdd:TIGR01645  389 S-LVAPPGLVA----------PTEINPSfLASPRKKMKREKLPVTFGAldDTLAWKEPSKE 438
rne PRK10811
ribonuclease E; Reviewed
292-435 5.50e-03

ribonuclease E; Reviewed


Pssm-ID: 236766 [Multi-domain]  Cd Length: 1068  Bit Score: 40.79  E-value: 5.50e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  292 VAQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATPVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLP 371
Cdd:PRK10811   887 VVEAVAEVVEEPVVVAEPQPEEVVVVETTHPEVIAAPVTEQPQVITESDVAVAQEVAEHAEPVVEPQDETADIEEAAETA 966
                           90       100       110       120       130       140       150
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|..
gi 1841659210  372 GMPIPLPGVAMMQIVSCPYVKTVATTKTGVL-PGMAPPIVPMIHPQVAIAASPATLAGATA-------VSEW 435
Cdd:PRK10811   967 EVVVAEPEVVAQPAAPVVAEVAAEVETVTAVePEVAPAQVPEATVEHNHATAPMTRAPAPEyvpeaprHSDW 1038
PRK14950 PRK14950
DNA polymerase III subunits gamma and tau; Provisional
292-378 6.81e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 237864 [Multi-domain]  Cd Length: 585  Bit Score: 40.56  E-value: 6.81e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  292 VAQTVSTPTTQDQTPSSAVsvATPTVSVSTPAPTATPVQTVPQPHPQTlpPAVPHSVPQPTTAIP--AFPPVMVPPFRVP 369
Cdd:PRK14950   355 VIEALLVPVPAPQPAKPTA--AAPSPVRPTPAPSTRPKAAAAANIPPK--EPVRETATPPPVPPRpvAPPVPHTPESAPK 430

                   ....*....
gi 1841659210  370 LPGMPIPLP 378
Cdd:PRK14950   431 LTRAAIPVD 439
PRK14959 PRK14959
DNA polymerase III subunits gamma and tau; Provisional
293-374 7.39e-03

DNA polymerase III subunits gamma and tau; Provisional


Pssm-ID: 184923 [Multi-domain]  Cd Length: 624  Bit Score: 40.43  E-value: 7.39e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  293 AQTVSTPTTQ-DQTPSSAVSVATPTVSVSTPAPTATPVQTVPQphpQTLPPAVPHSvpqpttAIPAFPPVMVPPfRVPLP 371
Cdd:PRK14959   394 AATIPTPGTQgPQGTAPAAGMTPSSAAPATPAPSAAPSPRVPW---DDAPPAPPRS------GIPPRPAPRMPE-ASPVP 463

                   ...
gi 1841659210  372 GMP 374
Cdd:PRK14959   464 GAP 466
DUF5585 pfam17823
Family of unknown function (DUF5585); This is a family of unknown function found in chordata.
259-424 7.43e-03

Family of unknown function (DUF5585); This is a family of unknown function found in chordata.


Pssm-ID: 465521 [Multi-domain]  Cd Length: 506  Bit Score: 40.33  E-value: 7.43e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  259 STPTTS-----SPAPAVSTSTSSSTPSSTTSTTTTATSVAQT--VSTPTTqdQTPSSAVSVATPTVSVSTPAPTATPVQT 331
Cdd:pfam17823   98 SEPATRegaadGAASRALAAAASSSPSSAAQSLPAAIAALPSeaFSAPRA--AACRANASAAPRAAIAAASAPHAASPAP 175
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  332 VPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVP--PFRVPLPGMPIPLPGVAMMQIvscPYVKTVATTKTGVLPGMAPPI 409
Cdd:pfam17823  176 RTAASSTTAASSTTAASSAPTTAASSAPATLTParGISTAATATGHPAAGTALAAV---GNSSPAAGTVTAAVGTVTPAA 252
                          170
                   ....*....|....*
gi 1841659210  410 VPMIHPQVAIAASPA 424
Cdd:pfam17823  253 LATLAAAAGTVASAA 267
PLN02316 PLN02316
synthase/transferase
850-935 7.89e-03

synthase/transferase


Pssm-ID: 215180 [Multi-domain]  Cd Length: 1036  Bit Score: 40.24  E-value: 7.89e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  850 SSMREDLFKQYiekiaknLDSEKEKELERQARIEASlREREREVQKARSEQTKEIDREREQHKREEAIQNFKA--LLSDM 927
Cdd:PLN02316   239 GGMDEHSFEDF-------LLEEKRRELEKLAKEEAE-RERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKLqnLLKKA 310

                   ....*...
gi 1841659210  928 VRSSDVSW 935
Cdd:PLN02316   311 SRSADNVW 318
PRK14971 PRK14971
DNA polymerase III subunit gamma/tau;
293-375 8.40e-03

DNA polymerase III subunit gamma/tau;


Pssm-ID: 237874 [Multi-domain]  Cd Length: 614  Bit Score: 40.14  E-value: 8.40e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 1841659210  293 AQTVSTPTTQDQTPSSAVSVATPTVSVSTPAPTATPVQ-TVPQPHPQTLPPavPHSVPQPTTAIPAFPPVMVPPFRVPLP 371
Cdd:PRK14971   389 APQPSAAAAASPSPSQSSAAAQPSAPQSATQPAGTPPTvSVDPPAAVPVNP--PSTAPQAVRPAQFKEEKKIPVSKVSSL 466

                   ....
gi 1841659210  372 GMPI 375
Cdd:PRK14971   467 GPST 470
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH