NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|2181405477|ref|NP_001387014|]
View 

transcription elongation regulator 1 isoform 8 [Homo sapiens]

Protein Classification

pre-mRNA-processing factor 40 family protein; WW domain-containing protein( domain architecture ID 13915558)

pre-mRNA-processing factor 40 (PRPF40) family protein similar to mammalian PRPF40 homologs A and B that may be involved in pre-mRNA splicing; contains WW and FF domains| WW domain-containing protein; the WW domain mediates protein-protein interaction via proline-rich motifs, such as PPxY

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PRP40 super family cl34905
Splicing factor [RNA processing and modification];
372-986 2.08e-22

Splicing factor [RNA processing and modification];


The actual alignment was detected with superfamily member COG5104:

Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 102.85  E-value: 2.08e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  372 ASPATLAGATAVSEWTEYKTADGKTYYYNNRTLESTWEKPQElkekekleekikepIKEPSEEPLPMEteeedpkeepik 451
Cdd:COG5104      3 AALLGMASGEARSEWEELKAPDGRIYYYNKRTGKSSWEKPKE--------------LLKGSEEDLDVD------------ 56
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  452 eikeepkeeemteeekaaqkakpvatapipgtPWCVVWTGDERVFFYNPTTRLSMWDRpddligradvdkiiqePPHKKG 531
Cdd:COG5104     57 --------------------------------PWKECRTADGKVYYYNSITRESRWKI----------------PPERKK 88
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  532 MEELKKLRHPTPTMLSIQKWQFSMSAIkEEQELMEEINEDepvkakkrkrmskksfmwiaraslfrrddnkdidsEKEAA 611
Cdd:COG5104     89 VEPIAEQKHDERSMIGGNGNDMAITDH-ETSEPKYLLGRL-----------------------------------MSQYG 132
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  612 MEAEIKAARERAivpLEARMKQFKDMLLERGVSAFSTWEKELHKIVfDPRYLLL--NPKERKQVFDQYVKTRAEEERREK 689
Cdd:COG5104    133 ITSTKDAVYRLT---KEEAEKEFITMLKENQVDSTWPIFRAIEELR-DPRYWMVdtDPLWRKDLFKKYFENQEKDQREEE 208
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  690 KNKIMQAKEDFKKMME-EAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDSKTRGEKIKSDFF 768
Cdd:COG5104    209 ENKQRKYINEFCKMLAgNSHIKYYTDWFTFKSIFSKHPYYSSVVNEKTKRQTFQKYKDKLGCYEKYVGKHMGGTALGRLE 288
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  769 ELLSNHHLDSQSRWSKVKDKVESDPRYKAvdSSSM----REDLFKQYIeKIAKNLdsekekelerqarieaslrerEREV 844
Cdd:COG5104    289 EVLRSLGSETFIIWLLNHYVFDSVVRYLK--NKEMkpldRKDILFSFI-RYVRRL---------------------EKEL 344
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  845 QKARSEQTKEIDReREQHKREeaiqNFKALLSDMVRSSDVS----WSDTRRTLRKDHRWESGSLLEREEKEKLFNEHIEA 920
Cdd:COG5104    345 LSAIEERKAAAAQ-NARHHRD----EFRTLLRKLYSEGKIYyrmkWKNAYPLIKDDPRFLNLLGRTGSSPLDLFFDFIVD 419
                          570       580       590       600       610       620       630
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 2181405477  921 LTKKKREHFRQLLDETSaITLTSTW--KEVKKIIKEDPRciKFSSSDRKKQREFEE---YIRDKYITAKAD 986
Cdd:COG5104    420 LENMYGFARRSYERETR-TGQISPTdrRAVDEIFEAIAE--KKEEGEIKFDKVDKEdisLIVDGLIKQRNE 487
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
132-164 7.26e-08

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


:

Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 49.14  E-value: 7.26e-08
                            10        20        30
                    ....*....|....*....|....*....|...
gi 2181405477   132 PTEEIWVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:smart00456    1 PLPPGWEERKDPDGRPYYYNHETKETQWEKPRE 33
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
981-1045 1.81e-04

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


:

Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 40.25  E-value: 1.81e-04
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405477   981 ITAKADFRTLLKETKFITYrskkliqesDQHLKDVEKILQNDKRYLVLDcVPEERRKLIVAYVDD 1045
Cdd:smart00441    1 EEAKEAFKELLKEHEVITP---------DTTWSEARKKLKNDPRYKALL-SESEREQLFEDHIEE 55
PAT1 super family cl37801
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
300-379 2.39e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


The actual alignment was detected with superfamily member pfam09770:

Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 45.41  E-value: 2.39e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  300 PVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMV------------PPFRVPLPGMPIPLPGVLPG-MAPPIVPMIHP 366
Cdd:pfam09770  262 PVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVqptqilqnpnrlSAARVGYPQNPQPGVQPAPAhQAHRQQGSFGR 341
                           90
                   ....*....|...
gi 2181405477  367 QVAIAASPATLAG 379
Cdd:pfam09770  342 QAPIITHPQQLAQ 354
 
Name Accession Description Interval E-value
PRP40 COG5104
Splicing factor [RNA processing and modification];
372-986 2.08e-22

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 102.85  E-value: 2.08e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  372 ASPATLAGATAVSEWTEYKTADGKTYYYNNRTLESTWEKPQElkekekleekikepIKEPSEEPLPMEteeedpkeepik 451
Cdd:COG5104      3 AALLGMASGEARSEWEELKAPDGRIYYYNKRTGKSSWEKPKE--------------LLKGSEEDLDVD------------ 56
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  452 eikeepkeeemteeekaaqkakpvatapipgtPWCVVWTGDERVFFYNPTTRLSMWDRpddligradvdkiiqePPHKKG 531
Cdd:COG5104     57 --------------------------------PWKECRTADGKVYYYNSITRESRWKI----------------PPERKK 88
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  532 MEELKKLRHPTPTMLSIQKWQFSMSAIkEEQELMEEINEDepvkakkrkrmskksfmwiaraslfrrddnkdidsEKEAA 611
Cdd:COG5104     89 VEPIAEQKHDERSMIGGNGNDMAITDH-ETSEPKYLLGRL-----------------------------------MSQYG 132
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  612 MEAEIKAARERAivpLEARMKQFKDMLLERGVSAFSTWEKELHKIVfDPRYLLL--NPKERKQVFDQYVKTRAEEERREK 689
Cdd:COG5104    133 ITSTKDAVYRLT---KEEAEKEFITMLKENQVDSTWPIFRAIEELR-DPRYWMVdtDPLWRKDLFKKYFENQEKDQREEE 208
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  690 KNKIMQAKEDFKKMME-EAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDSKTRGEKIKSDFF 768
Cdd:COG5104    209 ENKQRKYINEFCKMLAgNSHIKYYTDWFTFKSIFSKHPYYSSVVNEKTKRQTFQKYKDKLGCYEKYVGKHMGGTALGRLE 288
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  769 ELLSNHHLDSQSRWSKVKDKVESDPRYKAvdSSSM----REDLFKQYIeKIAKNLdsekekelerqarieaslrerEREV 844
Cdd:COG5104    289 EVLRSLGSETFIIWLLNHYVFDSVVRYLK--NKEMkpldRKDILFSFI-RYVRRL---------------------EKEL 344
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  845 QKARSEQTKEIDReREQHKREeaiqNFKALLSDMVRSSDVS----WSDTRRTLRKDHRWESGSLLEREEKEKLFNEHIEA 920
Cdd:COG5104    345 LSAIEERKAAAAQ-NARHHRD----EFRTLLRKLYSEGKIYyrmkWKNAYPLIKDDPRFLNLLGRTGSSPLDLFFDFIVD 419
                          570       580       590       600       610       620       630
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 2181405477  921 LTKKKREHFRQLLDETSaITLTSTW--KEVKKIIKEDPRciKFSSSDRKKQREFEE---YIRDKYITAKAD 986
Cdd:COG5104    420 LENMYGFARRSYERETR-TGQISPTdrRAVDEIFEAIAE--KKEEGEIKFDKVDKEdisLIVDGLIKQRNE 487
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
762-811 1.28e-13

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 65.94  E-value: 1.28e-13
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 2181405477  762 KIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQY 811
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYKALLDGSEREELFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
923-978 7.80e-10

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 55.27  E-value: 7.80e-10
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*.
gi 2181405477   923 KKKREHFRQLLDETSAITLTSTWKEVKKIIKEDPRCiKFSSSDRKKQREFEEYIRD 978
Cdd:smart00441    1 EEAKEAFKELLKEHEVITPDTTWSEARKKLKNDPRY-KALLSESEREQLFEDHIEE 55
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
384-411 2.61e-08

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 50.60  E-value: 2.61e-08
                           10        20
                   ....*....|....*....|....*...
gi 2181405477  384 SEWTEYKTADGKTYYYNNRTLESTWEKP 411
Cdd:cd00201      2 PGWEERWDPDGRVYYYNHNTKETQWEDP 29
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
132-164 7.26e-08

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 49.14  E-value: 7.26e-08
                            10        20        30
                    ....*....|....*....|....*....|...
gi 2181405477   132 PTEEIWVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:smart00456    1 PLPPGWEERKDPDGRPYYYNHETKETQWEKPRE 33
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
137-162 7.30e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.04  E-value: 7.30e-08
                           10        20
                   ....*....|....*....|....*.
gi 2181405477  137 WVENKTPDGKVYYYNARTRESAWTKP 162
Cdd:pfam00397    5 WEERWDPDGRVYYYNHETGETQWEKP 30
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
137-164 1.92e-07

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 47.91  E-value: 1.92e-07
                           10        20
                   ....*....|....*....|....*...
gi 2181405477  137 WVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:cd00201      4 WEERWDPDGRVYYYNHNTKETQWEDPRE 31
PTZ00121 PTZ00121
MAEBL; Provisional
678-1018 1.27e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 49.75  E-value: 1.27e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  678 VKTRAEEERR--EKKNKIMQAK--EDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREAlfNEFVAAARKKEK 753
Cdd:PTZ00121  1423 AKKKAEEKKKadEAKKKAEEAKkaDEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEA--KKKAEEAKKKAD 1500
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  754 EDSKTRGEKIKSDffELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEKIAKNLDSEKEKELERQARI 833
Cdd:PTZ00121  1501 EAKKAAEAKKKAD--EAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNM 1578
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  834 EASLREREREVQKARSEQTKEIDREREQHKREEAiqnfKALLSDMVRSSDVSWSDTRRtlRKDHRWESGSLLEREEKEKL 913
Cdd:PTZ00121  1579 ALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEA----KKAEEAKIKAEELKKAEEEK--KKVEQLKKKEAEEKKKAEEL 1652
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  914 FNEHIEALTKKKREHFRQLLDETSAitltstwKEVKKIIKEDPRCIKFSSSDRKKQREFEEyIRDKYITAKADFRTLLKE 993
Cdd:PTZ00121  1653 KKAEEENKIKAAEEAKKAEEDKKKA-------EEAKKAEEDEKKAAEALKKEAEEAKKAEE-LKKKEAEEKKKAEELKKA 1724
                          330       340
                   ....*....|....*....|....*
gi 2181405477  994 TKFITYRSKKLIQESDQHLKDVEKI 1018
Cdd:PTZ00121  1725 EEENKIKAEEAKKEAEEDKKKAEEA 1749
PRP40 COG5104
Splicing factor [RNA processing and modification];
136-173 1.92e-05

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 48.54  E-value: 1.92e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 2181405477  136 IWVENKTPDGKVYYYNARTRESAWTKPDgvKVIQQSEL 173
Cdd:COG5104     16 EWEELKAPDGRIYYYNKRTGKSSWEKPK--ELLKGSEE 51
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
981-1045 1.81e-04

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 40.25  E-value: 1.81e-04
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405477   981 ITAKADFRTLLKETKFITYrskkliqesDQHLKDVEKILQNDKRYLVLDcVPEERRKLIVAYVDD 1045
Cdd:smart00441    1 EEAKEAFKELLKEHEVITP---------DTTWSEARKKLKNDPRYKALL-SESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
983-1042 1.89e-04

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 40.13  E-value: 1.89e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  983 AKADFRTLLKETKfITYRSkkliqesdqHLKDVEKILQNDKRYLVLDcVPEERRKLIVAY 1042
Cdd:pfam01846    2 AREAFKELLKEHK-ITPYS---------TWSEIKKKIENDPRYKALL-DGSEREELFEDY 50
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
300-379 2.39e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 45.41  E-value: 2.39e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  300 PVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMV------------PPFRVPLPGMPIPLPGVLPG-MAPPIVPMIHP 366
Cdd:pfam09770  262 PVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVqptqilqnpnrlSAARVGYPQNPQPGVQPAPAhQAHRQQGSFGR 341
                           90
                   ....*....|...
gi 2181405477  367 QVAIAASPATLAG 379
Cdd:pfam09770  342 QAPIITHPQQLAQ 354
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
299-385 7.37e-04

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 43.52  E-value: 7.37e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  299 TPVQTVPQPHPqTLPPAVPHSV-PQPTTAI-PAFPPVMVPPFRVPLPGMPIPLPGVLPGMAPPIVPMIHPQVAIAASP-- 374
Cdd:TIGR01645  349 SSAKKEAEEVP-PLPQAAPAVVkPGPMEIPtPVPPPGLAIPSLVAPPGLVAPTEINPSFLASPRKKMKREKLPVTFGAld 427
                           90
                   ....*....|.
gi 2181405477  375 ATLAGATAVSE 385
Cdd:TIGR01645  428 DTLAWKEPSKE 438
PHA02682 PHA02682
ORF080 virion core protein; Provisional
300-393 1.23e-03

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 42.16  E-value: 1.23e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  300 PVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPP-FRVPLPGMPIPLPGVLPGMAPPIV-PMIHPQVAIAASPATL 377
Cdd:PHA02682   105 PAVTCPAPAPACPPATAPTCPPPAVCPAPARPAPACPPsTRQCPPAPPLPTPKPAPAAKPIFLhNQLPPPDYPAASCPTI 184
                           90
                   ....*....|....*.
gi 2181405477  378 AGATAVSEWTEYKTAD 393
Cdd:PHA02682   185 ETAPAASPVLEPRIPD 200
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
812-1021 6.74e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 40.44  E-value: 6.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  812 IEKIAKNLDSEKEKELERQARIEASLREREREVQKARSEQT---KEIDREREQ-HKREEAIQNFKALLSD-MVRSSDVSW 886
Cdd:TIGR02169  721 IEKEIEQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKeleARIEELEEDlHKLEEALNDLEARLSHsRIPEIQAEL 800
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  887 SDtrrtLRKDHRWESGSL--LEREEKEKLFNEHIEaltKKKREHFRQLLDEtsaitLTSTWKEVKKIIKEDPRCIKFSSS 964
Cdd:TIGR02169  801 SK----LEEEVSRIEARLreIEQKLNRLTLEKEYL---EKEIQELQEQRID-----LKEQIKSIEKEIENLNGKKEELEE 868
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  965 DRKKQREFEEYIRDKYITAKADFRTLLKETKFITYRSKKL---IQESDQHLKDVEKILQN 1021
Cdd:TIGR02169  869 ELEELEAALRDLESRLGDLKKERDELEAQLRELERKIEELeaqIEKKRKRLSELKAKLEA 928
 
Name Accession Description Interval E-value
PRP40 COG5104
Splicing factor [RNA processing and modification];
372-986 2.08e-22

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 102.85  E-value: 2.08e-22
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  372 ASPATLAGATAVSEWTEYKTADGKTYYYNNRTLESTWEKPQElkekekleekikepIKEPSEEPLPMEteeedpkeepik 451
Cdd:COG5104      3 AALLGMASGEARSEWEELKAPDGRIYYYNKRTGKSSWEKPKE--------------LLKGSEEDLDVD------------ 56
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  452 eikeepkeeemteeekaaqkakpvatapipgtPWCVVWTGDERVFFYNPTTRLSMWDRpddligradvdkiiqePPHKKG 531
Cdd:COG5104     57 --------------------------------PWKECRTADGKVYYYNSITRESRWKI----------------PPERKK 88
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  532 MEELKKLRHPTPTMLSIQKWQFSMSAIkEEQELMEEINEDepvkakkrkrmskksfmwiaraslfrrddnkdidsEKEAA 611
Cdd:COG5104     89 VEPIAEQKHDERSMIGGNGNDMAITDH-ETSEPKYLLGRL-----------------------------------MSQYG 132
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  612 MEAEIKAARERAivpLEARMKQFKDMLLERGVSAFSTWEKELHKIVfDPRYLLL--NPKERKQVFDQYVKTRAEEERREK 689
Cdd:COG5104    133 ITSTKDAVYRLT---KEEAEKEFITMLKENQVDSTWPIFRAIEELR-DPRYWMVdtDPLWRKDLFKKYFENQEKDQREEE 208
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  690 KNKIMQAKEDFKKMME-EAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDSKTRGEKIKSDFF 768
Cdd:COG5104    209 ENKQRKYINEFCKMLAgNSHIKYYTDWFTFKSIFSKHPYYSSVVNEKTKRQTFQKYKDKLGCYEKYVGKHMGGTALGRLE 288
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  769 ELLSNHHLDSQSRWSKVKDKVESDPRYKAvdSSSM----REDLFKQYIeKIAKNLdsekekelerqarieaslrerEREV 844
Cdd:COG5104    289 EVLRSLGSETFIIWLLNHYVFDSVVRYLK--NKEMkpldRKDILFSFI-RYVRRL---------------------EKEL 344
                          490       500       510       520       530       540       550       560
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  845 QKARSEQTKEIDReREQHKREeaiqNFKALLSDMVRSSDVS----WSDTRRTLRKDHRWESGSLLEREEKEKLFNEHIEA 920
Cdd:COG5104    345 LSAIEERKAAAAQ-NARHHRD----EFRTLLRKLYSEGKIYyrmkWKNAYPLIKDDPRFLNLLGRTGSSPLDLFFDFIVD 419
                          570       580       590       600       610       620       630
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|.
gi 2181405477  921 LTKKKREHFRQLLDETSaITLTSTW--KEVKKIIKEDPRciKFSSSDRKKQREFEE---YIRDKYITAKAD 986
Cdd:COG5104    420 LENMYGFARRSYERETR-TGQISPTdrRAVDEIFEAIAE--KKEEGEIKFDKVDKEdisLIVDGLIKQRNE 487
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
762-811 1.28e-13

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 65.94  E-value: 1.28e-13
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 2181405477  762 KIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQY 811
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYKALLDGSEREELFEDY 50
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
695-744 1.31e-11

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 60.16  E-value: 1.31e-11
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|
gi 2181405477  695 QAKEDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEF 744
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYKALLDGSEREELFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
923-978 7.80e-10

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 55.27  E-value: 7.80e-10
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*.
gi 2181405477   923 KKKREHFRQLLDETSAITLTSTWKEVKKIIKEDPRCiKFSSSDRKKQREFEEYIRD 978
Cdd:smart00441    1 EEAKEAFKELLKEHEVITPDTTWSEARKKLKNDPRY-KALLSESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
630-677 8.64e-10

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 55.16  E-value: 8.64e-10
                           10        20        30        40
                   ....*....|....*....|....*....|....*....|....*....
gi 2181405477  630 RMKQFKDMLLERGVSAFSTWEKELHKIVFDPRYL-LLNPKERKQVFDQY 677
Cdd:pfam01846    2 AREAFKELLKEHKITPYSTWSEIKKKIENDPRYKaLLDGSEREELFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
761-814 1.69e-09

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 54.50  E-value: 1.69e-09
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*
gi 2181405477   761 EKIKSDFFELLSNHHLD-SQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEK 814
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYKALLSESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
866-917 3.05e-09

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 53.61  E-value: 3.05e-09
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|..
gi 2181405477  866 EAIQNFKALLSDMVRSSDVSWSDTRRTLRKDHRWEsgSLLEREEKEKLFNEH 917
Cdd:pfam01846    1 KAREAFKELLKEHKITPYSTWSEIKKKIENDPRYK--ALLDGSEREELFEDY 50
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
384-411 2.61e-08

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 50.60  E-value: 2.61e-08
                           10        20
                   ....*....|....*....|....*...
gi 2181405477  384 SEWTEYKTADGKTYYYNNRTLESTWEKP 411
Cdd:cd00201      2 PGWEERWDPDGRVYYYNHNTKETQWEDP 29
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
384-411 4.26e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.81  E-value: 4.26e-08
                           10        20
                   ....*....|....*....|....*...
gi 2181405477  384 SEWTEYKTADGKTYYYNNRTLESTWEKP 411
Cdd:pfam00397    3 PGWEERWDPDGRVYYYNHETGETQWEKP 30
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
132-164 7.26e-08

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 49.14  E-value: 7.26e-08
                            10        20        30
                    ....*....|....*....|....*....|...
gi 2181405477   132 PTEEIWVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:smart00456    1 PLPPGWEERKDPDGRPYYYNHETKETQWEKPRE 33
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
137-162 7.30e-08

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 49.04  E-value: 7.30e-08
                           10        20
                   ....*....|....*....|....*.
gi 2181405477  137 WVENKTPDGKVYYYNARTRESAWTKP 162
Cdd:pfam00397    5 WEERWDPDGRVYYYNHETGETQWEKP 30
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
694-747 1.36e-07

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 49.11  E-value: 1.36e-07
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*
gi 2181405477   694 MQAKEDFKKMMEEAKFN-PRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAA 747
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYKALLSESEREQLFEDHIEE 55
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
386-412 1.47e-07

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 48.37  E-value: 1.47e-07
                            10        20
                    ....*....|....*....|....*..
gi 2181405477   386 WTEYKTADGKTYYYNNRTLESTWEKPQ 412
Cdd:smart00456    6 WEERKDPDGRPYYYNHETKETQWEKPR 32
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
137-164 1.92e-07

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 47.91  E-value: 1.92e-07
                           10        20
                   ....*....|....*....|....*...
gi 2181405477  137 WVENKTPDGKVYYYNARTRESAWTKPDG 164
Cdd:cd00201      4 WEERWDPDGRVYYYNHNTKETQWEDPRE 31
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
924-975 3.02e-07

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 47.84  E-value: 3.02e-07
                           10        20        30        40        50
                   ....*....|....*....|....*....|....*....|....*....|..
gi 2181405477  924 KKREHFRQLLDETSaITLTSTWKEVKKIIKEDPRCIKFSSSDRKKQrEFEEY 975
Cdd:pfam01846    1 KAREAFKELLKEHK-ITPYSTWSEIKKKIENDPRYKALLDGSEREE-LFEDY 50
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
865-920 2.55e-06

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 45.64  E-value: 2.55e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....*..
gi 2181405477   865 EEAIQNFKALLSDMVRS-SDVSWSDTRRTLRKDHRWESgsLLEREEKEKLFNEHIEA 920
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYKA--LLSESEREQLFEDHIEE 55
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
628-679 6.74e-06

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 44.10  E-value: 6.74e-06
                            10        20        30        40        50
                    ....*....|....*....|....*....|....*....|....*....|....
gi 2181405477   628 EARMKQFKDMLLERGVS-AFSTWEKELHKIVFDPRY-LLLNPKERKQVFDQYVK 679
Cdd:smart00441    1 EEAKEAFKELLKEHEVItPDTTWSEARKKLKNDPRYkALLSESEREQLFEDHIE 54
WW smart00456
Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds ...
484-512 1.22e-05

Domain with 2 conserved Trp (W) residues; Also known as the WWP or rsp5 domain. Binds proline-rich polypeptides.


Pssm-ID: 197736 [Multi-domain]  Cd Length: 33  Bit Score: 42.97  E-value: 1.22e-05
                            10        20
                    ....*....|....*....|....*....
gi 2181405477   484 PWCVVWTGDERVFFYNPTTRLSMWDRPDD 512
Cdd:smart00456    5 GWEERKDPDGRPYYYNHETKETQWEKPRE 33
PTZ00121 PTZ00121
MAEBL; Provisional
678-1018 1.27e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 49.75  E-value: 1.27e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  678 VKTRAEEERR--EKKNKIMQAK--EDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREAlfNEFVAAARKKEK 753
Cdd:PTZ00121  1423 AKKKAEEKKKadEAKKKAEEAKkaDEAKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKADEA--KKKAEEAKKKAD 1500
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  754 EDSKTRGEKIKSDffELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEKIAKNLDSEKEKELERQARI 833
Cdd:PTZ00121  1501 EAKKAAEAKKKAD--EAKKAEEAKKADEAKKAEEAKKADEAKKAEEKKKADELKKAEELKKAEEKKKAEEAKKAEEDKNM 1578
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  834 EASLREREREVQKARSEQTKEIDREREQHKREEAiqnfKALLSDMVRSSDVSWSDTRRtlRKDHRWESGSLLEREEKEKL 913
Cdd:PTZ00121  1579 ALRKAEEAKKAEEARIEEVMKLYEEEKKMKAEEA----KKAEEAKIKAEELKKAEEEK--KKVEQLKKKEAEEKKKAEEL 1652
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  914 FNEHIEALTKKKREHFRQLLDETSAitltstwKEVKKIIKEDPRCIKFSSSDRKKQREFEEyIRDKYITAKADFRTLLKE 993
Cdd:PTZ00121  1653 KKAEEENKIKAAEEAKKAEEDKKKA-------EEAKKAEEDEKKAAEALKKEAEEAKKAEE-LKKKEAEEKKKAEELKKA 1724
                          330       340
                   ....*....|....*....|....*
gi 2181405477  994 TKFITYRSKKLIQESDQHLKDVEKI 1018
Cdd:PTZ00121  1725 EEENKIKAEEAKKEAEEDKKKAEEA 1749
DUF5401 pfam17380
Family of unknown function (DUF5401); This is a family of unknown function found in ...
669-919 1.54e-05

Family of unknown function (DUF5401); This is a family of unknown function found in Chromadorea.


Pssm-ID: 375164 [Multi-domain]  Cd Length: 722  Bit Score: 48.97  E-value: 1.54e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  669 ERKQVfDQYVKTRAEEERREKKNKIMQAKEdfKKMMEEAKFNPRATFSEFAAKHAKDSRFkAIEKMKDREALFNEfvaaa 748
Cdd:pfam17380  286 ERQQQ-EKFEKMEQERLRQEKEEKAREVER--RRKLEEAEKARQAEMDRQAAIYAEQERM-AMERERELERIRQE----- 356
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  749 rKKEKEDSKTRGEKIKSDFFEL--LSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMRE-DLFKQYIEKIaknldsEKEK 825
Cdd:pfam17380  357 -ERKRELERIRQEEIAMEISRMreLERLQMERQQKNERVRQELEAARKVKILEEERQRKiQQQKVEMEQI------RAEQ 429
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  826 ELERQARIEASLREREREVQKARSEQ---TKEIDREREQhkrEEAIQNFKALLSDMVRSSDVSWSDTRRTLRKDHRWESG 902
Cdd:pfam17380  430 EEARQREVRRLEEERAREMERVRLEEqerQQQVERLRQQ---EEERKRKKLELEKEKRDRKRAEEQRRKILEKELEERKQ 506
                          250
                   ....*....|....*..
gi 2181405477  903 SLLEREEKEKLFNEHIE 919
Cdd:pfam17380  507 AMIEEERKRKLLEKEME 523
PRP40 COG5104
Splicing factor [RNA processing and modification];
136-173 1.92e-05

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 48.54  E-value: 1.92e-05
                           10        20        30
                   ....*....|....*....|....*....|....*...
gi 2181405477  136 IWVENKTPDGKVYYYNARTRESAWTKPDgvKVIQQSEL 173
Cdd:COG5104     16 EWEELKAPDGRIYYYNKRTGKSSWEKPK--ELLKGSEE 51
WW cd00201
Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; ...
483-512 2.21e-05

Two conserved tryptophans domain; also known as the WWP or rsp5 domain; around 40 amino acids; functions as an interaction module in a diverse set of signalling proteins; binds specific proline-rich sequences but at low affinities compared to other peptide recognition proteins such as antibodies and receptors; WW domains have a single groove formed by a conserved Trp and Tyr which recognizes a pair of residues of the sequence X-Pro; variable loops and neighboring domains confer specificity in this domain; there are five distinct groups based on binding: 1) PPXY motifs 2) the PPLP motif; 3) PGM motifs; 4) PSP or PTP motifs; 5) PR motifs.


Pssm-ID: 238122 [Multi-domain]  Cd Length: 31  Bit Score: 42.13  E-value: 2.21e-05
                           10        20        30
                   ....*....|....*....|....*....|
gi 2181405477  483 TPWCVVWTGDERVFFYNPTTRLSMWDRPDD 512
Cdd:cd00201      2 PGWEERWDPDGRVYYYNHNTKETQWEDPRE 31
PRP40 COG5104
Splicing factor [RNA processing and modification];
137-172 7.43e-05

Splicing factor [RNA processing and modification];


Pssm-ID: 227435 [Multi-domain]  Cd Length: 590  Bit Score: 46.61  E-value: 7.43e-05
                           10        20        30
                   ....*....|....*....|....*....|....*.
gi 2181405477  137 WVENKTPDGKVYYYNARTRESAWTKPDGVKVIQQSE 172
Cdd:COG5104     58 WKECRTADGKVYYYNSITRESRWKIPPERKKVEPIA 93
PTZ00121 PTZ00121
MAEBL; Provisional
681-950 7.43e-05

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 47.06  E-value: 7.43e-05
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  681 RAEEERR--EKKNKIMQAK--EDFKKMMEEAKFNPRATFSEFAAKHAKDSRFKAIEKMKDREALFNEFVAAARKKEKEDS 756
Cdd:PTZ00121  1297 KAEEKKKadEAKKKAEEAKkaDEAKKKAEEAKKKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEKKKEEA 1376
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  757 KTRGEKIKSDFFELLSNHHLDSQSRwskvKDKVESDPRYKAVDSSSMREDLfKQYIEKIAKNLDSEKEKELERQARieaS 836
Cdd:PTZ00121  1377 KKKADAAKKKAEEKKKADEAKKKAE----EDKKKADELKKAAAAKKKADEA-KKKAEEKKKADEAKKKAEEAKKAD---E 1448
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  837 LREREREVQKARSEQTKEIDREREQHKREEAIQNFKAllSDMVRSSDVSWSDTRRTLRKDHRWESGSLLEREEKEKLFNE 916
Cdd:PTZ00121  1449 AKKKAEEAKKAEEAKKKAEEAKKADEAKKKAEEAKKA--DEAKKKAEEAKKKADEAKKAAEAKKKADEAKKAEEAKKADE 1526
                          250       260       270
                   ....*....|....*....|....*....|....
gi 2181405477  917 HIEALTKKKREHFRQLLDETSAITLTSTwKEVKK 950
Cdd:PTZ00121  1527 AKKAEEAKKADEAKKAEEKKKADELKKA-EELKK 1559
FF smart00441
Contains two conserved F residues; A novel motif that often accompanies WW domains. Often ...
981-1045 1.81e-04

Contains two conserved F residues; A novel motif that often accompanies WW domains. Often contains two conserved Phe (F) residues.


Pssm-ID: 128718 [Multi-domain]  Cd Length: 55  Bit Score: 40.25  E-value: 1.81e-04
                            10        20        30        40        50        60
                    ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405477   981 ITAKADFRTLLKETKFITYrskkliqesDQHLKDVEKILQNDKRYLVLDcVPEERRKLIVAYVDD 1045
Cdd:smart00441    1 EEAKEAFKELLKEHEVITP---------DTTWSEARKKLKNDPRYKALL-SESEREQLFEDHIEE 55
FF pfam01846
FF domain; This domain has been predicted to be involved in protein-protein interaction. This ...
983-1042 1.89e-04

FF domain; This domain has been predicted to be involved in protein-protein interaction. This domain was recently shown to bind the hyperphosphorylated C-terminal repeat domain of RNA polymerase II, confirming its role in protein-protein interactions.


Pssm-ID: 426471 [Multi-domain]  Cd Length: 50  Bit Score: 40.13  E-value: 1.89e-04
                           10        20        30        40        50        60
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  983 AKADFRTLLKETKfITYRSkkliqesdqHLKDVEKILQNDKRYLVLDcVPEERRKLIVAY 1042
Cdd:pfam01846    2 AREAFKELLKEHK-ITPYS---------TWSEIKKKIENDPRYKALL-DGSEREELFEDY 50
WW pfam00397
WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds ...
483-510 2.37e-04

WW domain; The WW domain is a protein module with two highly conserved tryptophans that binds proline-rich peptide motifs in vitro.


Pssm-ID: 459800 [Multi-domain]  Cd Length: 30  Bit Score: 39.41  E-value: 2.37e-04
                           10        20
                   ....*....|....*....|....*...
gi 2181405477  483 TPWCVVWTGDERVFFYNPTTRLSMWDRP 510
Cdd:pfam00397    3 PGWEERWDPDGRVYYYNHETGETQWEKP 30
PAT1 pfam09770
Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate ...
300-379 2.39e-04

Topoisomerase II-associated protein PAT1; Members of this family are necessary for accurate chromosome transmission during cell division.


Pssm-ID: 401645 [Multi-domain]  Cd Length: 846  Bit Score: 45.41  E-value: 2.39e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  300 PVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMV------------PPFRVPLPGMPIPLPGVLPG-MAPPIVPMIHP 366
Cdd:pfam09770  262 PVTILQRPQSPQPDPAQPSIQPQAQQFHQQPPPVPVqptqilqnpnrlSAARVGYPQNPQPGVQPAPAhQAHRQQGSFGR 341
                           90
                   ....*....|...
gi 2181405477  367 QVAIAASPATLAG 379
Cdd:pfam09770  342 QAPIITHPQQLAQ 354
PTZ00121 PTZ00121
MAEBL; Provisional
665-1037 3.04e-04

MAEBL; Provisional


Pssm-ID: 173412 [Multi-domain]  Cd Length: 2084  Bit Score: 45.13  E-value: 3.04e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  665 LNPKERKQVFDQYVKTRAEEERREKKNKIMQAKEDFKKMMEEAKfnpratFSEFAAKHAKDSRfKAIEKMKDREALFNEf 744
Cdd:PTZ00121  1072 LKPSYKDFDFDAKEDNRADEATEEAFGKAEEAKKTETGKAEEAR------KAEEAKKKAEDAR-KAEEARKAEDARKAE- 1143
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  745 vaAARKKEkEDSKTRGEKIKSDFFELLSNHHLDSQSRWSKVKDKVE---SDPRYKAVDSSSMREDLFKQYIEKIAKNLDS 821
Cdd:PTZ00121  1144 --EARKAE-DAKRVEIARKAEDARKAEEARKAEDAKKAEAARKAEEvrkAEELRKAEDARKAEAARKAEEERKAEEARKA 1220
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  822 EKEKELERQARIEaSLREREREVQKARSEQTKEIDREREQHKREEAIQNFKALLSDMVRSSDvswsdtrrTLRK-DHRWE 900
Cdd:PTZ00121  1221 EDAKKAEAVKKAE-EAKKDAEEAKKAEEERNNEEIRKFEEARMAHFARRQAAIKAEEARKAD--------ELKKaEEKKK 1291
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  901 SGSLLEREEKEKLFNEHIEALTKKKREHFRQLLDET--SAITLTSTWKEVKKIIKEDPRCIKFSSSDRKKQREFEEYIRD 978
Cdd:PTZ00121  1292 ADEAKKAEEKKKADEAKKKAEEAKKADEAKKKAEEAkkKADAAKKKAEEAKKAAEAAKAEAEAAADEAEAAEEKAEAAEK 1371
                          330       340       350       360       370       380
                   ....*....|....*....|....*....|....*....|....*....|....*....|.
gi 2181405477  979 KYITAKADFRTLLK--ETKFITYRSKKLIQESDQHLKDVEKILQNDKRYLVLDCVPEERRK 1037
Cdd:PTZ00121  1372 KKEEAKKKADAAKKkaEEKKKADEAKKKAEEDKKKADELKKAAAAKKKADEAKKKAEEKKK 1432
PRK03918 PRK03918
DNA double-strand break repair ATPase Rad50;
684-1038 6.18e-04

DNA double-strand break repair ATPase Rad50;


Pssm-ID: 235175 [Multi-domain]  Cd Length: 880  Bit Score: 43.90  E-value: 6.18e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  684 EERREKKNKIMQAKEDFKKMMEEAKFNPRATFSEFAAKHAKDSRF-----KAIEKMKDREALFNEFVAAARKKEKEDSKT 758
Cdd:PRK03918   175 KRRIERLEKFIKRTENIEELIKEKEKELEEVLREINEISSELPELreeleKLEKEVKELEELKEEIEELEKELESLEGSK 254
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  759 RGEKIKsdffelLSNhhldSQSRWSKVKDKVEsDPRYKAVDSSSMREDLfKQYIEkiaknLDSEKEKELERQARIE---A 835
Cdd:PRK03918   255 RKLEEK------IRE----LEERIEELKKEIE-ELEEKVKELKELKEKA-EEYIK-----LSEFYEEYLDELREIEkrlS 317
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  836 SLREREREVQKARSEQTKEIDREREQHKREEAIQNFKALLSDMVRSSDvswsDTRRTLRKDHRWESGslLEREEKEKLFN 915
Cdd:PRK03918   318 RLEEEINGIEERIKELEEKEERLEELKKKLKELEKRLEELEERHELYE----EAKAKKEELERLKKR--LTGLTPEKLEK 391
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  916 EhIEALTKKKREHFRQLLdetsaiTLTSTWKEVKKIIKEDPRCIKFSSSDRKK----QREFEEYIRDKYITA-KADFRTL 990
Cdd:PRK03918   392 E-LEELEKAKEEIEEEIS------KITARIGELKKEIKELKKAIEELKKAKGKcpvcGRELTEEHRKELLEEyTAELKRI 464
                          330       340       350       360
                   ....*....|....*....|....*....|....*....|....*...
gi 2181405477  991 LKETKFITYRSKKLIQEsdqhLKDVEKILQNDKRYLVLDCVPEERRKL 1038
Cdd:PRK03918   465 EKELKEIEEKERKLRKE----LRELEKVLKKESELIKLKELAEQLKEL 508
half-pint TIGR01645
poly-U binding splicing factor, half-pint family; The proteins represented by this model ...
299-385 7.37e-04

poly-U binding splicing factor, half-pint family; The proteins represented by this model contain three RNA recognition motifs (rrm: pfam00076) and have been characterized as poly-pyrimidine tract binding proteins associated with RNA splicing factors. In the case of PUF60 (GP|6176532), in complex with p54, and in the presence of U2AF, facilitates association of U2 snRNP with pre-mRNA.


Pssm-ID: 130706 [Multi-domain]  Cd Length: 612  Bit Score: 43.52  E-value: 7.37e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  299 TPVQTVPQPHPqTLPPAVPHSV-PQPTTAI-PAFPPVMVPPFRVPLPGMPIPLPGVLPGMAPPIVPMIHPQVAIAASP-- 374
Cdd:TIGR01645  349 SSAKKEAEEVP-PLPQAAPAVVkPGPMEIPtPVPPPGLAIPSLVAPPGLVAPTEINPSFLASPRKKMKREKLPVTFGAld 427
                           90
                   ....*....|.
gi 2181405477  375 ATLAGATAVSE 385
Cdd:TIGR01645  428 DTLAWKEPSKE 438
SMC_N pfam02463
RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The ...
666-1058 8.07e-04

RecF/RecN/SMC N terminal domain; This domain is found at the N terminus of SMC proteins. The SMC (structural maintenance of chromosomes) superfamily proteins have ATP-binding domains at the N- and C-termini, and two extended coiled-coil domains separated by a hinge in the middle. The eukaryotic SMC proteins form two kind of heterodimers: the SMC1/SMC3 and the SMC2/SMC4 types. These heterodimers constitute an essential part of higher order complexes, which are involved in chromatin and DNA dynamics. This family also includes the RecF and RecN proteins that are involved in DNA metabolism and recombination.


Pssm-ID: 426784 [Multi-domain]  Cd Length: 1161  Bit Score: 43.81  E-value: 8.07e-04
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  666 NPKERKQVFDQYVKTRAEEERREKKNKIMQAKEDfkkmmeeakfnpratfsefaakhakdsrfKAIEKMKDREALFNEFV 745
Cdd:pfam02463  151 KPERRLEIEEEAAGSRLKRKKKEALKKLIEETEN-----------------------------LAELIIDLEELKLQELK 201
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  746 AAARKKEKEDSKTRGEKIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKavDSSSMREDLFKQYIEKIAKNLDSEKEK 825
Cdd:pfam02463  202 LKEQAKKALEYYQLKEKLELEEEYLLYLDYLKLNEERIDLLQELLRDEQEE--IESSKQEIEKEEEKLAQVLKENKEEEK 279
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  826 ELERQARIEASLREREREVQKAR--SEQTKEIDREREQHKREEAIQNFKALLSDMVRSSDvswsdtRRTLRKDHRWESGS 903
Cdd:pfam02463  280 EKKLQEEELKLLAKEEEELKSELlkLERRKVDDEEKLKESEKEKKKAEKELKKEKEEIEE------LEKELKELEIKREA 353
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  904 LLEREE----KEKLFNEHIEALTKKKREHFRQLLDETSAITLTSTWKEVKKIIkedprcikfsSSDRKKQREFEEYIRDK 979
Cdd:pfam02463  354 EEEEEEelekLQEKLEQLEEELLAKKKLESERLSSAAKLKEEELELKSEEEKE----------AQLLLELARQLEDLLKE 423
                          330       340       350       360       370       380       390
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....
gi 2181405477  980 YITAKADFrtLLKETKFITYRSKKLIQESDqHLKDVEKILQNDKRYLVLDCVPEERRKLIVAYVDDLDRRGPPPPPTAS 1058
Cdd:pfam02463  424 EKKEELEI--LEEEEESIELKQGKLTEEKE-ELEKQELKLLKDELELKKSEDLLKETQLVKLQEQLELLLSRQKLEERS 499
PHA02682 PHA02682
ORF080 virion core protein; Provisional
300-393 1.23e-03

ORF080 virion core protein; Provisional


Pssm-ID: 177464 [Multi-domain]  Cd Length: 280  Bit Score: 42.16  E-value: 1.23e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  300 PVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPP-FRVPLPGMPIPLPGVLPGMAPPIV-PMIHPQVAIAASPATL 377
Cdd:PHA02682   105 PAVTCPAPAPACPPATAPTCPPPAVCPAPARPAPACPPsTRQCPPAPPLPTPKPAPAAKPIFLhNQLPPPDYPAASCPTI 184
                           90
                   ....*....|....*.
gi 2181405477  378 AGATAVSEWTEYKTAD 393
Cdd:PHA02682   185 ETAPAASPVLEPRIPD 200
HEC1 COG5185
Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell ...
681-963 1.40e-03

Chromosome segregation protein NDC80, interacts with SMC proteins [Cell cycle control, cell division, chromosome partitioning];


Pssm-ID: 444066 [Multi-domain]  Cd Length: 594  Bit Score: 42.64  E-value: 1.40e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  681 RAEEERREKKNKIMQAKEDFKKMMEEAKFNPRATFSEFAAKHAKDSrfKAIEKMKDREALFNEFVAAARKKEKEDSKTRG 760
Cdd:COG5185    257 KLVEQNTDLRLEKLGENAESSKRLNENANNLIKQFENTKEKIAEYT--KSIDIKKATESLEEQLAAAEAEQELEESKRET 334
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  761 EKIKSDFFELLSNHHLDSQSRWSKVKDKVESDPRYKAVDSSSMREDLFKQYIEKIAKNLDSekekelerqarIEASLRER 840
Cdd:COG5185    335 ETGIQNLTAEIEQGQESLTENLEAIKEEIENIVGEVELSKSSEELDSFKDTIESTKESLDE-----------IPQNQRGY 403
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  841 EREVQKARSEQTKEIDREREQHKR---------EEAIQNFKALLSDMVRSSDVSWSDTRRTLRKDHRWESGSLLEREEKE 911
Cdd:COG5185    404 AQEILATLEDTLKAADRQIEELQRqieqatssnEEVSKLLNELISELNKVMREADEESQSRLEEAYDEINRSVRSKKEDL 483
                          250       260       270       280       290       300
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 2181405477  912 --------------KLFNEHIEALTKKKREHFRQLLDETSAITLTSTWKEVKKIIKEDPRCIKFSS 963
Cdd:COG5185    484 neeltqiesrvstlKATLEKLRAKLERQLEGVRSKLDQVAESLKDFMRARGYAHILALENLIPASE 549
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
258-374 4.51e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 41.29  E-value: 4.51e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  258 ASTPTTSSPAPAVSTSTSSSTPSSTTSTTTTATSVAQTVSTTPVQTV------------PQPHPQTLPPavPHSVPQPTT 325
Cdd:pfam03154  176 AQSGAASPPSPPPPGTTQAATAGPTPSAPSVPPQGSPATSQPPNQTQstaaphtliqqtPTLHPQRLPS--PHPPLQPMT 253
                           90       100       110       120       130
                   ....*....|....*....|....*....|....*....|....*....|....*....
gi 2181405477  326 AIPafPPVMVPP-------FRVPLPGMPIPL---PGVLPGMAPPIVPMIHPQVAIAASP 374
Cdd:pfam03154  254 QPP--PPSQVSPqplpqpsLHGQMPPMPHSLqtgPSHMQHPVPPQPFPLTPQSSQSQVP 310
Atrophin-1 pfam03154
Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian ...
296-367 5.57e-03

Atrophin-1 family; Atrophin-1 is the protein product of the dentatorubral-pallidoluysian atrophy (DRPLA) gene. DRPLA OMIM:125370 is a progressive neurodegenerative disorder. It is caused by the expansion of a CAG repeat in the DRPLA gene on chromosome 12p. This results in an extended polyglutamine region in atrophin-1, that is thought to confer toxicity to the protein, possibly through altering its interactions with other proteins. The expansion of a CAG repeat is also the underlying defect in six other neurodegenerative disorders, including Huntington's disease. One interaction of expanded polyglutamine repeats that is thought to be pathogenic is that with the short glutamine repeat in the transcriptional coactivator CREB binding protein, CBP. This interaction draws CBP away from its usual nuclear location to the expanded polyglutamine repeat protein aggregates that are characteriztic of the polyglutamine neurodegenerative disorders. This interferes with CBP-mediated transcription and causes cytotoxicity.


Pssm-ID: 460830 [Multi-domain]  Cd Length: 991  Bit Score: 40.91  E-value: 5.57e-03
                           10        20        30        40        50        60        70
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 2181405477  296 VSTTPVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMV------PPFRVPLPGMPIPLPGVLPGMAPPIVPMIHPQ 367
Cdd:pfam03154  293 VPPQPFPLTPQSSQSQVPPGPSPAAPGQSQQRIHTPPSQSqlqsqqPPREQPLPPAPLSMPHIKPPPTTPIPQLPNPQ 370
COG4913 COG4913
Uncharacterized conserved protein, contains a C-terminal ATPase domain [Function unknown];
804-935 6.32e-03

Uncharacterized conserved protein, contains a C-terminal ATPase domain [Function unknown];


Pssm-ID: 443941 [Multi-domain]  Cd Length: 1089  Bit Score: 40.67  E-value: 6.32e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  804 REDLFKQYIEKIAKNLDSEKEKELERQARIEAsLREREREVQKARSEQ--------TKEIDR-EREQHKREEAIQNFKAL 874
Cdd:COG4913    289 RLELLEAELEELRAELARLEAELERLEARLDA-LREELDELEAQIRGNggdrleqlEREIERlERELEERERRRARLEAL 367
                           90       100       110       120       130       140
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*
gi 2181405477  875 LSDMvrssDVSWSDTRRTLRKDHRwESGSLLER--EEKEKLFNEHIEALTKKK--REHFRQLLDE 935
Cdd:COG4913    368 LAAL----GLPLPASAEEFAALRA-EAAALLEAleEELEALEEALAEAEAALRdlRRELRELEAE 427
PHA03247 PHA03247
large tegument protein UL36; Provisional
293-386 6.45e-03

large tegument protein UL36; Provisional


Pssm-ID: 223021 [Multi-domain]  Cd Length: 3151  Bit Score: 40.69  E-value: 6.45e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  293 AQTVSTTPVQTVPQPHPQTLPPAVPHSVPQPTTAIPAFPPVMVPPFRVPLPGMPIPLPgvlpgmAPPIvpmihPQVAIAA 372
Cdd:PHA03247  2925 PPPQPQPPPPPPPRPQPPLAPTTDPAGAGEPSGAVPQPWLGALVPGRVAVPRFRVPQP------APSR-----EAPASST 2993
                           90
                   ....*....|....
gi 2181405477  373 SPATLAGATAVSEW 386
Cdd:PHA03247  2994 PPLTGHSLSRVSSW 3007
SMC_prok_A TIGR02169
chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of ...
812-1021 6.74e-03

chromosome segregation protein SMC, primarily archaeal type; SMC (structural maintenance of chromosomes) proteins bind DNA and act in organizing and segregating chromosomes for partition. SMC proteins are found in bacteria, archaea, and eukaryotes. It is found in a single copy and is homodimeric in prokaryotes, but six paralogs (excluded from this family) are found in eukarotes, where SMC proteins are heterodimeric. This family represents the SMC protein of archaea and a few bacteria (Aquifex, Synechocystis, etc); the SMC of other bacteria is described by TIGR02168. The N- and C-terminal domains of this protein are well conserved, but the central hinge region is skewed in composition and highly divergent. [Cellular processes, Cell division, DNA metabolism, Chromosome-associated proteins]


Pssm-ID: 274009 [Multi-domain]  Cd Length: 1164  Bit Score: 40.44  E-value: 6.74e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  812 IEKIAKNLDSEKEKELERQARIEASLREREREVQKARSEQT---KEIDREREQ-HKREEAIQNFKALLSD-MVRSSDVSW 886
Cdd:TIGR02169  721 IEKEIEQLEQEEEKLKERLEELEEDLSSLEQEIENVKSELKeleARIEELEEDlHKLEEALNDLEARLSHsRIPEIQAEL 800
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  887 SDtrrtLRKDHRWESGSL--LEREEKEKLFNEHIEaltKKKREHFRQLLDEtsaitLTSTWKEVKKIIKEDPRCIKFSSS 964
Cdd:TIGR02169  801 SK----LEEEVSRIEARLreIEQKLNRLTLEKEYL---EKEIQELQEQRID-----LKEQIKSIEKEIENLNGKKEELEE 868
                          170       180       190       200       210       220
                   ....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  965 DRKKQREFEEYIRDKYITAKADFRTLLKETKFITYRSKKL---IQESDQHLKDVEKILQN 1021
Cdd:TIGR02169  869 ELEELEAALRDLESRLGDLKKERDELEAQLRELERKIEELeaqIEKKRKRLSELKAKLEA 928
PLN02316 PLN02316
synthase/transferase
801-886 9.18e-03

synthase/transferase


Pssm-ID: 215180 [Multi-domain]  Cd Length: 1036  Bit Score: 40.24  E-value: 9.18e-03
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2181405477  801 SSMREDLFKQYiekiaknLDSEKEKELERQARIEASlREREREVQKARSEQTKEIDREREQHKREEAIQNFKA--LLSDM 878
Cdd:PLN02316   239 GGMDEHSFEDF-------LLEEKRRELEKLAKEEAE-RERQAEEQRRREEEKAAMEADRAQAKAEVEKRREKLqnLLKKA 310

                   ....*...
gi 2181405477  879 VRSSDVSW 886
Cdd:PLN02316   311 SRSADNVW 318
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH