NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|18410222|ref|NP_567015|]
View 

Cleavage and polyadenylation specificity factor (CPSF) A subunit protein [Arabidopsis thaliana]

Protein Classification

DDB1/RSE1 family protein( domain architecture ID 10564462)

DDB1/RSE1 family protein is a nucleic acid binding protein with a beta-propeller fold, such as human DNA damage-binding protein 1 (DDB1) and Neurospora crassa pre-mRNA-splicing factor RSE1

CATH:  2.130.10.10
Gene Ontology:  GO:0003676
SCOP:  4004169

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
MMS1_N pfam10433
Mono-functional DNA-alkylating methyl methanesulfonate N-term; MMS1 is a protein that protects ...
75-589 0e+00

Mono-functional DNA-alkylating methyl methanesulfonate N-term; MMS1 is a protein that protects against replication-dependent DNA damage in Saccharomyces cerevisiae. MMS1 belongs to the DDB1 family of cullin 4 adaptors and the two proteins are homologous. MMS1 bridges the interaction of MMS22 and Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein involved in the regulation of DNA replication subsequent to DNA damage. The N-terminal region of MMS1 and the C-terminal of MMS22 are required for the the MMS1-MMS22 interaction. The human HIV-1 virion-associated protein Vpr assembles with DDB1 through interaction with DCAF1 (chromatin assembly factor) to form an E3 ubiquitin ligase that targets cellular substrates for proteasome-mediated degradation and subsequent G2 arrest.


:

Pssm-ID: 463091  Cd Length: 486  Bit Score: 565.74  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222     75 DYIVVGSDSGRIVILEYNKEKNVFDKVHQ-ETFGKSGCRRIVPGQYVAVDPKGRAVMIGACEKQKLVYVLN----RDTTA 149
Cdd:pfam10433    1 DHLVVGTDSGRLVFLSWDPEKNQFETIHSrEDLGKSGSRRSQPGQYLAVDPKGRAIAVSAYEGVFLVYPLKqpqkLNRNE 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    150 RLTISSPLEAHKSHTICYSLCGVDCGFDNPIFAAIEldyseadQDPTGQaaseaqKHLTFYELDLGLNHVSR--KWSNPV 227
Cdd:pfam10433   81 ALLLSSPLEARKSEGFILSMVFLDPGYDNPIFALLE-------QDRTGK------THLKLYEWDLGLNHVVRgpKWSEPL 147
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    228 DN------GANMLVTVPGGadgPSGVLVCAENFVIYMNQG-HPDVRAV---IPRRTDLPAergvlvvSAAVHKQKTmFFF 297
Cdd:pfam10433  148 DFlpkedrGANLLIPVPKG---PGGVLVCGETIITYKDILdQPDIRCPpvaRPLRENATI-------FVAWHKLDN-FFI 216
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    298 LIQTEYGDVFKVTLDHNGDH-VSELKVKYFDTIPVASSICVLKLGFLFSASEFGNHGLYQFQAIGEEPdvessssnlmet 376
Cdd:pfam10433  217 LLADEYGDLYLLTIENDEDNvVTSIKIGYFGTTSVASALVILDNGFLFVASEFGDSQLYQIDARGDDD------------ 284
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    377 eegfqpvffqprrLKNLVRIDQVESLMPLMDMKVLNIFEEETPQIFSLCGRGPRSSLRILRPGLAITEMAVSQLPGQP-S 455
Cdd:pfam10433  285 -------------LSNLELVQTFSNWAPILDFVVMDLGGEDTARIYTCSGAGKRGSLRSLRHGVGAEELAVSEEPGSPiT 351
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    456 AVWTVKKNVSDEFDAYIVVSFTNATLVLSI-GEQVEEV-NDSGFLDTTPSLAVSLIGDDSLMQVHPNGIRHIREDGRINE 533
Cdd:pfam10433  352 GVWTLKSSPEDEYDDYLVVSFVNETRVLSIdGDGVEEVdEDSGFLLSVPTLAAGNLGDGRLLQVTPNGIRLIDSDKRISE 431
                          490       500       510       520       530
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 18410222    534 WRTPGKRSIVKVGYNRLQVVIALSGGELIYFEadMTGQLMEV-EKHEMSGDVACLDI 589
Cdd:pfam10433  432 WKPPGGKSITAAAANGRQVLLALSGGELVYFE--ISTQLIEVvERKDLSSQVSCISL 486
CPSF_A pfam03178
CPSF A subunit region; This family includes a region that lies towards the C-terminus of the ...
861-1180 4.45e-103

CPSF A subunit region; This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding.


:

Pssm-ID: 427182  Cd Length: 319  Bit Score: 328.78  E-value: 4.45e-103
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    861 SCIRVLDPKTATTTCLLELQDNEAAYSVCTVNFHDKEYG----TLLAVGTVKGMQFWPKKNlvAGFIHIYRFVEDGKS-- 934
Cdd:pfam03178    2 SCIRLVDPITKEVIDTLELEENEAVLSVKSVNLEDSSTTkgkeEYLVVGTAFDLGEDPAAR--SGRILVFEIIEVPETnr 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    935 -LELLHKTQVEGVPLALCQFQGRLLAGIGPVLRLYDLG-KKRLLRKCENKLfPNTIISIQTYRDRIYVGDIQESFHYCKY 1012
Cdd:pfam03178   80 kLKLVHKTEVKGAVTALAEFQGRLLAGQGQKLRVYDLGeDKSLLPKAFLDT-GVYVVDLKVFGNRIIVGDLMKSVTFVGY 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222   1013 RRDENQLYIFADDCVPRWLTASHHVDFDTMAGADKFGNVYFVRLPQDLSEEIEEDPtggkikweqgklngapnKVDEIVQ 1092
Cdd:pfam03178  159 DEEPYRLIEFARDTQPRWVTAAEFLDGDTVLVADKFGNLHVLRYDPDVPESLDGDP-----------------RLLVRAE 221
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222   1093 FHVGDVVTCLQKASMIPGGSES-----IMYGTVMGSIGALHAFTSRDDVDFFSHLEMHMRQEYPPLCGRDHMAYRSAYFP 1167
Cdd:pfam03178  222 FHLGETVTSFRKGSLVPGGSESpsspqLLYGTLDGSIGLLVPFISEEDYRFLQSLQQQLRDELPHLGGLDHRAFRSYYTP 301
                          330
                   ....*....|....*.
gi 18410222   1168 ---VKDVIDGDLCEQF 1180
Cdd:pfam03178  302 prtVKGVIDGDLLERF 317
WD40 super family cl29593
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
581-651 7.54e-03

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


The actual alignment was detected with superfamily member cd00200:

Pssm-ID: 475233 [Multi-domain]  Cd Length: 289  Bit Score: 40.01  E-value: 7.54e-03
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 18410222  581 SGDVACLDIAPVpegrkrSRFLAVGSYDNTVRILSLDPDDCLQ-----ILSVQSVSSAPESLLFLevqasIGGDDG 651
Cdd:cd00200    9 TGGVTCVAFSPD------GKLLATGSGDGTIKVWDLETGELLRtlkghTGPVRDVAASADGTYLA-----SGSSDK 73
 
Name Accession Description Interval E-value
MMS1_N pfam10433
Mono-functional DNA-alkylating methyl methanesulfonate N-term; MMS1 is a protein that protects ...
75-589 0e+00

Mono-functional DNA-alkylating methyl methanesulfonate N-term; MMS1 is a protein that protects against replication-dependent DNA damage in Saccharomyces cerevisiae. MMS1 belongs to the DDB1 family of cullin 4 adaptors and the two proteins are homologous. MMS1 bridges the interaction of MMS22 and Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein involved in the regulation of DNA replication subsequent to DNA damage. The N-terminal region of MMS1 and the C-terminal of MMS22 are required for the the MMS1-MMS22 interaction. The human HIV-1 virion-associated protein Vpr assembles with DDB1 through interaction with DCAF1 (chromatin assembly factor) to form an E3 ubiquitin ligase that targets cellular substrates for proteasome-mediated degradation and subsequent G2 arrest.


Pssm-ID: 463091  Cd Length: 486  Bit Score: 565.74  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222     75 DYIVVGSDSGRIVILEYNKEKNVFDKVHQ-ETFGKSGCRRIVPGQYVAVDPKGRAVMIGACEKQKLVYVLN----RDTTA 149
Cdd:pfam10433    1 DHLVVGTDSGRLVFLSWDPEKNQFETIHSrEDLGKSGSRRSQPGQYLAVDPKGRAIAVSAYEGVFLVYPLKqpqkLNRNE 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    150 RLTISSPLEAHKSHTICYSLCGVDCGFDNPIFAAIEldyseadQDPTGQaaseaqKHLTFYELDLGLNHVSR--KWSNPV 227
Cdd:pfam10433   81 ALLLSSPLEARKSEGFILSMVFLDPGYDNPIFALLE-------QDRTGK------THLKLYEWDLGLNHVVRgpKWSEPL 147
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    228 DN------GANMLVTVPGGadgPSGVLVCAENFVIYMNQG-HPDVRAV---IPRRTDLPAergvlvvSAAVHKQKTmFFF 297
Cdd:pfam10433  148 DFlpkedrGANLLIPVPKG---PGGVLVCGETIITYKDILdQPDIRCPpvaRPLRENATI-------FVAWHKLDN-FFI 216
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    298 LIQTEYGDVFKVTLDHNGDH-VSELKVKYFDTIPVASSICVLKLGFLFSASEFGNHGLYQFQAIGEEPdvessssnlmet 376
Cdd:pfam10433  217 LLADEYGDLYLLTIENDEDNvVTSIKIGYFGTTSVASALVILDNGFLFVASEFGDSQLYQIDARGDDD------------ 284
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    377 eegfqpvffqprrLKNLVRIDQVESLMPLMDMKVLNIFEEETPQIFSLCGRGPRSSLRILRPGLAITEMAVSQLPGQP-S 455
Cdd:pfam10433  285 -------------LSNLELVQTFSNWAPILDFVVMDLGGEDTARIYTCSGAGKRGSLRSLRHGVGAEELAVSEEPGSPiT 351
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    456 AVWTVKKNVSDEFDAYIVVSFTNATLVLSI-GEQVEEV-NDSGFLDTTPSLAVSLIGDDSLMQVHPNGIRHIREDGRINE 533
Cdd:pfam10433  352 GVWTLKSSPEDEYDDYLVVSFVNETRVLSIdGDGVEEVdEDSGFLLSVPTLAAGNLGDGRLLQVTPNGIRLIDSDKRISE 431
                          490       500       510       520       530
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 18410222    534 WRTPGKRSIVKVGYNRLQVVIALSGGELIYFEadMTGQLMEV-EKHEMSGDVACLDI 589
Cdd:pfam10433  432 WKPPGGKSITAAAANGRQVLLALSGGELVYFE--ISTQLIEVvERKDLSSQVSCISL 486
CPSF_A pfam03178
CPSF A subunit region; This family includes a region that lies towards the C-terminus of the ...
861-1180 4.45e-103

CPSF A subunit region; This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding.


Pssm-ID: 427182  Cd Length: 319  Bit Score: 328.78  E-value: 4.45e-103
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    861 SCIRVLDPKTATTTCLLELQDNEAAYSVCTVNFHDKEYG----TLLAVGTVKGMQFWPKKNlvAGFIHIYRFVEDGKS-- 934
Cdd:pfam03178    2 SCIRLVDPITKEVIDTLELEENEAVLSVKSVNLEDSSTTkgkeEYLVVGTAFDLGEDPAAR--SGRILVFEIIEVPETnr 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    935 -LELLHKTQVEGVPLALCQFQGRLLAGIGPVLRLYDLG-KKRLLRKCENKLfPNTIISIQTYRDRIYVGDIQESFHYCKY 1012
Cdd:pfam03178   80 kLKLVHKTEVKGAVTALAEFQGRLLAGQGQKLRVYDLGeDKSLLPKAFLDT-GVYVVDLKVFGNRIIVGDLMKSVTFVGY 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222   1013 RRDENQLYIFADDCVPRWLTASHHVDFDTMAGADKFGNVYFVRLPQDLSEEIEEDPtggkikweqgklngapnKVDEIVQ 1092
Cdd:pfam03178  159 DEEPYRLIEFARDTQPRWVTAAEFLDGDTVLVADKFGNLHVLRYDPDVPESLDGDP-----------------RLLVRAE 221
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222   1093 FHVGDVVTCLQKASMIPGGSES-----IMYGTVMGSIGALHAFTSRDDVDFFSHLEMHMRQEYPPLCGRDHMAYRSAYFP 1167
Cdd:pfam03178  222 FHLGETVTSFRKGSLVPGGSESpsspqLLYGTLDGSIGLLVPFISEEDYRFLQSLQQQLRDELPHLGGLDHRAFRSYYTP 301
                          330
                   ....*....|....*.
gi 18410222   1168 ---VKDVIDGDLCEQF 1180
Cdd:pfam03178  302 prtVKGVIDGDLLERF 317
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
581-651 7.54e-03

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 40.01  E-value: 7.54e-03
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 18410222  581 SGDVACLDIAPVpegrkrSRFLAVGSYDNTVRILSLDPDDCLQ-----ILSVQSVSSAPESLLFLevqasIGGDDG 651
Cdd:cd00200    9 TGGVTCVAFSPD------GKLLATGSGDGTIKVWDLETGELLRtlkghTGPVRDVAASADGTYLA-----SGSSDK 73
 
Name Accession Description Interval E-value
MMS1_N pfam10433
Mono-functional DNA-alkylating methyl methanesulfonate N-term; MMS1 is a protein that protects ...
75-589 0e+00

Mono-functional DNA-alkylating methyl methanesulfonate N-term; MMS1 is a protein that protects against replication-dependent DNA damage in Saccharomyces cerevisiae. MMS1 belongs to the DDB1 family of cullin 4 adaptors and the two proteins are homologous. MMS1 bridges the interaction of MMS22 and Crt10 with Cul8/Rtt101. Cul8/Rtt101 is a cullin protein involved in the regulation of DNA replication subsequent to DNA damage. The N-terminal region of MMS1 and the C-terminal of MMS22 are required for the the MMS1-MMS22 interaction. The human HIV-1 virion-associated protein Vpr assembles with DDB1 through interaction with DCAF1 (chromatin assembly factor) to form an E3 ubiquitin ligase that targets cellular substrates for proteasome-mediated degradation and subsequent G2 arrest.


Pssm-ID: 463091  Cd Length: 486  Bit Score: 565.74  E-value: 0e+00
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222     75 DYIVVGSDSGRIVILEYNKEKNVFDKVHQ-ETFGKSGCRRIVPGQYVAVDPKGRAVMIGACEKQKLVYVLN----RDTTA 149
Cdd:pfam10433    1 DHLVVGTDSGRLVFLSWDPEKNQFETIHSrEDLGKSGSRRSQPGQYLAVDPKGRAIAVSAYEGVFLVYPLKqpqkLNRNE 80
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    150 RLTISSPLEAHKSHTICYSLCGVDCGFDNPIFAAIEldyseadQDPTGQaaseaqKHLTFYELDLGLNHVSR--KWSNPV 227
Cdd:pfam10433   81 ALLLSSPLEARKSEGFILSMVFLDPGYDNPIFALLE-------QDRTGK------THLKLYEWDLGLNHVVRgpKWSEPL 147
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    228 DN------GANMLVTVPGGadgPSGVLVCAENFVIYMNQG-HPDVRAV---IPRRTDLPAergvlvvSAAVHKQKTmFFF 297
Cdd:pfam10433  148 DFlpkedrGANLLIPVPKG---PGGVLVCGETIITYKDILdQPDIRCPpvaRPLRENATI-------FVAWHKLDN-FFI 216
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    298 LIQTEYGDVFKVTLDHNGDH-VSELKVKYFDTIPVASSICVLKLGFLFSASEFGNHGLYQFQAIGEEPdvessssnlmet 376
Cdd:pfam10433  217 LLADEYGDLYLLTIENDEDNvVTSIKIGYFGTTSVASALVILDNGFLFVASEFGDSQLYQIDARGDDD------------ 284
                          330       340       350       360       370       380       390       400
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    377 eegfqpvffqprrLKNLVRIDQVESLMPLMDMKVLNIFEEETPQIFSLCGRGPRSSLRILRPGLAITEMAVSQLPGQP-S 455
Cdd:pfam10433  285 -------------LSNLELVQTFSNWAPILDFVVMDLGGEDTARIYTCSGAGKRGSLRSLRHGVGAEELAVSEEPGSPiT 351
                          410       420       430       440       450       460       470       480
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    456 AVWTVKKNVSDEFDAYIVVSFTNATLVLSI-GEQVEEV-NDSGFLDTTPSLAVSLIGDDSLMQVHPNGIRHIREDGRINE 533
Cdd:pfam10433  352 GVWTLKSSPEDEYDDYLVVSFVNETRVLSIdGDGVEEVdEDSGFLLSVPTLAAGNLGDGRLLQVTPNGIRLIDSDKRISE 431
                          490       500       510       520       530
                   ....*....|....*....|....*....|....*....|....*....|....*..
gi 18410222    534 WRTPGKRSIVKVGYNRLQVVIALSGGELIYFEadMTGQLMEV-EKHEMSGDVACLDI 589
Cdd:pfam10433  432 WKPPGGKSITAAAANGRQVLLALSGGELVYFE--ISTQLIEVvERKDLSSQVSCISL 486
CPSF_A pfam03178
CPSF A subunit region; This family includes a region that lies towards the C-terminus of the ...
861-1180 4.45e-103

CPSF A subunit region; This family includes a region that lies towards the C-terminus of the cleavage and polyadenylation specificity factor (CPSF) A (160 kDa) subunit. CPSF is involved in mRNA polyadenylation and binds the AAUAAA conserved sequence in pre-mRNA. CPSF has also been found to be necessary for splicing of single-intron pre-mRNAs. The function of the aligned region is unknown but may be involved in RNA/DNA binding.


Pssm-ID: 427182  Cd Length: 319  Bit Score: 328.78  E-value: 4.45e-103
                           10        20        30        40        50        60        70        80
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    861 SCIRVLDPKTATTTCLLELQDNEAAYSVCTVNFHDKEYG----TLLAVGTVKGMQFWPKKNlvAGFIHIYRFVEDGKS-- 934
Cdd:pfam03178    2 SCIRLVDPITKEVIDTLELEENEAVLSVKSVNLEDSSTTkgkeEYLVVGTAFDLGEDPAAR--SGRILVFEIIEVPETnr 79
                           90       100       110       120       130       140       150       160
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222    935 -LELLHKTQVEGVPLALCQFQGRLLAGIGPVLRLYDLG-KKRLLRKCENKLfPNTIISIQTYRDRIYVGDIQESFHYCKY 1012
Cdd:pfam03178   80 kLKLVHKTEVKGAVTALAEFQGRLLAGQGQKLRVYDLGeDKSLLPKAFLDT-GVYVVDLKVFGNRIIVGDLMKSVTFVGY 158
                          170       180       190       200       210       220       230       240
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222   1013 RRDENQLYIFADDCVPRWLTASHHVDFDTMAGADKFGNVYFVRLPQDLSEEIEEDPtggkikweqgklngapnKVDEIVQ 1092
Cdd:pfam03178  159 DEEPYRLIEFARDTQPRWVTAAEFLDGDTVLVADKFGNLHVLRYDPDVPESLDGDP-----------------RLLVRAE 221
                          250       260       270       280       290       300       310       320
                   ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 18410222   1093 FHVGDVVTCLQKASMIPGGSES-----IMYGTVMGSIGALHAFTSRDDVDFFSHLEMHMRQEYPPLCGRDHMAYRSAYFP 1167
Cdd:pfam03178  222 FHLGETVTSFRKGSLVPGGSESpsspqLLYGTLDGSIGLLVPFISEEDYRFLQSLQQQLRDELPHLGGLDHRAFRSYYTP 301
                          330
                   ....*....|....*.
gi 18410222   1168 ---VKDVIDGDLCEQF 1180
Cdd:pfam03178  302 prtVKGVIDGDLLERF 317
WD40 cd00200
WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions ...
581-651 7.54e-03

WD40 domain, found in a number of eukaryotic proteins that cover a wide variety of functions including adaptor/regulatory modules in signal transduction, pre-mRNA processing and cytoskeleton assembly; typically contains a GH dipeptide 11-24 residues from its N-terminus and the WD dipeptide at its C-terminus and is 40 residues long, hence the name WD40; between GH and WD lies a conserved core; serves as a stable propeller-like platform to which proteins can bind either stably or reversibly; forms a propeller-like structure with several blades where each blade is composed of a four-stranded anti-parallel b-sheet; instances with few detectable copies are hypothesized to form larger structures by dimerization; each WD40 sequence repeat forms the first three strands of one blade and the last strand in the next blade; the last C-terminal WD40 repeat completes the blade structure of the first WD40 repeat to create the closed ring propeller-structure; residues on the top and bottom surface of the propeller are proposed to coordinate interactions with other proteins and/or small ligands; 7 copies of the repeat are present in this alignment.


Pssm-ID: 238121 [Multi-domain]  Cd Length: 289  Bit Score: 40.01  E-value: 7.54e-03
                         10        20        30        40        50        60        70
                 ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*.
gi 18410222  581 SGDVACLDIAPVpegrkrSRFLAVGSYDNTVRILSLDPDDCLQ-----ILSVQSVSSAPESLLFLevqasIGGDDG 651
Cdd:cd00200    9 TGGVTCVAFSPD------GKLLATGSGDGTIKVWDLETGELLRtlkghTGPVRDVAASADGTYLA-----SGSSDK 73
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH