NCBI Home Page NCBI Site Search page NCBI Guide that lists and describes the NCBI resources
Conserved domains on  [gi|2796586558|ref|WP_373252187|]
View 

Tat pathway signal sequence [Bacteroides thetaiotaomicron]

Protein Classification

PHP domain-containing protein( domain architecture ID 581140)

PHP (Polymerase and Histidinol Phosphatase) domain-containing protein has an invariant histidine that is involved in metal ion coordination, similar to Streptococcus tyrosine-protein phosphatase CpsB that dephosphorylates CpsD and is involved in the regulation of capsular polysaccharide biosynthesis

Gene Ontology:  GO:0046872
PubMed:  9685491

Graphical summary

 Zoom to residue level

show extra options »

Show site features     Horizontal zoom: ×

List of domain hits

Name Accession Description Interval E-value
PHP super family cl23724
Polymerase and Histidinol Phosphatase domain; The PHP (also called histidinol phosphatase-2 ...
47-298 4.68e-09

Polymerase and Histidinol Phosphatase domain; The PHP (also called histidinol phosphatase-2/HIS2) domain is associated with several types of DNA polymerases, such as PolIIIA and family X DNA polymerases, stand alone histidinol phosphate phosphatases (HisPPases), and a number of uncharacterized protein families. The PHP domain has four conserved sequence motifs and contains an invariant histidine that is involved in metal ion coordination. PHP in polymerases has trinuclear zinc/magnesium dependent proofreading activity. It has also been shown that the PHP domain functions in DNA repair. The PHP structures have a distorted (beta/alpha)7 barrel fold with a trinuclear metal site on the C-terminal side of the barrel.


The actual alignment was detected with superfamily member pfam12228:

Pssm-ID: 451507  Cd Length: 591  Bit Score: 58.81  E-value: 4.68e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558  47 YWGDLHNHCNIT---YGHG---DMRDAFEAAKGQ---------------LDFVSVTPHAMW---------PDIPGADDPR 96
Cdd:pfam12228   7 YFGDLHLHTTLSfdaFAFGtrlTPDDAYRFAKGEpvthptgqpvqlrrpLDFLAVTDHAEYlglmraladPNPPLLKWPT 86
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558  97 LK-WVIDYH-------TGAFKRL------------REGGYE-----------KYVAMTNEYNKEGEFLTFVGYEAHSMEH 145
Cdd:pfam12228  87 GKaWHELLDagdpqepALAFRELifaaaggdvppeLLDGYEdgedayksawqRTIEAAEAYNDPGKFTTFIGYEWTSAPG 166
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 146 GDHVALN--YDLDAPL---------VECTSIED-WK-----QKAKGHKVFITPHHMGYQGG--YRGYNWkcftEGDQT-- 204
Cdd:pfam12228 167 GNNLHRNviFRDGAVParqvlpfssFESPNPEDlWDwmdayEEQTGGEVLAIPHNSNLSNGlmFPPTDW----DGEPIda 242
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 205 ----------PFVEMYSRHG-------LAESDQ-GDYPYLHDMGPRQWEGTIQ------Y-------GLEL-----GN-- 246
Cdd:pfam12228 243 dyarlrarwePLVEITQIKGdsethplLSPNDEfADFETWDFGNPCGTAEKTPsmlpgsYvrsalkrGLSLeqklgVNpy 322
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 247 KFGIMASTDQHSG--------YPGSYGD----------------------------GRIGVLAPSLTRDAIWEALRTRHV 290
Cdd:pfam12228 323 KFGFIGSTDSHTGlasaeednFFGKFGGsepssaprrtrianagylyqsgwkmgasGLAAVWAEENTREAIFDAMRRKET 402

                  ....*...
gi 2796586558 291 CAATGDKI 298
Cdd:pfam12228 403 YATSGPRI 410
 
Name Accession Description Interval E-value
DUF3604 pfam12228
Protein of unknown function (DUF3604); This family of proteins is found in bacteria. Proteins ...
47-298 4.68e-09

Protein of unknown function (DUF3604); This family of proteins is found in bacteria. Proteins in this family are typically between 621 and 693 amino acids in length.


Pssm-ID: 371976  Cd Length: 591  Bit Score: 58.81  E-value: 4.68e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558  47 YWGDLHNHCNIT---YGHG---DMRDAFEAAKGQ---------------LDFVSVTPHAMW---------PDIPGADDPR 96
Cdd:pfam12228   7 YFGDLHLHTTLSfdaFAFGtrlTPDDAYRFAKGEpvthptgqpvqlrrpLDFLAVTDHAEYlglmraladPNPPLLKWPT 86
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558  97 LK-WVIDYH-------TGAFKRL------------REGGYE-----------KYVAMTNEYNKEGEFLTFVGYEAHSMEH 145
Cdd:pfam12228  87 GKaWHELLDagdpqepALAFRELifaaaggdvppeLLDGYEdgedayksawqRTIEAAEAYNDPGKFTTFIGYEWTSAPG 166
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 146 GDHVALN--YDLDAPL---------VECTSIED-WK-----QKAKGHKVFITPHHMGYQGG--YRGYNWkcftEGDQT-- 204
Cdd:pfam12228 167 GNNLHRNviFRDGAVParqvlpfssFESPNPEDlWDwmdayEEQTGGEVLAIPHNSNLSNGlmFPPTDW----DGEPIda 242
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 205 ----------PFVEMYSRHG-------LAESDQ-GDYPYLHDMGPRQWEGTIQ------Y-------GLEL-----GN-- 246
Cdd:pfam12228 243 dyarlrarwePLVEITQIKGdsethplLSPNDEfADFETWDFGNPCGTAEKTPsmlpgsYvrsalkrGLSLeqklgVNpy 322
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 247 KFGIMASTDQHSG--------YPGSYGD----------------------------GRIGVLAPSLTRDAIWEALRTRHV 290
Cdd:pfam12228 323 KFGFIGSTDSHTGlasaeednFFGKFGGsepssaprrtrianagylyqsgwkmgasGLAAVWAEENTREAIFDAMRRKET 402

                  ....*...
gi 2796586558 291 CAATGDKI 298
Cdd:pfam12228 403 YATSGPRI 410
 
Name Accession Description Interval E-value
DUF3604 pfam12228
Protein of unknown function (DUF3604); This family of proteins is found in bacteria. Proteins ...
47-298 4.68e-09

Protein of unknown function (DUF3604); This family of proteins is found in bacteria. Proteins in this family are typically between 621 and 693 amino acids in length.


Pssm-ID: 371976  Cd Length: 591  Bit Score: 58.81  E-value: 4.68e-09
                          10        20        30        40        50        60        70        80
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558  47 YWGDLHNHCNIT---YGHG---DMRDAFEAAKGQ---------------LDFVSVTPHAMW---------PDIPGADDPR 96
Cdd:pfam12228   7 YFGDLHLHTTLSfdaFAFGtrlTPDDAYRFAKGEpvthptgqpvqlrrpLDFLAVTDHAEYlglmraladPNPPLLKWPT 86
                          90       100       110       120       130       140       150       160
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558  97 LK-WVIDYH-------TGAFKRL------------REGGYE-----------KYVAMTNEYNKEGEFLTFVGYEAHSMEH 145
Cdd:pfam12228  87 GKaWHELLDagdpqepALAFRELifaaaggdvppeLLDGYEdgedayksawqRTIEAAEAYNDPGKFTTFIGYEWTSAPG 166
                         170       180       190       200       210       220       230       240
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 146 GDHVALN--YDLDAPL---------VECTSIED-WK-----QKAKGHKVFITPHHMGYQGG--YRGYNWkcftEGDQT-- 204
Cdd:pfam12228 167 GNNLHRNviFRDGAVParqvlpfssFESPNPEDlWDwmdayEEQTGGEVLAIPHNSNLSNGlmFPPTDW----DGEPIda 242
                         250       260       270       280       290       300       310       320
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 205 ----------PFVEMYSRHG-------LAESDQ-GDYPYLHDMGPRQWEGTIQ------Y-------GLEL-----GN-- 246
Cdd:pfam12228 243 dyarlrarwePLVEITQIKGdsethplLSPNDEfADFETWDFGNPCGTAEKTPsmlpgsYvrsalkrGLSLeqklgVNpy 322
                         330       340       350       360       370       380       390       400
                  ....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 2796586558 247 KFGIMASTDQHSG--------YPGSYGD----------------------------GRIGVLAPSLTRDAIWEALRTRHV 290
Cdd:pfam12228 323 KFGFIGSTDSHTGlasaeednFFGKFGGsepssaprrtrianagylyqsgwkmgasGLAAVWAEENTREAIFDAMRRKET 402

                  ....*...
gi 2796586558 291 CAATGDKI 298
Cdd:pfam12228 403 YATSGPRI 410
 
Blast search parameters
Data Source: Precalculated data, version = cdd.v.3.21
Preset Options:Database: CDSEARCH/cdd   Low complexity filter: no  Composition Based Adjustment: yes   E-value threshold: 0.01

References:

  • Wang J et al. (2023), "The conserved domain database in 2023", Nucleic Acids Res.51(D)384-8.
  • Lu S et al. (2020), "The conserved domain database in 2020", Nucleic Acids Res.48(D)265-8.
  • Marchler-Bauer A et al. (2017), "CDD/SPARCLE: functional classification of proteins via subfamily domain architectures.", Nucleic Acids Res.45(D)200-3.
Help | Disclaimer | Write to the Help Desk
NCBI | NLM | NIH