|
Name |
Accession |
Description |
Interval |
E-value |
| COG5263 |
COG5263 |
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism]; |
1938-2052 |
1.28e-16 |
|
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
Pssm-ID: 444077 [Multi-domain] Cd Length: 486 Bit Score: 85.31 E-value: 1.28e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1938 ATAQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNW-RFYSGKTMLVGFwdlgANGNNKTYYF 2016
Cdd:COG5263 343 AMATGWVTDDGKWYYLGSDGAMATGWQKIDGKWYYFDSNGAMATGWVKVDGKWyYFDSSGAMATGW----LKIDGKWYYF 418
|
90 100 110
....*....|....*....|....*....|....*..
gi 503037449 2017 TKDGLMVSGkWLEIDGKCYYFYTDGSLARS-TKIDGY 2052
Cdd:COG5263 419 DSDGAMATG-WQKIGGKWYYFDSNGAMATGwVKVDGK 454
|
|
| PspC_subgroup_1 |
NF033838 |
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ... |
1928-2058 |
9.55e-14 |
|
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
Pssm-ID: 468201 [Multi-domain] Cd Length: 684 Bit Score: 76.98 E-value: 9.55e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1928 GRYIKLTINPATAQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWrfysgktmlvgfwdlga 2007
Cdd:NF033838 573 GSWYYLNANGDMATGWLQYNGSWYYLNANGDMATGWLQYNGSWYYLNANGSMATGWVKDGDTW----------------- 635
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|.
gi 503037449 2008 ngnnktYYFTKDGLMVSGKWLEIDGKCYYFYTDGSLARSTKIDGYEVDEKG 2058
Cdd:NF033838 636 ------YYLEASGAMKASQWFKVSDKWYYVNGSGALAVNTTVDGYGVNANG 680
|
|
| PspC_relate_1 |
NF033840 |
PspC-related protein choline-binding protein 1; Members of this family share C-terminal ... |
1936-2058 |
7.46e-13 |
|
PspC-related protein choline-binding protein 1; Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.
Pssm-ID: 411409 [Multi-domain] Cd Length: 648 Bit Score: 73.96 E-value: 7.46e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1936 NPATAQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWRFYSGKTMLVGFWdlgANGNNKTYY 2015
Cdd:NF033840 525 DGSMATGWVQVNGSWYYLNSNGSMATGWVQVNGSWYYLNSNGSMATGWVQVDGSWYYLNDNGSMETGW---LQNNGSWYY 601
|
90 100 110 120
....*....|....*....|....*....|....*....|...
gi 503037449 2016 FTKDGLMVSGKWLEIDGKCYYFYTDGSLARSTKIDGYEVDEKG 2058
Cdd:NF033840 602 LNSNGSMKANQWFQVGSKWYYVNASGELAVNTSIDGYRVNDNG 644
|
|
| SLH |
pfam00395 |
S-layer homology domain; |
1816-1856 |
1.93e-10 |
|
S-layer homology domain;
Pssm-ID: 459798 [Multi-domain] Cd Length: 42 Bit Score: 57.60 E-value: 1.93e-10
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 503037449 1816 FTDVKADSAYRPYIEWAYSKGIIQGIGNSQFAPDRAITREE 1856
Cdd:pfam00395 1 FKDVKSVAAWAEAVAALAELGIISGYPDGTFRPNEPITRAE 41
|
|
| PspC_subgroup_1 |
NF033838 |
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ... |
1937-2044 |
2.26e-09 |
|
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
Pssm-ID: 468201 [Multi-domain] Cd Length: 684 Bit Score: 62.72 E-value: 2.26e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1937 PATAQ-GWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWRFYSGKTMLVGFWdLGANGNnkTYY 2015
Cdd:NF033838 481 PSTPKtGWKQENGMWYFYNTDGSMATGWLQNNGSWYYLNANGAMATGWLQNNGSWYYLNANGSMATGW-LQNNGS--WYY 557
|
90 100
....*....|....*....|....*....
gi 503037449 2016 FTKDGLMVSGkWLEIDGKCYYFYTDGSLA 2044
Cdd:NF033838 558 LNANGAMATG-WLQYNGSWYYLNANGDMA 585
|
|
| glucan_65_rpt |
TIGR04035 |
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed ... |
1951-1997 |
6.63e-09 |
|
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.
Pssm-ID: 274933 [Multi-domain] Cd Length: 62 Bit Score: 53.68 E-value: 6.63e-09
10 20 30 40
....*....|....*....|....*....|....*....|....*...
gi 503037449 1951 YLYYKDGKALTGTQTIDGVKYFFSTDGT-LKTGWVKDGDNWRFYSGKT 1997
Cdd:TIGR04035 1 YYFDADGKAVTGAQTIDGVTYYFDENGKqVKGDFVTNGGGTYYYDKDS 48
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
447-1761 |
1.48e-07 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 57.08 E-value: 1.48e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 447 TAVYTPSENDPGKNYLTATSAAVSKTIAKANLTDFAITEVTGKKYGDAVFKLQATAAGAPVDASFSVPADNKVLSIIGDT 526
Cdd:COG3210 47 TAGGIASNAGTTASTSGGSGTAGGVGNTSASTGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTG 126
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 527 AAIIGAGTVRVTAVKAADNNFNEATATLDIAIGKATPTVTINEVSYYIYNGKPVTNPTAEQVTVTGADYADVEFLYSTVM 606
Cdd:COG3210 127 NNTGGTTTSSTNTVTTLGGTTTGNTVLSTSGAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATA 206
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 607 SGPYTAQAPKDAGINYYVKARILETENTKAYYSQPKTFSIKEKGISIAGGTAASKKYDGGADAIVTDLIFSGLQNGETLE 686
Cdd:COG3210 207 GVLANAGGGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSN 286
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 687 LGRDYLVGSPAYDNANVGTEKTVSGTASLILNEKTKNYNLISGNYIIRNGVITRGDGPDAPTGGIVDDERNIFIFDSVVG 766
Cdd:COG3210 287 TAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGT 366
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 767 TTYEYKIGSENWTNIFATGTETVITVGNKALAIGELQVRIKETANYQAGEVLSNKTAFTASLEGSVDITGTTVYGETLTA 846
Cdd:COG3210 367 GNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTI 446
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 847 EVEGQQANAVLIYKWKANGEEIQNGSQNTLTISGSLVGKIINVEITAANYSGKLISTTTPSVRKKQVTATAGSITKVYDG 926
Cdd:COG3210 447 GGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGN 526
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 927 NKWV--NIPLSITKVNASDDITVVAEGIYNSADAGTGKTVSISGATVTGDAKDWYEIALPKGLTGTITPKSMPSATVTVN 1004
Cdd:COG3210 527 ATSGgtGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTG 606
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1005 GSYIYTGSAIVPTDENVVVKDGTTTLTKGTDYSFTASNNINVGTASVEVTLKGNYSGNASGSFTIAPKAVTPTIEAIGNR 1084
Cdd:COG3210 607 SAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGT 686
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1085 TFTGTQLTPNVTVENDGTTL-IKDTDYTVSYGANTNVGTGSVTVSLKGNYSGTatanfAITKAASPTVSAPDNIAMVKGQ 1163
Cdd:COG3210 687 TGTTLNAATGGTLNNAGNTLtISTGSITVTGQIGALANANGDTVTFGNLGTGA-----TLTLNAGVTITSGNAGTLSIGL 761
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1164 ERAYTFDLSTLTMPQNAGVKTYVVASPSNADLFAVNPTVDGSMLKFTSKSVSTAGGTATVAVTIKSDNYRDVAVTLTFEI 1243
Cdd:COG3210 762 TANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGTITAAGTTAINVTGSGGTITINTATTGLTGTGDTTSGAG 841
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1244 VDKVPVSISGIAAASKTYNGTPAVYTGTPIAKDGEH--KVTVSGYNYTWSKADGTPLLEAPKSVGNYKLSVSVKADDPNY 1321
Cdd:COG3210 842 GSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASItvGSGGVATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGT 921
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1322 IGSINVPFEIVKANLIIKADDKHILIGGAKPDYTATVTGLVNGETISGISFTDNAPNTNTKGSFTITPSNGTITGGGNGN 1401
Cdd:COG3210 922 GGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGN 1001
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1402 YNITYETGTLTIGIDVSVIDSAIAAANTAKSGVSVDDRAANQVSSGTRFVTTAEMNALTAAIQTATEAKVMVNTAGEAQA 1481
Cdd:COG3210 1002 SGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQNGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGG 1081
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1482 AAKTLEDAVVSFKAAIKTGSYTAPSSGGGSSGGSTSSGGGTTTTPAPVATPEKMPNQPVTATAPVTATAGTNGAASASIP 1561
Cdd:COG3210 1082 TAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSA 1161
|
1130 1140 1150 1160 1170 1180 1190 1200
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1562 DKAVTDAISKAQADATSQGKTANGISVELNVTMPKGTASLTATLTRSSLDSLVSAGVSSLEIGCSLVQVSFDKKALAEIQ 1641
Cdd:COG3210 1162 SAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQT 1241
|
1210 1220 1230 1240 1250 1260 1270 1280
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1642 KQSSGNISIAIAPKTNLSDAAKKIIGTRPVYDITVGYGSGKTVSSFGGGIATVSIPYTLGKNEAVGGLYAVYVDAKGNAN 1721
Cdd:COG3210 1242 GSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGT 1321
|
1290 1300 1310 1320
....*....|....*....|....*....|....*....|
gi 503037449 1722 RIAGSAYDANSGCVIFTTTHFSQYGIGYTAPTAKFTDTST 1761
Cdd:COG3210 1322 TATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDS 1361
|
|
| MBG_2 |
pfam18676 |
MBG domain (YGX type); This domain is found in a variety of bacterial extracellular proteins. ... |
1336-1412 |
3.77e-07 |
|
MBG domain (YGX type); This domain is found in a variety of bacterial extracellular proteins. This domain is related to the MBG domain (pfam17883). But it replaces the characteriztic YDG motif close the N-terminus with a YGX motif.
Pssm-ID: 436658 [Multi-domain] Cd Length: 73 Bit Score: 49.15 E-value: 3.77e-07
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 503037449 1336 LIIKADDKHILIGGAKPDYTATVTGLVNGETISGIS-FTDNAPNTNTKGSFTITPSngtitGGGNGNYNITYETGTLT 1412
Cdd:pfam18676 1 LTVTADDKSKVYGDADPALTYTYSGLVNGDTLTVLSgGSLSATAGSNVGTYAITAS-----GLSASNYTITYVPGTLT 73
|
|
| Big_3_5 |
pfam16640 |
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ... |
383-477 |
1.58e-06 |
|
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold.
Pssm-ID: 406933 [Multi-domain] Cd Length: 90 Bit Score: 48.01 E-value: 1.58e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 383 SEPFEYGKEVFLEATVTGVNGEMPDGTVTFKKGSDTLETANYDFGTEKYTLSSgKIFDAGNHSFTAVYTPSEndpgkNYL 462
Cdd:pfam16640 1 PTSVTYGQSVTLTATVTPASGGTPTGTVTFTDGGTVLGTAVLVSGNGVATLTT-TALAAGTHTITATYSGDA-----NYA 74
|
90
....*....|....*
gi 503037449 463 TATSAAVSKTIAKAN 477
Cdd:pfam16640 75 ASTSSAVTVTVTKAA 89
|
|
| SLH |
pfam00395 |
S-layer homology domain; |
1881-1923 |
2.81e-06 |
|
S-layer homology domain;
Pssm-ID: 459798 [Multi-domain] Cd Length: 42 Bit Score: 45.66 E-value: 2.81e-06
10 20 30 40
....*....|....*....|....*....|....*....|...
gi 503037449 1881 YADASGIgSAYKDAVKAMQQAGIMMGGSDNKFNPKGNATRAEV 1923
Cdd:pfam00395 1 FKDVKSV-AAWAEAVAALAELGIISGYPDGTFRPNEPITRAEA 42
|
|
| Choline_bind_3 |
pfam19127 |
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ... |
2010-2047 |
8.07e-06 |
|
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.
Pssm-ID: 465978 [Multi-domain] Cd Length: 47 Bit Score: 44.45 E-value: 8.07e-06
10 20 30
....*....|....*....|....*....|....*....
gi 503037449 2010 NNKTYYFTKDGLMVSGKWLEIDGKCYYFYTD-GSLARST 2047
Cdd:pfam19127 8 NGQTLYFDSDGKQVKGWVVTIDGKWYYFDADsGEMVTNR 46
|
|
| SLH |
pfam00395 |
S-layer homology domain; |
1756-1794 |
8.88e-04 |
|
S-layer homology domain;
Pssm-ID: 459798 [Multi-domain] Cd Length: 42 Bit Score: 38.73 E-value: 8.88e-04
10 20 30 40
....*....|....*....|....*....|....*....|
gi 503037449 1756 FTDTSTHWA-KESIDYVVGRGLMSGSSETAFTPNSAMTRG 1794
Cdd:pfam00395 1 FKDVKSVAAwAEAVAALAELGIISGYPDGTFRPNEPITRA 40
|
|
| PHA03255 |
PHA03255 |
BDLF3; Provisional |
1028-1155 |
2.93e-03 |
|
BDLF3; Provisional
Pssm-ID: 165513 [Multi-domain] Cd Length: 234 Bit Score: 41.43 E-value: 2.93e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1028 TTLTKGTDYSFTASNNINVGTASVEVTLKGNYSGNASGSFTIAPKAVTPTIEAIGNR-----TFTGTQLTPNVTVENdGT 1102
Cdd:PHA03255 20 TSLIWTSSGSSTASAGNVTGTTAVTTPSPSASGPSTNQSTTLTTTSAPITTTAILSTntttvTSTGTTVTPVPTTSN-AS 98
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|...
gi 503037449 1103 TLIKDTDYTVSYGANTNVGTGSVTVSLKGNYSGTATANFAITKAASPTVSAPD 1155
Cdd:PHA03255 99 TINVTTKVTAQNITATEAGTGTSTGVTSNVTTRSSSTTSATTRITNATTLAPT 151
|
|
|
|
Name |
Accession |
Description |
Interval |
E-value |
| COG5263 |
COG5263 |
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism]; |
1938-2052 |
1.28e-16 |
|
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
Pssm-ID: 444077 [Multi-domain] Cd Length: 486 Bit Score: 85.31 E-value: 1.28e-16
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1938 ATAQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNW-RFYSGKTMLVGFwdlgANGNNKTYYF 2016
Cdd:COG5263 343 AMATGWVTDDGKWYYLGSDGAMATGWQKIDGKWYYFDSNGAMATGWVKVDGKWyYFDSSGAMATGW----LKIDGKWYYF 418
|
90 100 110
....*....|....*....|....*....|....*..
gi 503037449 2017 TKDGLMVSGkWLEIDGKCYYFYTDGSLARS-TKIDGY 2052
Cdd:COG5263 419 DSDGAMATG-WQKIGGKWYYFDSNGAMATGwVKVDGK 454
|
|
| PspC_subgroup_1 |
NF033838 |
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ... |
1928-2058 |
9.55e-14 |
|
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
Pssm-ID: 468201 [Multi-domain] Cd Length: 684 Bit Score: 76.98 E-value: 9.55e-14
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1928 GRYIKLTINPATAQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWrfysgktmlvgfwdlga 2007
Cdd:NF033838 573 GSWYYLNANGDMATGWLQYNGSWYYLNANGDMATGWLQYNGSWYYLNANGSMATGWVKDGDTW----------------- 635
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|.
gi 503037449 2008 ngnnktYYFTKDGLMVSGKWLEIDGKCYYFYTDGSLARSTKIDGYEVDEKG 2058
Cdd:NF033838 636 ------YYLEASGAMKASQWFKVSDKWYYVNGSGALAVNTTVDGYGVNANG 680
|
|
| PspC_relate_1 |
NF033840 |
PspC-related protein choline-binding protein 1; Members of this family share C-terminal ... |
1936-2058 |
7.46e-13 |
|
PspC-related protein choline-binding protein 1; Members of this family share C-terminal homology to the choline-binding form of the pneumococcal surface antigen PspC, but not to its allelic LPXTG-anchored forms because they lack the choline-binding repeat region. Members of this family should not be confused with PspC itself, whose identity and function reflect regions N-terminal to the choline-binding region. See Iannelli, et al. (PMID: 11891047) for information about the different allelic forms of PspC.
Pssm-ID: 411409 [Multi-domain] Cd Length: 648 Bit Score: 73.96 E-value: 7.46e-13
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1936 NPATAQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWRFYSGKTMLVGFWdlgANGNNKTYY 2015
Cdd:NF033840 525 DGSMATGWVQVNGSWYYLNSNGSMATGWVQVNGSWYYLNSNGSMATGWVQVDGSWYYLNDNGSMETGW---LQNNGSWYY 601
|
90 100 110 120
....*....|....*....|....*....|....*....|...
gi 503037449 2016 FTKDGLMVSGKWLEIDGKCYYFYTDGSLARSTKIDGYEVDEKG 2058
Cdd:NF033840 602 LNSNGSMKANQWFQVGSKWYYVNASGELAVNTSIDGYRVNDNG 644
|
|
| COG5263 |
COG5263 |
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism]; |
1942-2043 |
2.73e-11 |
|
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
Pssm-ID: 444077 [Multi-domain] Cd Length: 486 Bit Score: 68.36 E-value: 2.73e-11
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1942 GWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWrfysgktmlvgfwdlgangnnktYYFTKDGL 2021
Cdd:COG5263 407 GWLKIDGKWYYFDSDGAMATGWQKIGGKWYYFDSNGAMATGWVKVDGKW-----------------------YYFDSDGA 463
|
90 100
....*....|....*....|..
gi 503037449 2022 MVSGkWLEIDGKCYYFYTDGSL 2043
Cdd:COG5263 464 MATG-WQTIDGKTYYFDSNGAW 484
|
|
| SLH |
pfam00395 |
S-layer homology domain; |
1816-1856 |
1.93e-10 |
|
S-layer homology domain;
Pssm-ID: 459798 [Multi-domain] Cd Length: 42 Bit Score: 57.60 E-value: 1.93e-10
10 20 30 40
....*....|....*....|....*....|....*....|.
gi 503037449 1816 FTDVKADSAYRPYIEWAYSKGIIQGIGNSQFAPDRAITREE 1856
Cdd:pfam00395 1 FKDVKSVAAWAEAVAALAELGIISGYPDGTFRPNEPITRAE 41
|
|
| PspC_subgroup_1 |
NF033838 |
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, ... |
1937-2044 |
2.26e-09 |
|
pneumococcal surface protein PspC, choline-binding form; The pneumococcal surface protein PspC, as described in Streptococcus pneumoniae, is a repetitive and highly variable protein, recognized by a conserved N-terminal domain and also by genomic location. This form, subgroup 1, has variable numbers of a choline-binding repeat in the C-terminal region, and is also known as choline-binding protein A. The other form, subgroup 2, is anchored covalently after cleavage by sortase at a C-terminal LPXTG site.
Pssm-ID: 468201 [Multi-domain] Cd Length: 684 Bit Score: 62.72 E-value: 2.26e-09
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1937 PATAQ-GWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDGDNWRFYSGKTMLVGFWdLGANGNnkTYY 2015
Cdd:NF033838 481 PSTPKtGWKQENGMWYFYNTDGSMATGWLQNNGSWYYLNANGAMATGWLQNNGSWYYLNANGSMATGW-LQNNGS--WYY 557
|
90 100
....*....|....*....|....*....
gi 503037449 2016 FTKDGLMVSGkWLEIDGKCYYFYTDGSLA 2044
Cdd:NF033838 558 LNANGAMATG-WLQYNGSWYYLNANGDMA 585
|
|
| glucan_65_rpt |
TIGR04035 |
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed ... |
1951-1997 |
6.63e-09 |
|
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.
Pssm-ID: 274933 [Multi-domain] Cd Length: 62 Bit Score: 53.68 E-value: 6.63e-09
10 20 30 40
....*....|....*....|....*....|....*....|....*...
gi 503037449 1951 YLYYKDGKALTGTQTIDGVKYFFSTDGT-LKTGWVKDGDNWRFYSGKT 1997
Cdd:TIGR04035 1 YYFDADGKAVTGAQTIDGVTYYFDENGKqVKGDFVTNGGGTYYYDKDS 48
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
447-1761 |
1.48e-07 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 57.08 E-value: 1.48e-07
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 447 TAVYTPSENDPGKNYLTATSAAVSKTIAKANLTDFAITEVTGKKYGDAVFKLQATAAGAPVDASFSVPADNKVLSIIGDT 526
Cdd:COG3210 47 TAGGIASNAGTTASTSGGSGTAGGVGNTSASTGGIGAAAANTAGTLETGLTSNIGGGSVNGSNSTGNGTLTTTAASATTG 126
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 527 AAIIGAGTVRVTAVKAADNNFNEATATLDIAIGKATPTVTINEVSYYIYNGKPVTNPTAEQVTVTGADYADVEFLYSTVM 606
Cdd:COG3210 127 NNTGGTTTSSTNTVTTLGGTTTGNTVLSTSGAGNNTNTNNSSSGTNIGNSIPTTGGSLNVVAANPTGVTGVGGALINATA 206
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 607 SGPYTAQAPKDAGINYYVKARILETENTKAYYSQPKTFSIKEKGISIAGGTAASKKYDGGADAIVTDLIFSGLQNGETLE 686
Cdd:COG3210 207 GVLANAGGGTAGGVASANSTLTGGVVAAGTGAGVISTGGTDISSLSVAAGAGTGGAGGTGNAGNTTIGTTVTGTNATGSN 286
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 687 LGRDYLVGSPAYDNANVGTEKTVSGTASLILNEKTKNYNLISGNYIIRNGVITRGDGPDAPTGGIVDDERNIFIFDSVVG 766
Cdd:COG3210 287 TAGASSGDTTTNGTSSVTGAGGTGVLGGGTAAGITTTNTVGGNGDGNNTTANSGAGLVSGGTGGNNGTTGTGAGSGLTGT 366
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 767 TTYEYKIGSENWTNIFATGTETVITVGNKALAIGELQVRIKETANYQAGEVLSNKTAFTASLEGSVDITGTTVYGETLTA 846
Cdd:COG3210 367 GNGGGLTTAGAGTVASTVGTATASTGNASSTTVLGSGSLATGNTGTTIAGNGGSANAGGFTTTGGVLGITGNGTVTGGTI 446
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 847 EVEGQQANAVLIYKWKANGEEIQNGSQNTLTISGSLVGKIINVEITAANYSGKLISTTTPSVRKKQVTATAGSITKVYDG 926
Cdd:COG3210 447 GGLTGSGTTNGAGLSGNTDVSGTGTVTNSAGNTTSATTLAGGGIGTVTTNATISNNAGGDANGIATGLTGITAGGGGGGN 526
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 927 NKWV--NIPLSITKVNASDDITVVAEGIYNSADAGTGKTVSISGATVTGDAKDWYEIALPKGLTGTITPKSMPSATVTVN 1004
Cdd:COG3210 527 ATSGgtGGDGTTLSGSGLTTTVSGGASGTTAASGSNTANTLGVLAATGGTSNATTAGNSTSATGGTGTNSGGTVLSIGTG 606
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1005 GSYIYTGSAIVPTDENVVVKDGTTTLTKGTDYSFTASNNINVGTASVEVTLKGNYSGNASGSFTIAPKAVTPTIEAIGNR 1084
Cdd:COG3210 607 SAGATGTITLGAGTSGAGANATGGGAGLTGSAVGAALSGTGSGTTGTASANGSNTTGVNTAGGTGGGTTGTVTSGATGGT 686
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1085 TFTGTQLTPNVTVENDGTTL-IKDTDYTVSYGANTNVGTGSVTVSLKGNYSGTatanfAITKAASPTVSAPDNIAMVKGQ 1163
Cdd:COG3210 687 TGTTLNAATGGTLNNAGNTLtISTGSITVTGQIGALANANGDTVTFGNLGTGA-----TLTLNAGVTITSGNAGTLSIGL 761
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1164 ERAYTFDLSTLTMPQNAGVKTYVVASPSNADLFAVNPTVDGSMLKFTSKSVSTAGGTATVAVTIKSDNYRDVAVTLTFEI 1243
Cdd:COG3210 762 TANTTASGTTLTLANANGNTSAGATLDNAGAEISIDITADGTITAAGTTAINVTGSGGTITINTATTGLTGTGDTTSGAG 841
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1244 VDKVPVSISGIAAASKTYNGTPAVYTGTPIAKDGEH--KVTVSGYNYTWSKADGTPLLEAPKSVGNYKLSVSVKADDPNY 1321
Cdd:COG3210 842 GSNTTDTTTGTTSDGASGGGTAGANSGSLAATAASItvGSGGVATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGT 921
|
890 900 910 920 930 940 950 960
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1322 IGSINVPFEIVKANLIIKADDKHILIGGAKPDYTATVTGLVNGETISGISFTDNAPNTNTKGSFTITPSNGTITGGGNGN 1401
Cdd:COG3210 922 GGGGLTGGNAAAGGTGAGNGTTALSGTQGNAGLSAASASDGAGDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGN 1001
|
970 980 990 1000 1010 1020 1030 1040
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1402 YNITYETGTLTIGIDVSVIDSAIAAANTAKSGVSVDDRAANQVSSGTRFVTTAEMNALTAAIQTATEAKVMVNTAGEAQA 1481
Cdd:COG3210 1002 SGTTASTTGGSGAIVAGGNGVTGTTGTASATGTGTAATAGGQNGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGG 1081
|
1050 1060 1070 1080 1090 1100 1110 1120
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1482 AAKTLEDAVVSFKAAIKTGSYTAPSSGGGSSGGSTSSGGGTTTTPAPVATPEKMPNQPVTATAPVTATAGTNGAASASIP 1561
Cdd:COG3210 1082 TAQASGAGTTHTLGGITNGGATGTSGGTTTSTGGVTASKVGGTTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSA 1161
|
1130 1140 1150 1160 1170 1180 1190 1200
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1562 DKAVTDAISKAQADATSQGKTANGISVELNVTMPKGTASLTATLTRSSLDSLVSAGVSSLEIGCSLVQVSFDKKALAEIQ 1641
Cdd:COG3210 1162 SAGDTTAVAAATTTTTGSAINGGADSAATEGTAGTDLKGGDSTGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQT 1241
|
1210 1220 1230 1240 1250 1260 1270 1280
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1642 KQSSGNISIAIAPKTNLSDAAKKIIGTRPVYDITVGYGSGKTVSSFGGGIATVSIPYTLGKNEAVGGLYAVYVDAKGNAN 1721
Cdd:COG3210 1242 GSFVAAGSASGTGDATTGATAGAVSNGATSTVAGNAGATATGSTVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGT 1321
|
1290 1300 1310 1320
....*....|....*....|....*....|....*....|
gi 503037449 1722 RIAGSAYDANSGCVIFTTTHFSQYGIGYTAPTAKFTDTST 1761
Cdd:COG3210 1322 TATGTAVAAVNSGGVNAGGGTINTTAANTGLNGGNGATDS 1361
|
|
| MBG_2 |
pfam18676 |
MBG domain (YGX type); This domain is found in a variety of bacterial extracellular proteins. ... |
1336-1412 |
3.77e-07 |
|
MBG domain (YGX type); This domain is found in a variety of bacterial extracellular proteins. This domain is related to the MBG domain (pfam17883). But it replaces the characteriztic YDG motif close the N-terminus with a YGX motif.
Pssm-ID: 436658 [Multi-domain] Cd Length: 73 Bit Score: 49.15 E-value: 3.77e-07
10 20 30 40 50 60 70
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 503037449 1336 LIIKADDKHILIGGAKPDYTATVTGLVNGETISGIS-FTDNAPNTNTKGSFTITPSngtitGGGNGNYNITYETGTLT 1412
Cdd:pfam18676 1 LTVTADDKSKVYGDADPALTYTYSGLVNGDTLTVLSgGSLSATAGSNVGTYAITAS-----GLSASNYTITYVPGTLT 73
|
|
| Big_3_5 |
pfam16640 |
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like ... |
383-477 |
1.58e-06 |
|
Bacterial Ig-like domain (group 3); This family consists of bacterial domains with an Ig-like fold.
Pssm-ID: 406933 [Multi-domain] Cd Length: 90 Bit Score: 48.01 E-value: 1.58e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 383 SEPFEYGKEVFLEATVTGVNGEMPDGTVTFKKGSDTLETANYDFGTEKYTLSSgKIFDAGNHSFTAVYTPSEndpgkNYL 462
Cdd:pfam16640 1 PTSVTYGQSVTLTATVTPASGGTPTGTVTFTDGGTVLGTAVLVSGNGVATLTT-TALAAGTHTITATYSGDA-----NYA 74
|
90
....*....|....*
gi 503037449 463 TATSAAVSKTIAKAN 477
Cdd:pfam16640 75 ASTSSAVTVTVTKAA 89
|
|
| SLH |
pfam00395 |
S-layer homology domain; |
1881-1923 |
2.81e-06 |
|
S-layer homology domain;
Pssm-ID: 459798 [Multi-domain] Cd Length: 42 Bit Score: 45.66 E-value: 2.81e-06
10 20 30 40
....*....|....*....|....*....|....*....|...
gi 503037449 1881 YADASGIgSAYKDAVKAMQQAGIMMGGSDNKFNPKGNATRAEV 1923
Cdd:pfam00395 1 FKDVKSV-AAWAEAVAALAELGIISGYPDGTFRPNEPITRAEA 42
|
|
| YDG |
pfam18657 |
YDG domain; This presumed domain is found in a wide variety of bacterial cell surface proteins. ... |
649-726 |
4.56e-06 |
|
YDG domain; This presumed domain is found in a wide variety of bacterial cell surface proteins. This domain has a highly conserved YDG motif near its N-terminus. This domain is likely related to the pfam17883 domain.
Pssm-ID: 436651 [Multi-domain] Cd Length: 85 Bit Score: 46.64 E-value: 4.56e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 649 KGISIAGGTAASKKYDGGADAIVTDLIFSGLQNG----ETLELGRdylvGSPAYDNANVGTEKTVSGTASLILNEKTKNY 724
Cdd:pfam18657 3 KPLTVTGVTAVTKVYDGTTTAAVAGVSVTGVLSGvvagDDVTLTT----GTATFDDKNVGTGKTVTVSGLTLTGADAGNY 78
|
..
gi 503037449 725 NL 726
Cdd:pfam18657 79 TL 80
|
|
| YDG |
pfam18657 |
YDG domain; This presumed domain is found in a wide variety of bacterial cell surface proteins. ... |
910-982 |
7.81e-06 |
|
YDG domain; This presumed domain is found in a wide variety of bacterial cell surface proteins. This domain has a highly conserved YDG motif near its N-terminus. This domain is likely related to the pfam17883 domain.
Pssm-ID: 436651 [Multi-domain] Cd Length: 85 Bit Score: 45.87 E-value: 7.81e-06
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 910 KKQVT-ATAGSITKVYDGNKWVNIPLS-----ITKVNASDDITVV-AEGIYNSADAGTGKTVSISGATVTGDAKDWYEIA 982
Cdd:pfam18657 2 PKPLTvTGVTAVTKVYDGTTTAAVAGVsvtgvLSGVVAGDDVTLTtGTATFDDKNVGTGKTVTVSGLTLTGADAGNYTLA 81
|
|
| Choline_bind_3 |
pfam19127 |
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ... |
2010-2047 |
8.07e-06 |
|
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.
Pssm-ID: 465978 [Multi-domain] Cd Length: 47 Bit Score: 44.45 E-value: 8.07e-06
10 20 30
....*....|....*....|....*....|....*....
gi 503037449 2010 NNKTYYFTKDGLMVSGKWLEIDGKCYYFYTD-GSLARST 2047
Cdd:pfam19127 8 NGQTLYFDSDGKQVKGWVVTIDGKWYYFDADsGEMVTNR 46
|
|
| FhaB |
COG3210 |
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, ... |
826-1721 |
4.32e-05 |
|
Large exoprotein involved in heme utilization or adhesion [Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442443 [Multi-domain] Cd Length: 1698 Bit Score: 48.99 E-value: 4.32e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 826 ASLEGSVDITGTTVYGETLTAEVEGQQANAVLIYKWKANGEEIQNGSQNTLTISGSLVGKIINVEITAANYSGKLISTTT 905
Cdd:COG3210 792 AEISIDITADGTITAAGTTAINVTGSGGTITINTATTGLTGTGDTTSGAGGSNTTDTTTGTTSDGASGGGTAGANSGSLA 871
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 906 PSVRKKQVTATAGSITKVYDGNKWVNIPLSITKVNASDDITVVAEGIYNSADAGTGKTVSISGATVTGDAKDWYEIALPK 985
Cdd:COG3210 872 ATAASITVGSGGVATSTGTANAGTLTNLGTTTNAASGNGAVLATVTATGTGGGGLTGGNAAAGGTGAGNGTTALSGTQGN 951
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 986 GLTGTITPKSMPSATVTVNGSYIYTGSAIVPTDENVVVKDGTTTLTKGTDYSFTASNNINVGTASV--------EVTLKG 1057
Cdd:COG3210 952 AGLSAASASDGAGDTGASSAAGSSAVGTSANSAGSTGGVIAATGILVAGNSGTTASTTGGSGAIVAggngvtgtTGTASA 1031
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1058 NYSGNASGSFTIAPKAVTPTIEAIGNRTFTGTQLTPNVTVENDGTTLIKDTDYTVSYGANTNVGTGSVTVSLKGNYSGTA 1137
Cdd:COG3210 1032 TGTGTAATAGGQNGVGVNASGISGGNAAALTASGTAGTTGGTAASNGGGGTAQASGAGTTHTLGGITNGGATGTSGGTTT 1111
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1138 TANFAITKAASPTVSAPDNIAMVKGQERAYTFDLSTLTMPQNAGVKTYVVASPSNADLFAVNPTVDGSMLKFTSKSVSTA 1217
Cdd:COG3210 1112 STGGVTASKVGGTTTVGATGTSTASTEAAGAGTLTGLVAVSAVAGGASSASAGDTTAVAAATTTTTGSAINGGADSAATE 1191
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1218 GGTATVAVTIKSDNYRDVAVTLTFEIVDKVPVSISGIAAASKTYNGTPAVYTGTPIAKDGEHKVTVSGYNYTWSKADGTP 1297
Cdd:COG3210 1192 GTAGTDLKGGDSTGGSTTTIGTTNVTTTTTLTASDTGNTTATGGSSAGQTGSFVAAGSASGTGDATTGATAGAVSNGATS 1271
|
490 500 510 520 530 540 550 560
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1298 LLEAPKSVGNYKLSVSVKADDPNYIGSINVPFEIVKANLIIKADDKHILIGGAKPDYTATVTGLVNGETISGISFTDNAP 1377
Cdd:COG3210 1272 TVAGNAGATATGSTVDIGSTSATSAGGSLDTTGNTAGANGATVGTGIGGTTATGTAVAAVNSGGVNAGGGTINTTAANTG 1351
|
570 580 590 600 610 620 630 640
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1378 NTNTKGSFTITPSNGTITGGGNGNYNITYETGTLTIGIDVSVIDSAIAAANTAKSGVSVDDRAANQVSSGTRFVTTAEMN 1457
Cdd:COG3210 1352 LNGGNGATDSAAGAGSGGAAGSLAATAGAGTVLTGAGNNTGAEGTNAGRDGGVTTSGTGVGNNGGVSGTTVAGTTGSSAT 1431
|
650 660 670 680 690 700 710 720
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1458 ALTAAIQTATEAKVMVNTAGEAQAAAKTLEDAVVSFKAAIKTGSYTAPSSGGGSSGGSTSSGGGTTTTPAPVATPEKMPN 1537
Cdd:COG3210 1432 TGTGGTGNTTGTSVAGAGGGNADASAINTGNASSLGAGGSTAGNAVGGAVIGGTTTGGNGAGVAGATASNGGTSTGAGGT 1511
|
730 740 750 760 770 780 790 800
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1538 QPVTATAPVTATAGTNGAASASIPDKAVTDAISKAQADATSQGKTANGISVELNVTMPKGTASLTATLTRSSLDSLVSAG 1617
Cdd:COG3210 1512 AGGTTAEVAKASLEGGEGTYGGSSVAEAGTGGGILGAVSGAGSEGGAAGGVTGSVGVGGTDGAGGDTGGADDTGAQAPTA 1591
|
810 820 830 840 850 860 870 880
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1618 VSSLEIGCSLVQVSFDKKALAEIQKQSSGNISIAIAPKTNLSDAAKKIIGTRPVYDITVGYGSGKTVSSFGGGIATVSIP 1697
Cdd:COG3210 1592 GNTATLTLSLAEGTNAEYGGTTNVTSGTAGNAGATGANSNTVVTTNGGEGVLALVAGGNTTNGTTLSGAVNGAGNGWAVD 1671
|
890 900
....*....|....*....|....
gi 503037449 1698 YTLGKNEAVGGLYAVYVDAKGNAN 1721
Cdd:COG3210 1672 LTDATLAGLGGATTAAAGNVATGD 1695
|
|
| COG5263 |
COG5263 |
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism]; |
1940-2024 |
8.07e-05 |
|
Glucan-binding domain (YG repeat) [Carbohydrate transport and metabolism];
Pssm-ID: 444077 [Multi-domain] Cd Length: 486 Bit Score: 47.56 E-value: 8.07e-05
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1940 AQGWAQNDAGQYLYYKDGKALTGTQTIDGVKYFFSTDGTLKTGWVKDgdnwrfysgktmlvgfwdlgangNNKTYYFTKD 2019
Cdd:COG5263 425 ATGWQKIGGKWYYFDSNGAMATGWVKVDGKWYYFDSDGAMATGWQTI-----------------------DGKTYYFDSN 481
|
....*
gi 503037449 2020 GLMVS 2024
Cdd:COG5263 482 GAWVG 486
|
|
| Choline_bind_3 |
pfam19127 |
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to ... |
1960-2027 |
9.60e-05 |
|
Choline-binding repeat; Pair of presumed choline-binding repeats often found adjacent to pfam01473.
Pssm-ID: 465978 [Multi-domain] Cd Length: 47 Bit Score: 41.76 E-value: 9.60e-05
10 20 30 40 50 60
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 503037449 1960 LTGTQTIDGVKYFFSTDGTLKTGWVKDGDnwrfysgktmlvgfwdlgangNNKTYYFTKDGLMVSGKW 2027
Cdd:pfam19127 1 VTGWQTINGQTLYFDSDGKQVKGWVVTID---------------------GKWYYFDADSGEMVTNRF 47
|
|
| SLH |
pfam00395 |
S-layer homology domain; |
1756-1794 |
8.88e-04 |
|
S-layer homology domain;
Pssm-ID: 459798 [Multi-domain] Cd Length: 42 Bit Score: 38.73 E-value: 8.88e-04
10 20 30 40
....*....|....*....|....*....|....*....|
gi 503037449 1756 FTDTSTHWA-KESIDYVVGRGLMSGSSETAFTPNSAMTRG 1794
Cdd:pfam00395 1 FKDVKSVAAwAEAVAALAELGIISGYPDGTFRPNEPITRA 40
|
|
| AidA |
COG3468 |
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular ... |
650-1037 |
9.60e-04 |
|
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442691 [Multi-domain] Cd Length: 846 Bit Score: 44.17 E-value: 9.60e-04
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 650 GISIAGGTAASKKYDGGADAIVTDLIFSGLQNGETLELGRDYLVGSPAYDNANVGTEKTVSGTASLILNEKTKNYNLISG 729
Cdd:COG3468 45 SGSGAGGVAGNGGGGGGGAGGGGGGAGSGGGLAGAGSGGTGGNSTGGGGGNSGTGGTGGGGGGGGSGNGGGGGGGGGGGG 124
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 730 NYIIRNGVITRGDGPDAPTGGIVDDERNIFIFDSVVGTTYEYKIGSENWTNIFATGTETVITVGNKALAIGELQVRIKET 809
Cdd:COG3468 125 TGGGGGGGTGSAGGGGGGGGGGTGVGGTGAAAAGGGTGSGGGGSGGGGGAGGGGGGGAGGSGGAGSTGSGAGGGGGGSGG 204
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 810 ANYQAGEVLSNKTAFTASLEGSVDITGTTVYGetLTAEVEGQQANAVLIYKWKANGEEIQNGSQNTLTISGSLVGKIINV 889
Cdd:COG3468 205 GGGAAGTGGGGGGGGGAGGATGGAGSGGNTGG--GVGGGGGSAGGTGGGGLTGGGAAGTGGGGGGTGTGSGGGGGGGANG 282
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 890 EITAANYSGKLISTTTPSVRKKQVTATAGSITKVYDGNKWVNIPLSITKVNASDDITVVAEGIYNSADAGTGKTVSISGA 969
Cdd:COG3468 283 GGSGGGGGASGTGGGGTASTGGGGGGGGGNGGGGGGGSNAGGGSGGGGGGGGGGGGGGTTLNGAGSAGGGTGAALAGTGG 362
|
330 340 350 360 370 380
....*....|....*....|....*....|....*....|....*....|....*....|....*...
gi 503037449 970 TVTGDAKDWYEIALPKGLTGTITPKSMPSATVTVNGSYIYTGSAIVPTDENVVVKDGTTTLTKGTDYS 1037
Cdd:COG3468 363 SGSGGGGGGGSGGGGGAGGGGANTGSDGVGTGLTTGGTGNNGGGGVGGGGGGGLTLTGGTLTVNGNYT 430
|
|
| PHA03255 |
PHA03255 |
BDLF3; Provisional |
1028-1155 |
2.93e-03 |
|
BDLF3; Provisional
Pssm-ID: 165513 [Multi-domain] Cd Length: 234 Bit Score: 41.43 E-value: 2.93e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1028 TTLTKGTDYSFTASNNINVGTASVEVTLKGNYSGNASGSFTIAPKAVTPTIEAIGNR-----TFTGTQLTPNVTVENdGT 1102
Cdd:PHA03255 20 TSLIWTSSGSSTASAGNVTGTTAVTTPSPSASGPSTNQSTTLTTTSAPITTTAILSTntttvTSTGTTVTPVPTTSN-AS 98
|
90 100 110 120 130
....*....|....*....|....*....|....*....|....*....|...
gi 503037449 1103 TLIKDTDYTVSYGANTNVGTGSVTVSLKGNYSGTATANFAITKAASPTVSAPD 1155
Cdd:PHA03255 99 TINVTTKVTAQNITATEAGTGTSTGVTSNVTTRSSSTTSATTRITNATTLAPT 151
|
|
| 34 |
PHA02584 |
long tail fiber, proximal subunit; Provisional |
891-1132 |
3.17e-03 |
|
long tail fiber, proximal subunit; Provisional
Pssm-ID: 222890 [Multi-domain] Cd Length: 1229 Bit Score: 42.82 E-value: 3.17e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 891 ITAANYSGKLISTTTPSVRKKQVTATAGSITKVYDGNKWVNI--PLSITKVNASDDITVVAEGIYNSADAGTGKTVSIS- 967
Cdd:PHA02584 934 SVTANSTLTTQNTSNGTVVVVDETSIAFYSQNNTTGNIVFNIdgTVDPINVNANGTLNATGVATNGRAVYAEGGGIARTn 1013
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 968 --------GATVTGDAKDWYEIALPKGlTGTITPKSMPSATVTvngsyiyTGSAIVPTDENVVVKDGTTTLTKGTDYSFT 1039
Cdd:PHA02584 1014 naaraitgGFTIRNDGSTTVFLLTAAG-DQTGGFNGLKSLIIN-------NANGQVTINDNYIINAGGTIMSGGLTVNSR 1085
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1040 ASNNINVGTASVEVTLKG----NYSGNASGSFTIAPKAVTPTIEAIGNRT---FTGTQLTPNVTVENDGTTlikdTDYTV 1112
Cdd:PHA02584 1086 IRSQGTKASYTRAPTADTvgfwSVDINDSATYNQFPGYFQMVTKTKSPGTltqFGNTLDSLYQDWSPDGRT----TRYTR 1161
|
250 260
....*....|....*....|
gi 503037449 1113 SYGANTNVGTGSVTVSLKGN 1132
Cdd:PHA02584 1162 TWQKNKNAWTSFGEVYTKGN 1181
|
|
| glucan_65_rpt |
TIGR04035 |
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed ... |
2014-2041 |
4.18e-03 |
|
glucan-binding repeat; This model describes a region of about 63 amino acids that is composed of three repeats of a more broadly distributed family of shorter repeats modeled by pfam01473. While the shorter repeats are often associated with choline binding (and therefore with cell wall binding), the longer repeat described here represents a subgroup of repeat sequences associated with glucan binding, as found in a number glycosylhydrolases. Shah, et al. describe a repeat consensus, WYYFDANGKAVTGAQTINGQTLYFDQDGKQVKG, that corresponds to half of the repeat as modeled here and one and a half copies of the repeat as modeled by pfam01473.
Pssm-ID: 274933 [Multi-domain] Cd Length: 62 Bit Score: 37.50 E-value: 4.18e-03
10 20
....*....|....*....|....*...
gi 503037449 2014 YYFTKDGLMVSGkWLEIDGKCYYFYTDG 2041
Cdd:TIGR04035 1 YYFDADGKAVTG-AQTIDGVTYYFDENG 27
|
|
| COG4625 |
COG4625 |
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function ... |
650-1140 |
4.32e-03 |
|
Uncharacterized conserved protein, contains a C-terminal beta-barrel porin domain [Function unknown];
Pssm-ID: 443664 [Multi-domain] Cd Length: 900 Bit Score: 42.07 E-value: 4.32e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 650 GISIAGGTAASKKYDGGADAIVTDLIFSGLQNGETLELGRDYLVGSPAYDNANVGTEKTVSGTASLILNEKTKNYNLISG 729
Cdd:COG4625 5 GGGGGGGGGGGGTGGGGAGGGGGAGGGAGGGGAGGGGGGGGGGGGAGGGGGGGGTGGGGGGGGGGGGGGAGGGGGGGGGG 84
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 730 NYII-RNGVITRGDGPDAPTGGIVDDERNIFIFDSVVGTTYEYKIGSENWTNIFATGTETVITVGNKALAIGELQVRIKE 808
Cdd:COG4625 85 GGGGgTGGVGGGGGGGGGGGGGGGGGGGGGGGGSAGGGGGGAGGAGGGGGGGAGGGGGGGGGGGAGGGGGGGAGGAGGGG 164
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 809 TANYQAGEVLSNKTAFTASLEGSVDITGTTVYGETLTAEVEGQQANAVLIYKWKANGEEIQNGSQNTLTISGSLVGKIIN 888
Cdd:COG4625 165 GGGGGGGGGGGGGGGGGGGGGGGGGGGGNGGGGGGGGGGGGGGGGGGGGAGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG 244
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 889 VEITAANYSGKLISTTTPSVRKKQVTATAGSITKVYDGNKWVNIPLSITKVNASDDITVVAEGIYNSADAGTGKTVSISG 968
Cdd:COG4625 245 GGGAGGGGGGGGGNGGGGGAGGGGGGGGGGSGGGGGGGGGGGSGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGGG 324
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 969 ATVTGDAKDWYEIALPKGLTGTITpksmpSATVTVNGSYIYTGSAIVPTDENVVVKDGTTTLTKGTDYSFTASNNINVGT 1048
Cdd:COG4625 325 GGGGGGGGGAGGGGGSGGAGAGGG-----GAGGGGAGGGGGGGTGGGGGGGGGGGGGSGGGGAGGGGGSGGGGGGGAGGG 399
|
410 420 430 440 450 460 470 480
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1049 ASVEVTLKGNYSGNASGSFTIAPKAVTPTIEAIGNRTFTGTQLTPNVTVENDGTTLIKDTDYTVSYGANTNVGTGSVTVS 1128
Cdd:COG4625 400 GGGGGAGGTGGGGAGGGGGAAGGGGGGTGAGGGGGGGGTGAGGGGATGGGGGGGGGAGGSGGGAGAGGGSGSGAGTLTLT 479
|
490
....*....|..
gi 503037449 1129 LKGNYSGTATAN 1140
Cdd:COG4625 480 GNNTYTGTTTVN 491
|
|
| 5 |
PHA02596 |
baseplate hub subunit and tail lysozyme; Provisional |
1025-1176 |
5.74e-03 |
|
baseplate hub subunit and tail lysozyme; Provisional
Pssm-ID: 222900 [Multi-domain] Cd Length: 576 Bit Score: 41.66 E-value: 5.74e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1025 DGT-TTLTKGTDYSFTASN-NINVGtASVEVTLKGN----YSGNA----SGSFTIAPKA-VTPTIEAIGNRTFTGtqltp 1093
Cdd:PHA02596 428 DGTrVVKIVGDDYYIVKQDrNVNVK-GNLKVVVEGDaiyyNMGNVlqtiDGNVTIFVRGnVTKTVEGNGTLYVKG----- 501
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1094 NVTVENDGttlikDTDYTVSYGANTNVgTGSVTVSLKGNYSGTATANFAITKAASPTVSAPDNIAMVKGQeraYTFDLST 1173
Cdd:PHA02596 502 NVTVQVDG-----NLDATVKGNATTLV-EGNQTNTVNGNYKLKVEGNFDMTVGGNWSEQMAGMSSIASGT---YTIDGSR 572
|
...
gi 503037449 1174 LTM 1176
Cdd:PHA02596 573 IDI 575
|
|
| MBG_3 |
pfam18887 |
MBG domain; This entry corresponds to an MBG (mirror beta grasp) domain. It is found in a ... |
999-1073 |
6.76e-03 |
|
MBG domain; This entry corresponds to an MBG (mirror beta grasp) domain. It is found in a variety of bacterial cell surface proteins.
Pssm-ID: 436808 [Multi-domain] Cd Length: 72 Bit Score: 37.29 E-value: 6.76e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 999 ATVTVNG-SYIYTGSAIVPTdenvvvkdgTTTLTKGTDYSFT----ASNNINVGTASVEVTL-KGNYSGNASGSFTIAPK 1072
Cdd:pfam18887 1 ATVTLGDlSQTYNGSAKSAT---------ATTSPAGLSVTLTydgsATAPTNAGSYAVVATInDANYTGSASGTLVIAKA 71
|
.
gi 503037449 1073 A 1073
Cdd:pfam18887 72 S 72
|
|
| AidA |
COG3468 |
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular ... |
790-1224 |
9.35e-03 |
|
Autotransporter adhesin AidA [Cell wall/membrane/envelope biogenesis, Intracellular trafficking, secretion, and vesicular transport];
Pssm-ID: 442691 [Multi-domain] Cd Length: 846 Bit Score: 41.09 E-value: 9.35e-03
10 20 30 40 50 60 70 80
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 790 ITVGNKALAIGELQVRIKETANYQAGEVLSNKTAFTASLEGSVDITGTTVYGETLTAEVEGQQANAVLIYKWKANGEEIQ 869
Cdd:COG3468 1 TASGGGGGATGLGGGGTGGGGGLGGTGGGNAGLGIGNGGGGGAASGSGAGGVAGNGGGGGGGAGGGGGGAGSGGGLAGAG 80
|
90 100 110 120 130 140 150 160
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 870 NGSQNTLTISGSLVGKIINVEITAANYSGKLISTTTPSVRKKQVTATAGSITKVYDGNKWVNIPLSITKVNASDDITVVA 949
Cdd:COG3468 81 SGGTGGNSTGGGGGNSGTGGTGGGGGGGGSGNGGGGGGGGGGGGTGGGGGGGTGSAGGGGGGGGGGTGVGGTGAAAAGGG 160
|
170 180 190 200 210 220 230 240
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 950 EGIYNSADAGTGKTVSISGATVTGDAkdwyeialPKGLTGTITPKSMPSATVTVNGSYIYTGSAIVPTDENVVVKDGTTT 1029
Cdd:COG3468 161 TGSGGGGSGGGGGAGGGGGGGAGGSG--------GAGSTGSGAGGGGGGSGGGGGAAGTGGGGGGGGGAGGATGGAGSGG 232
|
250 260 270 280 290 300 310 320
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1030 LTKGTDYSFTASNNINVGTASVEVTLKGNYSGNASGSFTIAPKAVTPTIEAIGNRTFTGTQLTPNVTVENDGTTLIkDTD 1109
Cdd:COG3468 233 NTGGGVGGGGGSAGGTGGGGLTGGGAAGTGGGGGGTGTGSGGGGGGGANGGGSGGGGGASGTGGGGTASTGGGGGG-GGG 311
|
330 340 350 360 370 380 390 400
....*....|....*....|....*....|....*....|....*....|....*....|....*....|....*....|
gi 503037449 1110 YTVSYGANTNVGTGSVTVSLKGNYSGTATANFAITKAASPTVSAPDNIAMVKGQERAYTFDLSTLTMPQNAGVKTYVVAS 1189
Cdd:COG3468 312 NGGGGGGGSNAGGGSGGGGGGGGGGGGGGTTLNGAGSAGGGTGAALAGTGGSGSGGGGGGGSGGGGGAGGGGANTGSDGV 391
|
410 420 430
....*....|....*....|....*....|....*
gi 503037449 1190 PSNADLFAVNPTVDGSMLKFTSKSVSTAGGTATVA 1224
Cdd:COG3468 392 GTGLTTGGTGNNGGGGVGGGGGGGLTLTGGTLTVN 426
|
|
|