Surface layer (S-layer) proteins are one of the most commonly observed cell envelope components in both Archaea and Bacteria. It has versatile functions and holds considerable application potential in biotechnology. Bifidobacteria are representative probiotics conferring health promoting properties. However, there is little study of S-layer in bifidobacteria yet. The distribution and characteristics of S-layer in bifidobacteria are unknown. In this study, search for S-layer protein in the identical protein groups in NCBI yielded 49 hits belonging to bifidobacteria. These proteins were annotated as either “S-layer (domain) protein” or “putative S-layer (y) domain protein” that distributed among 26 species of Bifidobacterium genus. Multiple alignments suggest S-layer proteins are relatively conservative. Phylogenetic analysis of 24 S-layer (domain) protein sequences groups them into three distinct clusters, with the majority species in Cluster-2. S-layer (domain) protein has a universe motif DUF4381, though its function is unknown. Meanwhile, two other motifs CARDB and EphA2_TM involved in cell adhesion and cell signaling respectively, presented in most S-layer (domain) protein in bifidobacteria. All S-layer proteins have a typical N-terminal Sec-dependent signal peptide and a C-terminal trans-membrane region. Homological modeling of representative S-layer proteins from each cluster revealed a few unique structural features. All representative S-layer proteins have a plenty of β-meander motif that exclusively composed by β-barrel structural architectures linked together by hairpin loops.
S-layers are one of the most commonly observed prokaryotic cell surface structures. They are composed by two-dimensional arrays of proteinaceous subunits (S-layer proteins), presented in almost every taxonomic group of walled Bacteria and almost universal in Archaea [
In Gram-positive bacteria, S-layers attached to the rigid peptidoglycan-containing layer. S-layers completely cover the cell surface during all stages of cell growth and division. Chemical and genetically analysis of many S-layers has revealed a similar overall composition [
Bifidobacteria are generally recognized as safe (GRAS), exerting many beneficial health effects on their host, and have attracted strong interest in the health care and food industries [
The S-layer protein sequences of bifidobacteria were searched from NCBI-Identical Protein Groups (IPG) with the key words “S-layer domain protein AND Bifidobacterium” (https://www.ncbi.nlm.nih.gov/). The resulting 49 protein sequences annotated as either “S-layer (domain) protein” or “putative S-layer (y) domain protein” were used for primary analysis. Domain Enhanced Lookup Time Accelerated BLAST (DRLTA-BLAST) conducted a second search with the longest consensus regions as queries when expected threshold was 4.0. The queries of S-layer (domain) protein (P146, YVNFGKGD, 8aa) and putative S-layer (y) domain protein (P277, QLVTWVESHDNYAN, 14aa) were obtained when threshold was set at 100% by local ClustalW multiple alignments [
Protein sequences of S-layer (domain) protein and putative S-layer (y) domain protein were then aligned separately by local ClustalW program version 2.0 with the progressive method [
Representative sequences, including S-layer proteins of B. thermophilum RBL67 (Accession: AGH41482.1), B. pseudocatenulatum LMG10505 (Accession: KFI75572.1), and B. longum DJO10A (Accession: ACD98337.1), belonging to Clusters 1, 2, and 3, respectively, were analyzed by ProtParam tool at ExPASy (http://web.expasy.org/protparam/). The sub-location of S-layer proteins was analyzed by PSORTb v3.0.2 program as well (http://www.psort.org/psortb/index.html). All motifs in S-layer (domain) proteins were screened by MOTIFS program (http://www.genome.jp/tools/motif/). Above representative sequence in each cluster was used as example for illustration of conserved and/or unique structural motifs. The database used for the search is Pfam library and the E-value is 1.0 with Profile Hidden Markov Model [
Potential signal peptide (SP) sequences of all S-layer protein were analyzed using SignalP Version 4.1 and TATFIND [
The sequences of representative S-layer (domain) proteins from each cluster of the phylogenetic tree were searched for closest homologues in protein data bank (PDB) database using NCBI-BLASTp search program with the algorithm of DELTA-BLAST (https://blast.ncbi.nlm.nih.gov/Blast.cgi?PROGRAM=blastp). Meanwhile, the three sequences were used as template separately for structure prediction. Five homology models were obtained for each sequence from RaptorX server (http://raptorx.uchicago.edu/StructurePrediction) [
Search for S-layer protein in the identical protein groups in NCBI yielded 49 hits belonging to bifidobacteria. The sequences that were annotated as either S-layer (domain) protein or putative S-layer (y) domain protein were downloaded. Basic information of these proteins was summarized in
Multiple alignments of 24 S-layer (domain) proteins yielded several consensus regions. The longest consensus region YVNFGKGD was marked as P146. As
Name | Species | Rep. strain | Accession No. | Size (aa) | Source |
---|---|---|---|---|---|
S-layer (domain) protein | B. adolescentis | 2789STDY5608824 | CUN56955.1 | 390 | INSDC |
2789STDY5608862 | CUN77683.1 | 390 | INSDC | ||
B. longum | DJO10A | ACD98337.1 | 388 | INSDC | |
BT1 | ALE08127.1 | 388 | INSDC | ||
LMG 21814 | KFI71892.1 | 388 | INSDC | ||
BG7 | ALE35760.1 | 391 | INSDC | ||
LMG 13197 | KFI65071.1 | 388 | INSDC | ||
AH1206 | AOP00589.1 | 388 | INSDC | ||
35624 | AOL09953.1 | 391 | INSDC | ||
BBMN68 | ADQ01830.1 | 391 | INSDC | ||
B. boum | LMG 10736 | KFI46217.1 | 405 | INSDC | |
B. breve | BR3 | ALE13696.1 | 431 | INSDC | |
B. catenulatum | LMG 11043 | KFI55811.1 | 378 | INSDC | |
B. dentium | Bd1 | ADB09373.1 | 407 | INSDC | |
B. gallinarum | LMG 11586 | KFI60864.1 | 400 | INSDC | |
B. gallicum | LMG 11593 | WP006295244.1 | RefSeq | ||
B. minimum | LMG 11592 | KFI72335.1 | INSDC | ||
B. moukalabense | DSM 27321 | ETY71086.1 | 416 | INSDC | |
B. myosotis | DSM 100196 | OZG56886.1 | INSDC | ||
B. pseudocatenulatum | LMG 10505 | KFI75572.1 | 373 | INSDC | |
B. reuteri | DSM 23975 | KFI87931.1 | 480 | INSDC | |
B. scardovii | LMG 21589 | KFI91418.1 | 454 | INSDC | |
LMG 21589 | KFI92534.1 | INSDC | |||
B. stellenboschense | DSM 23968 | KFI98781.1 | 387 | INSDC | |
B. stercoris | DSM 24849 | KFI96690.1 | 390 | INSDC | |
B. subtile | LMG 11597 | KFI98912.1 | INSDC | ||
B. thermophilum | JCM 1207 | KFJ07718.1 | 407 | INSDC | |
RBL67 | AGH41482.1 | 405 | INSDC | ||
B. sp. | 12_1_47BFAA | EFV36416.1 | 391 | INSDC | |
putative S-layer (y) domain protein | B. angulatum | LMG 11039 | KFI40943.1 | INSDC | |
B. animalis | Bl12 | AGO52940.1 | INSDC | ||
B. choerinum | LMG 10510 | KFI56970.1 | INSDC | ||
LMG 10510 | KFI54244.1 | INSDC | |||
B. gallicum | LMG 11596 | KFI59372.1 | INSDC |
LMG 11596 | KFI59699.1 | INSDC | |||
---|---|---|---|---|---|
LMG 11596 | KFI59086.1 | INSDC | |||
LMG 11596 | KFI59089.1 | INSDC | |||
LMG 11596 | WP006295430.1 | RefSeq | |||
LMG 11596 | WP006295735.1 | RefSeq | |||
B. magnum | LMG 11591 | KFI68108.1 | INSDC | ||
B. merycicum | LMG 11341 | KFI69438.1 | INSDC | ||
B. minimum | LMG 11592 | KFI73654.1 | INSDC | ||
B. pseudolongum | LMG 11569 | KFI75530.1 | INSDC | ||
LMG 11569 | KFI75529.1 | INSDC | |||
LMG 11569 | KFI75058.1 | INSDC | |||
LMG 11571 | KFI77915.1 | INSDC | |||
B. pseudocatenulatum | LMG 10505 | KFI74947.1 | INSDC | ||
B. stercoris | DSM 24849 | KFI95044.1 | INSDC | ||
B. tsurumiense | JCM 13495 | KFJ06122.1 | INSDC |
shown in
ProtParam computation of representative S-layer proteins from each cluster indicates they are stable proteins have high value of aliphatic index and close pI (detail values see
Parameters/Species | B. thermophilum RBL67 | B. pseudocatenulatum LMG10505 | B. longum DJO10A |
---|---|---|---|
No. of amino acids | 405 | 373 | 388 |
Molecular weight | 42,180.03 | 38,803.91 | 40,777.55 |
Theoretical pI | 4.39 | 4.13 | 4.41 |
No. of negatively charged residues | 51 | 49 | 47 |
No. of positively charged residues | 30 | 21 | 25 |
Instability index | 35.84 | 24.03 | 28.21 |
Aliphatic index | 80.94 | 81.07 | 82.42 |
GRAVY | −0.17 | −0.103 | −0.091 |
No., number; GRAVY, Grand average of hydropathicity.
By MOTIFS searching of S-layer (domain) sequences extracted from NCBI-IPG, we recognized a plenty of motifs in each sequence. For simplicity, motifs in representative sequences were compared when E-value is 0.01 (
these motifs indicates some important properties of S-layer protein. For example, structural motif corresponding to the first α-helix of S-layer protein is conserved in all clusters.
Signal peptide (SP) responses for the direction of protein secretion across cell membrane. S-layer (domain) protein needs such structural element to direct its sub-localization. SignalP-TM prediction with Gram-positive bacteria model indicates most S-layer (domain) proteins, exactly 23 of 24 analyzed, have a potential Sec dependent SP (as represented in
Search of PDB database of these three representative sequences by Blastp yielded same results. All sequences have a homological structure model to Chain A of Vibrio nigripulchritudo nigritoxine with 33% identity in 45aa (PDB ID: 5M41). However, this model represents a small partial structure, as only 141aa was included in the model. Therefore, we next generated homology models of representative S-layer (domain) protein sequences from each cluster. Five models
were generated for each sequence. The best predicted model was selected and demonstrated in
In this study, we investigated the distribution of S-layer domain protein in Bifidobacterium from a phylogenetic and structural perspective. Phylogenetic analysis on all annotated S-layer protein sequences grouped them into three distinct clusters. (Putative) S-layer (y) domain proteins distributed in less than half species in bifidobacteria, though they have several conserve regions and their longest consensus sequences P146/P227 are common in nearly all species of bifidobacteria. S-layer proteins have different motifs and domains that are either involved in cell envelope and outer membrane biogenesis or related to cell adhesion. Furthermore, all S-layer (domain) proteins have a typical signal peptide sequence and a C-terminal trans-membrane region. Analysis of homological models of representative sequences revealed cluster-specific structural properties of S-layer protein.
This study was supported by a cooperation grant (No. 172102410055) from the Henan Agency of Science and Technology and a grant of Key Scientific Research Project (No. 17A180017) from the Henan Province Department of Education. The funders had no role in the study design, data collection and analysis, or decision to publish.
Li, J., Shen, Y.H., Jiang, Y.T., He, L. and Sun, Z.K. (2018) Bioinformatic Survey of S-Layer Proteins in Bifidobacteria. Computational Molecular Bioscience, 8, 68-79. https://doi.org/10.4236/cmb.2018.82003