Publication Date



Frederick Cohan






The bacterial domain includes thousands of known species, and likely orders of magnitude more that have not yet been discovered. Little is known about the causes of diversification in bacteria and the environmental factors associated with recent divergences. Ecotypes are bacterial populations that are theorized to be ecologically distinct, representing the most recent products of speciation. Our study utilizes a unique environmental soil gradient with dimensions such as salinity, boron, and copper decreasing westward along the transect, and other dimensions remaining stable or in random flux along the gradient. We isolated and sequenced a protein coding gene for 620 strains from the Bacillus subtilis-licheniformis clade from the environmental gradient described above. These sequences were used to demarcate ecotypes from the sample set, and tested for associations with the twelve environmental factors measured for the soil samples. We found thirty one ecotypes in our sample set, twenty three in the B. subtilis subclade and nine in the B. licheniformis subclade, including nine previously unidentified ecotypes. The ecotypes are significantly heterogeneous in their associations with iron, F(18, 365)=1.6704, p=0.04239 and boron, F(18, 244)=1.6767 p=0.04401 in the soil, as well as the soil pH F(18, 365)=1.6466, p=0.04699 and the proportion of clay present in the soil F(18, 99)=1.7226 p=0.04753. We further explored the ecological heterogeneity of ecotypes and strains within ecotypes by testing their growth tolerance to high levels of boron or copper in the soil. Ecotypes were marginally significantly different in their growth in high boron media, F(7, 61)=0.0566 p=0.0566, and strains nested within ecotypes were significantly different, F(13, 61) p=0.0087. In high copper media, ecotypes didnโ€™t show differences in growth, F(24, 131) p=0.21975, but strains did, F(73, 82) p=0.00001. This suggests that strains are quickly evolving with respect to their associations with copper and boron, and might gain and lose tolerances within the lifetime of an ecotype.

Rapid diversification, even within ecotypes, is supported by a separate project where we compared the genomes content of four strains of Bacillus subtilis subspecies spizizenii within a single ecotype. Gene annotations and genome comparisons with RAST showed that the strains differed in gene content, including genes for carbohydrate usage, specifically in utilization of maltose, maltodextrin and myo-inositol. Only one strain, G1A4, had genes that were non-paralogous to genes in the other strains; these were five genes for maltose and maltodextrin utilization. Strain G1A4 also had two paralogous genes for maltose and maltodextrin utilization in addition to the genes all of the strains shared; the strain G1A3 had three paralogous genes for the utilization of myo-inositol in addition to those in the core genome. To determine if differences in gene content reflect differences in ecology, strains were tested in monocoluture and in competition for their growth in media with the sole energy source as maltose, maltodextrin or myo-inositol. The strain G1A4, predicted to perform best on maltose and maltodextrin, did outperform the other strains. G1A4 also performed better on glucose, indicating that the strain was superior for reasons besides the extra maltose/maltodextrin genes (due to either the five other unique genes with known functions or one of the dozens of unique genes with unknown functions). The strain G1A3, predicted to perform best on myo-inositol, did not perform the best, even when data was corrected for the strainsโ€™ growth differences in the glucose control. These findings demonstrate the value of combining genomic analyses with growth experiments. Differences in the ecology of strains can be difficult to determine from sequence data alone. But taken together, the results indicate that carbohydrates are one factor associated with very recent speciation events in bacteria.



ยฉ Copyright is owned by author of this document