An Improved Algorithm for Demarcating Bacterial Species
For scientists to understand microbial ecosystems, identifying the fundamental unit of bacterial species is imperative. Past attempts involved characterizing metabolic capabilities, or percentage of genome sequence similarity; however these approaches have proven ineffective. The Cohan lab developed a program called Ecotype Simulation (ES) that attempts to demarcate bacterial species based on the Stable Ecotype model of speciation. Previously, our lab has demonstrated the superior accuracy of ES, compared to other demarcators (AdaptML, BAPS, GMYC), with field experiments and sequence simulations. However, ES lacks the efficiency of the other programs. Recently we have made several improvements to the algorithm behind ES that increase its efficiency by orders of magnitude. With these improvements I aim to demonstrate the high accuracy of our second version of Ecotype Simulation (ES2) compared to ES for smaller inputs and then show ES2's superiority for larger size inputs.