High-Throughput Prediction of Bacteriophage Structure and Lifestyle Implication

2017 SIAM Conference on Computational Science and Engineering

Abstract. Bacteriophages are superabundant in nearly all environments on Earth, from the deep sea to the human gut. Details on distribution of bacteriophage structures could lead to better understanding of which phages are more successful in various environmental niches and a richer understanding of their impact. With the advent of large datasets of phage genome data being produced via metagenomics, a high-throughput model to produce these distributions based on genome size would be ideal. We present an approach to such a model here. The model was developed using attributes of icosahedral tailed phages in order to address a majority of phages with one precise, repeatable method applicable to large data sets. Application of the model to these data sets produced distributions of T numbers that closely match cited percentages for capsids expected to be icosahedral, and identied those groups with non-icosahedral capsid shapes. This result is promising, and leads to other lines of inquiry in order to rene the model, create related models for other phage groups and investigate what we can learn about the variance in capsid structure distribution.