This project aims to increase the biogeographical resolution of microbial communities. Conceptually, biogeographical resolution refers to the precision with which researchers can identify and differentiate bacteria for the purpose of classifying and demonstrating relationships among species of bacteria. Microbial communities refer to the microorganisms that inhabit a specific biological region and that interact with each other, e.g. the bacteria present in the human mouth. Thus, increasing biogeographical resolution of microbial communities simply means better classifying and differentiating between species and strains of bacteria in a sampled community of bacteria.
The scope of this project entails the development of two primary pieces of software: a set of tools to better identify and distinguish bacteria, and a pipeline to link existing software together and provide an automated workflow. The aforementioned pipeline will be developed as a bash shell command line interface, providing the user options with what specific genomic analysis tools to use, any thresholds to set for these tools, and input and output designations, thus eliminating the need for unnecessary breaks in the analysis workflow.
The potential benefits to the metagenomic community are substantial. By providing a streamlined interface via the pipeline tool, researchers can more efficiently perform genomic analysis without the need to constantly reorganize output as input to the next analysis tool in the process if no verification is necessary. By developing software to increase microbial resolution, genomics researchers can use the more precise information to study microbial communities, effectively allowing for species and strain-level bacterial identification. This could contribute invaluable information to research concerning disease, granting researchers the ability to precisely identify bacteria contributing to the disease, or conversely, bacteria that should be present but is not.