![]()
In Partial Fulfillment of the Requirements for the Degree of
Doctor of Philosophy
Will defend his dissertation
Many applications that use 16S ribosomal RNA (16S rRNA) probes to rapidly identify bacteria have been reported. In order to better understand how the existing methods work and how they might be improved, we conducted a statistical study on the properties of short subsequences (n-mers) in 16S rRNAs. Our study confirmed that there are a large number of characteristic n-mers in 16S rRNAs. With this property, by comparing three major n-mer based alignment-free distance models, we found that the Angle distance method with 6-mers was the best model for 16S rRNA alignment-free sequence comparisons. We were able to select a set of representative 16S rRNA sequences with this model.
Considering that traditional group-specific probe design methods are incapable of finding target n-mers for some phylogenetic groupings, we addressed a novel approach by characterizing hybridization patterns for all representative sequences. The approach optimizes a small set of target n-mers such that each representative sequence has a unique pattern serving as a signature. The optimization was performed with evolutionary programming on multiple processors. We obtained seven sets of 20-mers that could be used as universal target sets.