Gregory B Gloor
University of Western Ontario, Canada
Title: Analyzing 'omics data using compositional data analysis
Biography
Biography: Gregory B Gloor
Abstract
We will demonstrate that the microbiome and transcriptome datasets should be analyzed by a combination of Bayesian estimation and compositional data approaches to examine the ratios between features giving robust insights into the structure of high throughput sequencing datasets. Traditional methods of analyzing microbiome or RNA-seq datasets can be misleading, and not use all the available information. This results in many analyses being dominated by either the most abundant, or the rarest features. Data collected using high throughput sequencing (HTS) methods are sequence reads mapped to genomic intervals, and are commonly analyzed as either 'normalized count data’ or 'relative abundance data’. One reason for these normalizations is to attempt to compensate for the problem that the sequencing instrument imposes an upper bound on the number of sequence reads. Positive data with an arbitrary bound are 'compositional data' and are subject to the problem of spurious correlation. Thus ordination, clustering and network analysis become unreliable. A second problem is that the data are sparse: i.e., contain many 0 values. A third problem is that the largest measurement error is at the low count margins in these datasets. These issues are all addressed using our approach.