2442 Mathematical Models for Analyzing Genomic Data Sets: From Equations to Diagnosis

Sunday, February 21, 2010: 1:50 PM
Room 5A (San Diego Convention Center)
Knut Reinert , Free University of Berlin, Berlin, Germany
Rapid advances in genomics and proteomics have brought such methodologies into sharp focus within the hazard and risk assessment communities. Already in their 2007 paper, “Toxicity Testing in the Twenty-first Century: A Vision and a Strategy”, the US National Research Council stressed the upcoming importance of high throughput methods such as  functional genomics based on Next Generation sequencing or Proteomics techniques. Tree years on, we can produce 10 Gigabases in a couple of days for several thousand dollars, hence it is clear that “omics” techniques could be a viable means for toxicity testing and risk assessment. Having huge data sets and complicated bioinformatics analysis at hand, the focus shifts now to the question of how much we can trust the results of computational analyses using these massive data sets? In addition to modeling the uncertainty and measurement error in the input, we also have to deal with possible mistakes accumulating during the numerous processing steps. Currently there are scarcely any reliable margins of uncertainty computed. In this talk we will give a short introduction to some (simplified) analysis pipelines for high throughput “omics” data that are in use today, and point out what possible effects small changes in parameter settings can have on the outcome of an analysis, and potentially on the result of an associated risk assessment. The goal of the presentation is to make the audience aware of possible pitfalls in the data analysis and to show the need for robust algorithms that are able to assess the accuracy of their results.