Kevin Coombes is a Professor in the Department of
Bioinformatics and Computational Biology at the UT M.D. Anderson Cancer
Center. He received his Ph.D. in pure
mathematics in 1982 from the University of Chicago and. worked for many years
in the areas of algebraic K-theory and arithmetic algebraic geometry (while
rising through the academic ranks at MIT, University of Michigan, and
University of Maryland in College Park). In the mid 1990’s, he shifted his
research interests to bioinformatics. He
received awards for “Best Presentation” at the 2001 and 2002 CAMDA (Critical
Assessment of Microarray Data Analysis) conferences, and for “Best Abstract” at
the First Annual Proteomics Data Mining Conference. His current research focuses on statistical,
mathematical, and computational methods to process, analyze, and understand
highly multivariate biological data arising from high throughput technologies.
He is particularly interested in (1) methods that incorporate existing
biological knowledge early in the analytical process and (2) methods that
integrate diverse types of biological data with a view toward predicting
clinically relevant patient outcomes. With his collaborator, Keith Baggerly, he
is known for his work in “forensic bioinformatics”, which is an effort to
understand and uncover the (often poorly described) methods that were actually
used to analyze large data sets.
“Cell Lines,
Chemotherapy Response, and the Need for Reproducible Research”
In
November 2006, researchers at Duke University published the first in a series
of papers that claimed that (1) microarray and drug response data from cancer
cell lines could be used to develop genomic signatures of response to specific
chemotherpaies, and (2) these signatures successfully predicted patient
responses. Duke later began running
clinical trials based on this work. We
attempted to reproduce their analyses on publicly available data, and were
unsuccessful. We identified a series of
errors in data provenance and analysis, which we published as letters to
editors and as an article in a statistical journal. As a result of our efforts, four clinical
trials have been terminated and at least eight papers have been retracted. In this talk, I will describe some of the
errors we found, some of the "forensic bioinformatics" methods used
to discover them, and the efforts that are underway nationally to develop new
tools
that promote and enable "reproducible research".