1 Ransohoff omics 11 9

Uploaded from authorPOINTLite
Views:
 
Category: Education
     
 

Presentation Description

No description available.

Comments

Presentation Transcript

Major challenges in clinical medicine: An overview for basic scientists November 9, 2004 “Frontiers of ‘-omics’ research” : 

Major challenges in clinical medicine: An overview for basic scientists November 9, 2004 “Frontiers of ‘-omics’ research” David F. Ransohoff, MD Depts. of Medicine and Epidemiology Director, K30 training program

What is “-omics”?: 

What is “-omics”? Definitions • pro·te·ome: The complete set of proteins that can be expressed by the genetic material of an organism. • ge·nome: The total ... content ... an organism's genetic material. Other terms bioinformatics, discovery-based, high-throughput... Comment ‘-omics’ is broad... ‘studying everything at once’... ... on the one hand, attractive; on the other...

Theme: 

Theme Opportunity a. knowledge about biology - molecular biology - biochemistry - etc b. technology - PCR, microarrays (for DNA, RNA) - mass spec (for proteins, peptides) - etc..

Theme: 

Theme Opportunity Danger - doing a large number of measurements does not necessarily lead to reproducible results or useful knowledge

Theme: 

Theme Opportunity Danger Opportunity Research that considers and addresses ‘danger’ may be successful (i.e., reproducible results; useful knowledge). UNC-CH well-positioned to explore ‘opportunity’.

New York Times, 2.3.04: 

New York Times, 2.3.04

“New cancer test stirs hope and concern” Pollack A. New York Times, Feb. 3, 2004: 

“New cancer test stirs hope and concern” Pollack A. New York Times, Feb. 3, 2004 "I've been in cancer research for 40 years and I think it's the most important breakthrough in those years," said Dr. John S. Kovach, director of... Correlogic Systems... Quest Diagnostics and LabCorp... say they expect to begin offering the test in the next few months.

Are many ‘-omics’ fields with promise.. and (possible) disappointment?: 

Are many ‘-omics’ fields with promise.. and (possible) disappointment? genomics proteomics transcriptomics metabolomics epigenomics ribonomics etc. Point: a) opportunity... but danger b) need to ‘explore’ fields efficiently: who to do this, and how...

Organization: 

Organization Current results, expectations, challenges 1) RNA expression arrays: prognosis BrCa 2) serum proteomics: diagnosis, ovarian Ca 3) genomics and proteomics, CRC Lessons/opportunity about conducting ‘omics’ research 1) future challenges 2) role of basic scientists 3) why UNC is well-positioned

Editorialists and reviewers interpret results as “definitive”: 

Editorialists and reviewers interpret results as “definitive” for clinical practice “... gene-expression patterns of primary tumours are better than available clinicopathological methods for determining the prognosis of individual patients.6,10,11” Ramaswamy and Perou, Lancet 2003;361:1576-7 for biological research “... compelling evidence... genetic program of a cancer cell at diagnosis defines its biologic behavior many years later, refuting a competing hypothesis....” Wooster and Weber, NEJM 2003;348:2339-47

Validity (i.e., reproducibility) of results can be compromised by ‘overfitting’: 

Validity (i.e., reproducibility) of results can be compromised by ‘overfitting’ • Definition: In multivariable predictive models, overfitting occurs when a large number of predictor variables is fit to a small N of subjects. A model may ‘fit’ well or perfectly, even if no real relationship. Simon, JNCI 2003 • Consequence: results not reproducible in new set of data

Enrico Fermi cautions Freeman Dyson about ‘overfitting’ in model-making: 

Enrico Fermi cautions Freeman Dyson about ‘overfitting’ in model-making Fermi: "How many arbitrary parameters did you use for your calculations?" Dyson: I thought for a moment about our cut-off procedures and said, "Four." Fermi: "I remember my friend Johnny von Neumann used to say, with four parameters I can fit an elephant, and with five I can make him wiggle his trunk.” Dyson, F. A meeting with Enrico Fermi; how one intuitive physicist rescued a team from fruitless research. Nature 2004.

Validity (i.e., reproducibility) of results can be compromised by ‘overfitting’: 

Validity (i.e., reproducibility) of results can be compromised by ‘overfitting’ • Method to assess: test reproducibility of model in independent validation set (done <10% of reports. Ntzani, Lancet 2003)

To check for overfitting, assess reproducibility in independent sample Ransohoff. Nat Rev Cancer 2004 : 

To check for overfitting, assess reproducibility in independent sample Ransohoff. Nat Rev Cancer 2004

Was independent validation done in NEJM 2002? : 

Was independent validation done in NEJM 2002? • 61 of 295 subjects in the validation set (NEJM 2002) came from the training set (Nature 2002). • Independent prediction was not demonstrated; therefore these results may not be strongly reproducible and should not be interpreted as ‘definitive.’

Not an isolated example; in another, independent validation was not done at all.: 

Not an isolated example; in another, independent validation was not done at all.

Was independent validation done? : 

Was independent validation done? What was written in abstract: “Initial external validation came from similarly accurate predictions of nodal status of a small sample in a distinct population.”

Was independent validation done? : 

Was independent validation done? What was written in methods/results: 13 patients were referred to; 3 lines of text about 6 Comment: If I had reviewed this article before publication, I’d say: ‘External validation has not been done; results may be due (are likely to be due, in my view) to overfitting; do not accept for publication.’ If chance likely explains results, a study should not be published. Publication is a problem in process: design, writing, review, editing.

So, are RNA-expression array results for BrCa prognosis reproducible?: 

So, are RNA-expression array results for BrCa prognosis reproducible? Reproducibility has not been demonstrated for BrCa. Reproducibility has been demonstrated for lymphoma and AML. But hematologic cancers may be different, with ‘robust’ RNA expression, because they are homogeneous, clonal. (Nature, 2000) A prediction: Current results about prognosis for BrCa will not be reproducible...

Organization: 

Organization Current results, expectations, challenges 1) RNA expression arrays: prognosis BrCa 2) serum proteomics: diagnosis, ovarian Ca 3) genomics and proteomics, CRC Lessons/opportunity about conducting ‘omics’ research 1) future challenges 2) role of basic scientists 3) why UNC is well-positioned

Serum proteomics Claims to detect cancer: extraordinary: 

Serum proteomics Claims to detect cancer: extraordinary • claims for multiple cancers (ovary, prostate, breast) -sensitivity: 95-100% -specificity: 95-100% • claims appear in Lancet, JNCI, WSJ, NBC, PBS, Redbook, NIH Director’s award, etc. • Correlogic - patent filing for ‘pattern recognition’ process • LabCorp - license to use patented process in lab test

Proteomics Petricoin, Lancet 2.02: 

Proteomics Petricoin, Lancet 2.02 “The discriminatory pattern correctly identified all 50 ovarian cancer cases in the masked set, including all 18 stage I cases. Of the 66 cases of non-malignant disease, 63 were recognised as not cancer. This result yielded a sensitivity of 100%… specificity of 95%…”

Big Picture: A non-invasive blood test is the ‘Holy Grail’ in cancer screening: 

Big Picture: A non-invasive blood test is the ‘Holy Grail’ in cancer screening The ovarian cancer study provides a major assertion that the Holy Grail has been located. Scientific community’s reaction: skeptical, in part because technique is a ‘black box.’ So, can a black box work, and be shown to work? Yes; if results are reproducible in totally-independent group. Further, black box has advantage: provide novel insights. However, before strong claims can be made, chance (overfitting) and bias must be addressed.

Are some serum proteomics results for ovarian cancer explained by bias?: 

Are some serum proteomics results for ovarian cancer explained by bias? (from news report (Nature 2004) of analysis of Keith Baggerly) ‘Discrimination’ that occurs is due to non-biological ‘signal’ • Small (non-biol.) m/z peaks account for discrimination. • ‘Signal’/discrimination is related to ‘run order’ i.e., cancers and non-cancers are run on different days/ chips; mass spec. machine ‘drifts’ over time, and so non-biologic ‘signal’ associated with Ca vs no-Ca is hard-wired into results, not removed by ‘splitting’ sample into ‘training’ and ‘validation’.

Prediction: Discovery-based pattern-recognition serum proteomics to diagnose cancer, in 10 yrs, will...: 

Prediction: Discovery-based pattern-recognition serum proteomics to diagnose cancer, in 10 yrs, will... • show no discrimination (or minimal), for any cancer. i.e., nowhere near the 95% sensitivity, 95% specificity currently claimed

Prediction: Discovery-based pattern-recognition serum proteomics to diagnose cancer, in 10 yrs, will...: 

Prediction: Discovery-based pattern-recognition serum proteomics to diagnose cancer, in 10 yrs, will... • show no discrimination (or minimal), for any cancer. i.e., nowhere near the 95% sensitivity, 95% specificity currently claimed

Why this prediction? : 

Why this prediction? Biology looks implausible. 2. ‘Scientific process’ is weak: i.e, Direct assessment of whether it does work. Studies are designed, written, reviewed, and published, with little attention to bias and overfitting in Methods, Results, Discussion. The single most important need of this field (discovery-based ‘-omics’): strengthen scientific process.

This prediction could be wrong.: 

This prediction could be wrong. I really really hope it’s wrong. I’m doing research that could show it’s wrong. Good evidence may exist that discovery-based pattern-recognition serum proteomics ‘works.’ But published evidence is based on weak ‘scientific process’ and cannot support claims and expectations. We almost seem to be assuming ‘it works.’

Organization: 

Organization Current results, expectations, challenges 1) RNA expression arrays: prognosis BrCa 2) serum proteomics: diagnosis, ovarian Ca 3) genomics and proteomics, CRC Lessons/opportunity about conducting ‘omics’ research 1) future challenges 2) role of basic scientists 3) why UNC is well-positioned

Avoiding chance and bias (I hope) in my own research: Serum proteomics to detect CRC : 

Avoiding chance and bias (I hope) in my own research: Serum proteomics to detect CRC Approach: assess discovery-based pattern-recognition serum proteomics to diagnose CRC Method • use uniformly-handled blinded specimen sets (avoid bias) • split into ‘training’ and ‘validation’ sets (avoid chance/overfitting) Results • in progress

Status report: 3 projects to assess: Serum proteomics to screen for CRC: 

Status report: 3 projects to assess: Serum proteomics to screen for CRC

Avoiding chance and bias (I hope) in my own research: Genomics of stool DNA to detect CRC: 

Avoiding chance and bias (I hope) in my own research: Genomics of stool DNA to detect CRC Approach: assess hypothesis-based stool DNA assay to diagnose CRC Method • use uniformly-handled blinded specimen sets (avoid bias) • (is hypothesis-based, so avoids chance/overfitting)

‘Genetic model’ provides rationale for using molecular markers to diagnose CRC: 

Modified from Fearon and Vogelstein Cell 1990; 61:759-767 ‘Genetic model’ provides rationale for using molecular markers to diagnose CRC Other Genetic Alterations? (e.g. TGF-ß type II receptor) Altered DNA Methylation

Genomics - Stool DNA to detect CRC: 

Genomics - Stool DNA to detect CRC Methods • prospective, blinded design; pre-specified analysis • 5486 subjects • receive colonoscopy and stool DNA test (APC, p53, BAT, ‘long DNA’) Results: 52% sensitivity, 95% specificity (in press 2004)

Organization: 

Organization Current results, expectations, challenges 1) RNA expression arrays: prognosis BrCa 2) serum proteomics: diagnosis, ovarian Ca 3) genomics and proteomics, CRC Lessons/opportunity about conducting ‘omics’ research 1) future challenges 2) role of basic scientists 3) why UNC is well-positioned

“Validation”: 

“Validation” 1. Definitions are so “diverse”, they are confusing. 2. But main concepts • are simple • can be usefully applied, right now The concepts is from field of observational epidemiology (where non-experimental method is used) and are not well understood by basic scientists using experiment.

Nat Rev Cancer 2004;4:309-14: 

Nat Rev Cancer 2004;4:309-14

Validation: Main concepts are simple: 

Validation: Main concepts are simple a. Chance Does chance explain ‘discrimination’? b. Bias Does bias explain ‘discrimination’? c. Generalizeability Does discrimination occur in clinically useful groups? (focus of Sullivan-Pepe, JNCI 2001) Every study should explicitly address (a) and (b), • or study is not worth publishing • and (c) is not worth asking.

Past: Validation of cancer markers is ‘disappointing’ (not reproducible): 

Past: Validation of cancer markers is ‘disappointing’ (not reproducible) 1. Non-invasive markers: Holy Grail of cancer diagnosis - CEA, CA19-9, CA125 - MRI of blood 2. CEA - initial report (PNAS): ~100% sensitivity, specificity for CRC - high expectations - disappointment when expensive ACS/CCS study did not reproduce initial results Disappointment would have been predicted and avoided if ‘rules of evidence’ were available.

Past: Validation of cancer markers is ‘disappointing’ (not reproducible): 

Past: Validation of cancer markers is ‘disappointing’ (not reproducible) 3. Lessons from CEA, for field of clinical epi/HSR - led to methodology to evaluate diagnostic tests -- rules of evidence -- concepts of bias, validation, ‘spectrum’ (Ransohoff and Feinstein. NEJM 1978) Methodology is still underdeveloped in 2004. 4 Past problems due to ‘culture clash’ - fields of laboratory medicine, clinical epidemiology have different ways of thinking, methods, rules of evidence, ‘Culture clash’ continues to be problem in 2004.

Present: Cancer markers are promising: 

Present: Cancer markers are promising Knowledge of molecular biology provides targets to measure - past: knew little about what to target - now: know DNA ‘path’ from normal.. adenoma.. CRC Assays to measure targets - past: ‘one dimensional’ assays, like CEA, FOBT, PSA - now: multi-dimensional assays (measure almost any target) -DNA - primers and probes; amplify signal -protein - mass spectroscopy

Present: Cancer markers are promising: 

Present: Cancer markers are promising - But Mother Nature closely guards (her) secrets. - New reductionist methods mean more data, but not necessarily more knowledge. - Rules of evidence: not changed. - Our job: -- to efficiently explore new technologies/fields -- to avoid predictable mistakes, inflated expectations. - Exploration must be interdisciplinary, translational -- molecular biology -- clinical epidemiology -- biostatistics

Organization: 

Organization Current results, expectations, challenges 1) RNA expression arrays: prognosis BrCa 2) serum proteomics: diagnosis, ovarian Ca 3) genomics and proteomics, CRC Lessons/opportunity about conducting ‘omics’ research 1) future challenges 2) role of basic scientists 3) why UNC is well-positioned: - technology - specimen sources - epidemiology expertise

References: 

References Developing molecular biomarkers for cancer. Science 2003; 299:1679-80. Discovery-based research and fishing. Gastroenterology 2003; 125:290. Rules of evidence for cancer molecular marker discovery and validation. Nature Reviews Cancer 2004; 4: 309-14. Evaluating discovery-based research: When biologic reasoning cannot work. Gastroenterology 2004; 127:1028. Bias as a threat to validity of molecular marker research. Nature Reviews Cancer (in revision)