Power and sample size estimation in high dimensional biology

Academic Article


  • Genomic scientists often test thousands of hypotheses in a single experiment. One example is a microarray experiment that seeks to determine differential gene expression among experimental groups. Planning such experiments involves a determination of sample size that will allow meaningful interpretations. Traditional power analysis methods may not be well suited to this task when thousands of hypotheses are tested in a discovery oriented basic research. We introduce the concept of expected discovery rate (EDR) and an approach that combines parametric mixture modelling with parametric bootstrapping to estimate the sample size needed for a desired accuracy of results. While the examples included are derived from microarray studies, the methods, herein, are 'extraparadigmatic' in the approach to study design and are applicable to most high dimensional biological situations. Pilot data from three different microarray experiments are used to extrapolate EDR as well as the related false discovery rate at different sample sizes and thresholds. © Arnold 2004.
  • Digital Object Identifier (doi)

    Pubmed Id

  • 3994317
  • Author List

  • Gadbury GL; Page GP; Edwards J; Kayo T; Prolla TA; Weindruch R; Permana PA; Mountz JD; Allison DB
  • Start Page

  • 325
  • End Page

  • 338
  • Volume

  • 13
  • Issue

  • 4