Thông báo xê-mi-na, thứ Hai 03/03

Trân trọng kính mời giảng viên, sinh viên và những người quan tâm tới tham dự buổi trình bày về

Multiple testing: False discovery rate control

do TS. Nguyễn Văn Hạnh

  • Thời gian: 9h30 thứ Hai, ngày 03/03/2014
  • Địa điểm: Phòng máy số 6, tầng 3 nhà hành chính, trường ĐHNN Hà nội
  • Abstract: The problem of testing several null hypotheses has a long history in the statistics literature. With the high-resolution techniques introduced in the recent years, it has known a renewed attention in many application fields where one aims to find significant features among several thousands (or millions) of candidates. Classical examples are microarray analysis, neuro- imaging analysis and source detection,… In a microarray analysis, the level expressions of a set of genes are measured under two different experimental conditions and we aim at finding the genes that are differentially expressed between the two conditions. For example, when the genes come from tumor cells in the first experimental condition, while they come from healthy cells in the second, the differentially expressed genes may be involved in the development of this tumor and thus are genes of special interest. For the gene i, one aims at testing the hypothesis:

    H0 (i) : “gene i is not differentially expressed” against H1(i) : “gene i is differentially expressed”

    The number of genes n can be large (several thousands), for instance, we test simultaneously n = 10 000 null hypotheses, of which n0 = 8 000 are true nulls (level α = 0.05 for each test). This procedure makes on average n 0α = 400 false positives (type I errors). It seems unsuitable because it is likely to select a lot of false positives. A multiple testing procedure aims at correcting a priori the level of the single tests in order to obtain the “quantity” of false positives that is below a nominal level α . The “quantity” of false positives is measured by using global type I error rates, as for instance the probability to make at least one type I error among all the hypotheses (family wise error rate, FWER) or the expected proportion of false positives among all rejected hypotheses (false discovery rate, FDR).

Xê-mi-na Công nghệ Thông tin