research-article

Open Access

High-speed westfall-young permutation procedure for genome-wide association studies

Authors:
Aika Terada

The Univ. of Tokyo, BRD, AIST

The Univ. of Tokyo, BRD, AIST
View Profile

,
Hanyoung Kim

Tokyo Inst. of Technol.

Tokyo Inst. of Technol.
View Profile

,
Jun Sese

BRD, AIST

BRD, AIST
View Profile

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health InformaticsSeptember 2015Pages 17–26https://doi.org/10.1145/2808719.2808721

Published:09 September 2015Publication History

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics

Pages 17–26

ABSTRACT

Genome-wide association studies (GWASs) are widely used to investigate statistically significant associations between diseases and single nucleotide polymorphisms (SNPs) to identify causal factors of diseases. In GWAS, statistical significance of more than one million SNPs have been recently assessed, but in many case, no associations are found because of the application of conservative multiple testing corrections, such as Bonferroni correction. While more sensitive methods, such as Westfall-Young permutation procedure (WY), would relate more SNPs with diseases, its extremely long computational time has prohibited from the application of WY to GWAS. We introduce an algorithm to accelerate WY, named High-speed Westfall-Young permutation procedure (HWY). HWY utilizes three techniques to make WY computationally practical. First, P-value calculations for SNPs that cannot affect the adjusted significance level are pruned. Second, a lookup table of P-values is used to avoid frequent duplicate calculations. Finally, computations are parallelized using a GPGPU. HWY was 619 times faster than WY and more than 122 times faster than PLINK, a widely used GWAS software, and analyzed a dataset contained one million SNPs and one thousand individuals in approximately two hours. Re-analysis of existing GWAS datasets with HWY may uncover additional hidden SNP-trait associations.

References

S. Atwell, Y. S. Huang, B. J. Vilhjálmsson, et al. Genome-wide association study of 107 phenotypes in Arabidopsis thaliana inbred lines. Nature, 465(7298):627--631, 2010.Google ScholarCross Ref
Y. Benjamini and Y. Hochberg. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Series B, 57(1):289--300, 1995.Google ScholarCross Ref
Y. Benjamini and D. Yekutieli. The control of the false discovery rate in multiple testing under dependency. Ann Stat., 29(4):1165--1188, 2001.Google ScholarCross Ref
C. E. Bonferroni. Teoria statistica delle classi e calcolo delle probabilità. Pubblicazioni del R Istituto Superiore di Scienze Economiche e Commerciali di Firenze, 8:3--62, 1936.Google Scholar
V. G. Cheung, R. S. Spielman, K. G. Ewens, et al. Mapping determinants of human gene expression by regional and genome-wide association. Nature, 437(7063):1365--1369, 2005.Google ScholarCross Ref
Y. Ge, S. Dudoit, and T. P. Speed. Resampling-based multiple testing for microarray data analysis. Test, 12(1):1--77, 2003.Google ScholarCross Ref
L. A. Hindorff, P. Sethupathy, H. A. Junkins, et al. Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. PNAS, 106(23):9362--9367, 2009.Google ScholarCross Ref
S. Holm. A simple sequentially rejective multiple test procedure. Scand J Stat., 6(2):65--70, 1979.Google Scholar
X. Huang and B. Han. Natural variations and genome-wide association studies in crop plants. Annu Rev Plant Biol, 65:531--551, 2014.Google ScholarCross Ref
M. I. McCarthy, G. R. Abecasis, and L. R. o. Cardon. Genome-wide association studies for complex traits: consensus, uncertainty and challenges. Nature Reviews Genetics, 9(5):356--369, 2008.Google ScholarCross Ref
N. Meinshausen, M. H. Maathuis, and P. Bühlmann. Asymptotic optimality of the WestfallâĂŞ Young permutation procedure for multiple testing under dependence. Ann Stat., 39(6):3369--3391, 2011.Google ScholarCross Ref
S. Purcell, B. Neale, K. Todd-Brown, et al. PLINK: a tool set for whole-genome association and population-based linkage analyses. American journal of human genetics, 81(3):559--75, 2007.Google Scholar
G. D. Ruxton and M. Neuhäuser. Good practice in testing for an association in contingency tables. Behavioral Ecology and Sociobiology, 64(9):1505--1513, 2010.Google ScholarCross Ref
A. Terada, M. Okada-Hatakeyama, K. Tsuda, et al. Statistical significance of combinatorial regulations. Proc Natl Acad Sci USA., 110(32):12996--13001, 2013.Google ScholarCross Ref
A. Terada, K. Tsuda, and J. Sese. Fast Westfall-Young Permutation Procedure for Combinatorial Regulation Discovery. In IEEE International Conference on Bioinformatics and Biomedicine (BIBM). IEEE, 2013.Google Scholar
The International HapMap Consortium. A haplotype map of the human genome. Nature, 437(7063):1299--320, 2005.Google ScholarCross Ref
Z. Šidák. Rectangular confidence regions for the means of multivariate normal distributions. J Am Stat Assoc., 62(318):626--633, 1967.Google Scholar
J. A. Webster, J. R. Gibbs, J. Clarke, et al. Genetic control of human brain transcript expression in Alzheimer disease. American journal of human genetics, 84(4):445--58, 2009.Google Scholar
D. Welter, J. MacArthur, J. Morales, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic acids research, 42(Database issue):D1001--6, 2014.Google Scholar
P. H. Westfall and S. S. Young. Resampling-based multiple testing: Examples and methods for p-value adjustment. Wiley, New York, 1993.Google Scholar
J. Winkelmann, B. Schormair, P. Lichtner, et al. Genome-wide association study of restless legs syndrome identifies common variants in three genomic regions. Nat Genet., 39(8):1000--1006, 2007.Google ScholarCross Ref
G. Yang, W. Jiang, Q. Yang, et al. PBOOST: A GPU based tool for parallel permutation tests in genome-wide association studies. Bioinformatics, 2014.Google Scholar
X. Zhang, F. Zou, and W. Wang. FastChi: an efficient algorithm for analyzing gene-gene interactions. Pacific Symposium on Biocomputing. Pacific Symposium on Biocomputing, pages 528--39, 2009.Google Scholar

Index Terms

High-speed westfall-young permutation procedure for genome-wide association studies

Recommendations

Efficient Algorithms for the Two Locus Problem in Genome-Wide Association Study: Algorithms for the Two Locus Problem
CIKM '16: Proceedings of the 25th ACM International on Conference on Information and Knowledge Management

Advances made in sequencing technology have resulted in the sequencing of thousands of genomes. Novel analysis tools are needed to process these data and extract useful information. Such tools could aid in personalized medicine. As an example, we could ...
Read More
A Novel Method to Select High-risk Disease-Related Regions after a Genome Wide Haplotype-Based Association Study: An Application to Alcoholism
FSKD '09: Proceedings of the 2009 Sixth International Conference on Fuzzy Systems and Knowledge Discovery - Volume 05

Genome-wide association (GWA) studies based on haplo type have emerged as a new and powerful approach to identify the genetic variants involved in human complex diseases. A challenging problem after a GWA study based on haplotype is to select high-risk ...
Read More
Effects of input data quantity on genome-wide association studies (GWAS)

Many software packages have been developed for Genome-Wide Association Studies (GWAS) based on various statistical models. One key factor influencing the statistical reliability of GWAS is the amount of input data used. In this paper, we investigate how ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics
September 2015
683 pages
ISBN:9781450338530
DOI:10.1145/2808719

Copyright © 2015 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 9 September 2015
Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
genome-wide association study
multiple testing procedure
westfall-young permutation procedure
Qualifiers
- research-article
Conference

Acceptance Rates
BCB '15 Paper Acceptance Rate48of141submissions,34%Overall Acceptance Rate254of885submissions,29%
More
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 492
  Total Downloads
- Downloads (Last 12 months)68
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

High-speed westfall-young permutation procedure for genome-wide association studies

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient Algorithms for the Two Locus Problem in Genome-Wide Association Study: Algorithms for the Two Locus Problem

A Novel Method to Select High-risk Disease-Related Regions after a Genome Wide Haplotype-Based Association Study: An Application to Alcoholism

Effects of input data quantity on genome-wide association studies (GWAS)

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

High-speed westfall-young permutation procedure for genome-wide association studies

BCB '15: Proceedings of the 6th ACM Conference on Bioinformatics, Computational Biology and Health Informatics

ABSTRACT

References

Cited By

Index Terms

Recommendations

Efficient Algorithms for the Two Locus Problem in Genome-Wide Association Study: Algorithms for the Two Locus Problem

A Novel Method to Select High-risk Disease-Related Regions after a Genome Wide Haplotype-Based Association Study: An Application to Alcoholism

Effects of input data quantity on genome-wide association studies (GWAS)

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media