A pruning strategy of reference panels for fast SNP genotype imputation

Erkhembayar Jadamba, Miyoung Shin, Myungguen Chung, Kiejung Park

Research output: Contribution to journalArticlepeer-review

Abstract

In recent genome-wide association studies, the task of genotype imputation for missing SNPs is a common procedure to increase the power of observed genetic markers. For genotype imputation, they usually employ publicly available resources, such as the International HapMap Project data or the 1000 Genome Project data, as a reference panel. However, lately, the volume of publicly available resources is rapidly increasing with the maturation of high-throughput genotyping technology. Thus, it often requires heavy computation for learning large reference panels, leading to long imputation time. In this work, to handle such problem, we propose a pruning strategy for the construction of imputation reference panels which is to reduce the size of reference panel data by excluding (or pruning) somewhat redundant samples from the reference panel based on the estimation of the kinship coefficients between samples. For evaluation, this approach was implemented under the Beagle framework and was tested on two real datasets, Mao et al.'s prostate cancer data and KNIH's diabetes data. Our experiment results show that the proposed pruning strategy for reference panel construction can provide fast imputation time without the loss of imputation accuracy.

Original languageEnglish
Pages (from-to)6-10
Number of pages5
JournalBiochip Journal
Volume7
Issue number1
DOIs
StatePublished - Mar 2013

Keywords

  • Kinship coefficient
  • Reference panel pruning
  • SNP imputation

Fingerprint

Dive into the research topics of 'A pruning strategy of reference panels for fast SNP genotype imputation'. Together they form a unique fingerprint.

Cite this