Uppsala University Department of Information Technology
February 2002
Abstract:The advent of sophisticated and powerful methods for molecular genetics pushes the need for efficient methods for data analysis. Advanced algorithms are necessary for extracting all possible information from laboriously obtained data sets. We present a general linear algebra framework for QTL mapping, applicable to many commonly used methods, using both linear regression and maximum likelihood estimation. The formulation simplifies future comparisons between and analyses of the methods. We show how the common structure of QTL analysis models can be used to improve the kernel algorithms, drastically reducing the computational effort while retaining the original analysis results. We have evaluated our new algorithms on data sets originating from two large F2 populations of domestic animals. Using an updating approach, we show that 1-3 orders of magnitude reduction in computational demand can be achieved for matrix factorizations. For interval mapping/composite interval mapping settings using a maximum likelihood model, we also show how to use the original EM algorithm instead of the ECM approximation, significantly improving the convergence and introducing an additional reduction in the computational time. The algorithmic improvements makes it feasible to perform analyses previously deemed impractical or even impossible. For example, using the new algorithms it is reasonable to perform permutation testing using exhaustive search on populations of 200 individuals for fully epistatic two-QTL models with a large number of parameters.
Available as Postscript (201 kB, no cover)
Download BibTeX entry.