About 30 years ago, I became a member of the department of biostatistics of the University of North Carolina at Chapel Hill. There I joined an interdepartmental programme in genetics and spent my time developing statistical methods to locate genes on chromosomes, a first step in what is now known as "positional cloning".
In 1971, John Stewart and I published what came to be called the Elston-Stewart algorithm for linkage analysis - a statistical technique to determine from pedigree data whether a disease gene that is segregating in a pedigree is close to a marker on a particular chromosome. To be useful, a marker must be polymorphic, with different forms from person to person, so that one can trace its inheritance through a pedigree.
In those days we had about 30 markers, many of which were blood groups. The beauty of the algorithm that John and I devised was that the amount of computation it requires to link a disease gene to a marker increases only linearly with the size of the pedigree. Since 1980 many more markers - identified now at the DNA level - have become known, and "multipoint" linkage analysis is performed, looking at several markers on a chromosome simultaneously to pinpoint the location of a disease gene.
Unfortunately, when multipoint linkage analysis is performed using the Elston-Stewart algorithm, the amount of computation increases exponentially with the number of markers. Because of this, the computation becomes unfeasible for more than about three markers at a time.
The next stage was the publication in 1987 of the Lander-Green algorithm for linkage analysis. With this algorithm, the computing time required for multipoint linkage analysis increases only linearly with the number of markers in the analysis, but exponentially with the number of persons in the pedigree. Both algorithms have been extended and improved. But until very recently the only way to perform a multipoint linkage analysis on a pedigree of more than a dozen or so members has been by using a simulation technique. In this technique, the transmission of chromosomes through a particular pedigree is simulated many times and the average results of these stochastic simulations approximate what an exact calculation, were it feasible, would yield.
In 1987 I received funding from the US National Center for Research Resources to develop computer programs for the genetic analysis of pedigree data, and since 1995 I have continued my research with this funding at Case Western Reserve University. On my arrival in Cleveland, I hired a new team of computer programmers. The lead programmer of this group, Geoff Wedig, has devised a hybrid algorithm, part Elston-Stewart and part Lander-Green, for which the computation required is linear in both the number of markers and the number of pedigree members. Our research will help others, more molecularly inclined, to find and clone genes underlying such complex diseases as cancer and diabetes. Knowledge of such genes and how they function promises to usher in exciting new therapies in the 21st century.
Robert Elston is director of the division of genetic and molecular epidemiology in the department of epidemiology and biostatistics, Case Western Reserve University, Cleveland, Ohio.
The Human Genetic Analysis Resource ( http://darwin.cwru.edu ) is funded by an NIH grant from the National Center for Research Resources.