Vol.2, No.6, 576-589 (2010) Natural Science
http://dx.doi.org/10.4236/ns.2010.26073
Copyright © 2010 SciRes. OPEN ACCESS
Building reliable genetic maps: different mapping
strategies may result in different maps
Yefim Ronin, David Mester, Dina Minkov, Abraham Korol*
Institute of Evolution and Department of Evolutionary and Environmental Biology University of Haifa, Mount Carmel, Haifa, Israel;
*Corresponding Author: korol@research.haifa.ac.il
Received 24 November 2009; revised 4 January 2010; accepted 6 April 2010.
ABSTRACT
New high throughput DNA technologies resul-
ted in a disproportion between the high number
of scored markers for the mapping populations
and relatively small sizes of the genotyped pop-
ulations. Correspondingly, the number of mark-
ers may, by orders of magnitude, exceed the
threshold of recombination resolution achieva-
ble for a given population size. Hence, only a
small part of markers can be genuinely ordered
in the map. The question is how to choose the
most informative markers for building such a
reliable “skeleton” map. We believe that our ap-
proach provides a solution to this difficult pro-
blem due to: 1) powerful tools of discrete opti-
mization for multilocus ordering; 2) a verifica-
tion procedure, which is impossible without fast
and high-quality optimization, to control the map
quality based on re-sampling techniques; 3) an
interactive algorithm of marker clustering in co-
mplicated situations caused by significant de-
viation of recombination rates between markers
of non-homologous chromosomes from the ex-
pected 50% (referred to as quasi-linkage or pse-
udo-linkage); and 4) an algorithm for detection
and removing excessive markers to increase
the stability of multilocus ordering.
Keywords: Pseudo-Linkage; Skeleton Map; Map
Verification; Map Stability; Traveling Salesperson
Problem; Guided Evolution Strategy
1. INTRODUCTION
Genetic maps are an important tool in genomics and in
numerous practical applications such as breeding, medi-
cal genetics, and gene cloning. Unfortunately, the avail-
able algorithms and software tools become less suitable
with an increasing number of available markers. The
objective of our study is to develop efficient methodol-
ogy for building multilocus genetic maps, providing the
control of the quality of maps by detecting and removing
the sources of map instability. Two major problems
should be addressed in multilocus genetic mapping: 1)
markers that belong to non-homologous chromosomes
should not be assigned to the same linkage group; and 2)
markers from the same chromosome should be placed on
the genetic map in the same order as the corresponding
DNA sequences that reside in the chromosome.
In situations with significant deviations of the recom-
bination rates between non-synthetic markers from the
expected level (50%), the problem of correct clustering
cannot be solved by an arbitrary choice of a certain
(constant) threshold value of recombination or LOD,
albeit this is exactly how this problem is treated in many
mapping packages [1,2]. Indeed, in experiments with the
foregoing characteristics, the recombination values be-
tween groups of markers from different chromosomes
may sometimes be smaller than the values between ad-
jacent markers within a chromosome. This phenomenon,
referred to as “quasi-linkage” (or “pseudo-linkage”) can
result from a combination of statistical and biological
reasons and scoring errors. The statistical reasonsof
pseudo-linkage are mainly caused by the sample size and
number of chromosomes (n) in the genome: increased n
is associated with higher chances to detect “significant”
deviations from independent segregation. But the major
source of pseudo-linkage is biology. Literature on this
phenomenon in many species can be found in: [3-5] and
references therein.
Mapping algorithms tend to ignore pseudo-linkage.
Consequently, some non-syntenic loci may appear in the
same “linkage group”, which could result in contradic-
tions between mapping results for different mapping
populations and between genetic and physical maps.
Thus, up to 12% of cattle markers were assigned to
wrong chromosomes and contradict the physical maps
(H. Lewin, personal communication). In fish genetics,
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
577
577
pseudo-linkage is also a known phenomenon (see for
review: [6]). To address this problem, we suggest a mod-
ified approach of clustering markers into linkage groups.
In our scheme, clustering is conducted concurrently with
multilocus ordering that also includes verification of the
order and removing unreliable markers. Instead of using
one threshold recombination rate (or LOD value), we
employ a series of increasing recombination thresholds
(or decreasing LODs). At the first, most stringent thre-
shold, we have a minimum of danger of mixing markers
from different chromosomes into one linkage group, but
the result is a high number of linkage groups. By relax-
ing the stringency at the next steps, we allow end-to-end
merging of the ordered linkage groups, excluding those
that display the strongest affinity to each other by their
interior parts.
In many practical cases, high-density mapping is as-
sociated with another difficult problem: a disproportion
between a high number of scored markers and a rela-
tively small population size. The number of markers may,
by orders of magnitude, exceed the resolution of recom-
bination for the given population size, so that only a mi-
nority of markers can be actually ordered. The question
is how to choose the most informative markers to build a
reliable “skeleton” map. If we consider a situation with,
for example, k ~ 1000 markers, then for a sample size of
N ~ 100, the minimum distance between markers that
can be resolved in the map should be 1 cM; hence, the
map length for a chromosome should be 1000 cM,
which is unrealistic for the vast majority of organisms.
How can the appearance of such 1000 cM maps be ex-
plained? We believe that the root is in the wrong as-
sumption that all markers are different (resolvable by
recombination). In fact, for small sample sizes, many
markers comprise groups of absolutely linked markers
and should be replaced by their “delegates”. But even
with this simplification, the number of resulting markers
that differ may remain quite large, with the map length
by far exceeding the expectations based on the estimates
of chiasma frequencies at meiosis [7,8]. Clearly, marker
scoring errors generate “false recombinants”: with per-
fect scoring most of these recombinants would not have
appeared, but after excluding absolutely linked excessive
markers (replacing them by delegates), it would be pos-
sible to build an “ideal” skeletonap. Another possible
complicating factor is negative interference [4,5,9,10]
violating the simple principle that “the entire entity is
supposed to be larger than its parts” [11].
Besides close linkage combined with a limited (small)
sample size, the necessity for the selection of representa-
tive markers for the skeleton map derives from the vary-
ing information content of markers (co-dominant versus
dominant, missing data, distorted segregation, and scor-
ing errors), “absolute” linkage between repulsion-phase
dominant markers, and negative interference [5]. These
(or some of these) criteria are employed by other authors
also. Thus, before the analysis, the authors also chose
bin markers, whereas after the analysis a decision is
made about excluding double recombinants and recov-
ering missing data (usually, by assuming no recombina-
tion). The problem with the last correction aimed to re-
duce the map length, is that it does not affect the order of
the markers. This after-ordering correction deals with
maps that might have been affected by errors of marker
scoring. This could cause erroneous ordering or, even
worse, bringing together markers from non-homologous
chromosomes.
Our objective is to get an approximation as close as
possible to the true multilocus order despite the forego-
ing complications. A specific feature of our approach is
that the choice of candidates for the skeleton map is a
part of the core ordering-verification procedure focused
on detecting and removing markers causing local map
instability and non-monotonic changes of recombination
(i.e., deviation from the expected increase of rf between
a marker and its subsequent neighbors). The verification
process is based on multiple re-sampling runs from the
scored mapping population using the so-called jackknife
approach [12,13], namely, from the initial set of N
genotypes, we sampled a subset of αN genotypes (e.g.,
with α = 0.8-0.9) scored for the same markers. The ob-
tained sub-sample is employed to order the map. This
process is applied repeatedly (e.g., 100 times), resulting
in corresponding map orders. The neighborhoods that do
not change upon these jackknife runs can be referred to
as stable.
Clearly, such a formulation calls for massive repeated
application of multilocus ordering procedures that may
be computationally very challenging in case of moderate
to high-density maps. Several genomic problems, in-
cluding multilocus genetic mapping, in building physical
maps (contig assembling for overlapping clones and
radiation hybrid mappings), assembling ESTs, and others,
can be formulated as multipoint one-dimensional order-
ing. Despite variation among possible optimization cri-
teria, the one-dimensional genetic or genomic ordering
problems are quite similar to the well known challenging
Traveling Salesperson Problem (TSP). A powerful algo-
rithm developed for a wide class of TSP-like Vehicle
Routing Problems and referred to as the Guided Evolu-
tion Strategy (GES) [14] was successfully adapted to
genetic mapping [15]. High performance and high preci-
sion of GES algorithms make them very suitable to ad-
dress computation challenging multipoint ordering prob-
lems, especially in the context of our methodology re-
quiring a verification analysis to ensure stability of the
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
578
constructed maps. The mapping of 3000 loci of a dataset
generated in a maize project at the Center for Plant Ge-
nomics, Iowa State University, can be used as an exam-
ple of practical efficiency of this approach [16].
2. METHODS, ALGORITHMS, AND
EMPLOYED DATASETS
2.1. General Scheme
The core procedure of our approach includes the fol-
lowing stages (Figure 1):
Using marker orders rather than marker cM positions
as the main map characteristic, with minimum total map
length as an optimization criterion. Although map length
is employed by many other authors, the stability of or-
dering rather than the confidence intervals of the marker
positions or posterior marker positions [17,18] as a crite-
rion of the map quality, is central to our method. To eva-
luate the stability of ordering, we employ re-sampling
procedures [11,19].
The previously mentioned formulation was possible to
implement as a mapping algorithm for many markers
(i.e., hundreds per chromosome) because of a novel, hig-
hly efficient method of discrete optimization developed
in our lab [14]. A procedure for detecting and removing
problematic markers causing local “map expansion” us-
ing the instability of local neighborhoods across re-sam-
pling runs and the deviation from the expected mono-
tonic change of recombination rates as criteria. A step-
wise procedure of merging clusters of linked markers
based on the end-to-end principle in order to reduce the
danger of combining markers of non-homologous chro-
mosomes in one map. This danger may derive from
sampling deviations of corresponding recombination
rates from the expected 50% or from the pseudo-linkage
phenomenon [4,5,20].
2.2. Clustering
The first step is calculating pairwise recombination frac-
tions (rf ) for all pairs of markers using the maximum
likelihood estimation procedure. Then, the number of
clusters (linkage groups, LG) can be evaluated as a func-
tion of the threshold (maximal) value rf0, allowing the
preliminary assignment of a marker to a certain LG,
namely, marker mi may be assigned to an LGj if recom-
bination between mi and at least one marker from LGj is
lower than the threshold rf0 and is the lowest compared
to the distances to any other LG. Based on the obtained
information, it is necessary to choose a sufficiently small
value of rf0 to exclude the possibility of getting in one
LG markers from non-homologous chromosomes due to
pseudo-linkage (see [4,5,21,22]). But choosing an rf0 that
is too small will result in a large number of clusters
(linkage groups) that will considerably exceed the hap-
loid number of the species. Therefore, the next steps
should include controlled merging of some of the clus-
ters by a gradual relaxing of the conditions on pseudo-
linkage (by increasing rf ). The specific feature of our
approach is that the building and ordering of the LGs are
considered as interacting procedures (see Figure 1). If
some markers of two LGs appeared closer than the re-
laxed rf, it would be reasonable to permit merging if the
closest markers of the two candidate LGs are terminal or
sub-terminal (“end-to-end” merging). Merging should be
forbidden if the closest markers reside in the interior part
of one or both candidates.
To illustrate how this scheme works, we simulated an
example with two chromosomes (A and B) with pseudo-
linkage. The maximum deviation from independent seg-
regations of markers ai (chromosome A) and bj (chro-
mosome B) was for markers with i = 5 and j = 13 (the
simulated value was rfa5-b13 = 0.2, whereas the value that
“occurred” was 0.19). The recombination values for
consecutive adjacent markers were 0.1, excluding r11-12 =
0.285 on chromosome A and r8-9 = 0.33 on chromosome
B. What happens when two different threshold values of
recombination are chosen, e.g., rf0 = 0.3 (a usual choice
in many publications) and rf0 = 0.15? With rf0 = 0.3, all
markers of chromosome A and the 12 last markers of
Figure 1. Stepwise clustering of markers in linkage groups coordinated with multilocus ordering.
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
579
579
chromosome B should be combined in one linkage group;
the remainder of the markers of chromosome B com-
prised the second linkage group. The results obtained
after ordering these groups followed by detecting and
removing markers violating monotonic change of rf
along the map are shown in Figure 2(a). Now, we start
with a more stringent threshold, rf0=0.15. This choice
resulted in four linkage groups (Figure 2(b)); upon re-
laxation of the threshold (0.15 0.30) linkage groups
of the non-homologous chromosome would tend to
merge, but not in the end-to-end manner (their internal
parts proved to be the closest, rfa5-b13 = 0.19). Thus, this
merging was not allowed. The next step of relaxation (rf0
= 0.35) with the previously mentioned rule (allowing
only end-to-end merging) resulted in the correct recov-
ery of the simulated chromosomes. The presented cycle
can be repeated several times until further merging will
cause an appearance of LGs with large internal gaps.
Clearly, this procedure can be simplified if anchor
markers are available. However, the choice and usage of
anchors based on literature should be cautious because
of the possibility of a relatively high level of errors in
some published maps.
(a) (b)
Chr 1 Chr 2 Chr 1Chr 2
Figure 2. Reducing the chances of wrong clustering by stepwise relaxation of the threshold recombination:
(a) Wrong clustering of markers, caused by a high threshold value of recopmbination in situations of pseudo-
linkage; (b) Preventing wrong clustering by using stringent threshold recombination.
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
580
2.3. Multilocus Ordering
As noted above, the number of scored markers may, by
orders of magnitude, exceed the number of those practi-
cally resolvable by recombination markers for the given
population size. Only a small portion of markers (here
referred to as delegate markers) can be included in the
skeleton map, with the remaining markers being at-
tached to the delegates. Besides the non-resolvable
linkage caused by small sample size, the necessity for
selection of representative markers for the skeleton map
derives from non-random (clustered) recombination dis-
tribution in the genome [4], varying information content
of the markers (co-dominant versus dominant, missing
data, distorted segregation, and scoring errors), biased
recombination estimates between repulsion-phase
dominant markers [19], and negative interference [5,10].
Using our approach (implemented in MultiPoint package,
http://www.multiqtl.com), one may start from a linkage
group with hundreds of markers and conduct several
analytical steps in order to build a reliable map:
multilocus ordering;
binding together closely linked markers followed
by selection of delegates (bin markers) with the
highest information content and replacing the gro-
ups of tightly linked markers by their “delegates”;
repeated ordering and re-sampling verification of
the reduced LG to detect regions of map instabil-
ity;
removing the markers causing unstable neighbor-
hoods and violating monotonic change of recom-
bination, followed by repeated ordering to obtain
the skeleton map;
attaching the removed markers to their best inter-
vals on the skeleton map.
Our mapping algorithm is based on the reduction of
the multilocus ordering problem to TSP or TSP-like for-
mulation. Its main features were described earlier [11,
19]. Here we present only recent modifications and ex-
tensions of our ordering algorithm. One of the possibili-
ties in addressing the mapping problem is to recover the
marker order from a known matrix dij of pair-wise
marker distances based on estimates of the recombina-
tion rate. An important fact is that in genetic ordering
problems the distances between the markers cannot be
measured directly. For this reason, even an exact TSP
solution does not guarantee that the obtained map will be
robust to a small variation of the data, hence the impor-
tance of stability testing. Special formulations of the
problem may include various restrictions (e.g., on a pre-
defined order of anchor markers), implying a reduction
of genetic mapping to a more complex constrained dis-
crete optimization problem.
In order to improve the efficiency of our multilocus
ordering algorithms, we developed a new metaheuristic
approach, referred to as the Guided Evolution Strategy
(GES) that combines the strengths of the Guided Local
Search (GLS – [33]) and Evolution Strategies [19] in the
framework of one iterative two-stage procedure. GES
combines the ES and GLS metaheuristics, and these two
stages are iteratively repeated until no more improve-
ments can be found in the local search. Our experiments
on 302 large-scale benchmark vehicle routing problems
with constraints demonstrated that the proposed algo-
rithm is fast, cost-effective, and highly competitive, pro-
ducing the best known solutions to 82% of the con-
strained benchmark problems [14]. We adapted this GES
approach to TSP-like problems of genetic mapping, in-
cluding the Fast2Opt local search procedure and new
variable (adaptive) neighborhood size. The new map-
ping-oriented algorithm works with small neighborhoods
(25-50 neighbor markers) that allows significantly ac-
celerate the performance on large-scale problems. The
algorithm was tested on standard TSPlib problems with
50-2392 points. All known best solutions were achieved
for these problems.
2.4. Map Verification
The objective of the verification procedure is detecting
regions with unstable neighborhoods relative to the initial
ordering (in the following example, explaining the method,
we used 10 markers numbered from 1 to 10). This can be
achieved by repeated re-sampling of the initial dataset
(jackknife, bootstrap) followed by multilocus ordering for
each such derivative sample (Figure 3(a)). Then, the
identification of unstable regions can be conducted based
on the frequency distribution of the right-side and left-side
neighbors (Figures 3(b)-3(d)). The identification of such
regions can be conducted by summing up corresponding
neighborhood matrices (Figures 3(b)-3(c)) and calcu-
lating the neighborhood frequencies (Figure 3(d)). The
higher the deviation from 1 (i.e., from the “diagonal”
pattern) the less certain is local order. In the example, we
show only two re-sampling runs. Based on this small-
size re-sampling, we can indicate certain local neighbor-
hoods, i.e., for marker pairs 1-2, 4-5, 7-8, and 9-10. In
actual analysis the number of runs should be at least a
few dozen or hundreds.
Clearly, the unstable neighborhoods result from fluc-
tuations in the estimates of recombination rates across
the repeated samples; the range of fluctuations depends
on the sample size and the proportion of individuals
taken at each jackknife run. In our framework, jackknife
analysis is a modeling tool to quantify the diversity of
map versions for the treated chromosome representing
the sampling (stochastic) nature of the map. The results
of such an evaluation can facilitate further decision
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
581
581
Figure 3. Graphical display of the verification process: (a) multilocus orders for each jackknife (the
example includes only two jackknives); (b) and (c) neighborhood matrices for each jackknife run; (d)
matrix of neighbourhood frequencies averaged over all runs.
making about problematic markers. These markers can
be removed from the map and then the map must be re-
built (Figure 3).
2.4.1. Improving Map Stability and Building
Skeleton Map
After revealing the regions of map instability, we need to
make a decision about the marker (or markers) responsi-
ble for local instability. For each region, there could be
more than one candidate for removal from the dataset
with the objective to stabilize the order. Our choice sho-
uld depend on quality of the markers (anchor markers or
genes vs. other markers; co-dominant vs. dominant; con-
cordantly c-segregating with the neighbors or displaying
unique segregation; fully scored or with many missing
data, etc.). Taking these criteria into account, we can
check the effect of removing any of the candidates using
the trial-and-error approach, namely, after the removal
of a candidate marker, we can re-build the map and
again test its stability based on jackknife re-sampling.
This computing of intensive methodology is affordable
within the framework of our approach due to the very
high performance of our multilocus ordering heuristic
algorithm. As a result, we will come up with a stabilized
(skeleton) map.
Clearly, the skeleton map will include the most reli-
able (informative) markers. Likewise, any group of tig-
htly linked markers un-resolved by recombination, due
to tight linkage and small sample size, will be repre-
sented by one delegate marker, which is also selected on
the basis of scoring quality or biological priority (say,
gene vs. anonymous marker). The three main reasons for
map instability can be mentioned here: 1) tightly linked
markers (with one-two recombinants in the sample) that
may produce varying local orders upon jackknife runs; 2)
islands of moderately linked well ordered markers sepa-
rated from other neighbors by relatively large gaps that
will appear in opposite orientation of the entire island
(propeller effect); and 3) regions with non-mon- otonic
change in recombination due to an excess of double re-
combinants caused by scoring errors, negative interfer-
ence, or gene conversion) [5,10,20,23,24].
2.4.2. Criteria for Comparing Different Map
Versions
To characterize the efficiency and advantages of the pro-
posed methodology compared to other methods, we need
to define some criteria used in comparisons. These in-
clude:
Map length and number of markers presented in the
resulting stable map: Even with correctly unraveled mar-
ker order, the evaluated map length may be higher than
the actual one, due to a certain amount of scoring errors.
But deviation from the correct order will result in map
length inflation even without marker scoring errors.
Controlling monotony: Some of the scoring errors
may generate situations in violation of the principle that
“the entire entity is supposed to be larger than its parts”
[11]. Normally, for three markers ordered as a-b-c, one
would expect: rac > r
ab & rac > r
bc. A violation of this
condition indicates that something may be wrong with
the local ordering. Alternatively, a violation may be
caused by negative interference or gene conversion.
Stability of local neighborhoods: With increased pro-
portions of scoring errors, the ordering may become very
sensitive even to a small sampling variation upon jack-
knife re-sampling. To quantify such instability, we em-
ploy a simple measure:
i
i
n2
)/1(

,
where 2
i is variance of ith marker neighborhood:
j
iji jip 22 )(5.0
,
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
582
and pij is the proportion of jackknife runs where markers
i and j were adjacent neighbors. The enumeration of
markers is according to the multilocus order obtained on
the entire sample or based on an external order (e.g.,
from the literature). Stable order will give = 1.
Concordance of segregation distortion within local
neighborhoods: Segregation distortion is a known phen-
omenon caused by various factors [4]. Upon correct or-
dering, one would expect a certain correspondence be-
tween segregation ratios of neighbor markers (segi and
segi+1), namely, a correct order should give smaller val-
ues of the following criterion Si compared to a wrong
order. If the frequency of one of the two classes at a
marker locus in an RIL (dihaploid, or backcross) popula-
tion or one of the homozygotes in F2 is p1, then the
normalized change of segregation ratio from marker i to
i + 1 can be calculated as:
1,
/100
jiii DdsegS
with dsegi = |p2i - p2i+1|, where p2i and p2i+1 are the fre-
quencies of the second marker class for loci i and i + 1
normalized by p1i and p1i+1, and Di,i+1 is the distance (in
cM) between the markers; for F2:
},max{ 133122  iiiii ppppdseg
where p2i and p3i are the normalized frequencies of the
heterozygote and the second homozygote.
2.4.3. Using Real Mapping Data for Illustration
To illustrate the efficiency of the proposed mapping stra-
tegy on real data we selected a few examples of diverse
organisms: wheat, barley, oat, maize, Arabidopsis, mo-
use, rat, and trout using the data available on the web.
The published results for the same datasets were em-
ployed for comparisons with our map versions.
3. RESULTS
An empirical assessment of the proposed analytical fra-
mework, in comparison with other procedures, can be
conducted using both Monte-Carlo simulations and real
data. The validation of the basic properties of our algo-
rithms was provided in our earlier publications using
simulated data [11,19]. Here we demonstrate the advan-
tages of the extended analytical scheme using genome
mapping data on several eukaryotic species: wheat, bar-
ley, oat, maize, Arabidopsis, mouse, rat, and trout. Ob-
viously, our intention is just to illustrate the proposed
approach rather than to revise the published maps. We
believe that a revision of a map, if needed, should be
conducted by the research groups generating the data.
Among the illustrations provided below, the example
on wheat is presented in more detail. The results on the
other examples are summarized in Table 1, using the
case on maize to explain how we compare the published
and de novo constructed maps. One more example, for
the mouse, is summarized in a separate table (Table 2)
due to the fact that besides chromosome 16, our results
for all other chromosomes corresponded well with the
published map. Like with the mouse, our map version
corresponded well with the published version using a
dataset on Arabidopsis. A slight difference was detected
for chromosome 5 only due to the presence of two
markers with high levels of missing data, which caused
problems in the published map and were excluded from
our version (see Table 1).
3.1. Wheat Chromosome 1B
Wheat data from the GrainGenes website (http://wheat.
pw.usda.gov/GG2/quickquery.shtml) on chromosome
1B for the RIL mapping population Synthetic Opata
with 81 markers were employed (72 markers remained
after deleting absolutely linked markers). The first step
was to check the marker ordering presented on the web.
Using the re-calculated pair-wise rf values transformed
“back” to F1 level, the length of the map (with Kosambi
mapping function) was estimated as L = 444 cM. Based
on re-sampling analysis, the neighborhood stability of
this map was tested and found to be relatively low (Fig-
ure 4(a)-4(c)).
A more stable ordering, at least for a sub-set of mark-
ers (comprising a skeleton map), can be achieved by re-
moving the markers causing the observed instability and
deviation from the (expected) monotonic growth of re-
combination rates along the map around each of the
markers [11]. After such “cleaning”, a skeleton map with
26 markers was obtained (Figure 5(a)). The first and the
last markers proved to be the same, as they were in the
initial map (Figure 5(b)); the map length was reduced
from 444 cM to 138 cM (!), and this was achieved
without deleting “double recombinants” and replacing
missing scores by those that yield non-recombinants.
The improved quality of our map was accomplished by
detecting and removing problematic markers, mainly
with high missing levels. Historically, these markers
were placed onto the map as “second wave” and “third
wave” markers that were characterized much later than
the first groups of markers and for a much smaller sub-
sample of the initial mapping population. Unlike the
total set with 72 markers, the order of our skeleton map
with a subset of 26 markers (see Figure 5(a)) corre-
sponded well with the published map: the relative or-
dering of the markers in Figure 5(b) has only one minor
“inversion” (between markers Xbcd442 (#66) and
Xbcd441 (#64).
3.2. Maize Chromosome 1(IRIL5)
We employed data on IBM302 intercross recombinant
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
583
583
Figure 4. Stability testing of wheat Synthetic Opata standard map of 1B chromosome by using jack-
knife re-sampling. The published order was chosen as a reference (diagonal). The stable order should be
the one where for each marker its left-side and right-side neighbors do not vary across the repeated runs
(1-1 pair along the diagonal). Notes: the numbers in brackets near the marker names indicate the marker
order in the map presented on the website. (a)-(c) represent the three (overlapping) parts of the map;
analysis was conducted using MultiPoint software package (http://www.multiqtl.com).
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
584
Figure 5. Stabilizing the map by deriving the skeleton map. The graph represents our trial to stabilize
the wheat 1B map by detecting and removing the markers that caused the local instability of the
neighborhoods. (a) and (b) represent the new (stabilized) and the original (web) versions, respectively.
The order of skeleton markers (which proved to be the “first wave” markers) in our version of the map
displays a remarkable similarity to the order on the website map, excluding one locality.
inbred line (IRIL) mapping population (http://www.
maizegdb.org/cgi-bin/displaymaprecord.cgi?id=870745).
To demonstrate the differences between the web version
and our version of the maps the results on maize chro-
mosome 1 are presented (Table 1). The length of our
version of skeleton map (LMP = 357 cM) is half of that
built with MapMaker (LMM = 696 cM), probably reflect-
ing a better ordering and less discrepancy from the cy-
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
585
585
togenetic map (see [7,8]). Note that the distal markers in
the two maps coincide. The average interval size be-
tween the neighbor markers was 696:341 = 2.05 for MM
and 357:182 = 1.96 for MP.
Several examples demonstrating the problems that are
typical of many published maps and are easily resolvable,
via detecting and removing unreliable markers, are pre-
sented in the examples in Table 1 for the selected or-
ganisms, including maize. Thus, disconcordant segrega-
tions of maize markers 202-203-204 in the MM map
(reflected in large values of 100dseg/D) indicate that
marker 203 was erroneously placed between 202 and
204. Indeed, segregation ratios for the consequent mark-
ers 202, 203, 204 in MM map are 0.78, 0.2, and 0.75,
Table 1. Examples of comparing the quality characteristics of the revised multilocus maps with those of the original maps.
D(ac)/D(ab or bc) 100dseg/D
Map L, cM
Number of markers
Chr or LG Species
MM
D(6,8)/D(6,7) = 0.65
D(6,8)/D(7,8) = 0.89
D(8,10)/D(9,10) = 0.5
MM MP
( 8,9)3.4 (8,10)1.2
(9,10)1.6
MM MP
83.3 52
10 7
MM MP
1.75 1.01
MN2 Oat
JM
D(17,19)/D(17,18) = 0.67
JM MP
(17,18)5.0 (17,19)0.4
(18,19)8.0
JM MP
317 241
33 28
JM MP
2.51 1.05 Chr1H Barley
JM
D(15,18)/D(15,16) = 0.55
D(15,18)/D(16,17) = 0.71
JM MP
(15,16)20 (15,18)0.01
(16,18)5.9
JM MP
445 138
72 26
JM MP
4.05 1.04
Chr1B Wheat
MM
D(202,204)/D(202,203) = 0.45
D(202,204)/D(203,204) = 0.65
D(304,308)/D(304,305) = 0.28
D(304,308)/D(305,306) = 0.43
D(304,308)/D(306,307) = 0.52
D(304,308)/D(307,308) = 0.66
MM MP
(202,203)9.5 (202,204)1.2
(203,204)12.7
(306,307)33 (304,308)5.5
(307,308)43
MM MP
696 357
341 182
MM MP
1.87 1.00
Lg1 Maize
JM
D(9,11)/D(10,11) = 0.8
D(23,25)/D(23,24) = 0.2
D(23,25)/D(24,25) = 0.4
JM MP
(9,10)37 (9,11)1.2
(10,11)6
JM MP
108 85
22(29) 17
JM MP
1.46 1.01
Chr10 Rat
MM
D(8,12)/D(9,10) = 0.36
MM MP
(10,11)9.4 (8,12)0.5
MM MP
136 88
25 16
MM MP
1.95 1.06 ОА-1 Trout
JM
D(3,5)/D(4,5) = 0.52
D(20,22)/D(20,21) = 0.28
JM MP
(3,4) 46 (3.5) 5.8
(4,5) 25.5
(20,21) 2.4 (20,22)0.7
(21,22) 10.4
JM MP
127.6 112.3
49 45
JM MP
1.0 1.0
Chr5
(Ler/Cvi)
Arabidopsis
Notes: - score for neighbourhood instability (averaged for the chromosome); Map L - length of the linkage map; 100 dseg/D - relative score
for concordance of segregation ratios (analogue of derivative for change of segregation ratios along the map); D(ac)/D(ab or bc) - ratio of the
recombination distance (cM) between flanking markers of a segment to the length of one of its parts (ratio < 1 indicates wrong marker scoring,
or wrong mapping of the internal marker, or strong negative interference); ММ, JM, MP denote that the map was constructed with Mapmaker,
Joinmap, or MultiPoint, respectively.
Coding marker names (bold denotes markers included to the skeleton map):
Oat: 6 – p40m51n6, 7 – bcd1230, 8 – p48m58n4, 9 – bcd1414, 10 – p48m88n4;
http://grain.jouy.inra.fr/cgi-bin/graingenes/report.cgi?class=mapdata;name=Oat%2C%20MxN%2C%20genetic%202005;show=locus;show=ma
p;print=
Barley: 17-E35M58-468, 18-E37M32-209, 19-E39M48-281;
http://wheat.pw.usda.gov/cgi-bin/cmap/map_set_info?map_set_aid=Barley_Cebada_Capa_x_SusPtrit
Wheat: 15-XksuE19, 16-Xbcd340, 17-Xrz166, 18-Xbcd1124;
http://wheat.pw.usda.gov/GG2/quickquery.shtml
Maize: 202-umc2237, 203-umc2239, 204-umc2238, 303-phi265454, 304-AY110426, 305-ufg14, 306-mmp195g, 307-npi238;
http://www.maizegdb.org/cgi-bin/displaymaprecord.cgi?id=870745
Rat: 9-D10Wox26, 10-D10Mgh11, 11-D10Wox9, 23-D10Wox19, 24-D10Wox22, 25-D10Mit7;
http://www.well.ox.ac.uk/~bihoreau/woxtable.html#Chromosome%201
Trout: 8 – Eacaacg176o, 9 – Eaacctg201o, 10 – Eaagacc531o, 11 – Eacgatc250c, 12 – Eaccagt121a;
http://www.wsu.edu/%7Ethorglab/OAmap/OA2002update.xls
Arabidopsis: 3-A.292L, 4-nga158, 5-BH.144L, 20-CH121L_Col, 21-nga139, 22 – DF.154.C;
ftp://ftp.arabidopsis.org/home/tair/Maps/Ler_Cvi_RIdata
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
586
respectively. In МР map, marker 203 is removed. Re-
moving a marker in a correct map usually results in an
increased distance between the remaining markers. That
was not the case in this example: D(202,203) = 6.13,
D(203,204) = 4.31, and D(202,204) = 2.80. Thus, the
presence of 203 expands the local region more than
3-fold. An even higher discrepancy in segregation ratios
was detected in the neighborhood flanked by markers
304 and 308. Here three markers are problematic: 305,
306, and 307. If all five markers, from 304 till 308, are
retained, the total length of the segment is equal to the
sum of distances D(304,305) = 8.2, D(305,306) = 5.4,
D(306,307) = 4.4, and D(307,308) = 3.5, which is 21.5.
In our version of the map markers 305, 306, and 307 are
deleted, and the length of the segment becomes D(304,
308) = 2.3, i.e., only ~1/10 of the original size of the
interval flanked by 304 and 308. An additional argument
in favor of our analysis is that unlike the removed mark-
ers, markers included on our skeleton map appear in the
same order as in the original web map, excluding a few
local discrepancies, namely, the revised local orders 2-1-
4, 28-32-31-33, 47-49-48-50, 101-109-105-111, 289-295
-293-292-296, and 324-331-329-334 (the numbers refl-
ect the consequent relative positions of marker loci from
1 to 341 in the original map).
3.3. Mouse
We employed data on the mapping population (C57BL/
6JxM.spretus)F1xC57BL/6J http://www.informatics.jax.
org/searches/crossdata_form.shtml). The skeleton maps
that we constructed for chromosomes 1-19 and X corres-
ponded completely with the maps presented on the web-
site. The only difference was for chromosome 16: the
web map seems to have a serious local mistake, unless
the authors used some additional information. Still, kee-
ping in mind the high quality of the data, it may be in-
structive to compare our results with the original maps.
For comparison, we excluded absolutely linked markers
from the original maps. The results are shown in Table 2.
For chromosome 5, the difference between our map and
the web version is small and was caused by two markers
(Zp3 and Ccnb1-rs1) that violated the rule that the entity
cannot be smaller than its parts [11]. The two versions of
the map proved identical until the marker Gusb. In the
web version, the interval Gusb-Zp3 was larger than the
flanking interval Gusb-Gnb2; thus, deleting Zp3 seems a
reasonable suggestion. Similarly, interval D5Fcr8-Brca2
is shorter than its part Ccnb1-rs1-Brca2; thus Ccnb1-rs1
was also deleted. For chromosome 16, the considerable
difference between the two versions, is caused by the
erroneous placement of marker D16Fcr1 at the upper di-
stal point of the map, despite its tight linkage with Csta.
4. DISCUSSION
Building correct multilocus maps is usually considered a
pre-requisite for diverse genomic/genetic applications,
e.g., positional cloning, anchoring contigs in physical
mapping, and marker-assisted selection. Dodds and co-
authors [25] surprisingly found that map errors do not
seem to have too much influence on QTL mapping re-
sults. Although this may be the case in some situations,
in many other situations, map errors may lead to dra-
matic negative impacts. Purportedly, if the objective of
QTL mapping was map-based cloning, then a lot of ef-
fort might be made with no results if the map order in
the region of residence of the targeted QTL was wrong.
This may happen even if the assignment of the QTL to
the chromosome was correct. Moreover, with some typ-
es of erroneous ordering, one could detect two QTLs in a
chromosome that carried only one QTL for the consid-
ered trait, whereas under the correct ordering of markers
only one QTL will be detected. Obviously, wrong or-
dering, even on a local scale, may also reduce the effi-
ciency of using marker positions on the map to facilitate
physical mapping.
Table 2. Comparing two versions of mouse maps.
Chromosome
16 9 8 7 6 5 4 3 2 1
Method Parameter
24
20
51
41
41
104
38
35
100
41
37
123
40
34
110
40
38
102
48
42
106
42
40
89
57
54
176
57
48
110
MMQTX
MP
nmar
Nmar
2.165
1.003
1.020
1.020
1.044
1.023
1.050
1.058
1.056
1.018
1.168
1.073
1.110
1.044
1.038
1.009
1.088
1.032
1.071
1.024
MMQTX
MP
67.3
46.3
72.7
72.7
66.7
63.3
63.8
61.3
60.5
55.7
80.0
74.2
77.6
71.7
85.9
84.2
88.4
84.8
91.4
83.4
MMQTX
MP
Lchr, cM
Nmar – numebr of markers in the intial map version, nmar – number of delegate markers
MMQTX – Map Manager QTX
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
587
587
Here we propose an analytical scheme for building
multilocus genetic maps that allows reasonable map qua-
lity even in complicated situations with very large num-
ber of markers, disproportionately small sample sizes,
and a high level of reading errors. A high ratio of scored
markers to population size has become typical in recent
years due to the ever-increasing availability of high thr-
oughput genomic technologies. This new situation has
encountered a psychological barrier inherited from the
previous generation of the mapping community, when
the number of markers was very small, justifying an
intention to put each marker onto the map, even if its
position is poorly resolved from the neighbor(s) or if its
quality is problematic. Even now, when the number of
potential markers becomes rather high (and sometimes
huge), there is still a tendency to follow this approach.
Some authors suggest selecting a part of the markers as
map bins and order the vast majority of the remaining
markers relative to the bins. This is a more realistic task,
but still the number of markers may by far exceed the
map resolution caused by small sample sizes (see Intro-
duction section). Consequently, the resulting maps, carr-
ying large number of markers, may be locally unreliable.
Our approach of testing map stability includes a veri-
fication procedure based on jackknife re-sampling [11].
Its major difference from the usual way of addressing
this problem is that, as a measure of map stability, we
consider the marker local orders [11,19] rather than the
length of confidence intervals of marker map positions
[17]. This approach gives flexibility in detecting and
removing markers causing map instability that would be
less natural to implement if the cM position on the map
is the measure. Moreover, it is a well known fact that
recombination rates may display very high variability
between different mapping populations of the same org-
anism, due to the effects of genotype, age, and environ-
ment [4,20]. This variability may cause serious problems
in combining data from different mapping populations to
build consensus maps [9,26]. If marker order is the basis
of map comparisons, this problem just does not exist.
At each of the re-sampling iterations, the multilocus
mapping problem is solved using TSP-like formulation.
Several well known heuristic algorithms can be applied:
Tabu Search, Simulated Annealing, Guided Local Sear-
ch, Genetic Algorithm (with EAX crossover), Evolution
Strategy, Guided Evolution Strategy, Ant Colony Be-
havior (ACB), and Artificial Neural Networks (ANN)
(for detailed references see [15]). Currently, the most
advanced software for solving unconstrained TSP is the
Concorde package (http://www.tsp.gatech.edu/index.ht-
ml). Concorde TSP software was applied for solving
genomic problems (e.g., [27,28]. The Concorde solver
uses the cutting-plane algorithm [29,30], which is an
alternative to branch and bound to solve integer pro-
gramming using the specific (one-dimensional) structure
of the problem. This allows generating very good cuts
helping to accelerate the optimization process, but only
if the problem does not contain additional restrictions.
There are examples of applying the cutting plane algo-
rithm to constrained discrete optimization problems [31,
32]. In both cited studies, the authors employed a
multi-processor system and parallel C++ language. In
particular, using the 188 processors system, a Concorde
team obtained the exact solution for a 120-point problem
for 10 days. Our heuristic GES algorithm found exactly
the same solution in just ten seconds using a standard PC
Pentium IV (2.0 Ghz). This fact illustrates that heuristic
approaches for constrained optimization problems are
preferable, and GES manages with this challenge effec-
tively. We note that our mapping-oriented GES is based
on hybrid technology and employs powerful properties
of both Guided Local Search [33,34] and Evolution Stra-
tegy algorithms [11,19].
Returning to the discussion about the importance of
re-sampling for reliable mapping, we should stress again
that one of the most frequent factors of instability in
small sample sizes and in a large number of markers is
errors in marker reading. As a rule, the number of such
errors is rather small, just a few per marker per popula-
tion. But the result is that marker pairs that had to be
non-resolvable (no true recombinants) or poorly resolv-
able (one recombinant) upon the small sample size, be-
come “resolvable” and are somehow ordered. Thus, if,
on some position of the chromosome, there are several
absolutely linked markers, the small rate of scoring er-
rors, approximately equal for all markers, should dis-
perse the markers in some multi-dimensional sphere
(where they are equally distant from each other). In fact,
during the mapping of these markers, they will be or-
dered in a one-dimensional space (as part of the map),
resulting in map length inflation proportional to the
number of such markers and the rate of errors. Local
marker order in such a situation should be very unstable
(non-reproducible) upon jackknife re-sampling, hence
the proposed method of detecting such a region by our
verification procedure. Removing a considerable propor-
tion of such markers should significantly improve the
map reliability and reduce the map length. As was sho-
wn in the previously mentioned example on wheat,
missing data also may be an important contributor to the
instability of marker neighborhoods, hence, markers
with high missing levels should be considered among the
first candidates for removal during the building of ske-
leton maps.
The discomfort that a researcher gets from a map that
is too long, explains the intention of reducing the map
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
588
length, sometimes by rather artificial approaches. For
example, such trials may include removing double re-
combinants and/or recovering the missing data by scores
yielding non-recombinant genotypes after the marker
ordering was finished. We found a good example for
such a situation in re-analyzing the data on wheat chro-
mosome 1B (see the previously mentioned analysis on
wheat chromosome 1B). In a trial to display the map len-
gth for the multilocus order of 1B presented on the web-
site, we used “per meiosis” recombination rates con-
verted to Kosambi map distances. This resulted in a 1B
map with L = 432 cM length, in contrast to the one pub-
lished on the website with L = 104 cM. How could this
huge discrepancy (432 vs. 104 cM) be explained? We
managed to “reproduce” the underlying procedure with
very high precision. As with many other published maps,
major efforts have been invested by several teams to enr-
ich this population with molecular markers from other
populations, thereby bridging these mapping resources
together. Unfortunately, new waves of markers were
scored only partially, so that the level of missing data for
this population was often very high, up to 50-70%.
It appeared that the problem of missing data was
treated by the map constructors (or by the software they
have employed) as follows. After ordering the markers,
missing data were recovered by replacing the missing
scores with those that resulted exclusively in non-re-
combinants. Clearly, the higher the level of missing data
the stronger the effect of such correction will be, i.e., a
reduction of the map length. Before transforming rf val-
ues for RIL to rf values for F1, double recombinants for
any pair of adjacent intervals were also replaced by
non-recombinants. This method of removing erroneous
double recombinants seems reasonable for F2 or double
haploids, but it is inappropriate given that in RIL map-
ping population, “double recombinants” are not neces-
sarily a result of scoring errors or real double recombi-
nation events. Indeed, a considerable portion of “double
recombinants” have likely resulted from recombination
in adjacent intervals that occurred in meiosis IN DIF-
FERENT generations of genotypes that remained het-
erozygous for those regions (in F2, F3, etc.). We consid-
ered several intervals, where our ordering was exactly
the same as in the published map, but the distances were
different (namely, our distances were higher than those
published on the website). After conducting the previ-
ously mentioned “correction” steps, we obtained exac-
tly the same distances as reported on the website map.
The conducted revision analysis indicates that these dis-
tances may be irrelevant to the actual situation. Thus, the
relatively small lengths of those maps are almost cer-
tainly an artifact introduced during the merging of dif-
ferent marker data sources, some of which contained
high frequencies of missing data and inappropriate “er-
ror correction”.
In genetic mapping, multilocus ordering is usually co-
nsidered as a much more complicated problem compared
to subdivision of the marker set into linkage groups.
However, many examples indicate that markers from
non-homologous chromosomes were assigned to the
wrong chromosomes, presumably, due to pseudo-linkage.
We have encountered these types of errors in analyzing
cereal species (see [5]). Similarly, in cattle, such wrong
assignments were found for 12% of markers/contigs (H.
Lewin, personal communication). Thus, the pseudo-lin-
kage phenomenon should be of special concern for
large-scale genetic mapping. As indicated above, incor-
rect assignments can be caused by biological and statis-
tical reasons. The probability of sampling deviation from
the random segregation of markers from non-homolog-
ous chromosomes should grow with increased numbers
of chromosomes, length of chromosomes, number of
markers, and with small sample size. We propose in this
paper that a considerable part of “wrong assignment”
errors can be reduced by the algorithm of stepwise clus-
tering markers into linkage groups alternated by multi-
locus ordering.
The possibility of re-sampling based on testing map
stability by detecting and removing the markers that
cause low map quality is the second major component of
the proposed approach. Detection and removal of the
markers responsible for local map instabilities and
non-monotonic change in recombination rates allows
building stable skeleton maps with minimal total length.
Clearly, there is some degree of uncertainty in such
choices; hence, there might be different versions of the
skeleton map. In such situations, the high performance
of our algorithms is an important advantage allowing
further fast correction (complementing) of the skeleton
map by using additional markers and/or some of the de-
leted markers. Further improvement of the mapping
quality is achievable by joint analysis of mapping data
from different mapping populations that can be referred
to as consensus mapping [9,26].
5. ACKNOWLEDGEMENTS
The study was supported by the Israeli Ministry of Absorption and the
United States-Israel Binational Agricultural Research and Develop-
ment Foundation (grant # 3873-06).
REFERENCES
[1] Lander, E.S. and Green, P. (1987) Construction of multi-
locus linkage maps in human. Proceedings of the Na-
tional Academy of Science, USA, 84(8), 2363-2367.
[2] Stam, P. (1993) Construction of integrated genetic link-
Y. Ronin et al. / Natural Science 2 (2010) 576-589
Copyright © 2010 SciRes. OPEN ACCESS
589
589
age maps by means of a new computer package: JoinMap.
Plant Journal, 3(5), 739-744.
[3] Sapre, A.B. and Deshpande, D.S. (1987) Spontaneous
emergence of parents from the F1 interspecific hybrids of
Coix L. Journal of Heredity, 78(6), 357-360
[4] Korol, A.B., Preygel, I.A. and Preygel, S.I. (1994) Re-
combination variability and evolution. Chapman & Hall,
London.
[5] Peng, J., Korol, A.B., Fahima, T., Roder M.S., Ronin, Y.
I., Li, Y.C. and Nevo, E. (2000) Molecular genetic maps
in wild emmer wheat, Triticum dicoccoides: Genome-
wide coverage, massive negative interference, and puta-
tive quasi-linkage. Genome Research. 10(10), 1509-1531.
[6] Korol, A.B., Shirak, A., Cnaani, A. and Hallerman, E.M.
(2007) Detection and analysis of QTLs for economic
traits in aquatic species. In: Liu, Z.J., Ed., Aquaculture
Genome Technologies. Blackwell, Oxford, 169-197.
[7] Nilsson, N.O., Säll, T. and Bengtsson, B.O. (1993) Chi-
asma and recombination data in plants: Are they com-
patible? Trends in Genetics, 9(10), 344-348.
[8] Anderson, L.K., Doyle, G.G., Brigham, B., Carter, J.,
Hooker, K.D., Lai, A., Rice, M. and Stack, S.M. (2003)
High-resolution crossover maps for each bivalent of Zea
mays using recombination nodules. Genetics, 165(2),
849-865.
[9] Korol, A.B., Mester, D., Frenkel, Z. and Ronin, Y.I.
(2009) Methods for genetic analysis in the Triticeae.
Chapter 6, In: Feuillet, C. and Muehlbauer, G.J., Eds.,
Genetics and Genomics of the Triticeae. Springer, Berlin,
pp. 163-199.
[10] Esch, E., and Weber, W.E. (2002) Investigation of cross-
over interference in barley (Hordeum vulgare L.) using
the coefficient of coincidence. Theoretical and Applied
Genetics, 104(5), 786-796.
[11] Mester, D., Ronin, Y., Minkov, D., Nevo, E. and Korol, A.
(2003b) Constructing large scale genetic maps using an
evolutionary strategy algorithm. Genetics, 165(4), 2269-
2282.
[12] Efron, B. (1979) Bootstrap method: Another look at the
jackknife. The Annals of Statistics, 7(1), 1-26.
[13] Efron, B. and Tibshirani, R.J. (1993) An introduction to
the bootstrap. Chapman and Hall, New York.
[14] Mester, D. and Braysy, O. (2005) Active guided evolu-
tion strategies for large-scale vehicle routing problems
with time windows. Computers & Operation Research
32(6), 1593-1614.
[15] Mester, D.I., Ronin, Y.I., Nevo, E. and Korol, A.B. (2004)
Fast and high precision algorithms for optimization in
large scale genomic problems. Computational Biology
and Chemistry, 28(4), 281-290.
[16] Fu, Y., Wen, T.J., Ronin, Y.I., Chen, H.D., Guo, L.,
Mester, D.I., Yang, Y.J., Lee, M., Korol, A.B., Ashlock D.
A., et al. (2006) Genetic dissection of maize intermated
recombinant inbred lines. Genetics, 174(3), 1671-1683.
[17] Liu, B.H. (1998) Statistical genomics: Linkage, mapping,
and QTL analysis. CRC Press, New York.
[18] Jansen, J., de Jong, A.G. and Ooijen, J.W. (2001) Constr-
ucting dense genetic linkage maps. Theoretical and Ap-
plied Genetics, 102(6-7), 1113-1122.
[19] Mester, D.I., Ronin, Y.I., Hu, Y., Peng, J., Nevo, E. and
Korol, A.B. (2003a) Efficient multipoint mapping: Mak-
ing use of dominant repulsion-phase markers. Theoretical
and Applied Genetics, 107(6), 1002-1112.
[20] Korol, A.B. (2001) Recombination. In: Encyclopedia of
Biodiversity. Academic Press, San Diego, 5, 53-71.
[21] Sakamoto, T., Danzmann, R.G., Gharbi, K., Howard, P.,
Ozaki, A., Khoo, S.K., Woram, R.A., Okamoto, N., Fer-
guson, M.M., Holm, L.E., et al. (2000) A microsatellite
linkage map of rainbow trout (Oncorhynchus mykiss)
characterized by large sex-specific differences in recom-
bination rates. Genetics, 155(3), 1331-1345.
[22] Sivagnanasundaram, S., Broman, K.W., Liu, M. and
Petronis, A. (2004) Quasi-linkage: A confounding factor
in linkage analysis of complex diseases? Hum Genet,
114(6), 588-593.
[23] Morrell, P.L., Toleno, D.M., Lundy, K.E., and Clegg, M.
T. (2006) Estimating the contribution of mutation, re-
combination and gene conversion in the generation of
haplotypic diversity. Genetics, 173(3), 1705-1723.
[24] Plagnol, V., Padhukasahasram, B., Wall, J.D., Marjoram,
P. and Nordborg, M. (2006) Relative influences of cross-
ing over and gene conversion on the pattern of linkage
disequilibrium in Arabidopsis thaliana. Genetics, 172(4),
2441-2448.
[25] Dodds, K.G., Ball, R., Djorovic, N. and Carson, S.D.
(2004) The effect of an imprecise map on interval map-
ping QTLs. Genetical Research, 84(1), 47-55.
[26] Mester, D.I., Ronin, Y.I., Korostishevsky, M.A., Pikus, V.
L., Glazman, A.E. and Korol, A.B. (2006) Multilocus
consensus genetic maps (MCGM): Formulation, algo-
rithms, and results. Computational Biology and Chemis-
try, 30(1), 12-20.
[27] Hitte, C., Lorentzen, T.D., Guyon, R., Kim, L., Cadieu,
E., Parker, H.G., Quignon, P., Lowe, J.K., Gelfenbeyn, B.,
Andre, C., et al. (2003) Comparison of multimap and
TSP/CONCORDE for constructing radiation hybrid
maps. Journal of Heredity, 94(1), 9-13.
[28] Menotti-Raymond, M., David, V.A., Chen, Z.Q., Menotti,
K.A., Sun, S., Schaffer, A.A., Agarwala, R., Tomlin, J.F.,
O’Brien, S.J. and Murphy, W.J. (2003) Second-genera-
tion integrated genetic linkage/radiation hybrid maps of
the domestic cat (Felis catus). Journal of Heredity, 94(1),
95-106.
[29] Dantzig, G., Fulkerson, R. and Johnson, S. (1954) Solu-
tion of a large-scale traveling salesman problem. Opera-
tions Research, 2, 393-410.
[30] Appligate, D., Bixby, R., Chvatal V. and Cook, W. (1998)
On the solution of traveling salesman problem. Docu-
menta Mathematica, Extra Volume International Con-
gress of Mathematics III, 645-656.
[31] Cook, W. and Rich, J.L. (1999) A parallel cutting-plane
algorithm for vehicle routing problem with time window.
Grant Working, Princeton University, USA.
[32] Appligate, D., Cook, W., Dash, S. and Rohe, A. (2002)
Solution of min-max vehicle routing problem. INFO-
RMS Journal on Computing, 14(2), 132-143.
[33] Voudouris, C. (1997) Guided local search for combinato-
rial problems. Ph.D. thesis, Department of Computer Sci-
ence, University of Essex, Colchester.
[34] Tsang, E. and Voudouris, C. (1997) Fast local search and
guided local search and their application to British tele-
com’s workforce scheduling problem. Operations Re-
search Letters, 20(3) , 119-127.