Abstract
We estimate the minimum core size necessary to maximally represent a portion of the U.S. Department of Agriculture's National Plant Germplasm System apple (Malus) collection. We have identified a subset of Malus sieversii individuals that complements the previously published core subsets for two collection sites within Kazakhstan. We compared the size and composition of this complementary subset with a core set composed without restrictions. Because the genetic structure of this species has been previously determined, we were able to identify the origin of individuals within this core set with respect to their geographic location and genetic lineage. In addition, this core set is structured in a way that samples all of the major genetic lineages identified in this collection. The resulting panel of genotypes captures a broad range of phenotypic and molecular variation throughout Kazakhstan. These samples will provide a manageable entry point into the larger collection and will be critical in developing a long-term strategy for ex situ wild Malus conservation.
A central tenet of gene bank management is to make a collection useful. The applied value of these collections for crop improvement or gene discovery often depends upon fostering efficient utilization. The concept of core collections (or core sets) was initially proposed as a way to define, as a representative subset, the genetic diversity of a crop species (Brown, 1989a; Frankel, 1984). Thus, core collections provide an efficient entry point to the whole collection that is composed of a subset of diversity for researchers, breeders, and trait specialists. As a management tool, core collections have been proposed to capture the common and rare alleles within a fraction (5–10%) of the original collection (Brown, 1989b). Traditionally, core collections have been determined based on geographical and phenotypic characteristics (Crossa et al., 1993), but increasingly, genetic data has also been used to make selections (Liu et al., 2003; Marita et al., 2000; Ronfort et al., 2006). In many cases, a large collection may have developed targeted subsamples that are focused on specific traits or localities of interest (e.g., Ma et al., 2006).
It has been suggested that having core sets available for clonal collections is particularly desirable for vegetatively propagated clonal collections (Hodgkin et al., 1995). This germplasm is more expensive to maintain than orthodox seed collections because individuals are kept under field, greenhouse, or in vitro conditions rather than in long-term reduced temperature storage. Clones maintained in limited field plantings are subject to attrition through disease or bad weather. Such core sets may be more frequently requested and gene bank managers can plan on having appropriate propagules available when needed. Increased distribution of core sets can result in additional characterization data, thus increasing the utility of the larger collection the core set represents (Rubenstein et al., 2006; van Hintum, 1999). Seed-based core sets may be particularly useful in wild relative collections of clonally propagated crops. In these accessions, the objective is frequently to capture the allelic variation within the accession, but not necessarily any particular genotypes. These core sets can be used in a crossing design to preserve allelic variation segregating within populations of seeds, which can then be stored for longer periods of time at far lower cost (Volk et al., 2005).
U.S. Department of Agriculture (USDA) plant exploration teams collected Malus sieversii seeds and clones from Kazakhstan between 1989 and 1996 (Dzhangaliev, 2003; Forsline et al., 2003; Hokanson et al., 1997; Luby et al., 2001). Over one thousand trees derived from seeds collected during these trips have been planted and characterized at the USDA-Agricultural Research Service Plant Genetic Resources Unit (PGRU) in Geneva, NY. Samples for these trees representing eight collection sites in Kazakhstan and one collection site in Kyrgyzstan have been genotyped to determine the population structure of wild M. sieversii using seven highly variable microsatellite loci (Richards et al., 2009). Results from this analysis using standard population genetic approaches and Bayesian assignment methods identified four genetically distinct, stable clusters of individuals (Richards et al., 2009). Importantly, these clusters revealed a pattern of variation that was not primarily defined among collection sites but rather among broad geographic regions. This regional pattern of differentiation revealed ongoing admixture that obscured site-specific differentiation.
Remarkable progress has been made in the generation of genotypic data in many agricultural taxa and the development of algorithms and bioinformatic tools used to guide the construction of core subsets. Many methods rely on initial stratification of the samples into groups that reflect some ecogeographic attribute or quantitative trait value (Brown, 1989a, 1995; Franco et al., 2005, 2006; Li et al., 2004). Stratification ensures that sampling is distributed among relevant groups defined beforehand. Alternatively, maximization strategies attempt to reduce redundancy in a core set without a priori stratification (Schoen and Brown, 1995). Maximization strategies have been developed that can be used to efficiently assemble core subsets based on character states such as alleles at molecular loci or values of quantitative traits (Brown, 1989b; Gouesnard et al., 2001; McKhann et al., 2004; Schoen and Brown, 1993, 1995). A key feature of this approach is that redundancy in the collection can be empirically assessed and the size of an appropriate core set can be established (Gouesnard et al., 2001).
While the collection itself represents an important source of variation for breeding improvement in Malus, the living orchard collection is at risk of continued seasonal mortality. Core subsets of individuals representing the genotypic and phenotypic diversity of two of the largest collection sites in Kazakhstan have been proposed in part to stem these losses by developing a long-term seed-based backup and to increase utility of this material by breeders and researchers worldwide (Volk et al., 2005). In addition, vegetative buds from the individuals in the core subset will be cryopreserved to ensure long-term availability. The two collection sites were considered separately to maintain putative site-specific environmental adaptations to drought and cold temperatures. Establishing a complementary core set among the other seven collection sites is the next step in developing a comprehensive conservation strategy for the entire Kazakhstan collection at PGRU.
A key feature of determining composition of the proposed core set in this study is the availability of an estimate of the genetic structure of this collection (Richards et al., 2009). This study used population genetics of diversity and linkage disequilibrium between alleles at the sampled loci as a metric used to partition the genotypes into groups that share common ancestry. One advantage of this method is that admixture among geographic regions can be detected and quantified.
In this article, we estimated the minimum core size necessary to maximally represent the diversity of the M. sieversii collection. Our objective was to identify a subset of M. sieversii individuals that complements the existing core subsets for two collection sites within Kazakhstan. This study investigated the size and composition of this complementary subset. We were particularly interested in the origin of individuals in these selected sets—what sites and what genetic lineages are represented? Inclusion of diverse sites and genetic lineages in core collections is key to ensure that full representation is achieved.
Materials and Methods
Collection materials and DNA extraction.
Malus sieversii seeds were collected from wild trees during plant explorations to Kazakhstan in 1989, 1993, 1995, and 1996. Clones of individuals collected in Kazakhstan and classified as elite due to unusual or desirable characteristics were not included in these analyses. DNA was extracted from duplicate leaf samples from 961 seedling accessions available in the field collections in 2003. The individual composition of the core subsets identified by Volk et al. (2005) was modified slightly. Specifically, seven accessions were added to core sets 6 and 9. These additions were necessary to offset variation in vigor among the original selections and they brought the total number of accessions within the two site-specific core collections to 77. Seven amplified microsatellite loci yielding 103 alleles were separated on a gel-based system (LI-COR, Lincoln, NE) as previously described (Richards et al., 2009; Volk et al., 2005). The simple sequence repeats (SSR) were amplified using unlinked primers GD12, GD15, GD96, GD100, GD142, GD147, and GD162 (Hemmat et al., 2003; Hokanson et al., 2001). Phenotypic data from 21 continuous traits were categorically classified according to standards described in the publicly available USDA Germplasm Resources Information Network database (USDA, 2004).
Data analysis.
In 2007, 797 of the 961 seeding trees were alive and available in the field collection. Genotypic and phenotypic data for these 797 accessions were considered for the construction of core collections. All analyses used the maximization algorithm in the software package MSTRAT (Gouesnard et al., 2001), based upon the maximization strategy proposed by Schoen and Brown (1995). Briefly, this method treats each allele and each quantitative trait category as a unique character state. The object is to identify the smallest subset of individuals that contains all the character states—a set that is maximized for character state variation (Gouesnard et al., 2001). For some analyses, MSTRAT was made to include the previously identified genotypes used in core sets developed for two of the nine collection sites genotyped (Volk et al., 2005). In these cases, maximization focused on complementing variation not included in the original cores by identifying novel variation in the other seven sites.
As a benchmark, we estimated the size of a collection needed to capture about 90% to 95% of the total variation by using a feature of MSTRAT that measures the fraction of total diversity obtained in core subsets of varying size. If each genotype in the larger collection contributed some unique character, there would be a linear relationship between variation captured and sample size. However, variation is commonly structured, especially in natural populations where dispersal limits panmixia (Hamrick and Godt, 1997). In these cases, the fraction of total diversity (measured in character states) of a sample plateaus at a certain size, similar to a saturation curve where there is a diminishing return on diversity after a certain sample size is reached. These “redundancy” curves were developed using the mean fraction of diversity captured in five independent sampling runs. The inflection point of the resulting curvilinear plot can be used to find the optimal core sample size (Gouesnard et al., 2001). We estimated the optimal core size in the original set of 961 genotypes and the sample of 797 genotypes that were healthy and flowering in 2007. In addition, we compared the optimal core size among these datasets when using molecular data (7 loci, 103 alleles), phenotypic data (21 traits, 114 total trait states), or both. The quantitative metric used for this comparison was the fraction of total character states retained in the core set. The resulting diversity of these cores assembled using maximization was compared with core sets assembled at random. The difference between the two sampling curves illustrates the net gain in diversity realized through maximization, and provides a relative measure of core collection success in capturing representative variation. Once an appropriate core size was identified, the composition of this subset was examined. In many instances, there were several equally diverse core sets. To develop a consensus set of genotypes, we examined 10 possible core sets for each dataset. We chose the set that contained the most commonly found genotypes among the 10 replicate core sets.
We examined the distribution of collection sites and genetic clusters represented within a core subset from the complete set of 961 genotypes (core-SSR), or a core subset using the genotypic and phenotypic data for the 797 individuals living in 2007.
Results
Redundancy curves in the full set of 961 genotypes show that regardless of the source of the data (molecular or phenotypic), cores maximized for character diversity always capture more diversity than a similarly sized, randomly assembled core (Fig. 1). However, data types capture diversity at different levels of efficiency. Phenotypic data saturated earlier—it took as few as 27 individuals to capture 95% of the phenotypic diversity. While the number of states was high for these quantitative traits, saturation required few individuals. This is most likely because many of the agricultural traits showed high covariance. In contrast, it took 84 individuals to capture 91% of the genotypic diversity [subsequently referred to as core-SSR, n = 84 (Fig. 1)].

Diversity redundancy curves for all 961 Malus sieversii individuals in the complete data set. Plots compare the amount of diversity retained in cores maximized for trait diversity based on available phenotypic, genotypic, or phenotypic and genotypic data (top three curves) and similarly sized, randomly assembled cores (bottom curve). To capture phenotypic trait variation, fewer individuals could be used than were necessary to capture an equal percentage of the genotypic variation.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228

Diversity redundancy curves for all 961 Malus sieversii individuals in the complete data set. Plots compare the amount of diversity retained in cores maximized for trait diversity based on available phenotypic, genotypic, or phenotypic and genotypic data (top three curves) and similarly sized, randomly assembled cores (bottom curve). To capture phenotypic trait variation, fewer individuals could be used than were necessary to capture an equal percentage of the genotypic variation.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228
Diversity redundancy curves for all 961 Malus sieversii individuals in the complete data set. Plots compare the amount of diversity retained in cores maximized for trait diversity based on available phenotypic, genotypic, or phenotypic and genotypic data (top three curves) and similarly sized, randomly assembled cores (bottom curve). To capture phenotypic trait variation, fewer individuals could be used than were necessary to capture an equal percentage of the genotypic variation.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228
Redundancy curves were also developed for a set of genotypes to complement the diversity in the established cores for sites 6 and 9 (Fig. 2). For these data, we considered only the 797 healthy individuals. The 77 individuals representing the site 6 and 9 core collections (Volk et al., 2005) were indexed in a way that they became a mandatory part of the resulting core. A complementary third core of 35 individuals captured 94% of the measured genetic and phenotypic diversity of the entire 797 seedling dataset (Table 1). In contrast, when individuals were randomly selected from the population of 797, 445 individuals were required to capture a comparable level of diversity (Fig. 1). These selected individuals (complementary core, n = 35) exhibit desirable characteristics such as disease resistance and fruit quality traits. For example, 54% of the individuals in the complementary core are resistant to fire blight (Erwinia amylovora) and 34% are resistant to apple scab (Venturia inaequalis).
Malus sieversii phenotypic characterization data for the new complementary core of 35 individuals.



Fractional composition of Malus sieversii core collections among the nine collection sites in Kazakhstan and Kyrgyzstan. Each core set was developed with different source data and objectives. The total (n = 797) represents the distribution of samples that were healthy and flowering in 2007. Core-SSR (n = 84) represents the subset of individuals selected using SSR data from the total 961 individuals genotyped. Set of three cores (n = 112) represents a collection where the previous site 6 and 9 core sets were locked in (77 genotypes) and 35 additional genotypes were selected to capture the most diversity in the total collection. Complementary new core (n = 35) represents the genotypes identified in this study that complement the previous two site specific core sets.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228

Fractional composition of Malus sieversii core collections among the nine collection sites in Kazakhstan and Kyrgyzstan. Each core set was developed with different source data and objectives. The total (n = 797) represents the distribution of samples that were healthy and flowering in 2007. Core-SSR (n = 84) represents the subset of individuals selected using SSR data from the total 961 individuals genotyped. Set of three cores (n = 112) represents a collection where the previous site 6 and 9 core sets were locked in (77 genotypes) and 35 additional genotypes were selected to capture the most diversity in the total collection. Complementary new core (n = 35) represents the genotypes identified in this study that complement the previous two site specific core sets.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228
Fractional composition of Malus sieversii core collections among the nine collection sites in Kazakhstan and Kyrgyzstan. Each core set was developed with different source data and objectives. The total (n = 797) represents the distribution of samples that were healthy and flowering in 2007. Core-SSR (n = 84) represents the subset of individuals selected using SSR data from the total 961 individuals genotyped. Set of three cores (n = 112) represents a collection where the previous site 6 and 9 core sets were locked in (77 genotypes) and 35 additional genotypes were selected to capture the most diversity in the total collection. Complementary new core (n = 35) represents the genotypes identified in this study that complement the previous two site specific core sets.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228
The contribution of each collection site to each core set is shown in Fig. 2. The histogram shows the proportion of genotypes in a core set drawn from each collection site. The data confirm that when core subsets from sites 6 and site 9 (totaling 77 individuals) are forced to be included into a core of the entire collection, the additional 35 individuals (complementary core, n = 35) that are needed to capture the remaining variation are chosen primarily from sites 3, 5, 7, 11, and 12 (Fig. 2). The core-SSR (n = 84) developed from all the available 965 genotypes (Table 2) included individuals drawn roughly in proportion to the number of samples collected at each site. The one exception was in site 5, which contributed disproportionally to the core set, most likely due to the presence of rare private alleles.
Core set of 84 Malus sieversii individuals (core-SSR) identified using genotypic data. Individuals are classified according to collection site, family (arbitrary identification number), and cluster.


The contribution of each genetic cluster to each core set is shown in Fig. 3. The histogram shows the proportion of genotypes in each core set that were selected from each cluster. Comparison of each core set shows some slight differences in composition, especially where the core set composition differs from the total (n = 797). These discrepancies may be due to properties of the core set criteria (such as forcing the inclusion of 77 genotypes) or to diversity of the genetic cluster. The set of three cores (n-112) are composed of genotypes in proportion to the size of each of the clusters, whereas core-SSR (n = 84) draws more heavily on clusters 3 and 4. The new proposed complementary core (n = 35) has a higher representation of individuals selected from clusters 3 and 4 (20% and 14%, respectively) than would be predicted by the size of the cluster (total n = 797) (Fig. 3). Thus, the proposed new core is heavily represented by individuals drawn from the smaller genetic clusters 3 and 4.

Fractional composition of Malus sieversii core collections among four genetic clusters within the collection of viable genotypes from Kazakhstan and Kyrgyzstan. The histogram compares the numerical distribution of samples in core sets among four genetic lineages (clusters) found in the 961 M. sieversii individuals sampled. The total (n = 797) represents the distribution of samples that were healthy and flowering in 2007. Core-SSR (n = 84) represents the subset of individuals selected using SSR data from the total 961 individuals genotyped. Set of three cores (n = 112) represents a collection where the previous site 6 and 9 core sets were locked in (77 genotypes) and 35 additional genotypes were selected to capture the most diversity in the total collection. Complementary new core (n = 35) represents the genotypes identified in this study that complement the previous two site specific core sets.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228

Fractional composition of Malus sieversii core collections among four genetic clusters within the collection of viable genotypes from Kazakhstan and Kyrgyzstan. The histogram compares the numerical distribution of samples in core sets among four genetic lineages (clusters) found in the 961 M. sieversii individuals sampled. The total (n = 797) represents the distribution of samples that were healthy and flowering in 2007. Core-SSR (n = 84) represents the subset of individuals selected using SSR data from the total 961 individuals genotyped. Set of three cores (n = 112) represents a collection where the previous site 6 and 9 core sets were locked in (77 genotypes) and 35 additional genotypes were selected to capture the most diversity in the total collection. Complementary new core (n = 35) represents the genotypes identified in this study that complement the previous two site specific core sets.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228
Fractional composition of Malus sieversii core collections among four genetic clusters within the collection of viable genotypes from Kazakhstan and Kyrgyzstan. The histogram compares the numerical distribution of samples in core sets among four genetic lineages (clusters) found in the 961 M. sieversii individuals sampled. The total (n = 797) represents the distribution of samples that were healthy and flowering in 2007. Core-SSR (n = 84) represents the subset of individuals selected using SSR data from the total 961 individuals genotyped. Set of three cores (n = 112) represents a collection where the previous site 6 and 9 core sets were locked in (77 genotypes) and 35 additional genotypes were selected to capture the most diversity in the total collection. Complementary new core (n = 35) represents the genotypes identified in this study that complement the previous two site specific core sets.
Citation: Journal of the American Society for Horticultural Science J. Amer. Soc. Hort. Sci. 134, 2; 10.21273/JASHS.134.2.228
Discussion
The M. sieversii seedling collection maintained in the field in Geneva, NY, has over 1000 inventories that represent 108 mother trees. This collection displays high levels of diversity at the phenotypic and genotypic level, but its size makes it unwieldy for many research and breeding efforts. The M. sieversii collection becomes more manageable when core sets of individuals are available that capture this diversity at the phenotypic and genotypic level. The phenotypic traits included in these analyses include disease resistance, quality, and yield characteristics, all of which are important considerations in breeding programs. The inclusion of data collected from unlinked genotypic markers makes the proposed core sets potentially more diverse than they would have been if only phenotypic data were considered.
The complementary core (n = 35) increases the representation of genotypes among collection sites and genetic lineages that were not represented in the core sets proposed for sites 6 and 9 (Volk et al., 2005). The development of three independent complementary cores for M. sieversii provides researchers with tools that allow them to select the group of individuals that are most relevant to their research goals. Spatial genetic patterns specific to sites 6 and 9 can be evaluated in site-specific cores, and the complementary core of 35 serves to capture the diversity that was not available at those locations. Researchers who are interested in trees that may be particularly drought tolerant or cold hardy can select the core collections targeted to sites 6 and 9, respectively. Alternatively, those interested in evaluating a representative subset of the USDA M. sieversii collection can choose to use the combined core set of 112 individuals.
The use of maximization algorithms to identify core subsets has resulted in cores that capture allelic, geographic, and phenotypic diversity (Balfourier et al., 2007). New algorithms continue to be proposed that may also capture diversity by measuring distances between accessions within defined groups (Jansen and van Hintum, 2007), by least distance stepwise sampling (Wang et al., 2007), or by sampling a single individual from sets of clusters (Franco et al., 2005). The quantitative metrics used to assess the efficiency of these algorithms often describe how the mean and variance of a trait value within the core compares with the larger collection (e.g., Upadhyaya and Ortiz, 2001). We selected the maximization method because we knew that the geographic boundaries of genetic variation were more diffuse in this species due to ongoing admixture.
Implicit in these core collections is the assumption that core sets maximized for diversity using a set of specific attributes (molecular or phenotypic) are in fact representative of diversity elsewhere in the genomes of the selected individuals (Bataillon et al., 1996). Validation of this assumption comes from assessing the retention of variation at independent loci in the core (Le Cunff et al., 2008; McKhann et al., 2004; Ronfort et al., 2006). Evidence from simulation analysis suggests that these validation approaches will support core set selection more often in inbreeding species (Bataillon et al., 1996). While we do not use independent loci to validate these selections, we show that the core set composition reflects not only geographic diversity but also, and most importantly, the genetic diversity at the level of lineages. In studies of natural systems, a priori designations of the units that comprise populations or clusters are often based upon geographical criteria such as the collection site where ecological and environmental conditions can be assessed. Increasingly, studies of structure rely on novel model-based clustering methods that use a Bayesian analytical procedure to simultaneously reveal cryptic population structure and assign individuals to clusters (Huelsenbeck and Andolfatto, 2007; Pritchard et al., 2000). The proposed core set specifically targets defined genetic clusters that represent different ancestry within the natural populations. Collections based on diverse genotypic data may have superior representativeness than those based on phenotypic data (Hu et al., 2000). This may be especially true in wild germplasm collections where phenotypic similarity may mask substantial genotypic diversity (Tanksley and McCouch, 1997).
The three distinct core collections of 40 (site 6), 37 (site 9), and 35 (complementary core, n = 35) individuals capture over 90% of the total diversity in the larger collection. The core sets include ≈14% of the individuals in the PGRU M. sieversii field collection. The trees included in this set of 112 will be repropagated and maintained indefinitely as clones in the main Malus collection. Additional data will be collected for accessions that have not yet been thoroughly phenotyped. The trees in the site 6 and site 9 core sets have been included in a large-scale hand-pollination crossing effort to generate sets of seeds that represent the genotypes of each core sets. Genotyping efforts are underway to confirm that these sets are indeed representative of the core diversity. In Spring 2008, the trees in the new core of 35 M. sieversii individuals will be crossed in a similar manner to produce seed lots that represent the diversity of this core set. The seed lots for each of the three M. sieversii core sets will be made available for distribution for research purposes.
Literature Cited
Balfourier, F. , Roussel, V. , Strelchenko, P. , Exbrayat-Vinson, F. , Sourdille, P. , Boutet, G. , Koenig, J. , Ravel, C. , Mitrofanova, O. , Michel Beckert, M. & Charmet, G. 2007 A worldwide bread wheat core collection arrayed in a 384-well plate Theor. Appl. Genet. 114 1265 1275
Bataillon, T.M. , David, J.L. & Schoen, D.J. 1996 Neutral genetic markers and conservation genetics: Simulated germplasm collections Genetics 14 409 417
Brown, A.H.D. 1989a The case for core collections 136 156 Brown A.H.D. , Frankel O. , Marshall D.R. & Williams J.T. The use of plant genetic resources Cambridge University Press Cambridge, UK
Brown, A.H.D. 1989b Core collections: A practical approach to genetic resources management Genome 31 818 824
Brown, A.H.D. 1995 The core collection at the crossroads 3 19 Hodgkin T. , Brown A.H.D. , van Hintum T.J.L. & Morales E.A.V. Core collections of plant genetic resources Wiley Chichester, UK
Crossa, J. , Hernandez, C.M. , Bretting, P. , Eberhart, S.A. & Taba, S. 1993 Statistical genetic considerations for maintaining germplasm collections Theor. Appl. Genet. 86 673 678
Dzhangaliev, A.D. 2003 The wild apple tree of Kazakhstan Hort. Rev. (Amer. Soc. Hort. Sci.) 29 63 303
Forsline, P.L. , Aldwinckle, H.S. , Dickson, E.E. & Hokanson, S.C. 2003 Collection, maintenance, characterization, and utilization of wild apples from central Asia Hort. Rev. (Amer. Soc. Hort. Sci.) 29 1 61
Franco, J. , Crossa, J. , Warburton, M. & Taba, S. 2006 Sampling strategies for conserving maize diversity when forming core subsets using genetic markers Crop Sci. 46 854 864
Franco, J. , Crossa, J. , Taba, S. & Shands, H. 2005 A sampling strategy for conserving genetic diversity when forming core subsets Crop Sci. 45 1035 1044
Frankel, O.H. 1984 Genetic perspectives on germplasm conservation 161 170 Arber W. , Llimensee K. , Peacock W.L. & Starlinger P. Genetic manipulation: Impact on man and society Cambridge University Press Cambridge, UK
Gouesnard, B. , Bataillon, T.M. , Decoux, G. , Rozale, C. , Schoen, D.J. & David, J.L. 2001 MSTRAT: An algorithm for building germplasm core collections by maximizing allelic or phenotypic richness J. Hered. 92 93 94
Hamrick, J.L. & Godt, M.J.W. 1997 Allozyme diversity in cultivated crops Crop Sci. 37 26 30
Hemmat, M. , Weeden, N.F. & Brown, S.K. 2003 Mapping and evaluation of Malus ×domestica microsatellites in apple and pear J. Amer. Soc. Hort. Sci. 128 515 520
Hodgkin, T. , Brown, A.H.D. , van Hintum, T.J.L. & Morales, E.A.V. 1995 Future directions 253 259 Hodgkin T. , Brown A.H.D. , van Hintum T. & Morales E.A.V. Core collections of plant genetic resources Intl. Plant Genet. Resources Inst./Wiley-Sayce Rome
Hokanson, S.C. , McFerson, J.R. , Forsline, P.L. , Lamboy, W.F. , Djangaliev, A.D. & Aldwinckle, H.S. 1997 Collecting and managing wild Malus germplasm in its center of diversity HortScience 32 173 176
Hokanson, S.C. , Lamboy, W.F. , Szewc-McFadden, A.K. & McFerson, J.R. 2001 Microsatellite (SSR) variation in a collection of Malus (apple) species and hybrids Euphytica 118 281 294
Hu, J. , Zhu, J. & Xu, H.M. 2000 Methods of constructing core collections by stepwise clustering with three sampling strategies based on the genotypic values of crops Theor. Appl. Genet. 101 264 268
Huelsenbeck, J.P. & Andolfatto, P. 2007 Inference of population structure under a Dirichlet process model Genetics 175 1787 1802
Jansen, J. & van Hintum, T. 2007 Genetic distance sampling: A novel sampling method for obtaining core collections using genetic distances with an application to cultivated lettuce Theor. Appl. Genet. 114 421 428
Le Cunff, L. , Fournier-Level, A. , Laucou, V. , Vezzulli, S. , Lacombe, T. , Adam-Blondon, A.F. , Boursiquot, J.M. & This, P. 2008 Construction of nested genetic core collections to optimize the exploitation of natural diversity in Vitis vinifera L. subsp. sativa BMC Plant Biol. 8 31
Li, C.T. , Shi, C.H. , Wu, J.G. , Xu, H.M. , Zhang, H.Z. & Ren, Y.L. 2004 Methods of developing core collections based on the predicted genotypic value of rice (Oryza sativa L.) Theor. Appl. Genet. 108 1172 1176
Liu, K. , Goodman, M. , Muse, S. , Smith, J.S. , Buckler, E. & Doebley, J. 2003 Genetic structure and diversity among maize inbred lines as inferred from DNA microsatellites Genetics 165 2117 2128
Luby, J. , Forsline, P. , Aldwinckle, H. , Bus, V. & Giebel, M. 2001 Silk road apples: Collection, evaluation, and utilization of Malus sieversii L. from central Asia HortScience 36 225 231
Ma, Y.S. , Wang, W.H. , Wang, L.X. , Ma, F.M. , Chang, R.Z. & Qiu, L.J. 2006 Genetic diversity of soybean and establishment of a core collection focused on resistance to soybean cyst nematode J. Integr. Plant Biol. 48 722 731
Marita, J.M. , Rodriguez, J.M. & Nienhuis, J. 2000 Development of an algorithm identifying maximally diverse core collections Genet. Resources Crop Evol. 47 515 526
McKhann, H.I. , Camilleri, C. , Berard, A. , Bataillon, T. , David, J.L. , Reboud, X. , Le Corre, V. , Caloustian, C. , Gut, I.G. & Brunel, D. 2004 Nested core collections maximizing genetic diversity in Arabidopsis thaliana Plant J. 38 193 202
Pritchard, J.K. , Stephens, M. & Donnelly, P. 2000 Inference of population structure using multilocus genotype data Genetics 155 945 959
Richards, C.M. , Volk, G.M. , Reilley, A.A. , Henk, A.D. , Lockwood, D.R. , Reeves, P.A. & Forsline, P.L. 2009 Genetic diversity and population structure in Malus sieversii, a wild progenitor species of domesticated apple Tree Genet. Genomes
Ronfort, J. , Bataillon, T. , Santoni, S. , Delalande, M. , David, J.L. & Prosperi, J.M. 2006 Microsatellite diversity and broad scale geographic structure in a model legume: Building a set of nested core collection for studying naturally occurring variation in Medicago truncatula BMC Plant Biol. 6 28
Rubenstein, K.D. , Smale, M. & Widrlechner, M.P. 2006 Demand for genetic resources and the U.S. National Plant Germplasm System Crop Sci. 46 1021 1031
Schoen, D.J. & Brown, A.H.D. 1993 Conservation of allelic richness in wild crop relatives is aided by assessment of genetic markers Proc. Natl. Acad. Sci. USA 90 10623 10627
Schoen, D.J. & Brown, A.H.D. 1995 Maximising genetic diversity in core collections of wild relatives of crop species 55 76 Hodgkin T. , Brown A.H.D. , van Hintum T.J.L. & Morales E.A.V. Core collections of plant genetic resources Intl. Plant Genet. Resources Inst./Wiley-Sayce Rome
Tanksley, S.D. & McCouch, S.R. 1997 Seed banks and molecular maps: Unlocking genetic potential from the wild Science 277 1063 1066
Upadhyaya, H.D. & Ortiz, R. 2001 A mini core subset for capturing diversity and promoting utilization of chickpea genetic resources in crop improvement Theor. Appl. Genet. 102 1292 1298
U.S. Department of Agriculture 2004 National genetic resources program. Germplasm resources information network (GRIN) 23 Sept. 2008 <http://www.ars-grin.gov/cgi-bin/npgs/html/desclist.pl?115>.
van Hintum, T.J.L. 1999 The core selector, a system to generate representative selections of germplasm collections Plant Genet. Resour. Newsl. 118 64 67
Volk, G.M. , Richards, C.M. , Reilley, A.A. , Henk, A.D. , Forsline, P.L. & Aldwinckle, H.S. 2005 Ex situ conservation of vegetatively propagated species: Development of a see d-based core collection for Malus sieversii J. Amer. Soc. Hort. Sci. 130 203 210
Wang, J.C. , Hu, J. , Xu, H.M. & Zhang, S. 2007 A strategy on constructing core collections by least distance stepwise sampling Theor. Appl. Genet. 115 1 8