Chinese cymbidiums (Cymbidium sp.) are important ornamental plants because of their foliage, flower shape, and fragrance. Well-known Chinese cymbidiums mainly include Cymbidium goeringii, Cymbidium faberi, Cymbidium ensifolium, Cymbidium kanran, and Cymbidium sinense. The population genetics of Chinese cymbidiums can be efficiently analyzed using small-scale marker panels with high discriminatory power. In this study, we tested several genic simple sequence repeats (SSRs) and built six genic SSR panels. The panels included several robust markers, which can rapidly assign Chinese cymbidium accessions to their source species. Fifty-three accessions of Chinese cymbidiums were analyzed using 25 markers, which exhibited polymorphism among five species. These markers were ranked according to their discriminatory scores (D scores). The program selected six markers to build an “overall” panel for all Cymbidium classifications and yielded 95.16% population assignment accuracy. Considering one species as the “critical” population and the four other species as one population, we built five genic SSR panels: C. ensifolium panel (four markers, 98.05% accuracy), C. faberi panel (six markers, 95.90% accuracy), C. goeringii panel (six markers, 95.15% accuracy), C. sinense panel (six markers, 96.35% accuracy), and C. kanran panel (five markers, 96.10% accuracy). Genetic distance matrices calculated using the “overall” panels and those derived with the 25 markers were compared. Results showed a high correlation (R = 0.807) with statistical significance (P = 0.042). Moreover, “all panels” revealed higher genetic variations among populations than “all markers.” Hence, the developed panels are suitable for efficient population classification of Chinese cymbidiums.
Chinese cymbidiums refer to species and hybrids derived from Cymbidium goeringii, C. faberi, C. ensifolium, C. kanran, and C. sinense. These are terrestrial species and most cultivars of these species present superior flower shape, foliage, and fragrance (Li et al., 2014c; Schneitz, 1999). Chinese cymbidiums have been cultivated for several centuries in Asia, especially in China, Korea, and Japan, because of the high ornamental and economic value of these plants (Moea and Parka, 2012). To conserve the genetic resources of Chinese cymbidiums, scholars have focused on conserving the genetic diversity of these species. Genetic diversity and population structure have to be preliminarily characterized before germplasm conservation. The genetic diversity of Cymbidium species can be analyzed using DNA markers, including restriction enzyme polymorphism markers (Obara-Okeyo and Kako, 1998); random amplified polymorphic DNA markers (Choi et al., 2006; Obara-Okeyo and Kako, 1998); amplified fragment length polymorphism markers (Wang et al., 2004); polymorphisms of internal transcribed spacers of nuclear ribosomal DNA and plastids, as well as inter-SSR (Lu et al., 2011); and SSR (Capesius, 1976; Moe et al., 2010; Moea and Parka, 2012). Novel Cymbidium SSRs derived from genes (genic SSRs) are developed from genic regions and can be applied for genetic analysis (Li et al., 2013, 2014a). SSRs are ubiquitous, locus specific, codominant, highly polymorphic, and easily operated, thereby requiring lower cost compared with other molecular markers (Honma and Goto, 2001; Li et al., 2006; Schwarz-Sommer et al., 1990; Tsai et al., 2004; Venkateswarlu et al., 2006; Xu et al., 2006). In addition, genic SSRs exhibit high transferability across species. These features render genic SSRs as suitable for discriminating closely related species.
The use of numerous marker loci in population genetic analysis is an expensive and inefficient method for genotyping large germplasm collections. As such, scholars developed informative panels by using a minimum number of marker locus combinations. The panel can be considered a DNA bar code and can be used for research on taxonomy and population genetics. Minimum combination of highly ranked loci for population assignment can be determined through two basic procedures. First, simulated test data are generated using the observed allele frequencies for each locus, and then the number of times each test genotype is correctly assigned to the appropriate source population is determined and markers are then ranked by comparison of individual marker scores. Second, another round of iterations yield ranked loci by increasing one locus at each run until the assignment score reaches or exceeds the accuracy criterion. This strategy is applied for locus selection in numerous species during population assignment; such species include chinook salmon [Oncorhynchus tshawytscha (Greig et al., 2003)], asian seabass [Lates calcarifer (Yue et al., 2012)], trematomids [Trematomus sp. (Van de Putte et al., 2009), humans [Homo sapiens (Rosenberg et al., 2003)], switchgrass [Panicum virgatum (Okada et al., 2011)], and rice [Oryza sativa (Agrama et al., 2012)].
In the present study, we aim to develop informative panels by using a minimum number of genic SSRs. The panels can be used to assign Chinese cymbidiums to five populations with high accuracy. The proposed strategy could be an efficient tool for population classification.
Materials and Methods
Fifty-three accessions from C. goeringii, C. faberi, C. ensifolium, C. kanran, and C. sinense were used for testing genetic markers (Table 1). Fifty-three accessions were selected from 105 accessions (Supplemental Fig. 1) based on their ancestries derived from STRUCTURE (Pritchard et al., 2000), which had maximum ancestry index higher than 0.6. The plants were grown and maintained in a greenhouse under natural light. Fresh leaves were collected from two or three seedlings of each accession for genomic DNA extraction.
Fifty-three Chinese cymbidium accessions for genetic analysis. Cymbidium ensifolium, Cymbidium faberi, Cymbidium sinense, Cymbidium goeringii, and Cymbidium kanran had 10, 10, 10, 12, and 11 accessions, respectively.
Genomic DNA was extracted from leaf samples as previously described (Li et al., 2007). Twenty-five genic SSRs were selected from the genic SSR collection (Li et al., 2014a) for panel development. Polymerase chain reaction (PCR) primers were synthesized by Life Technologies (ABI and Invitrogen, Shanghai, China). PCR experiments were conducted, and the products were separated using polyacrylamide gel electrophoresis gel (Li et al., 2007). The gel was silver stained according to the procedure reported by Li et al. (2001), and the results were recorded using a scanner. The genotype was determined by analyzing band patterns (Li et al., 2014a). The accessions were labeled by 1, 2, 3, and so on, according to the number and position of bands, and then the data were converted into binary matrix using “compute frequency” procedure in PowerMarker version 2.7 (Liu and Muse, 2005).
Polymorphism of markers.
Genetic distances were calculated using Nei’s distance (Nei and Takezaki, 1983). Phylogenetic reconstruction was performed based on neighbor-joining (NJ) method (with 1000 bootstrap replicates) and implemented in PowerMarker version 2.7. The phylogenetic tree was visualized in MEGA version 4 (Tamura et al., 2007). PowerMarker was also used to calculate the average number of marker alleles and polymorphism information content (PIC).
Analysis of population structure.
Population structure was initially analyzed using the software STRUCTURE as described in previous studies (Li et al., 2010, 2011, 2012, 2014b). Genetic variations within and among populations were then calculated by analysis of molecular variance (AMOVA) in Arlequin V2.000 (Schneider and Excoffier, 1999). Genetic distances among species were calculated using Nei’s distance (Nei and Takezaki, 1983). A phylogenetic tree was constructed by NJ method. Both calculations were performed in PowerMarker version 2.7. The phylogenetic tree was visualized using MEGA version 4 (Tamura et al., 2007). The correlation of genetic distances based on total markers or panels was assessed using Mantel test in PowerMarker.
Discriminatory power of markers.
Twenty-five genic SSRs were employed for WHICHLOCI analysis. Accessions with clear origins were used to test discriminatory powers. Using the WHICHLOCI v.1.0 software package (Banks et al., 2003), we ranked each of the 25 markers according to their D scores, which indicate discriminatory powers in population analysis. Discriminatory power was assessed by conducting 10,000 simulations (population size N = 2000) based on the allele frequency of each population. The threshold of the assignment accuracy was set to 95% and the stringency of limit of detection (LOD) was designated as 3.0 for all simulations. For the “critical” population, we determined which locus is necessary for identification of a specific species. These methods were applied in each species, where all species that do not belong to the “critical” population were treated as one population. Finally, reliability examination was performed on panel markers by comparing the results derived from panels with those obtained using all markers.
Population structure and genic ssr profile.
Fifty-three of 105 accessions were selected based on their ancestries derived from STRUCTURE (Table 1). Accessions with severely mixed ancestries (maximum ancestry index lower than 0.6) were excluded. A total of 12, 11, 10, 10, and 10 accessions from C. goeringii, C. kanran, C. faberi, C. ensifolium, and C. sinense, respectively, were used for further analysis. The AMOVA results revealed 19.98% genetic variation among species and 80.02% genetic variation within species. Polygenetic analysis also indicated that the largest genetic distance was found between C. ensifolium and C. sinense (0.2734), and the smallest distance was detected between C. goeringii and C. kanran (0.2008) (Fig. 1; Table 2).
Comparison of pairwise genetic distance above the diagonal based on the “overall” panel comprising six markers and below the diagonal based on all 25 markers among five Chinese Cymbidium species.z
In the genic SSR collection, 25 genic SSRs exhibited polymorphism within the five species. These SSRs were selected to test assignment ability because equal numbers of effective markers (polymorphic markers) are required within each population for WHICHLOCI. The polymorphisms of the markers are shown in Table 3. The PIC values varied from 0.204 for SSR68 to 0.855 for SSR73, with an average content of 0.528. The allele number of these markers ranged from two for SSR79 to 14 for SSR08, with an average value of 6.88.
Twenty-five polymorphic genic simple sequence repeats (SSRs) applied for analysis of 53 Chinese cymbidium accessions.
Evaluation of marker discriminatory power and construction of panels.
D scores were used to measure the ability of markers for correct assignment of accessions to their original species. Among the 25 markers, SSR73 showed the highest assignment score (4.941), whereas SSR68 displayed the lowest score (0.941) (Fig. 2; Table 4). The correlation coefficients between D scores and allelic numbers and those between D scores and PICs were 0.795 and 0.924, respectively. With increasing number of markers employed, high portions of simulated individuals were accurately assigned (Fig. 2). The percentage of correct assignments markedly increased in response to the first six markers and then gradually increased upon successive addition of the seventh to the 10th markers. The percentage then reached 99.31%, and the “overall” panel was generated by WHICHLOCI without defining the critical population. The panel included six markers: SSR73, SSR21, SSR08, SSR64, SSR53, and SSR03. In this panel, the D scores of the markers varied from 3.57 for SSR03 to 4.94 for SSR73, with an average value of 4.15 (Table 4). About 95.16% of the simulated accessions were accurately assigned to their original species by using the most effective markers, which were selected based on D scores at an LOD threshold of 3.0 (Table 4).
Estimation of correct assignments of the five Cymbidium species and “critical” species by using “overall” or “critical” panels.
In critical population method, other populations were treated as one population. When C. ensifolium was selected as the critical population, a panel was created; the developed panel generated 98.05% correct assignments. The panel included four markers with D scores ranging from 2.07 to 2.67 (Table 4). The C. faberi panel consisted of six markers, with D scores ranging from 1.71 to 2.77, and resulted in 95.90% correct assignments. The C. goeringii panel comprised six markers, with D scores ranging from 1.52 to 2.67, and resulted in 95.15% correct assignments. The C. sinense panel was composed of six markers, with D scores ranging from 1.79 to 2.57, and resulted in 96.35% correct assignments. The C. kanran panel included five markers, with D scores ranging from 1.94 to 2.79, and resulted in 96.10% correct assignments.
Application of marker panels in population structure analysis.
The discriminatory powers of marker panels in the population assignment were assessed. Genetic variation among the five populations derived using the “overall” panel was 29.92%, whereas that obtained using all marker was 19.98%. In critical population analysis for each Chinese cymbidium species, genetic variation between C. ensifolium and the four other populations derived using the C. ensifolium panel was 30.46%, which was higher than the 14.00% variation obtained using all markers. Similar results were observed in the analysis of the four other species (Table 5). In addition, pairwise genetic distances between species derived from each panel or all markers were compared through Mantel analysis. The results showed a significant correlation, with a coefficient of 0.807 (P = 0.042), between the two techniques (Table 2). This high correlation suggested that the panels developed can be effectively used to reveal population differentiation patterns.
Genetic variation among Cymbidium species explained by panels and all 25 markers (genic simple sequence repeats).
Genic ssrs as markers for analysis of population differentiation.
Genic SSRs show higher transferability across related species than anonymous SSR markers (Scott et al., 2000). Genic SSRs can be used to directly compare different taxa and prevent the risk of locus-specific differences (genetic diversity among accessions), which may mask true species-level differences (Ellis and Burke, 2007). As such, genic SSRs are suitable for discriminating species. In the present study, genic SSRs were used to create highly efficient panels for Chinese cymbidiums. As the complex relationships between accessions can influence the discriminating power of DNA markers (Hayes et al., 2005), accessions with ambiguous ancestries were not included in marker selection. Finally, 53 accessions with clear sources were analyzed in WHICHLOCI. To avoid bias associated with the number of markers used within each species, we selected 25 markers that showed polymorphisms within the five species for further analysis.
Factors influencing the discriminatory power of markers.
The informative value of markers is attributed to their allelic number and PIC (Agrama et al., 2012; Yue et al., 2012). The discriminatory power of markers is determined by their efficiency in correct population assignments and is inversely correlated with their propensity for causing false assignments (Banks et al., 2003). Successive assignment trials were performed in simulated populations based on the allele frequencies of markers (Banks et al., 2003). Thus, the D score of a marker is logically and positively associated with the corresponding genetic diversity, PIC, and number of alleles. In the “overall” panel analysis, six markers with high D scores presented high PICs and high allele numbers. Similarly, SSR68, SSR54, and SSR16, which showed low D scores, corresponded to low polymorphism and diversity. The correlation coefficient between D score and PIC and that between D score and number of alleles are 0.795 and 0.924, respectively. These results are consistent with those reported in a previous study (Agrama et al., 2012).
The order sorted by D scores and PICs was not exactly the same. This difference may be due to allelic distribution among and within populations. In critical population analysis for each species, specific markers containing a minimum of four or five alleles would be ranked as high for particular species, such as in the case of SSR01 in C. goeringii panel, which amplified C. goeringii-specific alleles. Similar results were also reported previously (Banks et al., 2003). C. goeringii was the least differentiated species in this study (Table 5) and required six markers for accurate assignment. This high requirement could be due to the polyphyletic quality of C. goeringii accessions. For example, numerous C. goeringii accessions are clustered with C. faberi or C. sinense, as reported in previous studies (Li et al., 2014a, 2014b).
Evaluation and improvement of panels for population classification.
To evaluate the capability of panels for population identification, we compared the results obtained using panels with those derived from total markers. Both results showed similar population structure patterns, as indicated by the high correlation coefficient (R = 0.8068) among genetic distance matrices. Nevertheless, the panels are considered more appropriate for population assignment because of their potential to present clear genetic differentiation with few markers. This ability could be attributed to the allelic distribution of panel markers and is highly consistent with the population structure. Among the “critical” panels, the C. ensifolium panel was found to be the most efficient because it can explain the highest variation among populations with the least number of markers (four). The C. goeringii panel explained relatively low variation and used the most markers (six). Considering the possibility of misassignment and uncertainty of the genetic background, we proposed an approach for correcting the population structure, that is, several reference accessions are included when using the panels for population analysis.
Genic SSR panels with the strongest discriminatory power were used to characterize Chinese cymbidium germplasms and explore their population structure. In practice, these panels can facilitate rapid screening of large germplasm banks and species classification at a relatively low cost.
AgramaH.A.McClungA.M.YanW.G.2012Using minimum DNA marker loci for accurate population classification in rice (Oryza sativa L.)Mol. Breed.29413425
ChoiH.KimM.J.LeeJ.S.RyuK.H.2006Genetic diversity and phylogenetic relationships among and within species of oriental cymbidiums based on RAPD analysisSci. Hort.1087985
GreigC.JacobsonD.P.BanksM.A.2003New tetranucleotide microsatellites for fine-scale discrimination among endangered chinook salmon (Oncorhynchus tshawutscha)Mol. Ecol. Notes3376379
HayesB.SonessonA.K.GjerdeB.2005Evaluation of three strategies using DNA markers for traceability in aquaculture speciesAquaculture2507081
LiX.CuiH.ZhangM.2006Molecular markers derived from EST: Their development and applications in comparative genomicsBiodiversity Sci.14541547
LiX.JinF.JinL.JacksonA.HuangC.LiK.ShuX.2014aDevelopment of Cymbidium ensifolium genic-SSR markers and their utility in genetic diversity and population structure analysis in cymbidiumsBMC Genet.15124
LiX.LuoJ.YanT.XiangL.JinF.QinD.SunC.XieM.2013Deep sequencing-based analysis of the Cymbidium ensifolium floral transcriptomePLoS One8e85480
LiX.XiangL.WangY.LuoJ.WuC.SunC.XieM.2014bGenetic diversity, population structure, pollen morphology and cross-compatibility among Chinese cymbidiumsPlant Breed.133145152
LiX.XuW.ChowdhuryM.JinF.2014cComparative proteomic analysis of labellum and inner lateral petals in Cymbidium ensifolium flowersIntl. J. Mol. Sci.151987719897
LiX.YanW.AgramaH.HuB.JiaL.JiaM.JacksonA.MoldenhauerK.McclungA.WuD.2010Genotypic and phenotypic characterization of genetic differentiation and diversity in the USDA rice mini-core collectionGenetica13812211230
LiX.YanW.AgramaH.JiaL.JacksonA.MoldenhauerK.YeaterK.McclungA.WuD.2012Unraveling the complex trait of harvest index with association mapping in rice (Oryza sativa L.)PLoS One7e29350
LiX.YanW.AgramaH.JiaL.ShenX.JacksonA.MoldenhauerK.YeaterK.McclungA.WuD.2011Mapping QTLs for improving grain yield using the USDA rice mini-core collectionPlanta234347361
LiZ.L.JakkulaR.S.HusseyJ.P.BoermaH.R.2001SSR mapping and confirmation of the QTL from PI96354 conditioning soybean resistance to southern root-knot nematodeTheor. Appl. Genet.10311671173
MoeK.T.ZhaoW.SongH.S.KimY.H.ChungJ.W.ChoY.I.ParkP.ParkH.S.ChaeS.C.ParkY.J.2010Development of SSR markers to study diversity in the genus CymbidiumBiochem. Syst. Ecol.38585594
MoeaK.T.ParkaY.2012Analysis of population structure revealed apparent genetic disturbance in Korea Cymbidium collectionSci. Hort.134157162
NeiM.TakezakiN.1983Estimation of genetic distances and phylogenetic trees from DNA analysis. Proc. 5th World Congr. Genet. Appl. Livstock Prod. p. 405–412
Obara-OkeyoP.KakoS.1998Genetic diversity and identification of Cymbidium cultivars as measured by random amplified polymorphic DNA (RAPD) markersEuphytica9995101
OkadaM.LanzatellaC.TobiasC.M.2011Single-locus EST-SSR markers for characterization of population genetic diversity and structure across ploidy levels in switchgrass (Panicum virgatum L.)Genet. Resources Crop Evol.58919931
SchneiderS.ExcoffierL.1999Estimation of past demographic parameters from the distribution of pairwise differences when the mutation rates vary among sites: Application to human mitochondrial DNAGenetics15210791089
Schwarz-SommerZ.HuijserP.NackenW.SaedlerH.SommerH.1990Genetic control of flower development by homeotic genes in Antirrhinum majusScience250931936
TsaiW.C.KuohC.S.ChuangM.H.ChenW.H.ChenH.H.2004Four DEF-like MADS box genes displayed distinct floral morphogenetic roles in Phalaenopsis orchidPlant Cell Physiol.45831844
Van de PutteA.P.Van HoudtJ.K.J.MaesG.E.JankoK.KoubbiP.RockJ.VolckaertF.A.M.2009Species identification in the trematomid family using nuclear genetic markersPolar Biol.3217311741
VenkateswarluM.Raje UrsS.Surendra NathB.ShashidharH.E.MaheswaranM.VeeraiahT.M.SabithaM.G.2006A first genetic linkage map of mulberry (Morus spp.) using RAPD, ISSR, and SSR markers and pseudotestcross mapping strategyTree Genet. Genomes31524
WangH.Z.WangY.D.ZhouX.Y.YingQ.C.ZhengK.L.2004Analysis of genetic diversity of 14 species of Cymbidium based on RAPDs and AFLPsActa Biol. Expt. Sinica37482486(in Chinese)
YueG.H.XiaJ.H.LiuP.LiuF.SunF.LinG.2012Tracing asian seabass individuals to single fish farms using microsatellitesPLoS One7e52721