Abstract
Oakleaf hydrangea (Hydrangea quercifolia) is an understory shrub native to the southeastern United States. The species occupies a relatively small native range, and little is known about its demography, genetic diversity, or needs for conservation. Samples were collected from 188 plants in 73 locations throughout the species range and were genotyped using genotyping by sequencing. A structure analysis identified six genetic clusters that are geographically defined. Although these clusters are weakly differentiated, each has unique alleles. An environmental association analysis revealed that environmental variables explain 11.3% of genetic diversity, and population structure explains 13.5%. Further, 231 putative adaptive alleles were identified, most of which are correlated with precipitation-related variables, indicating that precipitation has an impact on genetic diversity in H. quercifolia. Many historically documented populations were found to be either extirpated or at risk of extirpation. The genetic clusters on the southern extent of the species range are relatively small and contain putative adaptive alleles at relatively high frequencies. These results highlight the importance of preserving representative germplasm from throughout the species range.
Genetic diversity is crucial for species to adapt to changing environments. Understanding how the diversity of a species is structured can help to guide conservation efforts by identifying populations with unique or at-risk diversity (Allendorf et al. 2010). Generalizations about the structure of genetic diversity have been made based on the life history of a species. Woody plants, for example, are expected to have higher diversity within than between populations, whereas insect pollination is expected to increase the differentiation between populations as compared with wind or self-pollination (Hamrick et al. 1992). Similar patterns based on life history have been found in response to processes such as habitat fragmentation (Vranckx et al. 2012). However, species-specific patterns have been identified that indicate a need to assess each species individually. Hahn et al. (2017) identified different responses in genetic diversity and differentiation to forest succession among 11 species of subtropical trees and shrubs, with weakly competing species decreasing in diversity and increasing in differentiation more quickly.
Genetic diversity has been studied in several insect-pollinated woody plants with the goal of addressing conservation concerns. Several examples can be taken from Rhododendron, where many species have been studied that are native to Asia (Li et al. 2015; Wu et al. 2015; Zhao et al. 2012) and North America (Chappell et al. 2008). These studies determined that many of the Rhododendron species are endangered and need conservation strategies. Another genus in which genetic diversity analysis uncovered a conservation need is Forsythia, where it was found that populations of Forsythia ovata and Forsythia suspensa have critically low genetic diversity and need management intervention (Chung et al. 2013; Fu et al. 2016). Pulsatilla patens was found to have extremely low genetic diversity due to small population sizes, which is contributing to severe genetic erosion (Szczecińska et al. 2016).
H. quercifolia is a native understory shrub or small tree found in a six-state region in the southeastern United States (Fig. 1). Since its first botanical description by William Bartram in 1791, it has largely been considered as an ornamental plant but has been overlooked in comparison with some more commercially exploited species (Mallet et al. 1992). H. quercifolia is an insect-pollinated, obligate outcrossing species (Reed 2000, 2004), and grows in the shady understory of hardwood forests. It is most often found in well-drained soil on steep slopes such as bluffs, cliffs, and riverbanks. H. quercifolia has been observed growing directly out of rocks, suggesting that rich soil is not a requirement for growth. Like other Hydrangea species, it can naturally propagate by branch layering and occasionally forms dense clonal stands (Kanno and Seiwa 2004; Pilatowski 1982).
Map of Hydrangea quercifolia DNA sample collection locations superimposed on the US Department of Agriculture (USDA) Plants Database range map for the species. Each black point represents one of the 188 samples collected from 73 US locations in the six-state native range of H. quercifolia. Counties shaded in green are those in which H. quercifolia has been previously documented (USDA, Natural Resources Conservation Service 2022).
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
Herbarium records indicate that H. quercifolia was once common in the southeastern United States; however, our preliminary exploration suggested that many of the historic populations either no longer exist or are small and may not be regenerating. Many of these extirpated populations have been affected by land use change (clear cutting, housing developments, etc.) or habitat degradation (e.g., invasive species). In addition, many of the remaining populations are very small, often having fewer than 10 individuals and in some cases fewer than five individuals. This brings into question the need for conservation action for H. quercifolia, which is best guided by genetic diversity data. Information regarding the structure of the genetic diversity throughout the species range as well as determining which factors contribute to the structure are vital to developing a conservation strategy for the species. This information can be used to prioritize conservation by identifying unique or especially at-risk diversity.
The objectives of this study were to 1) locate populations of H. quercifolia across its natural range, 2) characterize the genetic diversity of the species and how the diversity is structured, 3) determine the impact that geographic and climatic factors have on this genetic structure, and 4) use this information to identify conservation priorities for the species.
Methods and Materials
Identifying populations.
The range of occurrence for H. quercifolia was reported by McClintock (1957) and the US Department of Agriculture (USDA) Plants Database (USDA, Natural Resources Conservation Service 2022) to be within the US states of Alabama, Florida, Georgia, Louisiana, Mississippi, and Tennessee (Fig. 1). As a first step in locating populations of wild H. quercifolia, herbarium databases in each state were searched for previously documented locations of the species (Table 1). Collection sites of herbarium specimens and public lands in the immediately surrounding areas were searched. In addition, state and private agencies were contacted to assist in locating populations and to secure permissions and permits for collection of leaf and seed samples.
Herbaria in which Hydrangea quercifolia specimens were surveyed for population identification.
Leaf collections.
Samples were collected from 188 individuals from 73 collection sites (populations) in May 2017 and between Apr and Jul 2018 (Fig. 1, Table 2). For the purposes of this study, a population was defined as a semicontinuous group of hydrangeas at a sampling location with populations being more than 2 km apart. Three to five young, expanding leaves per plant were collected from one to seven individuals per population (mean = 2.6 individuals per population). Sampled plants were a minimum of 3 m apart to avoid resampling an individual clone, based on the observation that the species often forms clonal colonies. Geographic coordinates (latitude and longitude) were recorded for each sampled plant with a smartphone app [EpiCollect5; ver. 3.0.3 (Aanensen et al. 2009)], which is accurate to 10 m. Notes were taken on the habitat in which the plant was growing along with any exceptional characteristics of the plant. Leaves were placed into coin envelopes with silica (Dry & Dry, Brea, CA, USA) in the field immediately after collection and lyophilized on return to the laboratory. Dried leaf samples were then frozen at −20 °C until DNA extraction.
Locations and number of samples of Hydrangea quercifolia populations with corresponding National Plant Germplasm System (NPGS) accession number.
DNA extraction.
DNA was extracted using a plant DNA extraction kit (DNeasy 96; Qiagen, Hilden, Germany). The manufacturer’s instructions were modified to include 2.5% beta-mercaptoethanol in the lysis solution, and elution incubations were extended to 10 min with elution buffer starting at 80 °C. Quality and quantity of extracted DNA were assessed using agarose gel electrophoresis and spectrophotometry (Nanodrop; Thermo Fisher Scientific Inc., Waltham, MA, USA) on a subset of samples. Fluorimetry analysis (PicoGreen; Invitrogen, Carlsbad, CA, USA) was conducted on all samples.
Genotyping.
DNA samples were genotyped using a genotyping by sequencing (Elshire et al. 2011) protocol at the University of Minnesota Genomics Center (Minneapolis, MN, USA). Samples were first digested with the restriction enzyme BamHI, barcoded, then sequenced on a sequencing instrument (NextSEq. 550; Illumina, San Diego, CA, USA) with 150 base pair (bp) single end reads in a single multiplexed pool. Approximately 300 million sequence reads were generated.
Single nucleotide polymorphism (SNP) variants were called using the Stacks de novo pipeline [version 2.3b (Catchen et al. 2011)]. Parameters for SNP calling were determined by first running the pipeline on a subset of eight random samples with varying parameter values to identify the parameter set that maximized the number of polymorphic loci identified. The final parameters used on the full data set were distance allowed between stacks (M) = 2, minimum stack depth (n) = 4, and distance allowed between catalog loci (m) = 2.
Loci with biallelic SNPs were filtered to include those with a read depth between 15 and 100, a minor allele frequency ≥0.05, mean genotyping quality ≥39, allele balance between 0.4 and 0.6 for heterozygotes, and allele balance ≥0.9 for homozygotes using vcftools [version 0.1.13 (Danecek et al. 2011)] and custom Python scripts. Many of the polymorphic loci (read length = 150 bp) contained multiple SNPs. Therefore, instead of each SNP being analyzed independently, each locus was considered a multiallelic haplotype marker made up of multiple SNPs (Helyar et al. 2011). Having a higher allele number per locus allowed for a higher information content per marker.
Genetic diversity analyses.
To identify the number of genetic clusters in H. quercifolia, Structure [version 2.3.4 (Pritchard et al. 2000)] was used with 50,000 Markov chain Monte Carlo replications after a 5000 replication burnin period. Number of clusters (K) between K = 1 and K = 10 were tested, with 10 Structure runs for each value of K. Structure runs were combined using Structure Harvester [version 0.6.94 (Earl and vonHoldt 2012)] and the most likely number of genetic clusters was determined using the delta-K method, which relies on determining the rate at which the probability of the observed data changes when increasing proposed values of K (Evanno et al. 2005).
Principal component analysis (PCA) on allele frequencies at all loci, implemented in the R package ade4 [version 1.7–15 (Thioulouse and Dray 2007)], was used as an additional method of detecting and visualizing population structure (Jombart et al. 2009). Because PCA does not handle missing data, missing genotype calls were replaced with the mean allele frequency for the allele and therefore this analysis slightly underestimates the true amount of genetic diversity.
A Procrustes analysis was used to test the congruency between PCA and geographic location (Wang et al. 2010), using the R package MCMCpack [version 1.4–6 (Martin et al. 2011)]. A Procrustes analysis minimizes the squared differences between two distance matrices by rescaling and stretching one matrix to fit the other without distorting it. In this case, the matrix of PCA values was fitted to the matrix of latitude and longitude coordinates and differences between them were visualized by connecting the two points [Procrustes transformed principal component (PC) values and geographic location] with a line on a map.
Pairwise fixation index (Fst) was used to quantify the genetic differentiation among the detected genetic clusters. Isolation by distance was tested using a Mantel test on Nei’s genetic distance (Nei 1972) and geographic distance using the R package ade4 [version 1.7–15 (Dray and Dufour 2007)]. Genetic diversity within each cluster was also quantified as expected heterozygosity and average number of alleles per locus, calculated in the R package poppr [version 2.8.5 (Kamvar et al. 2014)].
Genome-environmental association analysis.
An environmental association analysis was used to determine the effects of environmental factors on population structure. This was implemented with a redundancy analysis (RDA) in the R package vegan [version 2.5–6 (Dixon 2003)]. An RDA is a multidimensional extension of linear regression; in this case, a PCA ordination of allele frequencies that is constrained by environmental variables (Forester et al. 2018). Environmental variables used were described in the BioClim database for the location of each sample (Fick and Hijmans 2017; Supplemental Table 1). To reduce collinearity among the variables, a PCA was used on the environmental variables, and the variables that explained >50% of the variation on the first two PCs were included in the RDA model (Rellstab et al. 2015). Including only these variables captures a large amount of environmental variation without overfitting the model. Population structure was controlled for by conditioning the RDA with the individual Q-matrix from Structure. Loci that were outliers (±4 standard deviations from the mean) on RDA1 or RDA2 were considered to be candidate loci, which are putatively under selection by environmental variables. Correlations were then tested between each candidate locus and the environmental variables.
Candidate SNP loci that are associated with environmental factors were mapped back to the fully annotated oak leaf hydrangea genome, H. quercifolia v1.1 (US Department of Energy, Joint Genome Institute 2022). The top hit of each SNP locus was then used to identify the potential genes within 10 kbp sequencing region in the genome.
Results
Identifying populations.
H. quercifolia populations were found in all six states for which it had been previously reported to occur (Fig. 1, Table 2). Small population sizes were frequently encountered during sampling, with a few having only a single individual (three populations) and many having fewer than 10 individuals (27 populations). Most of these small populations were on or near the edges of the species range, especially the populations west of the Mississippi River in Louisiana. Many of the populations were very fragmented with little to no connection to other sites with suitable habitat. An exception to this occurred in northern Alabama, where H. quercifolia grows as nearly continuous populations, often containing hundreds of individuals. Many populations in the southern portion of the species range were growing in calcareous soils; however, a wide range of soil types supporting H. quercifolia populations were observed.
Of the locations where H. quercifolia had previously been documented, 23% (16 of 69 searched) were found to no longer contain populations when searched in this study. These sites had either undergone land use change (housing developments, clear cutting, roadside vegetation control) or habitat degradation due to logging or invasive species. The primary species noted was Ligustrum sinense. Observations during sampling indicate that H. quercifolia does not tolerate high competition. Many of these extirpated populations were located on the edges of the species range.
Genotyping.
After all filtering steps, 6006 polymorphic loci were selected for analysis with between 2 and 25 alleles (haplotypes) per locus (mean = 3.5). On average, each locus had 22x sequencing coverage. The loci had a mean of 1.9 SNPs per locus, which contributed to the high allelic diversity relative to most studies using genotyping by sequencing. Sequence data and genotype files are available for populations included in the USDA Germplasm Resources Information Network database (USDA, Agricultural Research Service 2022).
Genetic diversity analysis.
The Structure analysis indicated that the most likely number of genetic clusters in H. quercifolia is K = 6. These genetic clusters are geographically structured with substantial admixture in certain parts of the species range (Fig. 2). The clusters were numbered from north to south and east to west as a way of unambiguously referring to each one. A large cluster exists in the United States throughout Mississippi, eastern Louisiana, and western Tennessee (cluster five). The small, disjunct populations in Louisiana west of the Mississippi River belong to a distinct genetic cluster (cluster six). In the eastern half of the species range, the genetic clusters are structured latitudinally with one cluster existing in northern Alabama, northern Georgia, and eastern Tennessee (cluster one). The populations in southeastern Alabama and southern Georgia belong to one genetic cluster (cluster two). The populations in south-central Alabama (cluster four) and the populations in the panhandle of Florida (cluster three) are each in their own genetic cluster.
Geographic structure of genetic diversity in Hydrangea quercifolia. Location of each pie chart on the map indicates location of sample collection. Proportion of each pie chart indicates assignment probability to each respective genetic cluster. Genetic clusters were determined using Structure (Pritchard et al. 2000) on 188 samples from 73 US populations throughout the species native range.
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
The PCA provided results congruent with the Structure analysis (Figs. 3 and 4A). PC1 spreads out cluster five and separates cluster four from the others with cluster four appearing as an island of extremely high PC1 values (Figs. 3A and 4A). PC2 separates the remaining clusters in the eastern half of the species range latitudinally, with northern populations having higher PC2 values (Figs. 3B and 4A). PC2 also separates cluster three from cluster two. Cluster one is in the center of the PCA biplot, with varying degrees of overlap from the adjacent genetic clusters, indicating substantial admixture.
Maps of principal component (PC) analysis of allelic diversity in Hydrangea quercifolia throughout its native US range: (A) map of PC1 values (point color represents PC1 value); (B) map of PC2 values (point color represents PC2 value).
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
Unconstrained and constrained ordination biplots of allele frequency in Hydrangea quercifolia throughout its US native range. (A) Principal component (PC) analysis (unconstrained ordination) with point color representing genetic cluster with highest assignment probability. (B) Redundancy analysis (RDA) (constrained ordination) constrained by environmental factors and conditioned by population structure. This displays the genetic variation that is unaccounted for by population structure and can be explained by environmental variation.
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
Procrustes analysis revealed that geography corresponds to genetic diversity (Fig. 5). Cluster four is most affected by the Procrustes transformation, with the populations occurring in south-central Alabama, but the transformed PCA values moving to the top of the plot and correspondingly having the longest lines. Cluster five experiences substantial shrinkage in the Procrustes transformation, with the geographic extent of the cluster being the largest, but the range of PCA values not being correspondingly large. After the Procrustes transformation, cluster six maintained its relationship to cluster five being at the extremity of both the geographic range and PCA values. Cluster three is the least affected by the Procrustes transformation, staying very close to the geographical location of the populations, and therefore having the shortest lines. As expected by the correlation between PC2 and latitude in the eastern half of the species range, clusters one and two have similar shifts to the west but to a slightly larger degree.
Map showing Procrustes transformed principal component analysis (PCA) values of allelic diversity of Hydrangea quercifolia throughout its US native range. Black points indicate transformed PCA values. Sampling locations are indicated by points colored by their respective genetic cluster. Points representing the same sample are connected with a line.
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
Pairwise Fst values among the clusters are shown in Table 3; with overall, Fst values being quite low (mean = 0.027). Cluster three had the highest pairwise Fst values with the other genetic clusters, which indicates its genetic uniqueness. The two highest pairwise Fst values were between cluster three and cluster one (0.056) and between cluster three and cluster five (0.052).
Pairwise fixation index (Fst) between genetic clusters of Hydrangea quercifolia. Genetic clusters were determined using Structure (Pritchard et al. 2000) on 188 samples from 73 populations throughout the species native range.
The results of the Mantel test do not support isolation by distance in H. quercifolia (P = 0.998). Genetic diversity statistics are shown in Table 4. Mean expected heterozygosity at the species level was 0.358, and observed values are comparable within each genetic cluster except for cluster six (0.294). Mean number of alleles per locus within each cluster was substantially lower than at the species level, with the lowest being cluster six (1.47). However, when using rarefaction to account for differing numbers of individuals in each cluster, all values decrease to a similar level (Table 4) due to each cluster effectively being a subsample of the entire species.
Sample sizes, expected heterozygosity (He), and mean alleles per locus (A) for each genetic cluster and Hydrangea quercifolia as a whole. Genetic clusters were determined using Structure (Pritchard et al. 2000) on 188 samples from 73 populations throughout the species native range.
Genome-environmental association analysis.
The environmental association analysis indicates that environmental factors have a significant impact on genetic structure in H. quercifolia (Fig. 5). The RDA indicates that when conditioned by population structure, all 19 BioClim variables explain 11.25% of the variation with population structure explaining 13.5% (P = 0.001). Even when controlling for both geographic locations and population structure, environmental variation still explains 10.9% of the diversity observed. Performing PCA on the environmental variables revealed that nine of the variables accounts for more than 50% of the variation on each of the first two PCs (Table 5). When performing RDA on these nine variables, they explained 5.9% of the genetic variation (P = 0.001).
Environmental variables included in the redundancy analysis (RDA) of the environmental association analysis of Hydrangea quercifolia.
Outlier analysis revealed 231 alleles at 170 loci that were significant outliers on RDA1 (143 alleles) and RDA2 (89 alleles). One of these loci was a significant outlier on both RDA axes. The nine environmental variables included in the analysis are presented in Table 5, along with the number of loci that had their highest correlation with each variable. Figure 6B–D depicts examples of three outlying alleles and the environmental variable with which they are most highly correlated. All nine environmental variables had at least one outlying locus that was most correlated with it. The variable with the highest number of outlying alleles associated is precipitation in the driest quarter (bio17) with 61 alleles. Most of the outlying alleles [165 (71.4%)] were correlated with a precipitation-related variable. Of the 66 temperature-associated alleles, 47 were most correlated with a variable related to annual variation in temperature (annual temperature range and temperature seasonality), rather than either high or low temperature. The environmental variable with the highest correlation coefficient with an outlying locus is minimum temperature in the coldest month (bio6), which is most correlated with Hq_locus29211 [r = 0.49 (Table 5, Fig. 6D)].
Environmental variation within the natural range of H. quercifolia and allele frequencies of putative environment-associated loci. (A) Principal component (PC) analysis of 19 environmental variables from the BioClim database (Fick and Hijmans 2017) with 80% ellipses surrounding the genetic clusters. (B) Map of precipitation in the driest quarter (bio17) and allele frequency of Hq_locus526520 (mean allele frequency = 0.144). (C) Map of precipitation in the warmest quarter (bio18) and allele frequency of Hq_locus34460 (mean allele frequency = 0.118). (D) Map of minimum temperature in the coldest month (bio6) and allele frequency of Hq_locus29211 (mean allele frequency = 0.337).
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
Of the 170 environmental-associated SNPs, 34 were mapped to the genome (Table 6). A total of 11 genes were found within the 10 kbp region of these SNPs. Three genes were found within the defined genomic region of Hq_locus701876 on chromosome 01. These genes encode leucine-rich repeat proteins and stress responsive proteins, which are putative plant responsive genes against environment stresses (Sharma and Pandey 2016).
Candidate single nucleotide polymorphism (SNP) loci and potential genes associated with environmental variables in Hydrangea quercifolia.
Discussion
Based on our analyses, the genetic diversity of H. quercifolia is geographically structured as six genetic clusters. The low Fst among clusters as well as the admixture detected with Structure and PCA indicate that there is substantial gene flow among the clusters. However, the presence of structuring in the genetic diversity indicates that effective gene flow is higher among populations within a cluster than among populations between genetic clusters. Low genetic differentiation with significant geographic structure has previously been observed in other insect-pollinated species, including several Rhododendron species (Li et al. 2015; Zhao et al. 2012) and Prunus sibirica (Wang et al. 2014), as well as in wind-pollinated species, including Abies cilicia (Awad et al. 2014). In general, populations of woody plants are expected to have lower levels of differentiation because of their relatively long generation time that allows them to resist genetic drift (Hamrick et al. 1992).
The lack of significant isolation by distance supports a hypothesis of long-distance geneflow. Even within cluster five, which covers the largest geographic area, support for isolation by distance is not significant (P = 0.151). This suggests that long-distance migration plays a role in limiting genetic differentiation among the populations and clusters. Seeds of H. quercifolia are very small (3.2 mg on average; unpublished data) and have small appendages that may aid in dispersal. The obligate outcrossing nature of H. quercifolia (Reed 2000, 2004) and pollination by generalist pollinators (e.g., solitary bees, flies, butterflies, and beetles; personal observation) could additionally contribute to long-distance dispersal. As insect-mediated long-distance pollen dispersal up to 6 km has been observed among fragmented habitats in other woody species (Ismail et al. 2012; Jha and Dick 2010; Lander et al. 2010), it is likely that both long-distance seed and pollen dispersal contribute to the low genetic differentiation among populations and clusters in H. quercifolia.
The Procrustes analysis disclosed that although clusters three and four are geographically very close to each other, they are genetically quite different. This is further supported by the relatively high Fst between these two clusters, as well as the almost complete lack of admixture between them. There is no obvious physical barrier to gene flow in that region, and the differentiation is likely not due to a phenological reproductive barrier, as individuals in both clusters were observed to be flowering during the same week of sample collection. Instead, the PCA on environmental variables (Fig. 6A) suggests that cluster three has a considerably different climate compared with the rest of the species range, which affects the genetic variation in that cluster (Fig. 4B). Therefore, the most likely explanation is selection due to precipitation and temperature variables.
Cluster six has the lowest genetic diversity as measured by expected heterozygosity and mean alleles per locus. This is largely explained by the small mean population size as well as the small number of populations in the genetic cluster (Table 4). Small populations are expected to have lower diversity because of the lower number of total alleles present in the population (i.e., genetic drift) (Ellegren and Galtier 2016). The small sample sizes for cluster six reflect actual differences in the wild rather than lower sampling effort. The total number of plants per population in this cluster was five or fewer for all three populations. This low diversity on the extremity of the species range indicates a low potential for those populations to adapt to changing conditions and a high potential for genetic drift (Ellstrand and Elam 1993). Therefore, the populations in western Louisiana (cluster six) are at risk of local extirpation, which is equivalent to a complete loss of the genetic diversity contained in those populations. Although cluster six is not strongly differentiated from the nearby cluster five (Fst = 0.026), the minimal admixture between the two clusters suggests that the Mississippi River may be serving as a barrier to gene flow. Fragmentation of populations has been shown to have a negative effect on genetic diversity of woody species, with fragmented populations experiencing a decrease of expected heterozygosity and number of alleles per locus compared with intact populations (Vranckx et al. 2012). These effects were strongest in those species that are insect pollinated, such as H. quercifolia.
Germplasm preservation needs to focus on maintaining as many populations as possible from all genetic clusters to preserve the greatest amount of diversity possible. Because each genetic cluster has unique diversity, losing populations in any genetic cluster will lead to the loss of genetic diversity in the species. Furthermore, each genetic cluster can be considered as a unit with unique conservation needs (van Zonneveld et al. 2014). For example, each population faces different threats and therefore will require individual strategies to preserve the maximum amount of diversity. The observation that land use change and invasive species may be driving population size reductions and local extirpation suggests that in situ conservation management needs to be prioritized to protect the existing populations. As a complementary approach, ex situ conservation can act as a buffer to genetic diversity loss if populations are extirpated or experience substantial genetic drift.
The substantial amount of genetic diversity that was explained by precipitation and temperature variables for H. quercifolia indicates that environmental factors can have nearly as large of an effect on genetic diversity as neutral population structure. Figure 4B shows that when population structure is accounted for, the genetic clusters contain similar genetic variation that is explainable by climatic factors. Therefore, this diversity is not associated with neutral structure, but is uniquely affected by the environmental factors included in the analysis. Cluster three is a clear exception to this, and the diversity in that cluster accounts for most of the variation on RDA1, but existing in two groups with opposite RDA1 values. The six samples with high RDA1 values are from the two outlier populations that are genetically distinct within cluster three (Supplemental Fig. 1). There are two environmental variables that are contributing to the separation of cluster three in the RDA, precipitation in the coldest quarter and precipitation in the wettest quarter. Therefore, the high precipitation in Florida is selecting for unique genetics in cluster three. Cluster three is the only one without any overlap and therefore represents a completely unique ecotype for the species. Furthermore, these populations from Florida have been previously determined to have an increased tolerance to leafspot [Xanthomonas campestris (Sherwood et al. 2021)], which is likely due to this unique climate experienced by these populations. The adaptation to the unique climatic conditions in Florida makes cluster three a conservation priority. Future research could include this information to develop a habitat suitability model (Hirzel and Le Lay 2008) for H. quercifolia to further guide conservation action.
The overrepresentation of precipitation-related variables among the correlations with outlying loci indicates that precipitation has a stronger impact on genetic diversity in H. quercifolia than temperature. This is not surprising, given the fact that H. quercifolia is never found in dry locations, but always in moist forest understories with high drainage (e.g., often found on steep slopes along riverbanks). In addition, recommendations for growing H. quercifolia include providing adequate moisture (Dirr 2004), as the species is generally intolerant of hot, dry conditions. Precipitation has been found to have a significant impact on genetic diversity in other species, such as Cotinus coggygria (Miao et al. 2017), Larix decidua, Pinus mugo (Mosca et al. 2012), Arabis alpina (Manel et al. 2010), and Pinus taeda (Eckert et al. 2010). When considering the portion of the P. taeda native range in which H. quercifolia co-occurs, high spring precipitation with fall aridity in Florida is also found to be a large factor contributing to the distribution of genetic diversity. This is to say that the diversity is not simply affected by drought or lack thereof, but rather by seasonal variability in precipitation.
The finding that most temperature-associated loci are correlated with variation in annual temperature rather than either high or low temperature alone suggests that temperature extremes (high and low temperatures within a year) are shaping genetic diversity in H. quercifolia. Plants in a location with a high range in temperature need to withstand both extreme heat and cold, which is a more complex selection pressure than either high or low temperature alone. Consistent with the effect of temperature on genetic diversity, tolerance to low temperatures in winter has been found to vary throughout the native range of H. quercifolia, with cluster three (Florida panhandle) having the least tolerance to low winter temperatures and populations having the greatest tolerance to low winter temperatures being from the northern portions of the species range [cluster one and the northern extent of cluster five (Sherwood et al. 2021)]. Future abiotic stress tolerance research could further the understanding of the effects of climatic conditions on H. quercifolia growth, reproduction, and native range expansion/contraction.
Conclusions
These analyses indicate that H. quercifolia exists in six weakly differentiated genetic clusters that are geographically structured. Based on the observations of extirpated populations occurring primarily on the edges of the species range, it appears the range for H. quercifolia is currently shrinking. Therefore, US populations on the edges of the species range should be of conservation concern, especially those in Louisiana west of the Mississippi River and in Florida. Environmental association analysis suggests that several climatic variables significantly affect genetic diversity, with precipitation being the primary environmental factor affecting genetic variation in H. quercifolia. Furthermore, the analysis identified candidate loci for environmental adaptation for both precipitation and temperature variables.
References Cited
Aanensen, D.M., Huntley, D.M., Feil, E.J., al-Own, F. & Spratt, B.G. 2009 EpiCollect: Linking smartphones to web applications for epidemiology, ecology and community data collection PLoS One 9 1 7 https://doi.org/10.1371/journal.pone.0006968
Allendorf, F.W., Hohenlohe, P.A. & Luikart, G. 2010 Genomics and the future of conservation genetics Nat. Rev. Genet. 11 697 710 https://doi.org/10.1038/nrg2844
Awad, L., Fady, B., Khater, C., Roig, A. & Cheddadi, R. 2014 Genetic structure and diversity of the endangered fir tree of Lebanon (Abies cilicica Carr.): Implications for conservation PLoS One 9 1 12 https://doi.org/10.1371/journal.pone.0090086
Catchen, J.M., Amores, A., Hohenlohe, P., Cresko, W. & Postlethwait, J.H. 2011 Stacks: Building and genotyping loci de novo from short-read sequences G3 (Bethesda) 1 171 182 https://doi.org/10.1534/g3.111.000240
Chappell, M., Robacker, C. & Jenkins, T.M. 2008 Genetic diversity of seven deciduous azalea species (Rhododendron spp. section Pentanthera) native to the eastern United States J. Amer. Soc. Hort. Sci. 133 374 382 https://doi.org/10.21273/JASHS.133.3.374
Chung, M.Y., Chung, J.M., López-Pujol, J., Park, S.J. & Chung, M.G. 2013 Genetic diversity in three species of Forsythia (Oleaceae) endemic to Korea: Implications for population history, taxonomy, and conservation. Biochem Sys Ecol. 47 80 92 https://doi.org/10.1016/j.bse.2012.11.005
Danecek, P., Auton, A., Abecasis, G., Albers, C.A., Banks, E., DePristo, M.A., Handsaker, R.E., Lunter, G., Marth, G.T., Sherry, S.T., McVean, G. & Durbin, R. 1000 Genomes Project Analysis Group 2011 The variant call format and VCFtools Bioinformatics 27 2156 2158 https://doi.org/10.1093/bioinformatics/btr330
Dirr, M.A. 2004 Hydrangeas for American gardens Timber Press Portland, OR, USA
Dixon, P. 2003 VEGAN, a package of R functions for community ecology J. Veg. Sci. 14 927 930 https://doi.org/10.1111/j.1654-1103.2003.tb02228.x
Dray, S. & Dufour, A. 2007 The ade4 package: Implementing the duality diagram for ecologists J. Stat. Softw. 22 1 20 https://doi.org/10.18637/jss.v022.i04
Earl, D.A. & vonHoldt, B.M. 2012 STRUCTURE HARVESTER: A website and program for visualizing STRUCTURE output and implementing the Evanno method Conserv. Genet. Resour. 4 359 361 https://doi.org/10.1007/s12686-011-9548-7
Eckert, A.J., Bower, A.D., González-MartÃnez, S.C., Wegrzyn, J.L., Coop, G. & Neale, D.B. 2010 Back to nature: Ecological genomics of loblolly pine (Pinus taeda, Pinaceae) Mol. Ecol. 19 3789 3805 https://doi.org/10.1111/j.1365-294X.2010.04698.x
Ellegren, H. & Galtier, N. 2016 Determinants of genetic diversity Nat Gen Rev. 17 422 433 https://doi.org/10.1038/nrg.2016.58
Ellstrand, N.C. & Elam, D.R. 1993 Population genetic consequences of small population size: Implications for plant conservation Annu. Rev. Ecol. Syst. 24 217 242 https://doi.org/10.1146/annurev.es.24.110193.001245
Elshire, R.J., Glaubitz, J.C., Sun, Q., Poland, J.A., Kawamoto, K., Buckler, E.S. & Mitchell, S.E. 2011 A robust, simple genotyping-by-sequencing (GBS) approach for high diversity species PLoS One 6 1 10 https://doi.org/10.1371/journal.pone.0019379
Evanno, G., Regnaut, S. & Goudet, J. 2005 Detecting the number of clusters of individuals using the software STRUCTURE: A simulation study Mol. Ecol. 14 2611 2620 https://doi.org/10.1111/j.1365-294X.2005.02553.x
Fick, S.E. & Hijmans, R.J. 2017 WorldClim 2: New 1km spatial resolution climate surfaces for global land areas Int. J. Climatol. 37 4302 4315 https://doi.org/10.1002/joc.5086
Forester, B.R., Lasky, J.R., Wagner, H.H. & Urban, D.L. 2018 Comparing methods for detecting multilocus adaptation with multivariate genotype–environment associations Mol. Ecol. 27 2215 2233 https://doi.org/10.1111/mec.14584
Fu, Z.-Z., Lei, Y.-K., Peng, D.-D. & Li, Y. 2016 Population genetics of the widespread shrub Forsythia suspensa (Oleaceae) in warm-temperate China using microsatellite loci: Implication for conservation Plant Syst. Evol. 302 1 9 https://doi.org/10.1007/s00606-015-1241-y
Hahn, C.Z., Michalski, S.G., Fischer, M. & Durka, W. 2017 Genetic diversity and differentiation follow secondary succession in a multi-species study on woody plants from subtropical China J. Plant Ecol. 10 213 221 https://doi.org/0.1093/jpe/rtw054
Hamrick, J.L., Godt, M.J.W. & Sherman-Broyles, S.L. 1992 Factors influencing levels of genetic diversity in woody plant species New For. 6 95 124 https://doi.org/10.1007/978-94-011-2815-5
Helyar, S.J., Hemmer-Hansen, J., Bekkovold, D., Taylor, M.I., Ogden, R., Limborg, M.T., Cariani, A., Maes, G.E., Diopere, E., Carvalho, G.R. & Nielsen, E.E. 2011 Application of SNPs for population genetics of nonmodel organisms: New opportunities and challenges Mol. Ecol. Resour. 11 123 136 https://doi.org/10.1111/j.1755-0998.2010.02943.x
Hirzel, A.H. & Le Lay, G. 2008 Habitat suitability modelling and niche theory J. Appl. Ecol. 45 1372 1381 https://doi.org/10.1111/j.1365-2664.2008.01524.x
Ismail, S.A., Ghazoul, J., Ravikanth, G., Shaanker, R.U., Kushalappa, C.G. & Kettle, C.J. 2012 Does long-distance pollen dispersal preclude inbreeding in tropical trees? Fragmentation genetics of Dysoxylum malabaricum in an agro-forest landscape Mol. Ecol. 21 5484 5496 https://doi.org/10.1111/mec.12054
Jha, S. & Dick, C.W. 2010 Native bees mediate long-distance pollen dispersal in a shade coffee landscape mosaic Proc. Natl. Acad. Sci. USA 107 13760 13764 https://doi.org/10.1073/pnas.1002490107
Jombart, T., Pontier, D. & Dufour, A.-B. 2009 Genetic markers in the playground of multivariate analysis Heredity 102 330 341 https://doi.org/10.1038/hdy.2008.130
Kamvar, Z.N., Tabima, J.F. & Grünwald, N.J. 2014 Poppr: An R package for genetic analysis of populations with clonal, partially clonal, and/or sexual reproduction PeerJ 2 e281 https://doi.org/10.7717/peerj.281
Kanno, H. & Seiwa, K. 2004 Sexual vs. vegetative reproduction in relation to forest dynamics in the understory shrub, Hydrangea paniculata (Saxifragaceae) Plant Ecol. 170 43 53 https://doi.org/10.1023/B:VEGE.0000019027.88318.54
Lander, T.A., Boshier, D.H. & Harris, S.A. 2010 Fragmented but not isolated: Contribution of single trees, small patches and long-distance pollen flow to genetic connectivity for Gomortega keule, an endangered Chilean tree Biol. Conserv. 143 2583 2590 https://doi.org/10.1016/j.biocon.2010.06.028
Li, M., Chen, S., Shi, S., Zhang, Z., Liao, W., Wu, W., Zhou, R. & Fan, Q. 2015 High genetic diversity and weak population structure of Rhododendron jinggangshanicum, a threatened endemic species in Mount Jinggangshan of China Biochem. Syst. Ecol. 58 178 186 https://doi.org/10.1016/j.bse.2014.12.008
Mallet, C., Mallet, R. & van Trier, H. 1992 Hydrangeas: Species and cultivars Center d’Art Floral Varengeville, France
Manel, S., Poncet, B.N., Legendre, P., Gugerli, F. & Holderegger, R. 2010 Common factors drive adaptive genetic variation at different spatial scales in Arabis alpina Mol. Ecol. 19 3824 3835 https://doi.org/10.1111/j.1365-294X.2010.04716.x
Martin, A.D., Quinn, K.M. & Park, J.H. 2011 MCMCpack: Markov chain Monte Carlo in R J. Stat. Softw. 42 1 21 https://doi.org/10.18637/JSS.V042.I09
McClintock, E. 1957 A monograph of the genus Hydrangea Proc. Calif. Acad. Sci. XXIX 147 256 https://doi.org/10.1016/S0079-8169(08)61510-X
Miao, C.Y., Li, Y., Yang, J. & Mao, R.L. 2017 Landscape genomics reveal that ecological character determines adaptation: A case study in smoke tree (Cotinus coggygria Scop.) BMC Evol. Biol. 17 1 11 https://doi.org/10.1186/s12862-017-1055-3
Mosca, E., Eckert, A.J., Di Pierro, E.A., Rocchini, D., La Porta, N., Belletti, P. & Neale, D.B. 2012 The geographical and environmental determinants of genetic diversity for four alpine conifers of the European Alps Mol. Ecol. 21 5530 5545 https://doi.org/10.1111/mec.12043
Nei, M. 1972 Genetic distance between populations Am. Nat. 106 283 292 https://doi.org/10.1086/282187
Pilatowski, R.E. 1982 A taxonomic study of the Hydrangea arborescens complex Castanea 47 84 98
Pritchard, J.K., Stephens, M. & Donnelly, P. 2000 Inference of population structure using multilocus genotype data Genetics 155 945 959 https://doi.org/10.1111/j.1471-8286.2007.01758.x
Reed, S.M. 2000 Compatibility studies in Hydrangea J. Environ. Hortic. 18 29 33 https://doi.org/10.24266/0738-2898-18.1.29
Reed, S.M. 2004 Self-incompatibility and time of stigma receptivity in two species of Hydrangea HortScience 39 312 315 https://doi.org/10.21273/HORTSCI.39.2.312
Rellstab, C., Gurgerli, F., Eckert, A.J., Hancock, A.M. & Holderegger, R. 2015 A practical guide to environmental association analysis in landscape genomics Mol. Ecol. 24 4348 4370 https://doi.org/10.1111/mec.13322
Sharma, M. & Pandey, G.K. 2016 Expansion and function of repeat domain proteins during stress and development in plants Front Plant Sci. 6 1218 https://doi.org/10.3389/fpls.2015.01218
Sherwood, A., McNamara, S., Alexander, L., Clark, M. & Hokanson, S.C. 2021 Horticultural characterization of wild Hydrangea quercifolia seedlings collected throughout the species native range HortScience 56 1023 1033 https://doi.org/10.21273/HORTSCI15889-21
Szczecińska, M., Sramko, G., Wołosz, K. & Sawicki, J. 2016 Genetic diversity and population structure of the rare and endangered plant species Pulsatilla patens (L.) Mill in east central Europe PLoS One 11 1 24 https://doi.org/10.1371/journal.pone.0151730
Thioulouse, J. & Dray, S. 2007 Interactive multivariate data analysis in R with the ade4 and ade4TkGUI packages J. Stat. Softw. 22 1 14 https://doi.org/10.18637/jss.v022.i05
US Department of Agriculture, Agricultural Research Service 2022 Germplasm Resources Information Network (GRIN) https://www.ars-grin.gov/ [accessed 21 Feb 2022]
US Department of Agriculture, Natural Resources Conservation Service 2021 The PLANTS database https://plants.usda.gov/home [accessed 27 Sep 2022]
US Department of Energy, Joint Genome Institute 2022 Phytozome https://phytozome-next.jgi.doe.gov/info/Hquercifolia_v1_1. [accessed 7 Oct 2022]
van Zonneveld, M., Dawson, I., Thomas, E., Scheldeman, X., van Etten, J., Loo, J. & Hormaza, J.I. 2014 Application of molecular markers in spatial analysis to optimize in situ conservation of plant genetic resources 67 91 Tuberosa, R., Graner, A. & Frison, E. Genomics of plant genetic resources. Springer Dordrecht https://doi.org/10.1007/978-94-007-7572-5
Vranckx, G., Jacquemyn, H., Muys, B. & Honnay, O. 2012 Meta-analysis of susceptibility of woody plants to loss of genetic diversity through habitat fragmentation Conserv. Biol. 26 228 237 https://doi.org/10.1111/j.1523-1739.2011.01778.x
Wang, C., Szpiech, Z.A., Degnan, J.H., Jakobsson, M., Pemberton, T.J., Hardy, J.A., Singleton, A.B. & Rosenberg, N.A. 2010 Comparing spatial maps of human population-genetic variation using procrustes analysis Stat. Appl. Genet. Mol. Biol. 9 Article 13, https://doi.org/10.2202/1544-6115.1493
Wang, Z., Kang, M., Liu, H., Gao, J., Zhang, Z., Li, Y. & Wu, R. 2014 High-Level genetic diversity and complex population structure of Siberian apricot (Prunus sibirica L.) in China as revealed by nuclear SSR markers PLoS One 9 1 13 https://doi.org/10.1371/journal.pone.0087381
Wu, F.Q., Shen, S.K., Zhang, X.J., Wang, Y.H. & Sun, W.B. 2015 Genetic diversity and population structure of an extremely endangered species: The world’s largest Rhododendron AoB Plants 7 1 9 https://doi.org/10.1093/aobpla/plu082
Zhao, B., Yin, Z., Xu, M. & Wang, Q. 2012 AFLP analysis of genetic variation in wild populations of five Rhododendron species in Qinling Mountain in China Biochem. Syst. Ecol. 45 198 205 https://doi.org/10.1016/j.bse.2012.07.033
Principal component analysis (PCA) of within-population allelic diversity in Hydrangea quercifolia throughout its native range. Point color and number indicate the populations from which the samples were collected. (A) PCA of samples with highest assignment probability to cluster one. (B) PCA of samples with highest assignment probability to cluster two. (C) PCA of samples with highest assignment probability to cluster three. (D) PCA of samples with highest assignment probability to cluster four. (E) PCA of samples with highest assignment probability to cluster five. (F) PCA of samples with highest assignment probability to cluster six.
Citation: Journal of the American Society for Horticultural Science 148, 1; 10.21273/JASHS05255-22
List of 19 bioclimatic variables used in environmental association analysis to determine the effects of environmental factors on population structure on Hydrangea quercifolia across its native US range. Names and descriptions are based on WorldClim (Fick and Hijmans 2017).