Trends in Statistical Analysis Software Use for Horticulture Research between 2005 and 2020

in HortTechnology
Authors:
Marina L. CurtisHorticultural Sciences Department, University of Florida, PO Box 110690, Gainesville, FL 32611

Search for other papers by Marina L. Curtis in
ASHS
Google Scholar
PubMed
Close
and
Gerardo H. NunezHorticultural Sciences Department, University of Florida, PO Box 110690, Gainesville, FL 32611

Search for other papers by Gerardo H. Nunez in
ASHS
Google Scholar
PubMed
Close

Courses are the main source of data analysis training for students. The statistical software training taught in those courses can affect student career readiness. However, lack of information about statistical software use in horticulture leads students and mentors to select statistics courses based on course availability and/or anecdotal evaluations. This research aimed to describe statistical software use trends in horticulture research to inform student course selection. We surveyed ≈50% of all articles published in HortScience, HortTechnology, and the Journal of the American Society for Horticultural Science (JASHS) between 2005 and 2020. We found that SAS, SPSS, and R were the most frequently used software packages in this period. SAS use frequency decreased in HortScience and JASHS, but not in HortTechnology. SPSS use increased in JASHS and R use increased in all journals. Results from this retrospective survey suggest that training in SAS, SPSS, and R can help align students with horticulture research practices.

Abstract

Courses are the main source of data analysis training for students. The statistical software training taught in those courses can affect student career readiness. However, lack of information about statistical software use in horticulture leads students and mentors to select statistics courses based on course availability and/or anecdotal evaluations. This research aimed to describe statistical software use trends in horticulture research to inform student course selection. We surveyed ≈50% of all articles published in HortScience, HortTechnology, and the Journal of the American Society for Horticultural Science (JASHS) between 2005 and 2020. We found that SAS, SPSS, and R were the most frequently used software packages in this period. SAS use frequency decreased in HortScience and JASHS, but not in HortTechnology. SPSS use increased in JASHS and R use increased in all journals. Results from this retrospective survey suggest that training in SAS, SPSS, and R can help align students with horticulture research practices.

Given the size and complexity of horticulture data sets, researchers require statistical analysis software to organize, analyze, and illustrate results. As with other research tools, students learn to use statistical software through formal instruction (courses) or informal coaching. Universities offer a plethora of statistics courses covering several statistical software packages (Lazar et al., 2011; Mazouchova et al., 2021). Additionally, data analysis and interpretation modules are becoming popular in discipline-specific courses (Schwab-McCoy, 2019). Thus, when selecting which courses to enroll in, students are implicitly selecting which statistical software to learn.

Statistical analysis is a common expectation placed on graduates of plant science and horticulture programs (Richter et al., 2018). Courses are the primary source of training in data analysis for students (Davidson et al., 2019). Thus, course selection can influence student career readiness. Graduate students frequently select their courses with input from their advisors and/or other senior laboratory members, whereas undergraduate students frequently follow predetermined course requirements. When they have multiple courses to choose from or elective credits, undergraduate students have less information than graduate students because they are advised by university personnel who are not aware of horticulture research practices. Hence, there is a need for empirical data that can inform student and advisor choices. In this article, we provide the first empirical report on statistical software use in horticulture research articles with the goal of informing students and educators who might need to select courses or train in statistics.

Materials and methods

This retrospective survey focused on articles published between 2005 and 2020 in the academic journals of the American Society for Horticultural Science (ASHS). Only primary research articles, reviews that included statistical analysis of metadata, and genetics/genomics articles where a hypothesis was tested were considered. Comprehensive reviews, conference proceedings, teaching plans, cultivar announcements, and genetic maps were excluded. In HortScience, we surveyed 10 articles per issue in each of the six issues published before 2010. Starting in 2010, we surveyed five articles per issue from each of the 12 issues published per year. In HortTechnology, we surveyed 10 articles per issue from each of the six issues published per year. In JASHS, we aimed to survey six articles per issue from each of the six issues published per year. However, fewer than six articles passed the inclusion filters in 2011, 2012, 2013, 2019, and 2020. In all journals, we selected articles at random if more than the desired number of articles passed the inclusion filters.

For each article, the statistical analysis software package used was recorded, where available. Data were tabulated and imported into R Studio (ver. 1.4 Juliet Rose; RStudio Team, Boston, MA). Then, the most used software packages were identified for comparison. Software use in 2005 and 2020 were compared using chi-square (χ2) tests (null hypothesis: frequency of use in 2005 = frequency of use in 2020). Data analysis and illustration were conducted in R (ver. 4.1.0; R Core Team, Vienna, Austria).

Results and discussion

Approximately 50% of the articles published in the ASHS journals were included in this survey. The most used software packages were SAS (SAS Institute Inc., Cary, NC), IBM SPSS Statistics [SPSS (IBM Corp., Armonk, NY)], and R (R Core Team) (Table 1). Together these software packages were referenced in 72% to 80% of the surveyed articles in each journal. Less than 2% of the articles surveyed used more than one software package, and 4% to 11% of the articles surveyed did not specify which statistical analysis software was used or if one was used at all. There were several software packages used in less than 3% of the surveyed articles in each journal. These minor use software packages were grouped in a category labeled “other” for the year-by-year analysis. When illustrated by year, there were observable software use trends (Fig. 1). Between 2005 and 2020, SAS use declined in HortScience2 = 0.04) and JASHS2 < 0.001), but not in HortTechnology2 = 0.50). SPSS use increased only in JASHS2 = 0.01). R use increased in all journals in this period, but χ2 testing was not used to avoid the test low frequency bias.

Fig. 1.
Fig. 1.

Software use frequency in articles published in (A) HortScience (n = 60 per year), (B) HortTechnology (n = 60 per year), and (C) the Journal of the American Society of Horticultural Sciences (n ≥ 30 per year). None = articles did not explicitly mention a software package; Other = less popular software packages; R (R Core Team, Vienna, Austria); SAS (SAS Institute Inc., Cary, NC); SPSS (IBM SPSS Statistics; IBM Corp., Armonk, NY).

Citation: HortTechnology 32, 4; 10.21273/HORTTECH05051-22

Table 1.

Statistical software used in a subsample of articles published in HortTechnology, HortScience, and the Journal of the American Society for Horticultural Science (JASHS) in the period 2005–20.

Table 1.

Results from this survey suggest that statistical software use in horticulture research is diversifying. Two decades ago, most articles published in the ASHS journals used SAS. Today, 40% to 80% of the manuscripts use other statistical software. SPSS and R are gaining popularity. On the basis of these results, courses that train students in SAS, SPSS, and R are likely to be career assets because they will help align students with research practices in the discipline. Training in multiple software packages is also likely to be welcomed by employers who do not favor a specific software package (Richter et al., 2018).

It is notable that two of the three most popular software packages (SAS and R) used in horticulture research were coding-based software. These software packages are popular in disciplines such as statistics and data science, where coding competency is a core skill (Fox and Leanage, 2016; Lazar et al., 2011). In disciplines where coding competency is less prevalent, such as psychology and business, point-and-click software, like SPSS and Excel (Microsoft Corp., Redmond, WA), are more popular (Davidson et al., 2019; Schwab-McCoy, 2019). The use of coding-based software might reflect widespread coding competency among horticulture researchers or interdisciplinary collaboration. This survey could not distinguish between these scenarios.

Statistical software adoption appears to be heterogeneous among horticulture research niches. The shift away from SAS prevalence seems to be occurring at a faster pace in JASHS than in other journals. It is possible that software adoption trends differ in other journals used by horticulture researchers. Knowing which statistical software package is used in a research niche can help inform student course selection. This information might be particularly useful for university personnel who are not exposed to horticulture research and for faculty advisers who finished their formal training several years ago.

This retrospective study described statistical software use in the scientific journals published by ASHS to inform adviser and student course selection. Horticulture researchers used a more diverse set of statistical software in 2020 than in 2005. SAS was the most used software package in the period 2005–20, but SPSS and R exhibit increasing popularity. Training in these software packages can help align students with horticulture research practices.

Literature cited

  • Davidson, H., Jabbari, Y., Patton, H., O’Hagan, F., Peters, K. & Cribbie, R. 2019 Statistical software use in Canadian university courses: Current trends and future directions Teach. Psychol. 46 3 246 250 https://doi.org/10.1177/0098628319853940

    • Search Google Scholar
    • Export Citation
  • Fox, J. & Leanage, A. 2016 R and the Journal of Statistical Software J. Stat. Softw. 73 2 1 13 https://doi.org/10.18637/jss.v073.i02

  • Lazar, N.A., Reeves, J. & Franklin, C. 2011 A capstone course for undergraduate statistics majors Am. Stat. 65 3 183 189 https://doi.org/10.1198/tast.2011.10240

    • Search Google Scholar
    • Export Citation
  • Mazouchova, A., Jedlickova, T. & Hlavacova, L. 2021 Statistics teaching practice at Czech universities with emphasis on statistical software J. Effic. Responsib. Educ. Sci. 14 4 258 269 https://doi.org/10.7160/eriesj.2021.140405

    • Search Google Scholar
    • Export Citation
  • Richter, B.S., Poleatewich, A., Hayslett, M. & Stofer, K. 2018 Finding the gaps: An assessment of concepts, skills, and employer expectations for plant pathology foundational courses Plant Dis. 102 10 1883 1898 https://doi.org/10.1094/PDIS-11-17-1845-FE

    • Search Google Scholar
    • Export Citation
  • Schwab-McCoy, A. 2019 The state of statistics education research in client disciplines: Themes and trends across the university J. Stat. Educ. 27 3 253 264 https://doi.org/10.1080/10691898.2019.1687369

    • Search Google Scholar
    • Export Citation
  • Tang, Q.Y. & Zhang, C.X. 2013 Data Processing System (DPS) software with experimental design, statistical analysis and data mining developed for use in entomological research Insect Sci. 20 2 254 260 https://doi.org/10.1111/j.1744-7917.2012.01519.x

    • Search Google Scholar
    • Export Citation

Contributor Notes

G.H.N. is the corresponding author. E-mail: g.nunez@ufl.edu.

  • Collapse
  • Expand

 

  • View in gallery
    Fig. 1.

    Software use frequency in articles published in (A) HortScience (n = 60 per year), (B) HortTechnology (n = 60 per year), and (C) the Journal of the American Society of Horticultural Sciences (n ≥ 30 per year). None = articles did not explicitly mention a software package; Other = less popular software packages; R (R Core Team, Vienna, Austria); SAS (SAS Institute Inc., Cary, NC); SPSS (IBM SPSS Statistics; IBM Corp., Armonk, NY).

  • Davidson, H., Jabbari, Y., Patton, H., O’Hagan, F., Peters, K. & Cribbie, R. 2019 Statistical software use in Canadian university courses: Current trends and future directions Teach. Psychol. 46 3 246 250 https://doi.org/10.1177/0098628319853940

    • Search Google Scholar
    • Export Citation
  • Fox, J. & Leanage, A. 2016 R and the Journal of Statistical Software J. Stat. Softw. 73 2 1 13 https://doi.org/10.18637/jss.v073.i02

  • Lazar, N.A., Reeves, J. & Franklin, C. 2011 A capstone course for undergraduate statistics majors Am. Stat. 65 3 183 189 https://doi.org/10.1198/tast.2011.10240

    • Search Google Scholar
    • Export Citation
  • Mazouchova, A., Jedlickova, T. & Hlavacova, L. 2021 Statistics teaching practice at Czech universities with emphasis on statistical software J. Effic. Responsib. Educ. Sci. 14 4 258 269 https://doi.org/10.7160/eriesj.2021.140405

    • Search Google Scholar
    • Export Citation
  • Richter, B.S., Poleatewich, A., Hayslett, M. & Stofer, K. 2018 Finding the gaps: An assessment of concepts, skills, and employer expectations for plant pathology foundational courses Plant Dis. 102 10 1883 1898 https://doi.org/10.1094/PDIS-11-17-1845-FE

    • Search Google Scholar
    • Export Citation
  • Schwab-McCoy, A. 2019 The state of statistics education research in client disciplines: Themes and trends across the university J. Stat. Educ. 27 3 253 264 https://doi.org/10.1080/10691898.2019.1687369

    • Search Google Scholar
    • Export Citation
  • Tang, Q.Y. & Zhang, C.X. 2013 Data Processing System (DPS) software with experimental design, statistical analysis and data mining developed for use in entomological research Insect Sci. 20 2 254 260 https://doi.org/10.1111/j.1744-7917.2012.01519.x

    • Search Google Scholar
    • Export Citation
All Time Past Year Past 30 Days
Abstract Views 0 0 0
Full Text Views 451 451 21
PDF Downloads 395 395 11