Breeders screen germplasm with molecular markers to
identify and select individuals that have desirable alleles. In the SunGrains
collaborative breeding group in the southern United States, genotyping-by-sequencing
(GBS) is conducted annually in the F5:7 generation to identify
single nucleotide polymorphisms (SNPs) for use in genomic selection. Subsequently, a reduced number of F5:9
generation lines are screened with markers for 60 QTL via Kompetitive allele
specific PCR (KASP). The objective of
this research was to investigate if major effect QTL can be accurately called
in F5:7 generation breeding lines by using the SNPs derived by GBS. In
2020 and 2021, 2376 and 3423 SunGrains lines submitted for GBS were genotyped
via KASP for the Fusarium head blight QTL: Fhb1 from ‘Sumai 3’,
Qfhb.vt-1B from ‘Jamestown’, and Qfhb.nc-1A and Qfhb.nc-4A
from ‘NC-Neuse’. In parallel, data was compiled from the 2011-2020 Southern
Uniform Winter Wheat Scab Nursery (UFHBN), which had been screened for the same
QTL via KASP, sequenced via GBS, and phenotyped for: severity (SEV), percent Fusarium
damaged kernels (FDK), deoxynivalenol content (DON), plant height, and heading
date. Three machine learning models were evaluated: random forest, k-nearest
neighbors, and gradient boosting machine. The SunGrains data was randomly partitioned
into training-testing splits. The QTL call and 100 most correlated GBS SNPs on
the chromosome containing the QTL were used for training and k-fold cross
validation tuning for each model. The cross-validated machine learning models
were used to predict QTL calls in the testing partition of the SunGrains lines
and the UFHBN. Phenotypic data and observed QTL calls were compared to
predictive QTL calls in the UFHBN. Random subsetting of training and testing partitions
in the SunGrains material, prediction of QTL calls in the SunGrains testing partitions
and UFHBN, and estimation of QTL call effects were repeated 20 times and
results were averaged. The average predictive accuracies for Fhb1 calls
in the 2020 SunGrains testing partitions ranged from 97.2 - 98.9%. The observed
Fhb1 call estimated effects for SEV, FDK, DON, plant height, and heading
date in the UFHBN were not significantly different from any of the predicted Fhb1
call effects. Similar results were observed in the 2021 SunGrains and UFHBN
populations. This indicates that machine learning may be utilized in breeding
programs to accurately estimate QTL calls in earlier generation germplasm via a
GBS and KASP genotyped training population.