QTL mapping for resistance to Cercospora sojina in Essex Forrest soybean (Glycine max L.) lines

Frogeye leafspot (FLS), caused by Cercospora sojina, is observed as red-brown lesions on leaves that can coalesce and decrease the photosynthetic ability of soybeans. The average yield loss due to Frogeye Leaf Spot is estimated at approximately 40% in established fields, whereas 100% incidence was previously recorded. QoI inhibitor fungicides were considered an effective control method, but the pathogen quickly evolved an ability to thrive post-application. This trait quickly spread across North America. Therefore, genetic host resistance is likely the most effective method to prevent the disease. To achieve this goal, we aimed to screen 91 recombinant inbred lines (RILs) of ‘Essex’ × ‘Forrest’ under greenhouse conditions for FLS resistance and used single nucleotide polymorphism (SNP) markers to identify associated quantitative trait loci (QTL). Two QTL were mapped in this study. One QTL reported on Chr. 13 coincides with the QTL previously reported, and the QTL on Chr. 19 was novel. Overall, this study will help to better understand the underlying mechanisms of soybean resistance to C. sojina as well as to develop soybean varieties with resistance to FLS using marker assisted selection.   
 
 Key words: Cercospora sojina, quantitative trait loci, Frogeye Leaf Spot, Essex × Forrest, disease resistance, genotypic and phenotypic traits.


INTRODUCTION
Frogeye leaf spot (FLS), caused by the pathogen Cercospora sojina, is a foliar disease indicated by watersoaked lesions on the leaves of soybeans. The lesions begin as small brown spots and develop a dark, redbrown border, whereas in severe cases, they can also form on the stems, pods, and seeds. When lesions appear on seeds, the fungus spreads to new seedlings the following year (Malvick, 2018). Yearly soybean losses to FLS in the United States have been measured at 106.3 thousand metric tons, with the most losses in the southern states (Wrather et al., 2001). In heavily infected fields, FLS can reduce soybean yield by 40% in conducive environmental conditions (Byamukama et al., 2019). Together, these characteristics create a cycle of reduced yield and reduced profits for infected fields.
The first verified case of FLS in the United States of America was recorded in 1925 (Lehman, 1928). The disease was particularly problematic in the southern states for many years, with cases first recorded in the Midwest in the late 1940s (Philips and Boerma, 1981).
For many years, chemical control, mostly using QoIinhibitor fungicides (also known as FRAC Group 11) was the most effective method for disease management. FLS resistance to QoI inhibitors was detected in North America by 2010 (Zhang, 2012), making genetic host resistance to FLS more crucial to high-yielding soybean production.
Single nucleotide polymorphisms (SNPs) for disease resistance in soybean are usually centralized on chromosomes (Chr.) 7, 13, and 18. Chr. 13, in particular, is known to be a rich area of disease resistance, as it harbors the resistance gene rich Satt114 marker and the Rsp8 gene in linkage group F on Chr. 13. This area is associated with resistance to two races of Phytophthora sojae, the causal agent of Phytophthora root rot (Gordon et al., 2006). Satt114 is also commonly used as a flag marker for other disease resistance studies (Pham et al., 2015). However, resistance genes are not restricted to these areas and can be scattered across the genome. For example, SNPs that are significant to Soybean cyst nematode resistance can be found on Chr 3, 4, 7, 9, 10, 11, 13, 14, 15, 18, 19, and 20 (Chang et al., 2016).
Currently, there are 12 known races of C. sojina and three main genes conferring resistance. These genes are Rcs1, which codes for resistance to race 1; Rcs2, which provides resistance to race 2; and Rcs3, which confers resistance to all other known races of C. sojina (Mian et al., 2008). In 2012, two additional dominant resistance alleles were identified as Rcs (PI 594891) and Rcs (PI 594774) (Pham et al., 2015). More research is needed in this area to understand specific QTLs that are associated with each resistance gene to make their implementation more feasible for breeders.
The Essex × Forrest (E × F) cross was made in 1983 at Southern Illinois University Carbondale (Lightfoot et al., 2005). Essex was chosen for its partial resistance to FLS, whereas Forrest for its partial susceptibility (Sharma and Lightfoot, 2017). Forrest has been extensively studied and mapped alongside Williams 82, making it an ideal candidate line for QTL identification. Essex and Forrest share a common germplasm pool with Forrest that accounts for 25% of their genomes (Lightfoot, 2008). From the initial cross, approximately 4,500 F2 plants were advanced to F5 using single-pod descent. After harvest, 150 F 5 plants were randomly selected and planted into progeny rows. Of these, 100 recombinant inbred lines (RILs) were kept for various phenotypic assays. In total, 94 RILs were used to construct a mapping population for quantitative trait loci (QTL) discovery and also released for research purposes (Lightfoot et al., 2005). The plant material that used in this study was consisted of 91 F 5:8 selected RILs.
Markers closely linked to QTL can be used to screen hundreds of lines at once for the genes of interest. For the purpose of developing resistant cultivars, the use of marker assisted selection is an efficient and accurate way to identify resistant lines as opposed to large phenotypic surveys (Yousef and Juvik, 2001). Phenotypic assays require more labor, take longer to complete, and are less precise compared to genotypic methods. Two major QTLs for FLS resistance were detected in the Essex × Forrest population (E×F) for C. sojina race 2 on Chr. 7 near Satt319 and on Chr. 8 near Satt632 as well as 13 minor QTL across various chromosomes (Sharma and Lightfoot, 2017). However, this study used simple sequence repeat (SSR) to find regions of interest. The use of SNP markers is more precise than SSR and is the preferred method in genetic diversity studies (Singh et al., 2013). For this reason, SNP was used in this study. Having a precise location in the genome for FLS resistance allows for simpler implementation in commercial lines. The objectives of this study are to analyze the phenotypic variation of FLS resistance in E×F in a greenhouse setting, create a genetic linkage map for the population, and identify candidate QTLs that code for resistance to C. sojina race 15 using SNPs.

Greenhouse assay
Greenhouse assays were conducted by planting the 91 E × F RILs and their parental lines in six-inch plastic nursery pots filled with Berger BM1 growing medium. Plants were watered according to environmental needs, generally twice a week. No fertilization was used in this experiment. Pots were arranged in a randomized complete block design with two blocks per replication. Each block contained one pot of each line, with the lines "Blackhawk" and "Lincoln" placed in each block as checks. This model was replicated twice in time, once in March 2019 and once in October 2019 comprising the EF_1 experiment. The EF_2 experiment also consisted of two blocks per repetition, with one repetition in March 2018 and one repetition in October 2018. Seven seeds were planted in each pot. One treatment, the application of C. sojina spores, was applied to all blocks. Shortly after emergence, thinning was performed to a density of one plant per pot. Plants were inoculated for the first time with C. sojina solution at V2-V5 stages. Plants were then inoculated a second and third time with a week between inoculations.
Race 15 of C. sojina was cultured in petri dishes filled with clarified V8 solid medium (Salas et al., 2007). After two weeks in a growth chamber at 25°C, the Petri dishes were flooded with a 0.1% Tween 20 solution and spores were knocked into the solution using a sterilized metal spatula. Approximately eight Petri dishes of seven colonies were used to make 300 ml of solution. The solution was mixed thoroughly on a stirring plate for 5 min, and then was filtered through a cheese cloth to remove mycelium. Final spore concentration was approximately 6 × 10 4 conidia/ml. This final product was poured into a spray bottle and immediately used for inoculation.
All lines were sprayed to dripping with the fungal solution and covered using a gallon-sized plastic bag to create a highly humid microenvironment. Gallon-sized bags were left on for 72 h. For the rest of the experiment, the plants were left under a humidity tent using plastic sheeting and a humidifier. Relative humidity was maintained at 80-90% and temperature was maintained at 28-30°C until the end of the experimental period. Two weeks after the first inoculation, plants were rated for disease severity using the Newman Scale. This method allowed for characterization of disease development over time. Plants were rated on a scale of 1-10; rating of 1 indicates 0-10% of the leaf surface showing disease symptoms, whereas a rating of 10 indicates 90-100% of the leaf showing symptoms. Defoliation due to disease presence was also counted as a 10 (Sinclair, 1982). In total, six ratings were taken within 2 wks.

DNA isolation
For DNA isolation, all lines screened in the greenhouse were planted in six-pack trays and allowed to grow in a dark room to minimize cuticle growth and chloroplastic DNA expression. When plants reached the V1 stage (first trifoliate emergence), 50 mg of tissue from the first trifoliate was collected and stored in a -20°C freezer until isolation. Upon collection of all tissues, samples were thawed, flash frozen with liquid nitrogen, and crushed. DNA isolation was performed using the DNEasy 96 Plant Kit (Qiagen, Hilden, Germany), following the manufacturer"s instructions. DNA purity was tested using a gel electrophoresis visualized with a 1% EtBr stained agarose gel, and DNA quantification was carried out with NanoDrop 2000 (Thermo Scientific, Waltham, MA, USA). SNP genotyping was conducted at the Soybean Genomics and Improvement Laboratory, USDA-ARS, Beltsville, MD, using the BARCSoySNP6K BeadChip array.

Phenotypic variation
To compare FLS resistance across the population, the sixth and final greenhouse rating for each line was used to run a distribution analysis. Lines with a lower FLS score than the susceptible parent were labelled "susceptible lines" and lines with higher FLS scores than the resistant parent were labelled "resistant lines."

Genetic map and QTL analysis
The genetic map and QTL analysis were done with the r/QTL package (Broman et al., 2003;Broman and Sen, 2009). The final rating for each line was used to measure the overall FLS resistance. Frogeye leaf spot scores were used to find phenotypic and genotypic differences between the parental lines and the RILs. Single marker analysis and interval mapping were used to identify the chromosome of interests (data not shown), the Cim() function was subsequently used for composite interval mapping (CIM). The Fitqtl() function was used to estimate the variance of QTL of interest, and a 1,000 permutation test was run to determine approximate logarithm of odds (LOD) thresholds of significance using operm.ag. The LOD thresholds, 4.44 and 4.38 was used for 95% confidence.

Gene ontology and kyto encyclopedia of genes and genomes pathways
The SoyBase database (Wm.82 version 2) was utilized to analyze the gene ontology (GO) and kyto encyclopedia of genes and genomes (KEGG) pathway of the candidate QTL and identify which proteins are coded for in the CIM interval, (Grant et al., 2010). The UniProt Consortium database was then used to understand what these proteins then do within the plant so that overall gene function can be understood.

Phenotypic variation
The distribution of FLS scores across the first experiment (EF_1) was normal (P = 0.158), the kurtosis of the distribution was 0.004 and the skewness was 0.31. Overall, the average of the FLS score was 3.23 ± 1.32, and the scores ranged from 1 to 7.25. Five lines were identified as more resistant than Essex (average score, 1.50 ± 0.50), whereas two lines were more susceptible than Forrest (average score, 5.75 ± 2.49) (Figure 1). Lines more resistant than Essex were noted as E × F 2, E × F 9, E × F 10, E × F 11, and E × F 54 (average score, 1.0 ± 0). The lines more susceptible than Forrest were E x F 29 (average score, 7.25 ± 1.79) and E × F 63 (average score, 6.0 ± 2.0). The distribution of the second experiment (EF_2) was normal (P= 0.644), the kurtosis was -0.460 and the skewness was 0.385. The average FLS score was 3.05 ± 1.13 and the scores ranged from 1 to 5.75 (Figure 2). The FLS score for the parental lines "Essex" and "Forrest" was 2 and 4.5. A total of 13 lines were more resistant than "Essex" (average score, 1.48 ± 0.18) and 9 lines were more susceptible than "Forrest" (average score, 5.17 ± 0.22).

Construction of genetic linkage map
A genetic map was created with a total of 1,959 markers across 20 chromosomes (Figure 3). The total map length was 2121.01 cM with an average distance between markers of 1.08 cM ( Table 1). The average chromosome length was 105.05 cM with 97.95 markers on each chromosome. The largest chromosome was Chr. 19 with a length of 133.66 cM and 95 markers, while the shortest was Chr. 16 with a length of 84.27 cM and 55 markers. The most genetically dense chromosome was Chr. 3, with 1.17 markers/cM. The gaps of < 5 cM were at a rate of 99.97%.

Identification of QTL
In EF_1, the ss715614578-ss715615158 interval (Position: 61.81-69.27 cM) was identified to underlie FLS resistance on chromosome 13 (LG F). A single peak was observed at the ss715614724 marker (Position: 64.04 cM) with a LOD score of 6.36; the variation of the phenotype explained by the QTL was 14.33%. The beneficial allele was derived from Forrest. In EF_2, the interval ss715634685-ss715634842 (Position: 86.71-90.21 cM) was identified to underlie FLS resistance on chromosome 19 (LG L). A single peak was observed at the ss715634723 marker (position: 87.50 cM) with LOD score of 6.64; the variation of the phenotype explained by this QTL was 14.72%. The beneficial allele was derived from Essex (Table 2).

Resistance
The genotypes of RILs that were more resistant than  Essex were found to have a Forrest-like genotype at ss715614724 (Table 3), whereas those that were more susceptible than Forrest to have Essex-like alleles at the same location. These results suggested that Forrest was the parent contributing to the QTL of resistance. To confirm this hypothesis, one-way ANOVA was conducted comparing FLS scores of all RILs (n=81). This test compared lines with Forrest-like alleles, Essex-like alleles, and recombinant genotypes (Figure 4). The ANOVA test was statistically significant to 95% confidence (F 2,80 = 7.64, P < 0.0009). Lines with Forrest-like alleles had mean FLS ratings 1.15 smaller, which equates to   approximately 11.5% less foliar damage, compared to Essex-like alleles. Heterozygous lines were not statistically different from either Forrest-like or Essex-like lines.

GO and KEGG pathways
Within ss715614578-ss715615158 (Chr.13), a wide variety of genes has been published and identified (Table  4) (Nelson et al., 2010). The nearest gene to the peak at ss715614724 are the BT089187.1 and M31024.1 genes, both of which code for ribosomal protein S11. This protein resides within the cytosolic small ribosomal subunit and plays a major role in rRNA binding and overall ribosomal structure. A total of 9 genes have been published and identified within the ss715634685-ss715634842 interval, an NBS-LRR disease resistance  (Table 5) (UniProt Consortium, 2020).

DISCUSSION
The parents of the E×F population were scored for FLS resistance. Forrest received an FLS score 2.3-fold higher than Essex in EF_1 and 2.3-fold higher 2.3-fold higher than Essex in EF_2, confirming that Forrest is more susceptible against C. sojina race 15. These results aligned with those presented in prior studies on resistance to race 2 (Sharma and Lightfoot, 2017). Since our histogram fit the normal distribution, the skewness was near zero, suggesting that the segregation equally contributed to high and low FLS scores. A single QTL associated with FLS resistance was identified on Chr. 13 at the ss715614578-ss715615158 interval, which coincides with the region of SNP41647 that is known for Rcs (PI594891) in linkage group F (Pham et al., 2015). PI594891 is a Chinese plant introduction, and its resistance pathway is not yet well documented (Hoskins, 2011). Our QTL could be allelic to Rcs (PI594891). It is believed that this resistance gene is conditioned by Rcs3, but it likely carries different resistance alleles from one or two other genes (Pham et al., 2015). Another QTL associated with FLS resistance was identified on Chr. 19 at the ss715634685-ss715634842 interval; this QTL has not been reported.
In the present study, Forrest contributed the resistance allele in EF_1 whereas Essex contributed the resistance allele in EF_2. The results in EF_1 is contradictory to prior studies on race 2, in which Essex donated the resistance allele (Sharma and Lightfoot, 2017). Since Rcs2 generally confers resistance to race 2, we assumed the existence of a different resistance mechanism for race 15. Although it seems counterintuitive for Forrest to donate the resistant allele, it might be possible since Forrest was only partially susceptible. The use of only Race 15 of C. sojina may have also played a role in this finding. More research should be conducted on which specific races Forrest is susceptible to. It is possible Race 15 is one that Forrest holds resistance for. Many priorly conducted resistance tests use mixed races, which can skew results when individual races are used. In this study, the suggested QTL was minor, contributing 14.33 and 14.72% of variance, probably due to the low disease pressure across the experiments. Therefore, differences among genes of small effect might not have been identified. Future research is needed under field conditions with relatively high disease pressure to confirm the presence of the QTL and identify any interaction with the environment. Besides, the use of mixed races or other individual races of C. sojina would be also beneficial to better understand the underlying mechanism of resistance and the role of the QTL. Marker ss715614724 could be used in future breeding projects to fine-tune marker-assisted selection for resistance to FLS. The QTL in EF_1 was found to be associated with ribosomal S11. In soybeans, it was found that ribosomal S11 was significantly elevated when immature plants were treated with 2,4 D (Gantt and Key, 1985). Since this study, the presence of S11 has been associated with cellular proliferation. It is abundant in meristematic tissue and allows the plant to produce new cells efficiently (Lenvik et al., 1994). To this end, we can hypothesize that the found SNP alters the amount of S11 produced in the plant and allows it to overcome damage from C. sojina. An NBS-LRR disease resistance protein was identified within the ss715634685-ss715634842 interval on Chr. 19; these proteins serve as a protein interaction platform and may lead to cell death (Belkhadir et al., 2004). This protein may contribute to the FLS resistance in soybean.
According to SoyBase, the nearest published gene to the ss715634723 marker noted in EF_2 is associated with the CYP98A2 and AK287176.1 genes, both of which code for cytochrome P450-98A2. Its function is in metal binding and it performs oxidoreductase activities (The UniProt Consortium, 2020). Cytochrome P450 enzymes are a large class of monooxygenases that aid in various plant functions from biosynthesis of pigments to plant hormone production. Most famously, cytochrome P450 degrades herbicides, insecticides, and pollutants whenever introduced to the plant (Guttikonda et al., 2010). Further research should be done to formally conclude how this gene could be functioning in a way to provide protection from C. sojina.

Conclusions
In summary, we report a QTL that is related to Rcs (PI594891) and production of the S11 ribosomal protein that aids in cell proliferation and a novel QTL on chr. 19 associated with Cytochrome P450-98A2. The associated marker ss715614724 and ss715634723 could be used in future projects to stack resistance genes for FLS. Environment played a large part in our experiments, and future studies should be conducted with higher and more consistent disease pressure to determine if the identified QTL could confer a higher percentage of resistance. Overall, Forrest and its derivatives are a good source for the advancement of FLS resistance in soybean.