Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms

Ithalo Coelho de  Sousa; Moysés  Nascimento; Gabi Nunes  Silva; Ana Carolina Campana  Nascimento; Cosme Damião  Cruz; Fabyano Fonseca e  Silva; Dênia Pires de  Almeida; Kátia Nogueira  Pestana; Camila Ferreira  Azevedo; Laércio  Zambolim; Eveline Teixeira  Caixeta

doi:10.1590/1678-992X-2020-0021

Authors

Ithalo Coelho de Sousa Universidade Federal de Viçosa – Depto. de Estatística https://orcid.org/0000-0001-8456-9349
Moysés Nascimento Universidade Federal de Viçosa – Depto. de Estatística https://orcid.org/0000-0001-5886-9540
Gabi Nunes Silva Universidade Federal de Rondônia – Depto. de Matemática e Estatística https://orcid.org/0000-0003-4161-9267
Ana Carolina Campana Nascimento Universidade Federal de Viçosa – Depto. de Estatística https://orcid.org/0000-0002-6985-1490
Cosme Damião Cruz Universidade Federal de Viçosa – Depto. de Biologia Geral https://orcid.org/0000-0003-3513-3391
Fabyano Fonseca e Silva Universidade Federal de Viçosa – Depto. de Zootecnia https://orcid.org/0000-0001-9536-1113
Dênia Pires de Almeida Universidade Federal de Viçosa/Instituto de Biotecnologia Aplicada à Agropecuária https://orcid.org/0000-0003-2871-9618
Kátia Nogueira Pestana Embrapa Mandioca e Fruticultura https://orcid.org/0000-0002-3049-9119
Camila Ferreira Azevedo Universidade Federal de Viçosa – Depto. de Estatística https://orcid.org/0000-0003-0438-5123
Laércio Zambolim Universidade Federal de Viçosa – Depto. de Fitopatologia https://orcid.org/0000-0001-5703-5069
Eveline Teixeira Caixeta Embrapa Café https://orcid.org/0000-0001-8850-6273

DOI:

https://doi.org/10.1590/1678-992X-2020-0021

Keywords:

Hemileia vastatrix, statistical learning, plant breeding, artificial intelligence

Abstract

Genomic selection (GS) emphasizes the simultaneous prediction of the genetic effects of thousands of scattered markers over the genome. Several statistical methodologies have been used in GS for the prediction of genetic merit. In general, such methodologies require certain assumptions about the data, such as the normality of the distribution of phenotypic values. To circumvent the non-normality of phenotypic values, the literature suggests the use of Bayesian Generalized Linear Regression (GBLASSO). Another alternative is the models based on machine learning, represented by methodologies such as Artificial Neural Networks (ANN), Decision Trees (DT) and related possible refinements such as Bagging, Random Forest and Boosting. This study aimed to use DT and its refinements for predicting resistance to orange rust in Arabica coffee. Additionally, DT and its refinements were used to identify the importance of markers related to the characteristic of interest. The results were compared with those from GBLASSO and ANN. Data on coffee rust resistance of 245 Arabica coffee plants genotyped for 137 markers were used. The DT refinements presented equal or inferior values of Apparent Error Rate compared to those obtained by DT, GBLASSO, and ANN. Moreover, DT refinements were able to identify important markers for the characteristic of interest. Out of 14 of the most important markers analyzed in each methodology, 9.3 markers on average were in regions of quantitative trait loci (QTLs) related to resistance to disease listed in the literature.

Downloads

Download data is not yet available.

Genomic prediction of leaf rust resistance to Arabica coffee using machine learning algorithms

Authors

DOI:

Keywords:

Abstract

Downloads

Downloads

Published

Issue

Section

License

How to Cite

Language