Data science for pattern recognition in agricultural large time series data: A case study on sugarcane sucrose yield

数据科学在农业大型时间序列数据模式识别中的应用:以甘蔗蔗糖产量为例

阅读:3

Abstract

Data science (DS) is one of the areas with the greatest versatility for application in any field of knowledge. It allows the optimization of different processes of daily life and permits the analysis of massive amounts of data. DS combines computer programming with mathematics and statistics tools in multiple environments such as Python, Julia, R, among others. Here, a protocol was proposed to use DS tools applied to the organization, visualization, and analysis of historical data in sugarcane production systems in the tropics as a basis to identify patterns associated with sucrose. The protocol consisted of four phases: (i) data collection and organization, (ii) data management, cleaning and incorporation of new variables, (iii) visualization tools, and (iv) analysis and modeling based on a multiapproach using a frequentist model (generalized lineal model), a regularized regression model (Lasso) and machine learning models (AutoML). Each of the phases was implemented using multiple algorithms and techniques to automate processes such as queries, numerical calculations, sorting, grouping, dividing, pivoting, totalizing, concatenation, cleaning, visualization, and fitting to models using the free Python software and libraries including Pandas, Numpy, Plotly, Matplotlib, SciPy, PySpark, Scikit-learn, Statsmode, among others. Each of the phases allowed the elimination of variables that obscured the analysis process by considering parameters such as Pearson correlation, exploratory analysis, and modeling. Important variables that offered value in the analysis were obtained, considering those variables related to the soil as those of minor contribution, and climatic variables as the most informative. Our results present an alternative to traditional analyzes in the agricultural sector, based on a step-by-step protocol for the responsible use of DS in the search to understand the behavior and temporal historical patterns of sucrose in sugarcane.

特别声明

1、本页面内容包含部分的内容是基于公开信息的合理引用;引用内容仅为补充信息,不代表本站立场。

2、若认为本页面引用内容涉及侵权,请及时与本站联系,我们将第一时间处理。

3、其他媒体/个人如需使用本页面原创内容,需注明“来源:[生知库]”并获得授权;使用引用内容的,需自行联系原作者获得许可。

4、投稿及合作请联系:info@biocloudy.com。