Mixing process-based and data-driven approaches in yield prediction
Yield prediction models can be divided between data-driven and process-based models (crop growth models). The first category contains many different types of models with parameters learned from the data themselves and where domain knowledge is only used to select the predictors and engineer features. In the second category, models are based upon biophysical principles, whose structure and parameters are derived primarily from domain knowledge.
Here we investigate if the integration of the two approaches can be beneficial as it allows to overcome the limitations of the two approaches taken individually - lack of sufficiently large, reliable and orthogonal datasets for data-driven approaches and the need of many inputs for process-based models. The applications of the two categories of models have been reviewed, paying special attention to the cases where the two approaches have been mixed.
By analysing the literature we identified three major cases of integration between the two approaches:
- using crop growth models to engineer features and expand the predictors space,
- use data-driven approaches to estimate missing inputs for process-based models
- using data-driven approaches to produce meta-models to reduce computation burden.
Finally we propose a methodology based on metamodels and transfer learning to integrate data-driven and process-based approaches.
B. Maestrini, G. Mimic, P.A.J. van Oort, K. Jindo, S. Brdar, I. N.Athanasiadis, F. K. van Evert, Mixing process-based and data-driven approaches in yield prediction, European Journal of Agronomy, 139:126569, 2022, doi:10.1016/j.eja.2022.126569.
You might also enjoy (View all publications)
- Mixing process-based and data-driven approaches in yield prediction
- Combining telecom data with heterogeneous data sources for traffic and emission assessments - an agent-based approach
- A weakly supervised framework for high-resolution crop yield forecasts