[PhD student] Romain Gautron: Reinforcement learning for developing country agriculture: a virtual personal adviser for risk mitigation and multi-objectives optimization

Thesis topic cofunded by #DigitAg

Reinforcement learning for developing country agriculture: a virtual personal adviser for risk mitigation and multi-objectives optimization

Starting date: January 2019
University : Montpellier MUSE / Institut Agro
PhD school: GAIA
Scientific field: Agronomy, IT
Thesis management: Marc Corbeels, Cirad AIDA / Philippe Preux, Inria Lille, Sequel
Thesis management: Eric Malezieux, Cirad HortSys / Odalric Ambrym Maillard, Inria Lille, Sequel
Funding: Cirad – CGIAR
#DigitAg : Labeled PhD – Axe 1 : Impact des technologies de l’information et de la communication sur le monde rural, Challenge 5 : Les services de conseil agricole

Keywords: Artificial intelligence; reinforcement learning; small farmers; southern countries; developing countries; recommendation system; crop itineraries; interactive voice server; smartphone; crowdsourcing

Abstract: Reinforcement learning (RL) has shown spectacular progress in recent years, especially when combined with deep learning for game playing with Deepmind’s AlphaGo (https://deepmind.com/blog/alphago-zero-learning-scratch/). While applied in marketing or pharmaceutical contexts, agriculture remains a poorly explored ground for RL applications. Agriculture is all about sequential decision making under uncertainty, making RL techniques particularly appropriate. With the emergence of bigger data collection at different scales by technologies such as remote sensing, ground-based sensors and real-time customer feedback with smartphone applications or interactive voice response (IVR) systems, a new frame of active and online-learning emerges where RL appears to be increasingly relevant. Current decision support systems for agriculture use static rules for decision making (if-then-else) employing statistical models complemented with agronomic and farm data, remote sensing, and some finite uses of machine learning outputs. The rules tend to be defined for specific management tasks (e.g. irrigation or fertilization) yet tend to not include the whole sequence of decisions, the profile of the farmer making the decision, optimization for multiple objectives (e.g. economic, environmental, social), or accounting for stochasticity (risk quantification). All of these are important for achieving relevant, secured and personalized agronomic recommendations. Those objectives can be fulfilled thanks to new RL advances. RL techniques allow to take into account uncertain impact of choices (such as crop management) and stochastic events (such as pests or weather). With model-free methods, problems that are too complex to be explicitly defined (in our case whole-farm planning) can be learned directly by interacting with the environment (i.e. directly by trying an action guessed to be the best in the real world). This is a learning frame of trials and errors: a recommendation is given, then its impact (called the ‘reward’) is assessed and future recommendations adjusted according to past experiences. These techniques can as well learn from historical data (e.g. soil, crop, climate data), linking them to what is called a ‘context’, making learning quicker and richer while quantifying uncertainty related to a recommendation. First, this PhD aims at designing tailored RL algorithms for agriculture with state-of-the-art techniques to give continuously improving advice to farmers, especially in a developing country context with limited data. By leveraging RL expertise from INRIA (SEQUEL Research team) combined with agronomists’ knowledge from CIAT, CIRAD and other research-for-agricultural development institutes we aim to build a novel multi-objective (economic, social and environmental) and relevant RL system for personalized and real-time recommendations for agronomic practices with a risk quantification component. This system will be operationalized as a personal virtual adviser queryable via IVR systems or other smartphone applications. This virtual assistant self-learns from earlier experiences (past recommendations followed by feedback from farmers) while relying as well on existing knowledge (historical data). Recommendations are provided according to space, time and individual farmer’s context. For instance, a recommendation would be of the “The choice expected to maximize your objectives is planting maize variety x the third yth week of august with a density of z plants per hectares”. Second, once the algorithms are tested in silico and ready for use, use cases will be conducted with smallholder farmers in Malawi. Available datasets consist of multi-dimensional data integrating location, normalized difference vegetation index,, precipitation, temperature, soils, land use, and historical crop productivity, as well, Digital Globe archives of sub-meter resolution satellite images, to delineate favorable areas and key limiting factors for maize production for the whole national territory of Malawi. Testing will be conducted through AirTel/Viamo M’Chikumbe interactive voice response-based advisory service which already has 726,000 registered farmers in Malawi. It offers a powerful way to interact directly with individual farmers via their mobile phones and provide tailored guidance across the entire country. If experiments are successful, it is expected that the methods and techniques developed in this PhD could be extended to other project sites in e.g. Colombia, Nigeria, Mexico, and India, that is to say to millions of smallholder farmers.

Contact : r.gautron [AT] cgiar.org

Social network: LinkedIn

Personal website : https://romaingautron.fr/

Modification date: 23 August 2024 | Publication date: 22 August 2022 | By: ZM

Name of the cookie	Purpose	Shelf life
CAS and PHP session cookies	Login credentials, session security	Session
Tarteaucitron	Saving your cookie consent choices	12 months

Name of the cookie	Purpose	Shelf life
atid	Trace the visitor's route in order to establish visit statistics.	13 months
atuserid	Store the anonymous ID of the visitor who starts the first time he visits the site	13 months
atidvisitor	Identify the numbers (unique identifiers of a site) seen by the visitor and store the visitor's identifiers.	13 months