#DigitAg Master2 Internship Offers

Every year, #DigitAg grants Master 2 Internships for French and foreign students

 

In 2021, subjects from different disciplines are suggested in Human&Economic Sciences, Environment&Life sciences, as well as in Engineering and Mathematics.

  • To apply please send your curriculum and application letter to the relevant contact from the list below
  • Allowance: #DigitAg gives internships grants to the research units:  the amount of your allowance is to be asked to the contact person

Maths and its applications

Environnement and life sciences

Human and Economic sciences

Engineering sciences

Maths and its applications

Using image analysis to estimate plant leaf area in crop mixtures

  • Keywords: image analysis, phenotyping, intercrops, water stress
  • Duration: 6 months
  • Desired start date: 1/02/2021
  • Research Unit: AGIR, Inrae
  • Contact: Casadebaig Pierre – casadebaig@inrae.fr
  • #DigitAg: Axis 5 – Challenge 1

Increasing plant diversity in agriculture is suggested as one of the main mechanisms to move towards more sustainable production systems. Plant diversity enhances and stabilizes primary production through complementarity between plants. Currently, the challenge is to determine which species mixtures improve the overall performance of the agroecosystem through better use of available environmental resources in the crop system considered. However, crop physiology in species mixture conditions is much less studied than in monoculture, which is an important prerequisite to understand and leverage ecological processes in cropping systems.

In this study, we will focus on growth and water consumption processes in species mixtures, using automated plant measurements and image analysis.

Our aim for crop physiology is to assess whether the mixture conditions impacts the crop water consumption as compared to sole crop conditions. This work necessitates a key methodological development, aiming to build a model to estimate plant leaf area as a function of features extracted from automated image acquisition, several times a week and for pure and mixed plant stands.

The data required to develop this method will be retrieved from an experimentation under controlled conditions. We will use the high-throughput phenotyping platform (Heliaphen, INRAE TPMP), where a robot enables to manage the cultivation of plants growing in pots and to acquire automatically imagery data and data on water consumption.

We will focus on two species (wheat and pea) grown alone or in mixture (three combinations) under two water scenarios (well-irrigated and dryness).

Four steps will be proposed: 1) extract plant functional characteristics from images acquired daily with a robot; 2) complete these data by manual measurements on the plant leaf area in order to validate the modeling step; 3) develop a model to predict plant leaf area from features extracted from images; and 4) analyze the data in order to compare water consumption between species growing in mixture or alone.

Building the decision to optimise weed control

  • Keywords: Data mining, Machine Learning, weeds, tropical, agroecology, expert system, decision support
  • Duration: 6 months
  • Desired start date: 3/02/2021
  • Research Unit: Aida, Cirad
  • Contact: Auzoux Sandrine – auzoux@cirad.fr
  • #DigitAg: Axis 5  – Challenge 1 ,Challenge 3 ,Challenge 8

Weeds are a major constraint on tropical agricultural production, leading to crop yield losses of 30-80%. Optimizing weed control practices in crop management requires a good knowledge of their behaviour. This course is part of a collaborative research initiative that proposes to combine approaches related to data science and field knowledge from experimental trials in agroecology. The aim of the course is to build a decision support tool for farmers, enabling them to optimize weed management practices according to pedoclimatic factors, technical itineraries and knowledge of the life cycle of species on Reunion Island.

This work will be based on a large volume of data (floristic surveys, environmental factors, taxonomy, and phaenology of weed species) that has been published on the CIRAD Dataverse in the form of files that are standardized from a syntactic, semantic and structural point of view.

The construction of the decision-support tool will be based on three steps:

  1. Predicting the influence of agronomic and ecological factors on the presence and abundance of species.

Based on the 1,341 surveys carried out on the Reunion Island and the 300 taxa observed, a machine learning method will be developed to predict the probability of the presence and abundance of the various weed species. The abundance of species in a plot will be predicted according to factors that can be easily informed by farmers, such as field location, soil type, altitude, season of intervention and the technical itinerary of the crop.

  1. Predict the aggressiveness of weeds by taking into account their phenology.

Based on the expertise of CIRAD weed scientists, the abundance predictions of stage 1 will be combined with aggressiveness indices for each species to estimate the potential harmfulness of weeds in the plot under consideration. In addition, by integrating the observations made on the phenology of the species (emergence, flowering), which are themselves linked to the previous pedoclimatic factors, an optimal range of intervention will be defined and technical choices of intervention will be recommended for the implementation of weed management practices..

  1. Advise the most appropriate means of control

Taking into account the predicted abundance of species, the range of intervention and the technical itinerary, as well as the expertise of local researchers, an expert decision system will be built to propose the most appropriate control methods (alternative methods to herbicides or optimized application of herbicides) to the agro-ecological context and according to the technical possibilities of the farmer.

This decision-support tool will be developed in the Reunionese context and for sugarcane crop, but in a sufficiently generic way to allow re-configuration for other cropping systems, other territories and other climates. Thus, it will be possible to propose adapted weeding practices according to a type of weed infestation prediction, in a given location, at a given date, for a given cropping system.

Environnement and life sciences

Monitoring the evolution of body reserves in dairy heifers: a tool to prevent early culling ?

  • Keywords: body reserves; 3D imaging; culling and welfare; real time measurements and analysis.
  • Duration: 6  months + 2 months short-term contract
  • Desired start date: 15/02/2021
  • Research Unit: Pegase, Inrae
  • Contact: Le Cozler Yannick – lecozler@agrocampus-ouest.fr
  • #DigitAg: Axis 2, Axis 3 – Challenge 1,Challenge 2,Challenge 4

Because they do not produce milk and are often reared far from the farm, dairy heifers for replacement are not subject to sufficient attention, comparted to dairy cows.  However, they represent half of the animals present in a farm and it is during the breeding period that their career is prepared. Longevity, i.e. age at culling, will only be optimal if the rearing period provides good conditions for growth, health and well-being. This can be estimated indirectly via the evolution of indicators such as body weight (BW) or the level of body reserves. In France, nearly 1/4 of heifers are culled after first calving, and despite this high figure, the causes are not known and should be the subject of future PhD work. However, the importance of body reserves in adult cows suggests that this criterion, especially its evolution during gestation, could have a strong influence on the risks of early culling in primiparous cows. Unlike adults cows, there is no body condition score (BCS) grid for heifers to estimate their reserve levels. Growth models based on BW gain have therefore been developed to estimate the deposition of lean and/or fat tissue. However, weighing is little done on-farm, as it is costly and sometimes dangerous. Finally, because the heifer is a growing animal that does not produce milk, the kinetics of BW and BCS gains differ from those of the cow: it is possible to observe a BW gain in heifer concomitant to a loss of BCS, a phenomenon rarely observed in cow. Automated analysis of 3D images allows real-time monitoring of BCS, estimated at the rump level in cows. In heifers, due to the development mentioned above, such learning would probably need to be reconsidered. But for future widespread use, the capture tools must also be inexpensive and easy to use, based on sensors dedicated to 3D environmental recognition coupled, in the long term, with automated and rapid processing of the information. Such devices are being tested on adult cows in the UMR PEGASE experimental facilities. Student will be in charge of developing a method for real-time BCS measurements in pregnant, based on the expertise being developed in adult cows. BCS measurements will be performed 3 times/week, with animals receiving two different diets to obtain very different BCSs at calving. In addition to the analysis of animal performance, the objective will be to validate a method that takes into account possible missing and/or aberrant data, which are common in commercial herds.

Construction of an event detection tool (reproductive, sanitary, malfunction) using data from automatic water trough in a herd of dairy cows and several groups of sows

  • Keywords: precision farming, event detection, water consumption, multi-species
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Pegase, Inrae
  • Contact: Boudon Anne – boudon@inrae.fr
  • #DigitAg: Axe 5  – Challenge 2

Precision farming tools now allow individual and automated tracking of dairy cows and sows. This follow-up, combined with the breeder’s observations, can lead to an earlier detection of a large number of events in which a breeder’s response is required (calving, health problems, malfunction of equipment). The UMR PEGASE validated, both on dairy cows and on pregnant sows, an individualized monitoring device for drinking behavior, consisting of connected water troughs able to record individual water consumption and drinking behavior.

The main hypothesis of this project is that large variations in water consumption can indicate health, reproduction or equipment dysfunction events. Two databases collected at UMR PEGASE will allow this hypothesis to be tested. The database of dairy cows consists of 2 years of data collection (water consumption, milk production, breeding and sanitary events) on a herd of 162 cows. The database of gestating sows contains data about water consumption, ingestion, and health events collected on 60 sows during various induced technical malfunctions events (i.e. closure of one of the feeder on the two in a room, variation of the room temperature).

This internship will be carried out at UMR PEGASE of INRAE, with a double supervision by a researcher nutritionist (Anne Boudon) and a researcher modeler (Charlotte Gaillard). This internship will also be an opportunity of exchange with researchers from UMR MoSAR. Thus, the internship monitoring committee will bring together researchers who are competent on the physiological regulations of water consumption and on the processing of dynamic data in livestock for the detection of events.

At the start of the internship, the databases described above will be available as well as a first draft of the R event detection program. The first step of the internship will be to verify and structure the data, as well as to explore and describe the variability of the data. The second step will consist of building an event detection process. The planned approach is to use a differential smoothing method, already implemented by UMR MoSAR, to detect disturbances in the dynamics of watering signals. These disturbances will then be compared to the dynamics of events induced in sows and to recorded events in dairy cows. This disturbance detection method has the a priori advantage of its adaptability to the two very different contexts treated (lactating cows and pregnant sows). This internship should also be an opportunity to explorate other possible approaches, in particular through an exchange with the Locadam unit of IRISA.

Human and Economic sciences

Digital Technologies and Circular Economy: Progress and Opportunities

  • Keywords: Digital technologies, Circular Economy, Data value chain
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Moisa, Inrae
  • Contact: Piot-Lepetit Isabelle – piot-lepetit@inrae.fr
  • #DigitAg: Axis 2 – Challenge 7

The use of digital technology on farms is supposed to improve productivity, traceability, working and living conditions of farmers, but also to limit negative impacts on the environment and fight against climate change (FAO, 2013). In addition, digital technologies are being considered to support and promote a transition to the circular economy; the latter being defined as an economic system of exchange and production which, at all stages of the life cycle of products (goods and services), aims to increase the efficiency of resource use and reduce the impact on the environment, while developing well-being (Ademe, 2020).

Indeed, digital technologies through the provision of information on the availability, location, and condition of products can support the development of production and consumption processes that minimize waste, increase the lifespan of products, and reduce induced transaction costs, either directly (monetary costs) or indirectly (time and effort required). In particular, digital technologies are seen as being able to help close the loop of the circular economy or at least reduce it in order to decrease the impact on the environment.

The purpose of the internship is to take stock of digital technologies supporting the development of the circular economy, in particular by (i) questioning their integration into the data value chain (collection – integration – analysis) and (ii) characterizing their contribution to the circular economy. In addition to the description of what already exists, special attention will also be paid to the challenges encountered by the actors involved and research needed to link digital technologies and the circular economy. The expected deliverable is a review of the literature exploring what is being done in different industries.

In order to characterize  elements specific to the agricultural sector, interviews will be carried out with actors developing and using digital technologies for the purposes of reducing the resource use or implementing activities based on a circularity principle. The actors to be interviewed will be selected based on the positioning of their digital technologies in the data value chain. The expected deliverable will offer examples of the use of digital technologies for circularity purposes in the agricultural sector and a very first synthesis of the implications in terms of changes in the economic system and the business models of the actors involved. Particular attention will be paid to elements concerning the creation and sharing of data, the collection and integration of these data, data analysis implemented and the use of the knowledge created, but also the needs of infrastructure, collaborations, and skills.

Engineering sciences

Participatory simulation of renewable resource management in hybrid environments

  • Keywords: Participatory simulation, Computer Vision , Smalltalk, Cormas
  • Duration: 8 months
  • Desired start date: 1/02/2021
  • Research Unit: Green, Cirad
  • Contact: Bommel Pierre – bommel@cirad.fr
  • #DigitAg: Axe 2,Axe 6 – Challenge 1,Challenge 2 ,Challenge 6 ,Challenge 7, Challenge 8

The ambition to remove the physical interfaces that allow interactions with computers is an old idea in the IT world that took several directions. The most emblematic one for the general public emerged in 2005 under the name of “Project Natal” taken up and democratized by Microsoft in its latest version called “Azure Kinect” (2019). This sensor detects movements and objects in the user’s field of vision to produce results in softwares.  It has been abundantly diverted and enriched since its beginnings for the general public to allow new ways of interacting with computers.

In addition, the group of researchers behind the development of an agent-based modeling platform called Cormas [1] is currently facing increased needs for interactivity in participatory simulations. Based on the Smalltalk language, Cormas has been developed since the late 1990s by a highly interdisciplinary team. Beyond the traditional uses of agent modeling, Cormas differs from other platforms by innovative modeling processes: the ComMod approach (Companion Modeling [2]), which aims to provide tools to actively participate in the design of a model, but also to immerse oneself in interactive game situations.  The game (board or computer game) is then used as a support to exchange on a real-life situation.

In 2012, for the first time, the team used an ultra-short focal projector to project a game board on a table. This simple change induced radically different behaviors of the actors allowing them to interact more naturally with computer simulations. In addition, the spatial configuration of the actors favors social interactions.  But until today, manual input by the facilitator of the decisions taken by the players remains necessary, which reduces the fluidity and spontaneity of interactions.

The team in charge of Cormas’ development wishes to investigate the possibilities offered by new opportunities related to the detection of objects and movements by bringing a prototype of participatory simulation in the field of hybrid simulation between computer model and physical game.This proof of concept work applied to the ReHab model (ComMod’s school model) will require the development of reliable detection methods for physical objects to be integrated into the calculation of the new simulated state.

Today, Cormas relies on Pharo (an open-source Smalltalk).  This solution is implemented by INRIA and a consortium of researchers and industrialists. The dynamic community proposes a set of packages allowing to mutualize the development of functionalities that could be mobilized to propose a prototype for participative simulation and object detection

Collecte et analyse de données issues des ruches connectées des apiculteurs

  • Keywords: time series, connected scales, honey production, beekeeping, database management
  • Duration: 6 months
  • Desired start date: 15/01/2021
  • Research Unit: Itsap, Acta
  • Contact: Fabrice Allier- allier@itsap.asso.fr
  • #DigitAg: Axe 2, Axe 5 ,Axe 6 – Challenge 1,Challenge 4,Challenge 5

Beekeeping has faced many challenges in recent decades, the causes and consequences of which have plunged beekeepers and the economy of the sector into a situation that is still uncertain. In response to the decline of bees, the decline in French honey production and strong environmental constraints, beekeepers are increasingly expressing the need for technical support and to base their strategies on references and tools for decision aids.

One proposal is to study the behavior of honey bee colonies, their dynamics over a period and in a given territory and their performance in terms of honey production. For this, beekeepers have been using electronic and connected scales for the past fifteen years, measuring the weight of the hives in real time. With a few thousand scales now in operation with beekeepers and hourly information being reported, this constitutes a mass of data that is still undervalued to this day. In addition, the ability of a colony to exploit nectar resources is influenced by biotic (colony development status, floral resources) and abiotic (meteorology) factors which may be considered depending on the availability of these data. By crossing these different data sources and implementing data science methods, this offers great potential for building the prediction services of tomorrow.

A thesis led by ITSAP in collaboration with INRIA on this topic was accepted by DigitAg in 2018, but its implementation was thwarted by the onset of a financial crisis at ITSAP. The financial stability of the Institute is now secure, and the subject is still as important for the beekeeping sector, we wish to relaunch this scientific dynamic in 2021 through this internship subject. A new submission of a thesis subject as a continuation of this internship will also be made.

In addition, this internship will be based on the dynamics of the Occitanum Apiculture open lab which also provides for actions on these connected balance technologies to develop decision support tools for beekeepers.

The intern’s work will aim to

– Approach beekeepers in order to obtain their consent to collect digital data from the scales (a minimum of one hundred) and complete the contextual data essential to the interpretation of the evolution of the weight curves generated by the scales (i.e. geolocation, change and date of places during transhumance of beehives, activity / honey targeted by the beekeeper). This task will be carried out directly by the trainee or he will support advisers from beekeeping development associations in charge of ensuring this data collection.

– Validate the quality of the data collected according to the specifications of the format required for the “MIELLEES Computer System” database, already operational and developed by INRAE ​​Genphyse and ITSAP-Institut de l’Abeille in 2019-2020.

– Apply a routine of “cleaning” of the numerical data of the evolution of the weight from an R programming acquired and tested on a set of experimental data restricted in 2020 (identification of the type of data, validation of the time step, management missing data, validation of the variation in weight, correction of weights, possibly segmentation).

– Identify subsets of consistent data from apidological criteria (honey production basin, period, target honey flow, density of scales identified per unit area, etc.).

– Develop and test the temporal dynamics of the weight of a hive (variable of interest) to predict the colony’s ability to exploit the nectar resources available within a given foraging radius and in relation to climatic factors. Different methods of machine learning will be tested.

– Offer operational, simple and educational outputs for beekeepers of the results obtained from coherent data subsets: visualization and mapping of productions, average of the evolution of weights, positioning of the activity of a colony in relation to other colonies, identification of favorable days for foraging according to the meteorology

Phenotyping mixed crops by proxidetection sensors

  • Keywords: Phenotyping ; mixed crops ; deep learning
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Arvalis, Acta
  • Contact: Benoit De Solan – desolan@arvalis.fr
  • #DigitAg: Axis 3  – Challenge 1,Challenge 2

Mixed crops have many agronomic advantages: better use of resources, better tolerance to stress. However, to be beneficial, they require adequate cultivation, adapted to local pedo-climatic conditions.

In order to better help producers choose the species and manage their crops, measurements are taken during the season in order to monitor the development of the main crop and associated cover. These data are classically obtained from visual field observations and destructive measurements. However, these methods are complex, tedious and often not very repeatable because very heterogeneous spatially. Therefore, their automation by sensors is a major challenge for accelerating acquisitions and improving their repeatability.

In 2021, a field trial comparing different modalities of associated crops will be set up in Oraison (04). It will be monitored by classical methods (visual, destructive) and by a set of largely automated phenotyping techniques (drones, portable system). Support activity with the technical team will be necessary to ensure the smooth running of the season. Once acquired and structured, this dataset will be used for the development of methods for characterizing the mixed covers as well as for their validation, in connection with experts from UMT CAPTE. Several approaches will be implemented and evaluated: quantification of coverage fractions by deep learning (semantic segmentation), characterization of the crops’ structure by depth image analysis (stereovision). Finally, a comparison of the performance of the different methods and tools will be established.

Integrated within the INRAe – ARVALIS team in Avignon, the intern oversees a complete data project: from data acquisition to analysis in the experiments set up. The main steps are as follows:

  • Participate in the field acquisition of sensor measurements and reference data (March -> June) with sensors developed by CAPTE
  • Develop and compare different algorithms, in conjunction with the CAPTE team. The robustness of the methods is an essential point.
  • Evaluate the quality of the estimates by the system, according to the conditions of use and the methods
  • Write the internship report and preparation for the defense.

ARVALIS – Institut du végétal is a leading applied research organization in the analysis of sensor data for agriculture. Within the UMT CAPTE, in cooperation with INRAE of Avignon and the company HIPHEN, we develop tools and methods using sensors (lidars, cameras, spectrometers, etc.) integrated on different systems (robots, drones, sensors wireless, …)

DeepBeesAlert: towards a system of sustainable management and protection of pollination resources

  • Keywords: Asian hornet, invasive species, bee decline, cost of predation, digital agriculture, deep learning, insect tracking
  • Duration: 6 months
  • Desired start date: 1/02/2021
  • Research Unit: Amap, Cirad
  • Contact: Borianne Philippe – borianne@cirad.fr
  • #DigitAg: Axis 3, Axis 4, Axis 5 – Challenge 5, Challenge 6

The honey bee, Apis mellifera, an essential link in biodiversity and the main pollinator of many food crops in Europe, is experiencing an unprecedented decline. Pesticides, parasites and climate change are contributing to this decline, as is the Asian Hornet, whose main prey is the honey bee. Accidentally introduced in France in 2004, this hornet is gradually invading Europe, ultimately threatening agriculture and our food security. While the research undertaken to control this predator is multiple and not very ecological, the hornet’s behavior in front of hives has never been studied in detail and its impact on bee populations has never been precisely quantified. However, only the automatic processing of behavioral data acquired en masse on a national scale will allow us to evaluate the economic cost of this predator, to know the types of apiaries that are most impacted and to warn about the moments or conditions favourable for intervention or trapping.

This project aims to develop a reliable and automatable methodology for counting predation events at the entrance of hives in order to quantify the hornet’s impact on apiaries and analyze its evolution over time and the factors likely to influence it. This will involve studying the behavior of hornets and counting from video surveillance data on the hives the catch/attack ratio of the bees over time in order to highlight possible correlations between the predation rate and the seasons, the hours of the day and the weather conditions. From 2016 to 2020, hives coupled with weather stations were equipped with cameras of different resolutions filming continuously one day per week between July and November (about 450 hours of acquisition).

The internship consists in setting up and validating automatic processing solutions coupling Deep Neuron Networks to detect and track objects in videos with variable acquisition frequency (from 25 to 240 fps) on fast moving and erratic data.

The student will have to carry out bibliographical research integrating recent significant advances in video processing (road traffic, bird flight, etc.), deploy and evaluate a convolutional neural network adapted to the specific detection of hornets and captures; object “tracking” scripts will complete the solution to correctly count the objects of interest in each video; finally, an analysis of the tracking trajectories of each object will be undertaken to address the behavioral study (visits, attacks and captures) of hornets. The student will have to evaluate the results produced by the selected solution and analyze the counting errors measured to improve the proposed system and thus enable to initiate large-scale statistical studies aimed at establishing correlations between predation rates and environmental factors.

Design and development of a recommendation Web service of relevant semantic resources in agriculture

  • Keywords: Machine Learning, Linked Open Data (LOD), semantic Web, Web service, and agriculture
  • Duration: 6 months
  • Desired start date: 1/02/2021
  • Research Unit: LIRMM, Université de Montpellier
  • Contact: Emna Amdouni – amdouni@lirmm.fr
  • #DigitAg: Axis 5  – Challenge 8

The master internship is part of the D2KAB project (www.d2kab.org), a research project started in 2019, which aims to help developers and users of agricultural data to transform their data into actionable knowledge. The D2KAB project develops and maintains AgroPortal, a public platform for semantic resources (thesaurus, terminologies, vocabularies, and ontologies), specialized in the agriculture and agronomy domain. AgroPortal is founded and maintained by the LIRMM laboratory in collaboration with the INRAE. It hosts around 126 semantic resources and offers its users a variety of semantic web services (search, annotation, alignment, etc.).

Currently, we are implementing a new Web service that makes AgroPortal semantic resources easily interoperable and reusable. More concretely, we adopt open science principle, mainly the FAIR principles (acronym of Findable, Accessible, Interoperable, and Reusable), via the exploitation of automatic techniques. In this context, we are offering an internship that aims to build and implement a recommendation system for semantic resources (described in OWL, OBO, and SKOS) within our platform. We hope that the recommendation system that will be developed by the future intern will support and enhance, on an international scale, AgroPortal’s FAIR vision.

The main objective of the master internship is to use artificial intelligence techniques for the realization of a recommendation Web service of semantic resources. More precisely, the main missions of the internship concern:

  1. The implementation of an identification method for similar semantic resources based on their description (e.g. authors, metrics, etc.) and content (e.g. key concepts, alignments, reused data, etc.).
  2. Recommendation of relevant semantic resources using machine-learning techniques aiming to improve the integration and reuse of existing AgroPortal’s semantic resources.
  3. The development of a human-machine interface for displaying recommendations.
  4. The deployment of the recommendation Web service on our portal and testing of the results.

Completing the internship assignments will require motivation to learn semantic Web technologies, a good knowledge of existing artificial intelligence techniques, and technical background of at least one scripting language such as R or Python.

Profile of the desired candidate:

– Computer training, data science or equivalent

– Strong skills in Web development and in C or Java language

– Motivated, autonomous and rigorous

Deep Learning for detection, delineation, and differentiation of mango trees and mango-based orchards from very high spatial resolution images

  • Keywords: Remote sensing, orchards, very high spatial resolution, satellite time series, deep learning, neural networks, classification
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Tetis, Cirad
  • Contact: Camille Lelong – lelong@cirad.fr
  • #DigitAg: Axe 3, Axe 5 – Challenge 1, Challenge 5, Challenge 6,Challenge 8

In West Africa, information acquisition in fruit sector is hampered by the lack of adapted methods and tools for the characterization of fruit tree based systems, often complex (e.g. agroforestry systems). In this context, PixFruit project (UPR HortSys) aims at aquiring data on mango production at both scales of tree and orchard, respectively, for the calibration of regional production models dedicated to provide accurate and reliable analytical statistics to the actors of the sector. In order to extrapolate to the scale of a region the mango production from data collected in the field by PixFruit tools (PixFruit smartapp), it is necessary to delimit and classify the trees and orchards to provide additional input data (cultivated area, planting density, cultivar composition, etc.) to regional models.

We thus propose to explore and evaluate the potential of deep learning in neuronal classification and segmentation methods for the production of this information from multispectral satellite imagery data at very high spatial resolution (Pleiades).

Our first objective is to develop, on the basis of this innovative methodology, tools making it possible to detect, identify and delineate orchards in a production region, and then to discriminate those containing mango trees. Our second objective is to identify and delineate the mango trees themselves, as individual trees, whether isolated or in orchards. It will therefore be necessary to obtain better results in segmentation (orchard circumscription, individual tree cut out) and classification (orchard recognition according to its majority species, orchard classification according to structure and cultivar composition, tree by tree species identification) than with the tools more conventionally used in remote sensing (SVM, RF). In this work, we will evaluate the potential of the two mostly used types of neural networks, on several data architectures, to delimit and classify orchards: convolutional networks (CNN) on two Pleiades images acquired in March and July 2017, and recurrent networks (RNN) on the association of these Pleiades images and a Sentinel-2 time series. Finally, we will analyze the performance of the Mask-RCNN (Regional Convolutional Neural Network) to correctly identify and segment trees.

The study area will be the Niayes in Senegal (503 km2), relevant for its cropping systems diversity comprising different levels of complexity and density (fruit monocultures, extensive systems and agroforestry systems) and the presence of many cultivated species (e.g. mango, citrus, cashew, neem, etc.). In addition, this area benefits from a large field database (11,300 mango trees and 12,211 orchards, resulting from Julien Sarron’s #DigitAg thesis) and the agronomic expertise got within the framework of the PixFruit project, which will allow the technical achievement of this study.

Characterising architecture and production on an apple tree core-collection based on airborne Lidar scans

  • Keywords: Airborne LiDAR, deep learning, points analyses, apple core-collection, high-throughput phenotyping
  • Duration: 6 months
  • Desired start date: 1/02/2021
  • Research Unit: Agap, Inrae
  • Contact: Costes Evelyne – costes@inrae.fr
  • #DigitAg: Axis 3 – Challenge 2

In the current context of climatic change and input reduction (fertilizers, water, pesticides, etc.), selecting new performant cultivars of fruit trees is becoming essential. Architectural traits (3D tree  structure and leaf distribution) have to be considered to evaluate the intrinsic genotypic potential for fruit production, interactions with environment (light, rains, insects, etc.) and the easiness of tree training. To evaluate such traits with precision and at high-throughput, we are developing an approach based on airborne Lidar scans which allows a rapid data acquisition of a whole orchard in 3D. This choice was made after a first set of experiments in which trees have been scanned during the vegetation period (with leaves and fruits) by terrestrial Lidar. However, this acquisition procedure was tedious and time consuming whilst it has allowed us to deliver first architectural characterisation of the trees of the collection  (Coupel et al., 2019) and a first estimation of their production (Artzet et al., 2020). Part of the methods for data analyses are based on machine and deep learning, especially using networks for point treatment, such as RandLa-Net (Hu et al., 2019). Recently, we have decided to revisit the data acquisition procedure and we have performed new scans with airborne LIDAR that are more rapid. However, both resolution and points of view differ between terrestrial and airborne LIDAR making it necessary to adapt the methods for the characterisation and identification of organs (stems, leaves and fruits). On whole orchard 3D scans, identifying individuals trees (which each corresponds to a genotype) and the organs that belong to each of them is a real challenge and is key for automatizing our methodology.

This Master internship will aim at formalizing a pipeline for point analyses that will allow the identification of each tree instance, the characterisation of its shape, foliar density and production by adapting and testing the estimation methods as well as the indicateurs associated with these new data. To do this, it will be necessary to revisit the various stages of the pipelines implemented previously, including the creation of a learning database for which data from the simulation can be used (MappleT model for generating virtual 3D architectural models of apple trees (Costes et al., 2008)). Virtual scanning methods simulating drone acquisition will have to be implemented, as well as transfer learning methods to reproduce the noise observed in the real data. For production estimation, points corresponding to apples could be identified with methods such as RandLa-Net. Nevertheless, the resulting clustering to identify individual apple instances will have to be robust to a low point sampling.

Objective indicators on tree development and production that could be used for genetic analyses, such as GWAS, are expected as the outcome of this work.

Analysis of simplified architectural 3D representations for improving crop model simulations – application to grapevine

  • Keywords: architecture, image analyses, 3D models, crop model, photosynthesis, transpiration
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Lepse, Inrae
  • Contact: Pallas Benoît – pallas@inrae.fr
  • #DigitAg: Axis 6 – Challenge 2

Aerial architecture management is of major importance for optimizing crop functioning since architecture directly determines the amount of intercepted light and the microclimate within the canopy, thus modifying many physiological plant functions (photosynthesis, transpiration, water use efficiency). The challenge of optimizing canopy structure is critical for grapevine as shown by all the training practices used to modify it (winter and summer pruning, trellising…). Functional structural plant models (FSPM) have been developed with the aim of simulating the spatial variability of ecophysiological processes on these 3D architectures. Nevertheless, their use is made difficult by the large amount of data needed for building an exhaustive description of the input architectures. Meanwhile, new high-throughput phenotyping methods (lidar, airborne imaging) are under rapid development and can be used to get new indicators on architectural characteristics of the canopy. Moreover, crop models allow simulating crop yield under different environments but they did not integrate detailed descriptions of plant architecture and of the resulting within-plant variability in functional traits. The objective of this project on grapevine is to determine to which extent simplified representations of aerial architectures could be coupled to crop models in order to better simulate some variables that present large within-plant spatial heterogeneity (light interception, photosynthesis, transpiration). The final aim is to suggest and test new high throughput phenotyping protocols of the architectural features needed to reconstruct these simplified 3D mockups. The research activities will be carried out at UMR LEPSE in Montpellier using existing plant models available under the open-source OpenAlea platform (TopVine, Louarn et al. 2008 ; RATP, Sinoquet et al. 2001; Hydroshoot, Albasha et al. 2019). The first step will be to simulate from already existing 3D data simplified representations similar to what could be acquired under field conditions (lidar, airborne imaging). Simulations of ecophysiological processes will be performed on these simplified 3D structures using different levels and methods for aggregating foliage characteristics (entire plant, voxels…). Results will be compared to simulations performed on detailed 3D representations (consider as references) in order to determine the best trade-off between (1) global error at the plant scale, (2) ability of the method to account for the spatial variability, (3) experimental time needed to get the input data, (4) and computation time. This simplified architecture representation and the associated computational methods could then be used as inputs of crop models in order to test the impact of contrasted training systems on integrative plant variables.

Characterize the natural environments grazed by remote sensing to assess the sustainability of pastoral livestock in the PACA region

  • Keywords: Pastoral areas, Land cover, Remote sensing
  • Duration: 6 months
  • Desired start date: 1/04/2021
  • Research Unit: Selmet, Inrae
  • Contact: Shaqura Imad – shaqura@inrae.fr
  • #DigitAg: Axe 2, Axe 3 – Challenge 1, Challenge 4, Challenge 6

Interactions between pastoral livestock farming and “natural environments” are an important component of the resilience of socio-ecological systems in the hinterland of the Provence-Alpes-Côte-D’azur region. These spaces occupy more than 30% of the land area and livestock farming contributes to the ecological and social dynamics. This activity is strongly intertwined with the multiple use of the spaces with other activities that jointly define the identity of the territories. But the spatial inscription of grazing as well as its relations to the dynamics of land use, contrary to other agricultural practices (ploughing, mowing), is difficult to determine with precision.

However, the intersection between land use and grazing is at the center of concerns both in terms of the capacity to adapt to and mitigate climate change, the control of ecological dynamics with regard to the preservation of biodiversity and the prevention of forest fires. The sustainability of livestock farming itself is directly questioned by the “closure” of pastoral environments that grazing practices are struggling to curb. The medium-term renewal of the fodder resource for grazing herds is threatened as well as the maintenance of these areas in the eligibility criteria for support under the first pillar of the CAP. The characterization of the land use of these environments and their ecological dynamics are thus at the heart of the analysis of the sustainability of the region’s pastoral livestock activity.

The available land-use maps (such as corine land cover …) do not allow to operate in a satisfactory way these crossings. Therefore, we have undertaken a work on :

– On the one hand, a detailed characterization of the uses of pastoral areas based on the data from the pastoral survey carried out by the actors of livestock farming on the scale of the Alps and on farmers’ declarations to the CAP’s graphical parcel register.

– On the other hand, the characterization of land use in these areas using a supervised classification (Bayes) of 5 environmental closure classes based on SPOT 6 images and multi-resolution segmentation (Baatzshape).

The proposed training course aims at adapting this land use characterization methodology to the whole region. This work will produce an easily updatable map allowing the identification of particularly exposed areas and the decision support of the actors of the livestock industry within the framework of public actions aiming at reinforcing the future of the activity.

Development of an optical sensor for the monitoring of leaf biochemical content: contribution to decision making for the monitoring of the physiological state & early detection of plant pathologies

  • Keywords: Foliar chemistry; Vegetation monitoring; Leaf spectroscopy; optical sensor; Physical modeling
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Tetis, Inrae
  • Contact: Feret Jean-Baptiste – feret@teledetection.fr
  • #DigitAg: Axis 3 – Challenge 3,Challenge 7

Climate change has a direct influence on crops, due to its impact on temperatures and water cycle. Monitoring of foliar chemical composition is relevant to study plant physiology, phenology and to identify pathogens in the context of climate change.

Recent advances in both fields of physical modeling of vegetation at the leaf scale , and in optical instrumentation, now make it possible to accurately measure leaf chemical content using portable and low cost sensors. The PROSPECT model allows precisely estimating the main leaf chemical constituents including chlorophyll, carotenoids, anthocyanins, water and dry matter content. The monitoring of this biochemical signature then makes it possible to diagnose the state of the cultures, to detect the stress and the possible presence of certain pathogens at an early stage. Mapping leaf chlorophyll content in the field using image analysis from remote sensing data (drone, airplane, satellites) is used operationally for several years in order to support decision-making tools dedicated to help farmer’s technical itinerary and in and to forecast crop production.

Recent research results show that it is possible to achieve a fine quantification of foliar chemistry using modeling tools and by acquiring parsimonious foliar optical properties in the optical domain. This opens perspectives for the development of a portable leaf-clip sensor using multispectral light emission based on laser diodes and a synchronous detection multi-carrier (patented technology by INRAE ).

The objective of this internship is to participate in the development of a prototype to quantify the foliar chemistry. During the first stage of the internship, the student will have to understand the different optical phenomena involved in the interactions between light and leaf material; he/she will also learn to use the modeling tools . He/she will then determine the combination of wavelengths allowing to optimally estimating foliar chemistry, combining simulated and already available experimental databases. Finally, based on these results, he/she will work on the development of a laboratory prototype. This system will be characterized and validated with experimental data from different crop types. This work will allow enhancing the research results of the TETIS and ITAP joint research units, and may lead to a patent application.

Visualisation and Navigation in agro-environmental spatio-temporal data classified by relational concept analysis

  • Keywords: Data mining,  Relational Concept Analysis, Agroecology, crop protection, knowledge base
  • Duration: 6 months
  • Desired start date: 15/02/2021
  • Research Unit: LIRMM, Université de Montpellier
  • Contact: Huchard Marianne – huchard@lirmm.fr
  • #DigitAg: Axe 1, Axe 5  – Challenge 1, Challenge 3, Challenge 8

With the rise of digital technology, agricultural research has produced numerous datasets on agriculture and on the environment to be mobilized to develop decision-making tools for populations from the North and the South. Among these datasets, there is one on the watercourses of two French watersheds developed by the Fresqueau project (http://dataqual.engees.unistra.fr/fresqueau_presentation_gb) which is spatio-temporal and another one on the uses of plants with pesticidal and antibiotic effect developed by the Knomana project (https://agris.fao.org/agris-search/search.do?recordID=FR2019109314) for animal, plant, human and public health whose data model has a ternary structure.

To develop the decision support tool, the classification method used by these projects, to model temporality and ternary relationship, is Relational Concepts Analysis (RCA). Using logical quantifiers, RCA groups and classifies sets of entities sharing common properties and relationships, supporting for example reasoning by exploring properties and similarities, reasoning by abduction to create hypotheses, and the search for alternative solutions by neighborhood with known solutions. To avoid calculating the complete classification to navigate and explore the dataset step by step, an on-demand calculation method has been developed. The problem faced by the team carrying out these projects, i.e. LIRMM, UPR AIDA, UMR IPME and ENGEES, is to have a tool for visualizing and navigating through the data classified by RCA.

Furthermore, the LIRMM conducts research in analytical visualization (Keim et al. 2008). This field focuses on the study of interactive visual interfaces enabling the exploration of complex and heterogeneous datasets in order to facilitate analytical reasoning on the data and thus derive knowledge from them (see for example (Accorsi et al. 2014) developed within the Fresqueau project).

The objective of this internship is to develop a software prototype for the visualization of data sets, including spatial and/or temporal data, classified by RCA. More precisely, the trainee will carry out an interactive visualization allowing to pilot the calculations on request of RCA and to display the results in an incremental way. Several visual approaches will be combined in order to give the user an overview of the extracted knowledge space and, as requested by the user, a detailed view of subsets of the classification calculated on the fly. Different interaction methods (Munzner 2014, chapters 11-14) and different graph visualization techniques (Tamassia 2013) will be used. The trainee will follow the design steps described by Sedlmair et al. 2012: i.e. literature review, definition of the need expressed as visual problem, proposal of a model, development, deployment, validation.

Distributed plant simulations: Application to agroecology

  • Keywords: Functional Structural Plant Model, Distributed computation, clusters, Stands, Agroforestry.
  • Duration: 6 months
  • Desired start date: 1/03/2021
  • Research Unit: Agap, Cirad
  • Contact: Frédéric Boudon – boudon@cirad.fr
  • #DigitAg: Axis 6 – Challenge 1, Challenge 8

To meet societal demands for a more sustainable and ecological agriculture, plant models simulating their growth and functioning (FSPM) are being developed by the scientific community. In the framework of the OpenAlea modeling platform, we have been developing for several years, different simulation formalisms (Pradal et al., 2008; Boudon et al., 2012). In particular, formal grammars, such as L-systems, allowing the efficient rewriting of trees or methods for rewriting multi-scale graphs (MTG) are available and have been used to model a wide variety of plants (apple, mango, palm, maize, sorghum, etc.).

FSPM models allow the study and analysis of plant-plant interactions in complex canopies in association (Gaudio et al., 2019, Braghiere et al., 2020). They allow to simulate aerial and root phenotypic plasticity by taking into account the competition for light and resources acquisition in a mechanistic way. For this, it is however necessary to simulate, at the organ scale and in 3D, the development and functioning of a large number of interacting plants within the same canopy. To do this in a reasonable time, it would be necessary to distribute the simulation calculations over large computing infrastructures (cluster, cloud). However, currently, there is neither formalism nor technology to automatically distribute the 3D simulation of interacting heterogeneous plants.

The challenge we are trying to address is therefore to efficiently simulate a set of plants in spatial (competition for resources acquisition) and temporal (feedback between structure and function) interaction. The objective of this project is to analyze different parallelization strategies to simulate in 3D the growth and functioning of plants and stands on shared memory architectures and in distributed environments (Pradal et al., 2017; Heidsieck et al., 2020). One of the challenges is to define design patterns for distributed computations at different granularities (parallel simulation of an isolated plant, distributed computation of a large number of interacting plants) using current technologies (OpenMP, Spark, Dask).  An important issue is to take into account the dependencies between the computations made on the structures when they are rewritten according to the strategies used (in place or by copy).

One application of this work will be the simulation of an agroforestry system mixing palm trees and rice for which pre-existing models (VPalm and Cereals projects) will be reused.

The student’s work will consist of :

– Definition of a protocol for spatial information exchange and synchronization between simulators.

– Formalization of a strategy for distributing simulations on several machines or clusters.

– Application to the creation of a Palm-Rice agroforestry system model with characterization of the dynamics of light distribution during a growth cycle.