[Doctorant] Oussama Mecchour

[PhD student] Oussama Mecchour: Integration and normalization of experimental databases in agroecology: text mining approaches guided by semantic information

Thesis topic cofunded by #DigitAg

Integration and normalization of experimental databases in agroecology: text mining approaches guided by semantic information
 

O Mecchour

I am doing my thesis at UMR Aida, Tetis and Mistea on the integration and standardization of experimental databases in the field of agroecology. I'll be working in La Réunion for half of my thesis and in Montpellier for the other half. 
I specialize in data science, and enjoy problem solving in research. 
This thesis is an extension of my internship at CIRAD. During these 4 months, I had the opportunity to immerse myself in the field of agroecology as a data scientist. I have gained an in-depth understanding of this field, and I am convinced that our work will provide a significant solution to the problem of data interoperability in agroecology.
As far as the internship is concerned, we have obtained very encouraging results, showing that combining approaches and integrating AI methods (language models) improves results.

  • Starting date: 12 October 2023
  • University: University of Montpellier
  • PhD school: I2S
  • Scientific field: IT
  • Thesis management: Krishna Naudin, UPR Aida, Cirad
  • Thesis supervisors: Sandrine Auzoux, UPR Aida, Cirad - Mathieu Roche, UMR Tetis, Cirad
  • Funding: #DigitAg – InterCropValues (Horizon)
  • #DigitAg : Cofunded thesis – Axe 5 : Fouille de données, analyse de données, extraction de connaissances, Axe 2 : Innovations en agriculture numérique, Axe 3 : Capteurs, acquisition et gestion de données, Axe 4 : Système d’information, stockage et transfert de données, Axe 6 : Modélisation et simulation (systèmes de production agricole), Challenge 1 : Le challenge agroécologique, Challenge 3 : La protection des cultures
    Challenge 8 : Développement agricole au Sud

Keywords : Agroecology, Data science, Knowledge Extraction

Abstract: One of the challenges to improve modelling methods in agroecology and their reproducibility is to link and standardize expert variables and model variables by integrating the semantic information they carry. In this context, the textual data associated with the variables (scientific publications, technical reports, blogs, etc.) and semantic resources such as "ontologies" can provide crucial semantic information to contextualize the variables.The objective of this PhD is to propose a generic approach to automate the labeling of variables created by experts and their linkage with model variables, using text mining methods driven by semantic information.

Modification date: 04 April 2024 | Publication date: 27 November 2023 | By: GL