Crop2ML & CyML: A crop modelling metalanguage and translator shared between different simulation platforms for the international community

In a context of climate change and food security, accelerating the sharing and reuse of research tools and findings on cultivated plants is a key challenge. The international community of crop modellers, which uses several programming languages and simulation platforms, had an obstacle to overcome: how to use a crop growth model component written in a given language for a given crop simulation platform, without having to rewrite it in another language in order to use it on another platform. After the definition of a metalanguage common to the whole community, Crop2ML, a digital translator, CyML, now meets their requirement; this programming language avoids the need for multiple conversions and facilitates sharing. The CyML models produced from source components are translated into different languages and used on target platforms thanks to transponders. The web application, which will be available in 2020, will enable non-programmers to understand shared models and to reuse them. This translator was developed by Cyrille Midingoyi, a #DigitAg-INRAE doctoral student in the LEPSE joint research unit in Montpellier. Pierre Martre, his supervisor, explains the Crop2ML + CyML language/transpiler combination and its advantages. The team’s findings were presented during the international symposium iCROPM2020 supported by #DigitAg, which was held in Montpellier in February.

Process-based crop simulation models are widely used tools for analysing and predicting the response of agricultural production systems to climatic, agronomic and genetic factors. These models are often developed on crop simulation platforms to ensure their development and to couple different crop models with a soil model and a crop management event scheduler. Thus, a crop simulation model will translate the dynamics of crop development under different abiotic environmental conditions (temperature, water and solar radiation), for different soils and climates. However, these models are written in different programming languages, on different platforms that use different formal systems, and even when modular approaches and techniques for reuse are proposed by the platforms, there is little exchange of model components between platforms, despite the advantages.

The Agricultural Model Exchange Initiative (AMEI) and its metalanguage Crop2ML

The international Agricultural Model Exchange Initiative (AMEI) arose out of the need to facilitate comparisons between crop models and the exchange of model components, as well as the desire to establish a tool to be shared by the whole community. In 2015, the heads of the leading platforms in the field met, since the involvement of all was essential to the success of the project. The AMEI initiative resulting from their discussions includes French teams from INRAE and CIRAD, members of #DigitAg (UMR LESPE, MIAT and AGAP), associated with researchers from Germany (University of Bonn), Italy (CREA Bologna), the USA (University of Florida), Australia (CSIRO), the European Commission’s JRC (Ispra, Italy), and the Netherlands (Wageningen University & Research – WUR), who are more specialised in data and ontologies.

Models, whether generic or crop-specific, are increasingly developed in the form of components, which model leaf area, photosynthesis, biomass production, root growth, etc. Until now, there was no system to exchange model components between modelling platforms. This was a problem, as users needed to know on which platform the component had been developed in order to extract it and re-encode it for reuse. This open source tool can be used to either write a model from scratch, or to edit, visualise and modify existing models, and to export a model encoded in Crop2ML (a combination of CyML and XML) towards a given platform, explains Pierre Martre..

The crop modelling communities, whether European (CropM) or international (AgMIP) have conducted many inter-comparisons of crop models. To improve them, they had a review of their strengths and weaknesses, but they noted that it was difficult to improve them. Because they are often poorly documented, and because if modellers wish to test whether the formal systems of a model, a crop simulation hypothesis, work in another model, this becomes difficult, and requires re-encoding, etc.

We arrived at the idea of the need to develop a single formal model, or rather to share a formal model from which we can translate all imports/exports from each model from one to another, a system that interconnects models to make them intelligible for analyses, but also to document them in a standardised manner, to know the assumptions generated by the models, to identify how the processes within models are modelled, to be able to inform other models, and to be able to publish our models via web repositories, search for them and find them.

It was in this context that the metalanguage Crop2ML was developed. Financing the thesis by Cyrille Midingoyi accelerated the development of the CyML translator and motivated the crop modelling community.

The description and assembly of crop model components are independent of the formal model of simulation platforms and the exchange of components between simulation platforms is facilitated. This metalanguage proposes a centralised framework for the exchange and reuse of models. Their reuse is taken into account from the design stage (meta-information, explicit specification of model inputs/outputs) and not during their implementation.

So far, we have defined a format to express a model on all modelling platforms, with all the sufficient information to then translate it using CyML according to the modelling platform considered and its specific programming language.

We define the model input variables (for example, temperature, atmospheric CO2 concentration, date and quantity of inputs used, etc.), then all algorithms and parameters associated with the equations, and we conduct tests to validate the model. If we take the example of high-throughput phenotyping of plants, it is often necessary to re-write the models according to the input/output variables measured in the field, and this can be very time-consuming, so the gain is considerable..

From XML and Cython to the CyML language and its translator

Les métadonnées du modèle sont décrites dans un langage déclaratif (XML), et nous avons décidé d’exprimer toutes les équations en Cython (du Python avec des types de variables du C). A partir de là, nous avons défini un langage, le CyML, sous-ensemble du Cython. Après avoir développé l’ensemble de ce langage, Cyrille a développé un traducteur (transcompileur), pour traduire de façon automatique le CyML dans tous les langages de programmation utilisés sur les différentes plateformes. Concrètement, on représente un modèle en XML, les équations en CyML et on traduit le CyML vers les langages de programmation des plateformes de simulation (C++, R, Python, etc.).

The structuring is complete and we have translators that work automatically on several platforms. We are now consolidating the system and including new models. After OpenAlea, Cyrille began the inclusion of the RECORD platform  and the Stics model.. We are also working on those of the EU JRCBioMA, and the German SIMPLACE , Australian (APSIM) and American (DSSAT) platforms.

A web application to open up models and to aid comprehension by non-programmers

The next stage for Crop2ML/CyML is the ongoing development of its web application by Vincent Armant (#DigitAg computer support). This will enable non-programmers to understand the content of shared models and to reuse them. A demo will be available in 2020.

A prototype interface and a software programme (PyCrop2ML) have already been developed by Cyrille with the help of a student (internship also financed by #DigitAg). The tool will now be given a web interface to facilitate its use by biologists, without any special programming skills required to use it. This will open up these relatively complex models, making them more accessible, and will enable everyone to understand how they work. It is important to understand the outputs of models, and to avoid using them as “black boxes”.

Next, the mode of integration of a model component found in a crop model will depend on the platforms. For some it is semi-automated, while for others it may be totally automated, and there will no longer be any need for programming. This will also depend on the investment of each platform in developing the necessary interfaces.

Beyond the research community, the tool is of interest to service companies. They could find in model libraries those missing from their own applications. This would accelerate development and transfer.

Using ontologies

What’s next? Building on ontologies, and linking to data in order to better document them. “So far, we have focused on the functioning of the system, but it provides for the use of ontologies, for both units and variables, but also to ensure consistency between models”, says Pierre Martre.

If we take the case of phenotyping, when we estimate leaf area, for example, based on Sentinel 2 or drone imagery, or destructive measurements of plants, we do not use the same method to measure the variable; and we do not therefore obtain exactly the same measurement result because we are not measuring exactly the same thing, and it is important to know exactly what we are measuring. Another case is plant height: do we obtain the plant measurement by stretching it or not, for a cereal? Do we measure from the base of the ear or from the top of the plant cover, etc.? Ontologies will provide answers to these questions and create linkages between variables.

Eventually, there should also be strong linkages with another thesis supervised by Llorenç Cabrera-Bosquet (INRAE LESPE) and Pascal Neveu (INRAE MISTEA) to develop ontologies based on equations and models (Luis Felipe Vargas Rojas: “Combining Semantic Web and modelling approaches for organizing phenotyping data collected at different scales of plant organization”). With Llorenç, we are also working on aspects of data-model linkages. He has developed ontologies with CGIAR in Montpellier and a team from the University of Florida.

We are also again calling on any modellers wishing to contribute to the AMEI and to the Crop2ML framework: let’s work together to develop the next generation of crop modelling methods and tools!

IN DETAIL

Crop2ML is based on a declarative architecture to represent modular models with an intermediate modelling language to describe biophysical processes and their transformation in existing platforms..

Crop2ML (a) From a combinatorial model-to-model transformation (top) to a centralized framework of
model exchange and reuse (bottom). (b) Crop2ML meta-specifications for model unit (left) and
composite (right) design. +, one or more elements; *, zero or more elements; ?, zero or one element. (Cyrille Midingoyi, 2020)

In Crop2ML, the unit models and their composition in executable components are represented by clearly defined concepts, which separate the description of a model and of its algorithms. These structured elements and their attributes are the grammar of the declarative language and are used to determine whether a model is consistent with Crop2ML. The model components are represented by a graph of models to manage complexity and to link the outputs of one model to the inputs of another. These concepts and the associated attributes were developed in collaboration, with a view to exchanging and reusing models. Given that the model algorithms are encoded on different platforms using different programming languages, a meta-level of abstraction of these languages is needed in order to enable a fully automated translation into the target languages and to provide minimal meta-specifications to encode the algorithms. To do so, we designed and implemented CyML, a language system integrated in Cython (Midingoyi et al., 2020). CyML is based on minimal and sufficient rules to express model algorithms close to mathematical expressions and without taking account of artefacts of simulation platforms.

The CyML translator was tested with two components of biophysical processes from growth models on the BioMA platform for their reuse on the OpenAlea, Stics and RECORD platforms, etc.

A biophysical crop simulation model translates the dynamics of plant development under different abiotic environmental conditions (temperature, water and solar radiation), and for different soils and climates. CyML was tested with two autonomous components of the BioMA (Biophysical Model Applications) tool: a crop energy balance and a wheat phenology model. Encoded in CyML, these two components were converted into target languages (Python, Java, R, C++, C#, Fortran), to make them reproducible whatever their implementation platform. They were also exported to several simulation platforms (OpenAlea, RECORD, SIMPLACE, DSSAT, etc.) in order to reuse them through the transpiler workflows.

Based on the Cython language, CyML encodes dynamic models (in discrete time) to implement them in different languages, whether procedural (Fortran), object-oriented (Java, C#, C++), functional or script (R, Python). A model component may be represented by a set of functions (unit models) with inputs/outputs. Each function implemented in CyML can be automatically translated into different languages using a transpiler. The web application developed to transpose a CyML model into a target language can be used to interactively test the model produced. The user can edit and execute the model obtained, modify its input values and visualise the results generated using interactive web widgets.

Learn more

Contacts : pierre.martre [AT] inrae.fr – cyrille.midingoyi [AT] inrae.fr [PhD homepage]

References

Ongoing PhDs :

AMEI « Agricultural Model Exchange Initiative »

  • GitHub Crop2ML / PyCrop2ML : https://github.com/AgriculturalModelExchangeInitiative
  • Martre Pierre, Donatelli Marcello, Pradal Christophe, Enders Andreas, Ahmed Midingoyi Cyrille, Athanasiadis Ioannis, Fumagalli Davide, Holzworth Dean, Hoogenboom Gerrit, Porter Cheryl, Raynal Hélène, Rizzoli Andrea E., Thorburn P.. 2018. The Agricultural Model Exchange Initiative. In : Abstracts of the 7th AgMIP Global Workshop. IICA. San José : IICA, Résumé, 17-18. AgMIP Global Workshop. 7, San José, Costa Rica, 24 Avril 2018/26 Avril 2018.