The Challenge
Predicting Nano-Enabled Agriculture Impacts Computationally
Nanoparticles are increasingly used in agriculture as fertilisers, biostimulants, and pesticides to support sustainable food production. However, the interaction between NP properties, soil systems, and plant species is highly complex, and conventional assessment of NP–plant interactions requires long, resource-intensive experiments.
Existing datasets suffer from heterogeneous formats, missing metadata, non-systematic data coding, and lack of direct links to original publications — making it difficult to build generalisable ML models for nano-agriculture applications.
Our Approach
From fragmented literature data to a validated, cloud-deployed prediction model
Curate and quality-control literature data
Performed extensive data curation on the publicly available NP–plant interactions dataset, cross-checking original publications, supplementing missing metadata such as NP core composition and crystal phase information, and standardising attribute encoding.
Enrich with atomistic descriptors
Calculated computationally derived atomistic descriptors based on the elemental composition and crystal phase of each nanoparticle, adding structural features that experimental characterisation alone cannot provide.
Address class imbalance with synthetic data
Applied synthetic data generation techniques to balance the dataset classes, combined with rigorous data filtering and variable selection through an automated ML framework evaluating seven different algorithms.
Optimise and validate with AutoML
The AutoML workflow selected XGBoost as the best-performing model, achieving 85% accuracy and 83% balanced accuracy on external validation. The model was validated following OECD guidelines with a defined applicability domain.
Deploy as CeresAI-nano and FAIRify data
Deployed the validated model as the CeresAI-nano web application on the Enalos Cloud Platform. Published the curated dataset through nanoPharos and documented the model in QMRF format for regulatory transparency.