The Challenge
Validating Computational Models Across Research Groups
A major barrier to regulatory acceptance of in silico nanoinformatics models is the lack of standardised methods for comparing and validating modelling approaches across independent groups. In experimental science, Round Robin tests are the established method for demonstrating reproducibility.
No equivalent standardised approach existed for computational nanoinformatics. This study introduces the concept of consensus modelling as a "modelling equivalent" of the experimental Round Robin, where independent teams build models on the same dataset and integrate their results.
Our Approach
A multi-laboratory consensus modelling framework for nanoinformatics validation
Share a common dataset and protocol
A publicly available dataset of metal and metal oxide nanomaterials with measured zeta potential in aqueous medium was distributed to all four participating groups: NovaMechanics (Cyprus), NTUA (Greece), QSAR Lab (Poland), and DTC Lab (India).
Build independent ML models
Each group independently developed ML models using their own methodologies and algorithms, including Random Forest, AdaBoost, k-Nearest Neighbours, and read-across approaches, resulting in five distinct predictive models.
Evaluate individual model performance
Compared all models on common test data to assess their predictive accuracy, biases, and applicability domains, revealing the strengths and limitations of each individual approach.
Generate consensus predictions
Integrated the individual model predictions into a consensus modelling scheme that combines outputs from multiple models, enhancing predictive accuracy and reducing the biases inherent in any single approach.
Demonstrate consensus superiority
Showed that the consensus models consistently outperform individual models, providing more reliable predictions and establishing a new standard for increasing confidence in nanoinformatics model validity.