
The relationships between concepts used by OpenML define the ML experiment. Credit: Pattern (2025). doi: 10.1016/j.patter.2025.101317
The project, which began with a doctoral degree, grows into a website with 120,000 unique visitors each year. With the platform OpenML, researcher Jan Van Rijn is contributing to open science with the goal of making machine learning more transparent, accessible and fair.
From climate research to behavioral science: Machine learning (ML) plays an increasingly important role in science. Researchers use it to discover patterns, predict patterns in large datasets, and simulate complex processes. However, despite this growth, ML results can still be difficult to assess or replicate.
“There is no standard way to share data, models, or results,” says Jan Van Rijn. “That’s a shame because if we want to take it seriously as a field, we need to make sure our work is verifiable and reproducible.”
What is machine learning?
Machine learning is a way for computers to learn from examples. This can be email programs that recognize spam based on thousands of previous messages. The system learns to find patterns on their own without all rules being programmed manually. In a way, it functions like human learning on a much larger scale. Applications are everywhere: from facial recognition and medical diagnosis to Netflix recommendations.
Shared workspace for machine learning
To make machine learning more transparent, Van Rijn founded OpenML over a decade ago. A shared digital workspace where researchers and students can upload datasets, algorithms and experiments. Anyone can view, contribute and learn about other people’s approaches. This platform is perfectly suited to open science principles. It is an accessible, verifiable, reusable science.
And clearly there is a need for that. OpenML is currently used worldwide and has already contributed to around 1,500 scientific publications. Van Rijn and his fellow researchers recently looked back on OpenML for 10 years in a publication in Journa Patterns. They identified three main ways researchers can use the platform. It is for improving the algorithms and gaining high-level insights through so-called meta-learning, and for education.
“OpenML is often used in courses on machine learning and reproducible research,” he says.
“It’s not that researchers don’t want to share codes.”
Open practice is still far from standard. “Science has a wide range of research cultures,” explains Van Lizin. “It brings a valuable perspective, but it means there is a lack of sharing standards. It takes a lot of time and effort to create and apply common standards. It doesn’t mean that researchers don’t want to share codes.
Still, Van Rijn is stuck on his mission. “The goal is something like Wikipedia for machine learning, but there are not only texts, but also data, models and experiments.
OpenML is more than just a platform
He believes that open science will gradually become more established. “Our publications are cited more frequently, which is useful. However, they also require structural support from universities and funders, such as by making codes and data openly shared.”
So OpenML is more than just a platform. This is a step towards a scientific culture built on collaboration, transparency and reuse. “There are other platforms like us,” says Van Rijn. “Our aim is to disassemble and connect these silos, so it’s even easier to share research.”
Details: Bernd Bischl et al, OpenML: 10 years and over 1,000 papers, patterns (2025). doi: 10.1016/j.patter.2025.101317
Provided by Leiden University
Quote: Platforms can make machine learning more transparent and accessible (July 21, 2025) Retrieved from https://techxplore.com/news/2025-07-platform-machine-transparent-acesible.html
This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.