
A new machine learning approach model that tunes the output of Holos-Quad microreactor designs by Holosgen LLC. The multi-agent augmented learning approach trains more efficiently than the previous approaches and advances towards a more autonomous nuclear microreactor for remote operation. Credit: Holosgen LLC.
According to a study led by researchers at the University of Michigan, exploiting the symmetry of nuclear microreactors to reduce training time when modeling power output adjustments, the journal Energy Conversion and Management: X.
Improved training efficiency helps researchers model reactors faster, moving them a step further towards remote or ultimately space operations towards real-time automated nuclear microreactor control.
These compact reactors can be used directly as heat or produce up to 20 megawatts of thermal energy that can be converted to electricity, but can be easily transported or potentially used on freighters who want to travel very long without refueling. When incorporated into an electrical grid, nuclear microreactants can provide stable, carbon-free energy when renewable energy such as solar and wind is not available in abundant use.
Smaller reactors avoid the enormous capital costs associated with large-scale reactors, and partial automation of microreactor power control helps reduce costs. In potential space applications, nuclear microreactors must operate completely autonomously, such as by propelling the spacecraft directly or powering the spacecraft’s systems.
As the first step towards automation, researchers simulate follow-ups under load. Power plants increase or decrease to reduce power output to suit grid power demand. This process is relatively simple to model compared to reactor startup. This includes rapid changes in difficult-to-predict conditions.
The Holos-Quad microreactor design modeled in this study regulates power through the position of eight control drums centered around the central core of the reactor where neutrons divide the uranium atoms to generate energy. On one side of the circumference of the control drum is lined with boron carbide, a neutron absorbing material.
When rotated inward, the drum absorbs neutrons from the core, reducing neutron populations and forces. Rotating the core outwards will keep more neutrons in the core and increase the output.
“Deep reinforcement learning builds models of system dynamics and enables real-time control. Traditional methods like model predictive control often struggle to achieve due to their repeated optimization needs.”
The researchers simulated load follow-up with rotation of the control drum based on reactor feedback with reinforcement learning. This is a machine learning paradigm in which agents can make decisions through repeated interactions with the environment through trial and error. Deep reinforcement learning is very effective, but requires extensive training that increases computational time and cost.
Researchers have tested the first time a multi-agent reinforcement learning approach in which eight independent agents train to control a particular drum while sharing information about the entire core. This framework helps to leverage the symmetry of microreactors to reduce training time by increasing learning experiences.
This study evaluated multi-agent augmented learning for two other models. It is an industry-standard proportional integration (PID) control that uses a single agent approach in which a single agent observes core status and controls all eight drums, and a feedback-based control loop.
The reinforced learning approach was achieved after similar or superior loading compared to PID. In an incomplete scenario where the sensor provides incomplete measurements, reinforcement learning maintained a lower error rate than PID at a lower control cost of up to 150% when reactor conditions were varied.
The multi-agent approach trained at least twice as fast as the single-agent approach, with only a slightly higher error rate.
Although this technique requires extensive validation in more complex and realistic conditions prior to practical applications, the findings establish a more efficient pathway for reinforcement learning with autonomous nuclear microreactants.
“This research is a step towards a forward digital twin where reinforcement learning drives system action. Next, we aim to close the loop with inverse calibration and high fidelity simulations to improve control accuracy,” Radaideh said.
Details: Leo Tunkle et al, Controlling deep reinforcement learning, energy conversion, and management during transients and loads of nuclear microreactors: X (2025). doi:10.1016/j.ecmx.2025.101090
Provided by the University of Michigan Faculty of Engineering
Citation: Reinforcement Learning for Nuclear Microreactor Control (June 30, 2025) Retrieved from https://techxplore.com/news/2025-06-nuclue-microreactor.html
This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.