
Credit: Unsplash/CC0 Public Domain
The language capabilities of today’s artificial intelligence systems are incredible. You can engage in natural conversations with systems such as ChatGpt, Gemini and many other systems. However, little is known yet about the internal processes of these networks that lead to such surprising results.
Journal of Statistical Mechanics: Theory and Experiment reveals part of this mystery. A study titled “Toptation transition between position learning and semantic learning in a resolvable model of Dot-Product Anterness.”
When a small amount of data is used for training, it indicates that the neural network initially depends on the position of the word in the sentence. However, the system is exposed to sufficient data and moves to a new strategy based on the meaning of the word.
This study shows that this transition occurs suddenly, like a phase transition in a physical system, when critical data thresholds cross. The findings provide valuable insights to understand the behavior of these models.
Just as children learn to read, neural networks begin with understanding sentences based on the position of words. Depending on where the word is in the sentence, the network can infer the relationship (subject, verb, object?). However, as training continues, the network “continues to school” – shifts occur. The meaning of the word is the main source of information.
This is what new research explains, and happens in a simplified model of autocatalytic mechanisms. This happens with the core building blocks of trans language models, such as ChatGpt, Gemini, Claude, etc., that we use every day.
Trans is a neural network architecture designed to process data sequences such as text, and forms the backbone of many modern language models. Transformers specialize in understanding relationships within sequences and use autocatalytic mechanisms to assess the importance of each word relative to other words.
“To assess relationships between words,” explains Hugo Kui, a postdoctoral researcher at Harvard University and a first author of the study. “The network can use two strategies. For example, in a language like English, subjects usually precede a verb. The verb precedes an object. “Mary eats an apple” is a simple example of this sequence.
“This is the first strategy that will spontaneously appear when a network is trained,” explains CUI. “However, in our study, if training continues and the network receives sufficient data, at a certain point (if the thresholds are intersecting), the strategy changes suddenly. The network instead begins to rely on meaning.”
“When we designed this work, we simply wanted to study which strategies, or combinations of strategies, but we adopted the network. But what we found was somewhat surprising. Beneath a certain threshold, the network was only positioned on it.
CUI describes this shift as a phase transition and borrows the concept from physics. Statistical physics research systems consist of a huge number of particles (atoms and molecules, etc.) by statistically describing collective behavior.
Similarly, the neural networks that are the basis of these AI systems are made up of numerous “nodes” or neurons (named by their similarity to the human brain), each connected to many others to perform simple operations. The intelligence of the system arises from the interaction of these neurons. These neurons are phenomena that can be explained in statistical ways.
This is why we can talk about the sudden change in the behavior of the network as a phase transition, as well as water under certain temperatures and pressures.
“It is important to understand from theoretical perspective that strategy changes occur this way,” CUI emphasizes.
“Our networks are simplified compared to the complex models that people interact with every day, but they can provide hints for models to begin to understand the conditions that stabilize their strategy. We hope that this theoretical knowledge can be used in the future to make the use of neural networks more efficient and secure.”
Details: Topological transitions between position learning and semantic learning in a DOT product attention-solvable model, Journal of Statistical Mechanics Theory and Experiment (2025).
Provided by Sissa Mediaab
Quote: From position to meaning: How to read AI (July 7, 2025) Retrieved from July 7, 2025 from https://techxplore.com/news/2025-07-position-ai.html
This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.
