Close Menu
  • Home
  • Aerospace & Defense
    • Automation & Process Control
      • Automotive & Transportation
  • Banking & Finance
    • Chemicals & Materials
    • Consumer Goods & Services
  • Economy
    • Electronics & Semiconductor
  • Energy & Resources
    • Food & Beverage
    • Hospitality & Tourism
    • Information Technology
  • Agriculture
What's Hot

Mapping AI’s brain reveals that memory and reasoning are not in the same place

Too Good To Go launches grocery bags in collaboration with Whole Foods

Illegal meat floods Britain while ministers are at an impasse over biosecurity, MPs say

Facebook X (Twitter) Instagram
USA Business Watch – Insightful News on Economy, Finance, Politics & Industry
  • Home
  • Aerospace & Defense
    • Automation & Process Control
      • Automotive & Transportation
  • Banking & Finance
    • Chemicals & Materials
    • Consumer Goods & Services
  • Economy
    • Electronics & Semiconductor
  • Energy & Resources
    • Food & Beverage
    • Hospitality & Tourism
    • Information Technology
  • Agriculture
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
USA Business Watch – Insightful News on Economy, Finance, Politics & Industry
Home » Test time training can lead to LLMs that are excellent at complex inference
Electronics & Semiconductor

Test time training can lead to LLMs that are excellent at complex inference

ThefuturedatainsightsBy ThefuturedatainsightsJuly 9, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


Research can lead to LLMs that are excellent at complex reasoning

Examples of ARC and BBH tasks in which the model resolves successfully only after applying test time training. Credit: Arxiv (2024). doi:10.48550/arxiv.2411.07279

For all impressive abilities, large-scale language models (LLMs) are often lacking when given challenging new tasks that require complex inference skills.

Accounting firm LLMs may be good at summarizing financial reports, but if they are tasked with predicting market trends and identifying fraudulent transactions, the same model can fail unexpectedly.

To make LLMS more adaptable, MIT researchers have investigated how to strategically deploy specific training methods to enhance the performance of models on unfamiliar, challenging problems.

They show that test-time training, a method of temporarily updating some of the internal workings of a model during deployment, can lead to a six-fold improvement in accuracy. Researchers have developed a framework for implementing test time training strategies that maximize these benefits using new task examples.

Their work can increase the flexibility of the model and enable off-the-shelf LLMs to adapt to complex tasks that require planning and abstraction. This could lead to LLMS becoming more accurate in many applications that require logical deductions, from medical diagnosis to supply chain management.

“Real Learning – What we did here with Test Time Training is what we can’t do on our own after these models are shipped. They can’t acquire new skills or get better at the task. But they showed that pushing the model a little can lead to major improvements in performance. ’25, the lead author of the study.

Akyürek has been joined by graduate students Mehul Damani, Linlu Qiu, Han Guo and Jyothish Pari. Adam Zweiger from the faculty. Senior author Een Kim, assistant professor of Electrical Engineering and Computer Science (EECS) and a member of the Institute of Computer Science and Artificial Intelligence (CSAIL). Jacob Andreas, an associate professor at EECS and a member of CSAil.

The study will be presented at the International Conference on Machine Learning (ICML 2025) held in Vancouver from July 13th to 19th. This paper is currently available on the Arxiv prelint server.

Working on the hard domain

LLM users often try to improve model performance with new tasks using a technique called context learning. It supplies the model with some examples of new tasks as a text prompt to guide the model’s output.

However, learning within context does not always work for problems that require logic and inference.

MIT researchers investigated how test time training could be used in combination with in-context learning to improve performance for these challenging tasks. Test time training involves updating the internal variables (internal variables) that you use to make predictions using a small amount of new data specific to the task at hand.

The researchers investigated how test-time training interacts with in-context learning. They studied design options that maximize performance improvements that can be separated from the generic LLM.

“We see that test-time training is a more powerful form of learning. Although providing examples can increase accuracy with precision, when you actually update your model with these examples, performance can be significantly improved, especially in challenging domains,” says Damani.

Context learning requires small examples of tasks, such as problems and solutions. Researchers use these examples to create task-specific datasets needed for test time training.

To expand the size of this dataset, create new inputs by slightly modifying the example problems and solutions, such as inverting some input data horizontally. They discovered that training the model on the output of this new dataset leads to top performance.

Additionally, researchers update only a small number of model parameters using a technique called low-rank adaptation, which improves the efficiency of the test-time training process.

“This is important because this method needs to be efficient when deployed in the real world. You can see that very small amounts of parameter training can greatly improve accuracy,” Akyürek says.

Developing new skills

Test time training is employed per instance, so streamlining the process is important. This means that users must do this for each individual task. The model update is temporary and after making predictions the model returns to its original form.

Models that usually take less than a minute to answer a query can take 5-10 minutes to provide answers in test time training, adds Akyürek.

“I don’t want to do this for every user query, but it can be useful if there is a very difficult task of solving a model well. Also, there may be tasks that LLM is difficult to solve without this method,” he says.

The researchers tested the approach on two benchmark datasets of highly complex problems, such as IQ puzzles. It has been six times more accurate than the technique that uses only in-context learning.

Tasks containing tasks with structured patterns or tasks using completely unfamiliar types of data showed the greatest performance improvements.

“For simpler tasks, learning within the context may be fine, but updating the parameters themselves can potentially develop new skills in the model,” says Damani.

In the future, researchers hope to use these insights towards developing models that will continuously learn.

The long-term goal is LLM, and given a query, it can automatically determine whether a parameter needs to be updated using test time training or whether a task can be resolved using context learning, and implement the best test time training strategies without the need for human intervention.

Details: Ekin Akyüreket al, The Suspicy Shot Learning’s incredible effectiveness of test time training, Arxiv (2024). doi:10.48550/arxiv.2411.07279

Journal Information: arxiv

Provided by Massachusetts Institute of Technology

This story has been republished courtesy of MIT News (web.mit.edu/newsoffice/), a popular site that covers news about MIT research, innovation and education.

Quote: Test time training could lead to a superior LLMS for complex inference (July 8, 2025) obtained from https://techxplore.com/news/2025-07-llms-complex.html.

This document is subject to copyright. Apart from fair transactions for private research or research purposes, there is no part that is reproduced without written permission. Content is provided with information only.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleHims & Hers offers general semaglutide as the Novo Nordisk patent has expired
Next Article UK priorities hosts Macron in France for immigration talks during the state’s visit | Emmanuel Macron News
Thefuturedatainsights
  • Website

Related Posts

Mapping AI’s brain reveals that memory and reasoning are not in the same place

November 12, 2025

The AI ​​boom is eerily similar to the dot-com bust of the 2000s, but there are some important differences.

November 12, 2025

Online age checks are creating a goldmine of data for hackers

November 12, 2025
Leave A Reply Cancel Reply

Latest Posts

Illegal meat floods Britain while ministers are at an impasse over biosecurity, MPs say

Veterans warn AI revolution must not compromise farm animal welfare

Parliamentary report warns Westminster is failing Welsh farmers

Lake District farmer restores hay meadow the size of 23 football pitches

Latest Posts

Airlines warn that flight cancellations will continue even after flight suspension

November 11, 2025

Explosives shortage could lead to higher phone, energy and housing prices

November 10, 2025

Demand for private jets increases amid government shutdown

November 10, 2025

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • Mapping AI’s brain reveals that memory and reasoning are not in the same place
  • Too Good To Go launches grocery bags in collaboration with Whole Foods
  • Illegal meat floods Britain while ministers are at an impasse over biosecurity, MPs say
  • U.S. natural gas futures extend rally to 8-month high on hopes of new cold wave in December and solid power demand – Energy News, Top Headlines, Commentary, Features & Events
  • The AI ​​boom is eerily similar to the dot-com bust of the 2000s, but there are some important differences.

Recent Comments

No comments to show.

Welcome to USA Business Watch – your trusted source for real-time insights, in-depth analysis, and industry trends across the American and global business landscape.

At USABusinessWatch.com, we aim to inform decision-makers, professionals, entrepreneurs, and curious minds with credible news and expert commentary across key sectors that shape the economy and society.

Facebook X (Twitter) Instagram Pinterest YouTube

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Archives

  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • March 2022
  • January 2021

Categories

  • Aerospace & Defense
  • Agriculture
  • Automation & Process Control
  • Automotive & Transportation
  • Banking & Finance
  • Chemicals & Materials
  • Consumer Goods & Services
  • Economy
  • Economy
  • Electronics & Semiconductor
  • Energy & Resources
  • Food & Beverage
  • Hospitality & Tourism
  • Information Technology
  • Political
Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Advertise With Us
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2025 usabusinesswatch. Designed by usabusinesswatch.

Type above and press Enter to search. Press Esc to cancel.