Close Menu
  • Home
  • Aerospace & Defense
    • Automation & Process Control
      • Automotive & Transportation
  • Banking & Finance
    • Chemicals & Materials
    • Consumer Goods & Services
  • Economy
    • Electronics & Semiconductor
  • Energy & Resources
    • Food & Beverage
    • Hospitality & Tourism
    • Information Technology
  • Agriculture
What's Hot

‘Let’s Eat Balanced’ celebrates 5 years of promoting British meat and dairy

Britain is now willing to undermine British institutions to protect Israel | Politics

Tariff-free import of Ukrainian eggs sparks backlash from UK farmers

Facebook X (Twitter) Instagram
USA Business Watch – Insightful News on Economy, Finance, Politics & Industry
  • Home
  • Aerospace & Defense
    • Automation & Process Control
      • Automotive & Transportation
  • Banking & Finance
    • Chemicals & Materials
    • Consumer Goods & Services
  • Economy
    • Electronics & Semiconductor
  • Energy & Resources
    • Food & Beverage
    • Hospitality & Tourism
    • Information Technology
  • Agriculture
  • Home
  • About Us
  • Market Research Reports and Company
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
USA Business Watch – Insightful News on Economy, Finance, Politics & Industry
Home » AI generates data to help embodied agents ground language to 3D world
Electronics & Semiconductor

AI generates data to help embodied agents ground language to 3D world

Bussiness InsightsBy Bussiness InsightsJune 18, 2025No Comments5 Mins Read
Share Facebook Twitter Pinterest Copy Link Telegram LinkedIn Tumblr Email
Share
Facebook Twitter LinkedIn Pinterest Email


AI generates data to help embodied agents ground language to 3D world
A new 3D-text dataset, 3D-GRAND, leverages generative AI to create synthetic rooms that are automatically annotated with 3D structures. The dataset’s 40,087 household scenes can help train embodied AI, like household robots, connect language to 3D spaces. Credit: Joyce Chai

A new, densely annotated 3D-text dataset called 3D-GRAND can help train embodied AI, like household robots, to connect language to 3D spaces. The study, led by University of Michigan researchers, was presented at the Computer Vision and Pattern Recognition (CVPR) Conference in Nashville, Tennessee on June 15, and published on the arXiv preprint server.

When put to the test against previous 3D datasets, the model trained on 3D-GRAND reached 38% grounding accuracy, surpassing the previous best model by 7.7%. 3D-GRAND also drastically reduced hallucinations to only 6.67% from the previous state-of-the-art rate of 48%.

The dataset contributes to the next generation of household robots that will far exceed the robotic vacuums that currently populate homes. Before we can command a robot to “pick up the book next to the lamp on the nightstand and bring it to me,” the robot must be trained to understand what language refers to in space.

“Large multimodal language models are mostly trained on text with 2D images, but we live in a 3D world. If we want a robot to interact with us, it must understand spatial terms and perspectives, interpret object orientations in space, and ground language in the rich 3D environment,” said Joyce Chai, a professor of computer science and engineering at U-M and senior author of the study.

While text or image-based AI models can pull an enormous amount of information from the internet, 3D data is scarce. It’s even harder to find 3D data with grounded text data—meaning specific words like “sofa” are linked to 3D coordinates bounding the actual sofa.

Like all LLMs, 3D-LLMs perform best when trained on large data sets. However, building a large dataset by imaging rooms with cameras would be time-intensive and expensive as annotators must manually specify objects and their spatial relationships and link words to their corresponding objects.

The research team took a new approach, leveraging generative AI to create synthetic rooms that are automatically annotated with 3D structures. The resulting 3D-GRAND dataset includes 40,087 household scenes paired with 6.2 million densely-grounded descriptions of the room.

“A big advantage of synthetic data is that labels come for free because you already know where the sofa is, which makes the curation process easier,” said Jianing Jed Yang, a doctoral student of computer science and engineering at U-M and lead author of the study.

After generating the synthetic 3D data, an AI pipeline first used vision models to describe each object’s color, shape and material. From here, a text-only model generated descriptions of entire scenes while using scene graphs—structured maps of how objects relate to each other—to ensure each noun phrase is grounded to specific 3D objects.

A final quality control step used a hallucination filter to ensure each object generated in the text actually has an associated object in the 3D scene.

Human evaluators spot-checked 10,200 room-annotation pairs to ensure reliability by assessing whether there were any inaccuracies in AI-generated sentences or objects. The synthetic annotations had a low error rate of about 5% to 8%, which is comparable to professional human annotations.

“Given the size of the dataset, the LLM-based annotation reduces both the cost and time by an order of magnitude compared to human annotation, creating 6.2 million annotations in just two days. It is widely recognized that collecting high-quality data at scale is essential for building effective AI models,” said Yang.

To put the new dataset to the test, the research team trained a model on 3D-GRAND and compared it with three baseline models (3D-LLM, LEO and 3D-VISTA). The benchmark ScanRefer evaluated grounding accuracy—how much overlap the predicted bounding box overlaps with the true object boundary—while a newly introduced benchmark called 3D-POPE evaluated object hallucinations.

The model trained on 3D-GRAND reached a 38% grounding accuracy with only a 6.67% hallucination rate, far exceeding the competing generative models. While 3D-GRAND contributes to the 3D-LLM modeling community, testing on robots will be the next step.

“It will be exciting to see how 3D-GRAND helps robots better understand space and take on different spatial perspectives, potentially improving how they communicate and collaborate with humans,” said Chai.

More information:
Jianing Yang et al, 3D-GRAND: A Million-Scale Dataset for 3D-LLMs with Better Grounding and Less Hallucination, arXiv (2024). DOI: 10.48550/arxiv.2406.05132

Journal information:
arXiv

Provided by
University of Michigan College of Engineering

Citation:
AI generates data to help embodied agents ground language to 3D world (2025, June 16)
retrieved 18 June 2025
from https://techxplore.com/news/2025-06-ai-generates-embodied-agents-ground.html

This document is subject to copyright. Apart from any fair dealing for the purpose of private study or research, no
part may be reproduced without the written permission. The content is provided for information purposes only.



Source link

Follow on Google News Follow on Flipboard
Share. Facebook Twitter Pinterest LinkedIn Tumblr Email Copy Link
Previous ArticleStellantis pivots to Google’s Android as in-car partnership with Amazon ends
Next Article Plains All American Executes Definitive Agreements for $3.75 Billion Sale of NGL Business to Keyera – Energy News, Top Headlines, Commentaries, Features & Events
Bussiness Insights
  • Website

Related Posts

Dual-mode design improves accuracy of MEMS accelerometers, study finds

November 18, 2025

Researchers complete first real-world validation of maritime IoT communications network

November 18, 2025

Plasma-based method creates efficient, low-cost catalyst for metal-air batteries

November 18, 2025
Leave A Reply Cancel Reply

Latest Posts

‘Let’s Eat Balanced’ celebrates 5 years of promoting British meat and dairy

Tariff-free import of Ukrainian eggs sparks backlash from UK farmers

Higher welfare benefits could protect farmers’ incomes, research suggests

Fly-tipping costs farmers more as incidents rise across the UK

Latest Posts

4 defense stocks key to Trump-Greenland crisis, Europe’s NATO concerns

January 16, 2026

Boeing will surpass Airbus’ sales in 2025 for the first time since 2018

January 13, 2026

Delta Air Lines (DAL) 2025 Q4 Earnings

January 13, 2026

Subscribe to News

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Recent Posts

  • ‘Let’s Eat Balanced’ celebrates 5 years of promoting British meat and dairy
  • Britain is now willing to undermine British institutions to protect Israel | Politics
  • Tariff-free import of Ukrainian eggs sparks backlash from UK farmers
  • Jeep maker celebrates 5th anniversary, rebuilds, Stellantis inventory drops 43%
  • 40% of US oil jobs lost in the past decade will not come back – Energy News, Top Headlines, Commentary, Features, Events

Recent Comments

  1. Numbersjed on 100% tariffs on Trump’s drugs: What we know | Donald Trump News
  2. JamesPak on Hundreds gather in Barcelona to protest overtourism in southern Europe
  3. vibroanalizador on 100% tariffs on Trump’s drugs: What we know | Donald Trump News
  4. игровой аппарат гейтс оф олимпус on 100% tariffs on Trump’s drugs: What we know | Donald Trump News
  5. online casino games slots on 100% tariffs on Trump’s drugs: What we know | Donald Trump News

Welcome to USA Business Watch – your trusted source for real-time insights, in-depth analysis, and industry trends across the American and global business landscape.

At USABusinessWatch.com, we aim to inform decision-makers, professionals, entrepreneurs, and curious minds with credible news and expert commentary across key sectors that shape the economy and society.

Facebook X (Twitter) Instagram Pinterest YouTube

Subscribe to Updates

Subscribe to our newsletter and never miss our latest news

Subscribe my Newsletter for New Posts & tips Let's stay updated!

Archives

  • January 2026
  • December 2025
  • November 2025
  • October 2025
  • September 2025
  • August 2025
  • July 2025
  • June 2025
  • March 2022
  • January 2021

Categories

  • Aerospace & Defense
  • Agriculture
  • Automation & Process Control
  • Automotive & Transportation
  • Banking & Finance
  • Chemicals & Materials
  • Consumer Goods & Services
  • Economy
  • Economy
  • Electronics & Semiconductor
  • Energy & Resources
  • Food & Beverage
  • Hospitality & Tourism
  • Information Technology
  • Political
Facebook X (Twitter) Instagram Pinterest
  • Home
  • About Us
  • Market Research Reports and Company
  • Contact us
  • DMCA
  • Privacy Policy
  • Terms & Conditions
© 2026 usabusinesswatch. Designed by usabusinesswatch.

Type above and press Enter to search. Press Esc to cancel.