Google DeepMind has released a new version of its language model, Gemini Robotics On-Device, designed to run directly on robots without needing an internet connection.
This update builds on the earlier Gemini Robotics model launched in March. The new version allows robots to perform physical tasks using local computing power. Developers can control and fine-tune robot behavior using simple natural language commands.
According to Google, the on-device model performs nearly as well as the cloud-based version. Although no specific comparisons were named, it also claims better results than other local models.

In demonstrations, robots using the model could perform tasks like unzipping bags and folding clothes. Initially trained on ALOHA robots, the model was later modified for use with other devices, such as the Apollo humanoid robot from Apptronik and the bi-arm Franka FR3.
Google says the FR3 robot even handled new, unseen tasks—like assembling parts on an industrial belt—without additional training.
To help developers train robots more easily, Google is also releasing a Gemini Robotics SDK. Using this toolkit, developers can train robots by showing them just 50 to 100 examples of a task. The training uses the MuJoCo physics simulator, making it more accessible and efficient.
Other tech players are exploring similar paths in robotics. Nvidia is building a platform for foundation models tailored for humanoid robots. Hugging Face is working on open-source models, datasets, and even its robotics projects. Meanwhile, Korean startup RLWRLD, backed by Mirae Asset, is also focused on building foundational AI models for robotics.