Google found a new way to demonstrate what its Gemini AI model can do, with help from a robot.
This was a robot from Google’s Everybody Robots Division, which was shut down last year. But apparently the robots are still around, so Google put a yellow bowtie on one of them then used Gemini to teach the robot how to respond to commands and navigate the DeepMind office space.
To accomplish this, Google is using vision language models VLMs that are trained on images and videos along with text, allowing them to answer questions and perform tasks that require perception.
For example, in one video a Google employee asks the robot to take him somewhere to draw things. The robot says it needs a minute to think, then it takes the employee to a white board. In another video, the robot is told to follow the directions on the whiteboard, where a map shows directions to get to what’s called the Blue Area. The robot follows the directions to a robotics testing area then announces, “I’ve successfully followed the directions on the whiteboard.”
Hit play to see the robot in action, then let us know what you think in the comments!