Google announces Gemini Robotics for building general purpose robots
Gemini Robotics is a vision-language-action (VLA) model by Google DeepMind that aims to bring Gemini and AI to the physical world. It was created to make general purpose robots that can perform a wider range of real-world tasks than ever before. The VLA is built on Gemini 2.0 and has physical actions as an additional output modality for the purpose of directly controlling robots. Google built its robotic AI model to be able to understand and respond quickly to instructions or changes in the environment, capable of the kinds of things people can generally do with their hands and fingers, and adept with dealing with new objects, diverse instructions, and new environments. A video showing some of the model's capabilities is available in the article.