Xai’sGrok can answer questions using your phone camera


The smartphone is held in the user's hands and the screen displays a chatbot interface that uses the camera to analyze the actual objects. A phone refers to a product, a foreign language sign, or a handwritten note. The screen displays visual highlights and live chatbot responses for each object. The background subtly suggests comparisons with Google and Openai AI tools, which have icons and faint UI elements. The overall scene is modern and intuitive, reflecting the cutting edge multimodal AI experience.

Xai’s Grok Chatbot took a big leap into visual intelligence. The company announced Tuesday that Grok can interpret the world through smartphone cameras thanks to a new feature called Grok Vision.

Grok Vision allows users to point their iPhones to objects like signs, documents, everyday products, and ask questions about what they are seeing in the chatbot. This feature is currently only available on the iOS version of the GROK app, but Xai says it will be available on Android in the future.

Multilingual audio and voice search have also been added

The GROK update also includes multilingual audio support for chatbot voice modes and real-time search integration. These new tools are available to Android users, but only subscribers with Xai’s $30 monthly SuperGrok plan.

These updates continue the pattern of rapid functional growth for GROK. Earlier this month, Xai introduced a memory feature that allows chatbots to recall details from past conversations. The bot has also acquired tools like canvas for creating documents and simple applications.

Grok’s new vision capabilities illustrate key steps in making chatbots more contextual and physically integrated into everyday life. By simply allowing users to point and ask, Xai creates a more seamless connection between digital aid and the real world. Once multimodal capabilities become standard across AI platforms, tools like Grok Vision can define the next era of mobile interaction.

Similar real-time visual features have already been deployed from Google’s Gemini and Openai’s ChatGpt, which straightens Grok into races and defines the next generation of multimodal AI assistants.

The future of AI may not live behind the screen. It may pass through the lens inside your hand.

Editor’s Note: tHis article was created by Alicia Shapiro, CMO of AINEWS.COM, and provided support for writing, images and idea generation from AI assistant ChatGpt. However, the only final perspective and editorial choice is Alicia Shapiro. Thank you to ChatGpt for your research and editorial support in writing this article.



Source link