OpenAI Unveils Video Features in ChatGPT’s Enhanced Voice Model

ChatGPT’s Enhanced Voice Mode Now Includes Video and Screensharing

ChatGPT’s Enhanced Voice Mode (EVM) has made a remarkable advancement with the incorporation of video and screenshare features.

Initially introduced last year alongside the launch of GPT-4o, this capability was previously limited to audio interactions. Now, users can communicate with ChatGPT through their mobile device’s camera, enabling the model to visually comprehend their surroundings in real-time.

In a livestream showcase, OpenAI’s Chief Product Officer Kevin Weil and additional team members demonstrated the new capabilities of EVM. One scenario featured ChatGPT helping with the preparation of pour-over coffee. By directing the camera at the coffee-making setup, the model illustrated its grasp of the tools involved and guided the team through the brewing process step-by-step. The team also emphasized ChatGPT’s ability to analyze and interpret uploaded screenshots.

Effective today, these innovative video and screenshare functionalities are accessible to ChatGPT Plus and Pro subscribers. OpenAI intends to extend access to Enterprise and Edu users in January.

*This story is still developing…*