OpenAI announced GPT-4o, a new generative AI model that handles text, speech, and video. Set to roll out iteratively, GPT-4o offers GPT-4-level intelligence with enhanced multimodal capabilities. It improves upon the previous GPT-4 Turbo model by adding speech to its analysis of images and text. The model boosts ChatGPT’s functionality, allowing for real-time responsiveness, nuanced voice interaction, and advanced visual analysis. It can answer questions about photos or desktop screens and translate menus in different languages. GPT-4o is more multilingual, faster, and cost-effective than its predecessors. Initially, its audio capabilities will be available to trusted partners due to misuse concerns. GPT-4o is accessible to ChatGPT’s free and premium users, with additional voice features arriving for Plus users soon. OpenAI is also updating the ChatGPT UI and releasing a macOS desktop version. The GPT Store is now available to free tier users, and previously paywalled features are now accessible.
- Introduction of GPT-4o, handling text, speech, and video.
- Enhanced capabilities over GPT-4 Turbo, including speech integration.
- Improvements to ChatGPT’s real-time responsiveness and visual analysis.
- Multilingual support and cost-effectiveness of GPT-4o.
- Updates to ChatGPT UI, macOS desktop version, and availability of the GPT Store to free tier users.