OpenAI has introduced a new artificial intelligence model GPT-4o with extended voice mode.
The letter "o" in the word GPT-4o stands for "omni", which indicates its broad capabilities. The updated model can handle speech, text and video. According to the company, the GPT-4o processes audio in an average of 320 milliseconds, which is comparable to the reaction time of a person in a conversation.
The GPT-4o matches the performance of OpenAI's previous top-of-the-line model, the GPT-4 Turbo. However, the company says it outperforms it in image and audio understanding.
As TechCrunch notes, while GPT models have offered voice mode for some time, the GPT-4o greatly expands that feature by allowing users to interact with ChatGPT like an assistant. The model responds to voice input in real time, recognizes voice nuances, and can generate responses in a variety of emotional styles, including singing. It speaks 50 languages, according to OpenAI.
GPT-4o became available to users on May 13. Initially, the voice features will only be available to a select group of trusted partners, with broader access for paid subscribers expected in June.
Free GPT-4 for everyone
ChatGPT-4 is also now available to everyone for free. It's not clear what is the reason for this decision. Most likely, OpenAI is trying to retain old users and attract new ones in this way, as the competition in the field of AI does not cease to intensify. Competitors literally release some products every week.
GPT-5, which was supposed to be released at the end of last year, has not become available yet. Probably OpenAI on this background is also trying to shift attention from this problem. Another benefit is that newer versions of AI use data from users' interactions with older versions, so OpenAI is trying to get as much data as possible.
Related Posts
You may like these post too
Leave a Reply
Your email address will not be published.
Comments on this post
0 comments