OpenAI has released a number of changes that are meant to improve the voice and vision skills of its artificial intelligence models. With these changes, developers can make AI-powered apps that work better together. This will change the way people talk to each other in real time and make picture recognition better.
One feature that stands out is the Realtime API, which lets developers make voice-based AI apps with just one request. The new API allows low-latency experiences by streaming audio inputs and outputs in real time. This is different from older ways that needed multiple models to work together for audio interactions. Like ChatGPT’s advanced voice features, this makes talks feel more realistic and real.
Before this, real-time apps like speech-to-speech chats were slow because audio inputs had to be fully uploaded and processed before they could get an answer. With the Realtime API, which runs on OpenAI’s newest GPT-4 model, which came out in May 2024, developers can now make fast voice exchanges that sound like they are with a real person. This model can handle writing, vision, and sound data all at the same time.
OpenAI has added tools to improve AI’s ability to handle images and words, as well as voice changes. These changes help the AI understand pictures better, which makes visual search and object recognition better. With human input, developers can fine-tune models, which improves the system’s ability to do things like recognize items and understand images.
Among the other changes that stand out are “model distillation” and “prompt caching.” Smaller AI models can learn from bigger ones using model distillation. This makes the smaller models more useful while reducing processing time and costs. On the other hand, prompt caching stores similar prompts so they don’t have to be processed again, which can save coders a lot of money over time.
These new ideas are very important to OpenAI’s business plan, since a big part of its income comes from companies that build AI-powered apps on top of its technology. These new features will help OpenAI make a lot more money. By next year, it hopes to make $11.6 billion, up from $3.7 billion in 2024.