AI

OpenAI releases major updates that improve AI’s ability to see and hear

OpenAI has released major new updates that make voice interactions and picture recognition much better. These updates make it easier for developers to make advanced AI apps. These improvements also make it cheaper to do jobs that are done over and over again and make smaller models work better.

Satpal S

Published

October 2, 2024

OpenAI has released a number of changes that are meant to improve the voice and vision skills of its artificial intelligence models. With these changes, developers can make AI-powered apps that work better together. This will change the way people talk to each other in real time and make picture recognition better.

One feature that stands out is the Realtime API, which lets developers make voice-based AI apps with just one request. The new API allows low-latency experiences by streaming audio inputs and outputs in real time. This is different from older ways that needed multiple models to work together for audio interactions. Like ChatGPT’s advanced voice features, this makes talks feel more realistic and real.

Before this, real-time apps like speech-to-speech chats were slow because audio inputs had to be fully uploaded and processed before they could get an answer. With the Realtime API, which runs on OpenAI’s newest GPT-4 model, which came out in May 2024, developers can now make fast voice exchanges that sound like they are with a real person. This model can handle writing, vision, and sound data all at the same time.

OpenAI has added tools to improve AI’s ability to handle images and words, as well as voice changes. These changes help the AI understand pictures better, which makes visual search and object recognition better. With human input, developers can fine-tune models, which improves the system’s ability to do things like recognize items and understand images.

Among the other changes that stand out are “model distillation” and “prompt caching.” Smaller AI models can learn from bigger ones using model distillation. This makes the smaller models more useful while reducing processing time and costs. On the other hand, prompt caching stores similar prompts so they don’t have to be processed again, which can save coders a lot of money over time.

These new ideas are very important to OpenAI’s business plan, since a big part of its income comes from companies that build AI-powered apps on top of its technology. These new features will help OpenAI make a lot more money. By next year, it hopes to make $11.6 billion, up from $3.7 billion in 2024.

Satpal S

Satpal is an Editor and Author at 4C Media Co, specializing in all stories and news related to crypto and finance.

See Full Bio

In this article:AI, API, ChatGPT, Featured, OpenAI

Exclusive

From LOLs to Lambo: Is Dogwifhat the Meme That Meant Business?

A look into the meme coin that is ruling social media and Solana. In crypto, memes don’t just stay memes anymore. There is money...

Sagar Saini5 days ago

Cryptocurrency

Bitget Unleashes Bold Response: Reverses Trades and Unfreezes Accounts After Price Surge

Cryptocurrency exchange Bitget faced an unexpected disruption on April 20 as trading on the futures contract VOXEL/USDT exploded. According to their statement, within 30...

Sagar Saini6 days ago

Business

Metaplanet Surpasses $400 Million in Bitcoin Holdings with Latest 28M Purchase

A Japanese company called Metaplanet bought 28M more Bitcoin to make its total bitcoin holdings 4,855 BTC. The company aims to acquire 21,000 BTC by...

CryptoCorn5 days ago