Connect with us

Hi, what are you looking for?

OpenAI’s
OpenAI’s

AI

OpenAI releases major updates that improve AI’s ability to see and hear

OpenAI has released major new updates that make voice interactions and picture recognition much better. These updates make it easier for developers to make advanced AI apps. These improvements also make it cheaper to do jobs that are done over and over again and make smaller models work better.

OpenAI has released a number of changes that are meant to improve the voice and vision skills of its artificial intelligence models. With these changes, developers can make AI-powered apps that work better together. This will change the way people talk to each other in real time and make picture recognition better.

One feature that stands out is the Realtime API, which lets developers make voice-based AI apps with just one request. The new API allows low-latency experiences by streaming audio inputs and outputs in real time. This is different from older ways that needed multiple models to work together for audio interactions. Like ChatGPT’s advanced voice features, this makes talks feel more realistic and real.

Before this, real-time apps like speech-to-speech chats were slow because audio inputs had to be fully uploaded and processed before they could get an answer. With the Realtime API, which runs on OpenAI’s newest GPT-4 model, which came out in May 2024, developers can now make fast voice exchanges that sound like they are with a real person. This model can handle writing, vision, and sound data all at the same time.

OpenAI has added tools to improve AI’s ability to handle images and words, as well as voice changes. These changes help the AI understand pictures better, which makes visual search and object recognition better. With human input, developers can fine-tune models, which improves the system’s ability to do things like recognize items and understand images.

Among the other changes that stand out are “model distillation” and “prompt caching.” Smaller AI models can learn from bigger ones using model distillation. This makes the smaller models more useful while reducing processing time and costs. On the other hand, prompt caching stores similar prompts so they don’t have to be processed again, which can save coders a lot of money over time.

These new ideas are very important to OpenAI’s business plan, since a big part of its income comes from companies that build AI-powered apps on top of its technology. These new features will help OpenAI make a lot more money. By next year, it hopes to make $11.6 billion, up from $3.7 billion in 2024.

Advertisement

You May Also Like

Business

WalletConnect Foundation and Reown have announced new UX standards for blockchain wallets to improve interactions and increase adoption. Through WalletGuide and WalletConnect Certified, they...

Cryptocurrency

In this week's Crypto Chronicle, we explore the potential impact of Donald Trump's presidential victory on the crypto world, Elon Musk's massive $20 billion...

Cryptocurrency

Canary Capital's application for a spot HBAR ETF has surprised the crypto community as the firm expands its market presence with new crypto-focused funds....

Cryptocurrency

Italy is adjusting its cryptocurrency tax plans, opting for a more mild hike of 28% rather than the previously proposed 42%. This shift comes...

polkadot
Polkadot (DOT) $ 4.84 3.87%
bitcoin
Bitcoin (BTC) $ 88,375.68 1.72%
ethereum
Ethereum (ETH) $ 3,082.72 3.71%
cardano
Cardano (ADA) $ 0.593553 4.15%
xrp
XRP (XRP) $ 0.821728 17.99%
stellar
Stellar (XLM) $ 0.134938 9.60%
litecoin
Litecoin (LTC) $ 81.65 8.17%