OpenAI Expands API With Advanced Voice and Translation Features
Artificial intelligence company OpenAI has introduced a suite of new voice intelligence tools within its API platform, aimed at helping developers build applications capable of speaking, translating, and transcribing conversations in real time.
Among the newly launched tools is GPT-Realtime-2, a voice model designed to generate more natural and realistic conversations with users.
OpenAI explained that the model incorporates GPT-5-class reasoning capabilities, allowing it to better understand and respond to more complex user requests compared to its predecessor.
The company also unveiled GPT-Realtime-Translate, a real-time translation system capable of keeping pace with live conversations.
The feature currently supports more than 70 input languages and can provide translated responses in 13 output languages, broadening accessibility for multilingual communication.
Another addition is GPT-Realtime-Whisper, a speech-to-text capability that enables live transcription during conversations. OpenAI stated that the new technologies collectively move voice systems beyond simple question-and-answer interactions toward more advanced tools capable of listening, reasoning, translating, and taking actions during discussions.
The company noted that the technology could benefit sectors such as customer service, education, media, events, and content creation.
At the same time, OpenAI acknowledged concerns about possible misuse and said safeguards have been built into the system to prevent spam, fraud, and other forms of abuse by automatically halting conversations that violate harmful content guidelines.
According to OpenAI, all the new voice models are now available through its Realtime API platform. Pricing for translation and transcription services will be based on usage time, while GPT-Realtime-2 will operate on token-based billing.
Source: TechCrunch
news via inbox
Get the latest updates delivered straight to your inbox. Subscribe now!

