OpenAI Testing AI-Generated Voice Mimicry in Limited Private Preview

OpenAI is testing a new AI-based voice technology in an effort to explore its capabilities while keeping it out of the hands of potential bad actors.

Voice Engine "uses text input and a single 15-second audio sample to generate natural-sounding speech that closely resembles the original speaker," OpenAI said in a blog post last week. It can create new audio from text using a single person's voice for reference, translate existing audio into another language while retaining the original speaker's tone and accent, or create new audio in a language that's different from the original speaker's.

Voice Engine has been around for a few years; its technology underlies OpenAI's text-to-speech API and ChatGPT's voice querying capability that was introduced last fall. At the end of 2023, however, OpenAI decided to start priming Voice Engine for eventual public consumption, beginning with just a "small group of trusted partners."

OpenAI did not indicate how long this limited private preview will last, or when it expects to make Voice Engine generally available. It is purposely taking a slow and measured approach "due to the potential for synthetic voice misuse," it said. OpenAI took a similar tack when launching its Sora text-to-video capability in February, making it available to only select testers.

Nevertheless, the Voice Engine testing group has already begun applying the model to a few real-world applications across several industries. For instance, it's being used to help patients with speech conditions to communicate with their own voice, using old video recordings of them as the reference audio. Content creators are also using it to translate their assets into different languages, giving them a broader audience. Other examples, with real audio snippets of Voice Engine in action, are on OpenAI's blog post.

For all its capabilities, however, this technology, like Sora, is ripe for abuse. OpenAI said it is working to develop guardrails for Voice Engine to limit how much it can contribute to the spread of misinformation. For instance, Voice Engine prevents individual users from making their own voices from scratch. In addition, audio created by Voice Engine comes with "watermarks" to track each snippet's provenance, as well as how it's being used.

OpenAI said it gave its testing group access to Voice Engine under several stipulations to discourage them from abusing the technology. For instance, they are not allowed to use Voice Engine to impersonate another party without their explicit consent. Testers also have to disclose to their audience when they're using Voice Engine to create audio.

To avoid widespread harm caused by AI-generated audio in general, OpenAI makes several suggestions for policymakers and developers:

  • Enforce a "no-go voice list" to prevent the impersonation of well-known figures.
  • Avoid voice-based authentication for critical systems.
  • Develop ways for individuals to protect the ownership of their voice.
  • Spread broad public awareness of AI misuse.
  • Fast-track the development of technology that can trace the provenance of audio and visual media.

"We recognize that generating speech that resembles people's voices has serious risks .... We are engaging with U.S. and international partners from across government, media, entertainment, education, civil society, and beyond to ensure we are incorporating their feedback as we build," OpenAI said.

About the Author

Gladys Rama (@GladysRama3) is the editorial director of Converge360.

Featured

  • digital illustration of a glowing padlock at the center, surrounded by abstract icons of books and graduation caps

    2025 Cybersecurity Predictions for K-20 Education

    What should K-12 and higher education institutions expect on the cybersecurity front in the coming year? Here's what the experts told us.

  • The AI Show

    Register for Free to Attend the World's Greatest Show for All Things AI in EDU

    The AI Show @ ASU+GSV, held April 5–7, 2025, at the San Diego Convention Center, is a free event designed to help educators, students, and parents navigate AI's role in education. Featuring hands-on workshops, AI-powered networking, live demos from 125+ EdTech exhibitors, and keynote speakers like Colin Kaepernick and Stevie Van Zandt, the event offers practical insights into AI-driven teaching, learning, and career opportunities. Attendees will gain actionable strategies to integrate AI into classrooms while exploring innovations that promote equity, accessibility, and student success.

  • smartphone with a glowing lock and shield icon at its center, surrounded by floating security symbols like a fingerprint, key, and authentication checkmark

    Jamf to Acquire Identity Automation, Combining Identity and Device Management in One Platform

    Apple mobile device management company Jamf has announced the intent to acquire Identity Automation, a provider of identity and access management (IAM) solutions for K-12 and higher education.

  • two abstract humanoid figures made of interconnected lines and polygons, glowing slightly against a dark gradient background

    Microsoft Introduces Copilot Chat Agents for Education

    Microsoft recently announced Microsoft 365 Copilot Chat, a new pay-as-you-go offering that adds AI agents to its existing free chat tool for Microsoft 365 education customers.