Microsoft Set to Launch AI-Driven Voice Cloning for Teams Calls

**Microsoft Teams Set to Launch AI-Driven Voice Cloning and Instant Translation**

Users of Microsoft Teams are on the verge of experiencing an innovative feature that will let them create digital versions of their voices for instantaneous translation during discussions. This new capability, known as “Interpreter,” was introduced at the annual [Microsoft Ignite](https://ignite.microsoft.com/en-US/home) conference and covered by [TechCrunch](https://techcrunch.com/2024/11/19/soon-microsoft-will-let-teams-meeting-attendees-clone-their-voices/). The AI-enhanced tool allows individuals to translate their spoken words into various languages while preserving the distinct quality of their own voice.

“Envision being able to express yourself just as you do in your native tongue, but in another language,” stated Microsoft CMO Jared Spataro in a blog entry. “The Interpreter function in Teams offers live speech-to-speech translation during meetings, and users can choose to have it mimic their voice for a more personalized and engaging interaction.”

Initially, this feature will be accessible to Microsoft 365 subscribers, supporting languages such as English, French, German, Italian, Japanese, Korean, Portuguese, Mandarin Chinese, and Spanish.

### Enhancing Accessibility, Yet Raising Issues

Microsoft’s Interpreter tool has the potential to enhance remote work and virtual communication for non-English speakers. Nevertheless, it currently lacks the flexibility of a live human translator. Additionally, the tool brings forth concerns regarding security and technology bias.

A recent [study](https://www.pbs.org/newshour/politics/a-neurological-disorder-took-rep-jennifer-wextons-voice-ai-helped-her-bring-it-back-to-the-house-floor) revealed that AI transcription technologies, like Whisper (utilized in Microsoft’s cloud services), were susceptible to inaccuracies, including generating misleading information when translating medical content. These challenges were particularly significant for people with speech impairments, such as aphasia. In a similar vein, the much-publicized [Humane AI pin](https://www.theverge.com/2024/4/18/24134180/humane-ai-pin-translation-wearables), which sought to provide live translation, turned out to be an inconsistent substitute for human translators.

In response to concerns about precision, Microsoft informed *TechCrunch*: “Interpreter is designed to reproduce the speaker’s message as accurately as possible without making assumptions or adding unnecessary content. Voice simulation can only be activated when users grant permission through a notification during the meeting or by enabling ‘Voice simulation consent’ in settings.”

### Ethical Considerations and Possible Misuse

This technology could dramatically impact accessibility, particularly for those with unusual speech patterns. For instance, U.S. Representative Jennifer Wexton has pointed out the advantages of personalized voice cloning for individuals with speech impairments. However, the emergence of AI-driven voice cloning has also raised alarms about unauthorized deepfakes and potential use in fraud.

Reportedly, Microsoft’s voice cloning technology is [strikingly human-like](https://www.livescience.com/technology/artificial-intelligence/ai-speech-generator-reaches-human-parity-but-its-too-dangerous-to-release-scientists-say), leading to ethical concerns. Even the CEO of Microsoft has advocated for more robust regulations and governance to combat the rising menace of AI-generated deepfakes, especially those involving celebrities.

### Increasing Industry Fascination with AI Voice Solutions

Despite these worries, the appetite for AI voice cloning is on the rise, propelled by the larger AI expansion. Last year, Apple unveiled its [Personal Voice](https://mashable.com/article/how-to-use-personal-voice-ios-17-apple-iphone) feature that employs machine learning to generate a synthesized version of a user’s voice for real-time text-to-speech applications like FaceTime. Microsoft has also rolled out its own [Personal Voice](https://learn.microsoft.com/en-us/azure/ai-services/speech-service/personal-voice-overview) feature, driven by Azure AI, which accommodates 90 languages.

As this technology advances, it holds the promise to transform communication and accessibility, yet it necessitates meticulous examination of its ethical and security ramifications.