TechCrunch AI • 104일 전

텍스트 번역으로 유명한 DeepL, 실시간 음성 번역 시장 진출

IMP

8/10

핵심 요약

텍스트 번역 기업 DeepL이 회의, 모바일 대화, 현장 근무자용 그룹 대화 등 다양한 환경을 지원하는 실시간 음성 대 음성(Voice-to-Voice) 번역 제품군과 API를 공식 출시했습니다. 줌(Zoom)과 마이크로소프트 팀즈(Teams) 플러그인 연동, 산업별 맞춤형 어휘 학습 기능을 제공하여 글로벌 비즈니스 및 고객 지원 환경에서 즉각적인 활용이 가능합니다. 이번 출시로 DeepL은 기존 텍스트 번역 역량을 바탕으로 음성 AI 시장의 유력한 경쟁자로 자리매김하며, 향후 텍스트 변환 과정을 생략하는 엔드투엔드(End-to-End) 음성 번역 모델 개발도 예고했습니다.

번역된 본문

텍스트 번역 도구로 잘 알려진 번역 기업 DeepL이 오늘 회의, 모바일 및 웹 대화, 그리고 맞춤형 앱을 통한 현장 근무자의 그룹 대화 등 다양한 사용 사례를 아우르는 음성 대 음성 번역 제품군을 출시했습니다. 또한 외부 개발자와 기업이 콜센터와 같은 맞춤형 사용 사례에 맞춰 DeepL의 기술을 활용할 수 있도록 API도 함께 공개했습니다.

DeepL의 야레크 쿠틸로프스키(Jarek Kutylowski) CEO는 TechCrunch와의 인터뷰에서 "수년 동안 텍스트 번역에 매진해 왔으며, 음성 분야로의 확장은 우리에게 매우 자연스러운 다음 단계였다"며 "텍스트 및 문서 번역 분야에서는 큰 발전을 이루었지만, 실시간 음성 번역을 위한 훌륭한 제품은 부재했다"고 밝혔습니다. 쿠틸로프스키 CEO는 실시간 번역 제품을 개발하는 데 있어 가장 큰 과제는 누군가 말한 후 번역된 오디오가 재생되기까지의 지연 시간(latency)을 줄이는 것과 정확한 번역 결과를 유지하는 것 사이의 균형을 맞추는 것이라고 설명했습니다.

DeepL은 줌(Zoom)과 마이크로소프트 팀즈(Microsoft Teams)와 같은 플랫폼용 추가 기능(add-on)을 출시하여, 참가자가 각자의 모국어로 말할 때 청취자가 실시간 번역 오디오를 듣거나 화면의 실시간 번역 텍스트를 따라갈 수 있도록 했습니다. 이 프로그램은 현재 얼리 액세스(Early Access) 단계로, 기업들은 대기자 명단(waitlist)에 등록할 수 있습니다. 또한 대면 및 원격 모두에서 사용 가능한 모바일 및 웹 기반 대화 제품도 선보였습니다. 교육 세션이나 워크샵과 같은 환경에서 참가자들이 QR 코드를 통해 그룹 대화에 참여할 수 있는 기능도 포함되었습니다.

DeepL의 음성 대 음성 기술은 업계 전문 용어, 회사명, 인명과 같은 맞춤형 어휘를 학습하고 적응할 수도 있습니다. 쿠틸로프스키 CEO는 AI가 향후 수년간 고객 서비스의 미래를 재정의할 것이라고 말했습니다. 그는 번역 레이어가 기업들에게 자격을 갖춘 직원을 채용하기 어렵고 비용이 많이 드는 언어로도 고객 지원을 제공할 수 있게 도와준다고 덧붙였습니다.

[참고: TechCrunch의 'Disrupt 2026' 행사 관련 홍보 문구는 번역에서 제외했습니다.]

DeepL은 전체 음성 대 음성 기술 스택을 자체 통제하고 있다고 밝혔습니다. 단, 현재 시스템은 음성을 텍스트로 변환한 뒤 번역을 적용하고, 이를 다시 음성으로 변환하는 방식을 사용합니다. DeepL은 수년간 텍스트 번역을 연마해 온 만큼 번역 품질 측면에서 강력한 우위를 점하고 있다고 보고 있습니다. 앞으로 회사는 텍스트 변환 단계를 완전히 건너뛰는 엔드투엔드(End-to-End) 음성 번역 모델을 개발할 계획입니다.

DeepL은 인접 분야에서 자금을 풍부하게 확보한 여러 스타트업들과 경쟁하고 있습니다. 작년에 쿼드릴르 캐피탈(Quadrille Capital)과 텔러퍼포먼스(Teleperformance)로부터 6,500만 달러를 유치한 Sanas는 실시간으로 발화자의 억양을 수정하는 AI를 사용하며, 이는 주로 콜센터 상담원을 겨냥한 도구입니다. 두바이에 본사를 둔 Camb.AI는 미디어 및 엔터테인먼트 기업을 대상으로 음성 합성 및 번역에 집중하며, 대규모 비디오 콘텐츠 더빙 및 현지화를 돕고 있습니다. Reddit 공동 창립자 알렉시스 오하니안(Alexis Ohanian)의 투자사인 Seven Seven Six의 지원을 받는 Palabra는 의미뿐만 아니라 발화자의 원래 목소리까지 보존하는 실시간 음성 번역 엔진을 구축하고 있어, 현재 DeepL이 구축 중인 제품과 가장 직접적인 경쟁 관계에 있습니다.

원문 보기

원문 보기 (영어)

DeepL, a translation company best known for its text tools, released a voice-to-voice translation suite today that covers use cases like meetings, mobile and web conversations, and group conversations for frontline workers through custom apps. The company is also releasing an API that lets outside developers and businesses build on top of DeepL's tech for customized use cases, such as call centers. "After spending so many years in text translation, voice was a natural step for us," DeepL CEO Jarek Kutylowski told TechCrunch in an interview. "We have come a long way when it comes to text translation and document translation. But we thought there wasn't a great product for real-time voice translation." Kutylowski said that the challenges in creating a real-time translation product center on striking a balance between reducing latency — the delay between someone speaking and the translated audio playing back — and maintaining accurate results. DeepL is releasing add-ons for platforms like Zoom and Microsoft Teams, where listeners can either hear real-time translation while others are speaking in native languages or follow real-time translated text on screen. This program is currently under early access, and the company is inviting organizations to join a waitlist . The company also has a product for mobile and web-based conversations that can take place in person or remotely. DeepL also lets allows users participate in a group conversation in settings like a setting like training sessions or workshops, allowing participants to join through a QR code. DeepL said that its voice-to-voice tech can also learn and adapt to custom vocabulary, such as industry-specific terms and company and personal names. Kutylowski said that AI is reimagining what customer service will look like in the coming years. He noted that a translation layer helps companies provide support in languages where qualified staff are scarce and expensive to hire. Techcrunch event Meet your next investor or portfolio startup at Disrupt Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $410. Meet your next investor or portfolio startup at Disrupt Your next round. Your next hire. Your next breakout opportunity. Find it at TechCrunch Disrupt 2026, where 10,000+ founders, investors, and tech leaders gather for three days of 250+ tactical sessions, powerful introductions, and market-defining innovation. Register now to save up to $410. San Francisco, CA | October 13-15, 2026 REGISTER NOW The company said that it controls the entire voice-to-voice stack. However, the current system converts the speech to text, applies translation, then converts that back to speech. DeepL believes that since it has worked on text translation for years, it has an edge in translation quality. Going forward, the company wants to develop an end-to-end voice translation model that skips the text step entirely. DeepL faces competition from several well-funded startups working in adjacent corners of the space. Sanas, which last year raised $65 million from Quadrille Capital and Teleperformance, uses AI to modify a speaker's accent in real time — a tool aimed primarily at call center agents. Dubai-based Camb.AI focuses on speech synthesis and translation for media and entertainment companies Amazon Web Services, helping them dub and localize video content at scale. Palabra, backed by Reddit co-founder Alexis Ohanian's firm Seven Seven Six, is building a real-time speech translation engine designed to preserve both the meaning and the speaker's original voice , putting it in more direct competition with what DeepL is now building. Topics AI , AI translation , DeepL , voice AI Ivan Mehta Ivan covers global consumer tech developments at TechCrunch. He is based out of India and has previously worked at publications including Huffington Post and The Next Web. You can contact or verify outreach from Ivan by emailing im@ivanmehta.com or via encrypted message at ivan.42 on Signal. View Bio April 30 San Francisco, CA StrictlyVC kicks off the year in SF. Get in the room for unfiltered fireside chats with industry leaders, insider VC insights, and high-value connections that actually move the needle. Tickets are limited. REGISTER NOW Most Popular After sale of its shoe business, Allbirds pivots to AI Sarah Perez An Amazon warehouse worker died on the job at Oregon facility Amanda Silberling Stanford report highlights growing disconnect between AI insiders and everyone else Sarah Perez Sam Altman responds to ‘incendiary’ New Yorker article after attack on his home Anthony Ha Anthropic temporarily banned OpenClaw's creator from accessing Claude Julie Bort France to ditch Windows for Linux to reduce reliance on US tech Zack Whittaker YouTube Premium and YouTube Music are getting more expensive Aisha Malik

DeepL 음성 AI 실시간 번역 비즈니스