In a world where AI is making groundbreaking strides, Respeecher stands at the forefront of synthetic voice technology, helping media giants like Lucasfilm and Disney push the boundaries of storytelling. At the heart of this innovation is Margarita Grubina, Vice President of Business Growth at Respeecher, who has been instrumental in scaling the company and ensuring its ethical and technological excellence.
In our latest episode of AllAboutAI.com, I talked with Margarita to explore Respeecher’s evolution, how its AI-powered voice synthesis is transforming industries from entertainment to healthcare, and what the future holds for the ethical use of AI in media.
Join us as we unwind into Margarita’s journey with Respeecher, its strategic goals, and its commitment to delivering the highest quality in Hollywood-grade AI voice technology.
Respeecher: Elevating AI Voice Technology for the Media Industry
From de-aging iconic characters like Mark Hamill’s Luke Skywalker to bringing historical figures back to life, Respeecher’s speech-to-speech technology is helping Hollywood and other industries push the boundaries of storytelling. Respeecher is revolutionizing the media industry with its cutting-edge AI voice technology, enabling creators to produce synthetic voices indistinguishable from the real thing.Â
Who is Margarita Grubina?
Margarita Grubina, Vice President of Business Growth at Respeecher, has been instrumental in scaling the company’s operations and building strong relationships with major clients like Disney and Lucasfilm. With over six years of experience in business development and sales, she has played a key role in driving Respeecher’s growth, ensuring the company remains at the forefront of AI voice technology while adhering to ethical standards.
Margarita also regularly participates in panels, podcasts, and international events such as the Cannes Film Festival and Google for Startups. Her leadership and expertise have helped shape Respeecher’s long-term strategy, positioning it as a leader in synthetic voice technology for the media and entertainment industries.
What is Respeecher?
Respeecher is a cutting-edge AI voice technology company specializing in speech-to-speech synthesis, enabling creators to replicate and transform voices with remarkable accuracy. Known for its Hollywood-grade quality, Respeecher’s technology has been used in major productions, including collaborations with Disney and Lucasfilm.
The platform allows users to recreate voices for creative projects, such as de-aging actors’ voices or bringing historical figures back to life while maintaining a solid ethical framework. Respeecher is committed to ensuring that all voice cloning is done with permission from voice owners, making it a trusted and responsible player in the AI voice technology industry. Beyond entertainment, Respeecher’s technology is being applied in sectors like healthcare, e-learning, and cybersecurity.
What inspired you to join Respeecher?
I have been working in business development for more than six years, and during that time, I worked with various companies providing outsourcing services. I started as a business development manager in a marketing agency. I also worked on an app that helped businesses offer parking slots to their employees in New York City.
Then, I joined a startup that set up beehives on skyscrapers for ecological benefits. This experience allowed me to work across different industries, gaining insights into how various sales processes are structured.
When I joined Respeecher, I had the experience to establish our sales operations and processes. Before I arrived, our CEO, Alex, handled all the sales alone. We received many incoming requests but did not have any formal sales structure. My role involved working closely with Alex to hire a sales team and build a sales process from scratch.
What key challenge shaped Respeecher’s growth strategy during your time there?
A pivotal moment for us was releasing our product, Voice Marketplace. This platform provides a library of voices for users. While it’s now easy to use, it was challenging back then because our models required high-quality data from both the cloned and input voices. This meant users had to send us a 60-minute high-quality recording, which was often hard to obtain from home recordings.
This was a key challenge, but overcoming it has been crucial to our growth. Voice Marketplace has expanded, and we’re adding voices from voice donors and celebrity voices. Today, our platform serves individual users, indie game developers, producers, large production studios, podcasters, and content creators. This shift has been a fundamental part of our development and long-term strategy.
What are Respeecher’s long-term strategic goals?
From the beginning, our primary goal was to make synthetic voice technology accepted by Hollywood. This was not a reality back then, but we achieved this goal in 2019 when we worked on our first project with Lucasfilm, and we’ve continued working with them since then.
Beyond Hollywood quality, one of our key goals has always been to operate ethically. We recognize that this technology can be dangerous if misused. It could create deceptive content that might manipulate people. That’s why we are committed to obtaining permission from voice owners before cloning their voices. Over the six years we’ve been operating, we have had zero cases of misuse. For us, it’s essential to maintain this ethical standard in everything we do.
What key insights or market research are crucial for growing and building an AI company?
It is essential to understand that Artificial Intelligence is already here, and it’s not going anywhere. It’s impossible to stop the progress of AI, but it will be regulated as we move forward. There are already regulations in place, and more will be introduced. It’s critical to consider these regulations when building an AI company. You want to ensure you remain compliant so that your business can continue to operate without being shut down due to regulatory changes.
Another important insight is to focus on building a sustainable company rather than just trying to get as many users as possible at any cost.
For example, at Respeecher, we could allow any user to clone any voice right now, but we always ask ourselves what the consequences will be five or ten years from now. Will regulations allow it? How will it impact society? This long-term vision is essential for creating sustainable growth and ensuring your company’s longevity.
How is Respeecher planning to shape the media industry’s future with AI voice technology?
Our mission is not necessarily to transform the industry but to help make things easier and achieve the impossible. For example, actors often dislike ADR (automated dialogue replacement) sessions or might be unavailable due to scheduling conflicts. Our technology can help eliminate the logistical challenges of ADR by allowing actors to complete these tasks without being physically present.
With Respeecher, it’s not about replacing actors but about assisting them to accomplish things they might not want or can do.
Our technology enables impossible feats, like de-aging Mark Hamill for The Mandalorian or bringing Elvis back on stage during America’s Got Talent. These kinds of projects boost creativity and allow for storytelling that wouldn’t have been possible without AI.
For example, we helped create a hologram of Vince Lombardi for the Super Bowl opening, which generated over 2 billion impressions online. These applications show how our technology makes things easier and pushes creative boundaries.
How does Respeecher maintain its brand as a leader in providing Hollywood-quality voice technology?
At Respeecher, we pride ourselves on being the most ethical, Hollywood-quality voice technology provider. We take every measure to ensure the ethical use of our AI. We always get permission from voice owners and thoroughly evaluate every project.
Even when a client has permission from the voice actor, a budget, and all the necessary resources, we sometimes decline projects if we believe using AI voice technology is inappropriate or could be harmful, such as in pranks or deepfakes.
Quality is another pillar of our brand. We have a dedicated team of sound engineers who have worked on central Hollywood and AAA gaming projects. They understand what high-quality production entails and ensure that every project we take on meets those standards. Also, we are very client-focused, aiming for significant, impactful projects rather than chasing large numbers of users.
How does Respeecher leverage events like Cannes and partnerships with Google to educate the industry on AI voice technology?
We are grateful for the opportunities provided by events like the Cannes Festival and partnerships with Google, primarily through Google’s support for Ukrainian startups. We are based in Ukraine, and Google for Startups has helped us maintain our operations and make connections within the industry.
Our primary goal in attending events like Cannes is to educate people about AI voice technology. It’s not about advertising or simply being visible—it’s about showing people what is possible with this technology and what they need to know before using it.  We aim to raise awareness and share our insights with industry professionals so they can make informed decisions about AI.
How does Respeecher handle requests to recreate the voices of the deceased?
We regularly receive requests to recreate the voices of deceased individuals, but we approach these requests cautiously.
- First, it’s not always technically possible because we need high-quality audio recordings of the person’s voice to make it work.
- Second, we always consider the ethical implications. Will it help or harm the people involved, particularly regarding grieving?
There are many aspects to consider, and we have not yet delivered any projects like this.
However, we have had projects where recreating a voice from the past had a positive impact. For example, we recreated the voice of Manolo, a famous basketball commentator in Puerto Rico. When the Puerto Rican basketball team made it to the Olympics, his family was deeply moved to hear his voice again. While this project was successful, we remain mindful that this kind of technology can have a profound emotional impact and handle such requests carefully.
What distinguishes Respeecher from competitors like Sonantic and Descript in the AI voice space?
One key difference between us and competitors like Sonantic and Descript is that our core technology is speech-to-speech. Many companies offer text-to-speech, but our focus on speech-to-speech allows us to achieve the most natural and human-like performance.
Additionally, the quality of our output speaks for itself. Our audio files are used by top-tier companies like Disney, Sony, and Netflix, all of which demand the best quality available. Another differentiator is our strict ethical approach. We do not allow people to clone voices for deepfakes or unethical purposes like the one with Joe Biden, Kamala Harris, Taylor Swift, and many others.
Every project is handled case by case, and no automated system will enable users to clone any voice they want.
How does Respeecher ensure content moderation on its Voice Marketplace platform?
On our Voice Marketplace platform, we have implemented automated moderation to flag any potentially controversial content. Our terms of use prohibit political, racist, sexual, or violent content. If something is flagged, it is either automatically banned or reviewed by a human moderator. Users who violate our terms are notified, and repeat offenders are banned from the platform.
How has Respeecher maintained a stable market position and built long-term trust with clients and voice owners?
Respeecher’s market position has remained stable because we know our strengths and have not changed them frequently. We have established trust with our clients and voice owners by doing things ethically and consistently. Clients and IP owners trust us because we have a clear and unwavering approach to business.
How has Respeecher expanded its voice technology solutions across various industries, from entertainment to healthcare and beyond?
We started with speech-to-speech but have expanded our technology to include text-to-speech, cross-lingual voice cloning, real-time voice changes, and accent conversion over time. We’ve grown from working primarily with feature films to serving industries like TV, podcasts, audiobooks, gaming, e-learning, and healthcare.
In healthcare, for instance, we help individuals who have undergone laryngectomy to recover their voices. Many electrolarynx devices provide the same voice to everyone, making it difficult to distinguish between users in the same room. Our technology can give these individuals personalized voices, which makes a huge difference in their daily lives.
We also work in cybersecurity, call centers, and other sectors, with opportunities for AI voice technology continuing to expand in fields like automotive and electronics manufacturing. The potential applications for AI voices are virtually limitless.
How does Respeecher ensure the quality and authenticity of the voices it synthesizes?
From the start, quality has been our primary focus. Our team includes sound professionals and machine learning engineers collaborating closely on model development. Our client-facing team understands our clients’ needs and communicates what is possible with our technology. If clients send us unsuitable materials, we explain the risks and limitations of working with that data, ensuring they have realistic expectations.
Once the model is developed, our sound engineers rigorously test and retrain it to guarantee perfect voice quality. Clients also have a final approval step to ensure satisfaction with the result. After the client approves the voice, we provide further support, including training them on using our tool to achieve the best possible outcomes for their project.
How did Respeecher use technology to make Aloe Blacc sing in five languages?
We worked on an exciting project with Aloe Blacc, where we made him sing in five different languages, even though he only speaks English. Singing is more complex than regular speech, but we ensured that our sound engineers tuned the output carefully to maintain quality.
Many of our sound engineers are musicians, so they fully understand the intricacies of working with music and vocals. This helped us successfully execute the project, ensuring the results were accurate and high-quality.
What are the key features of Respeecher?
Respeecher offers cutting-edge features, including speech-to-speech conversion, text-to-speech, cross-lingual voice cloning, and real-time voice transformation. For example, we can take input audio in one language and make it sound like the same person is speaking in another language, such as Ukrainian. Our real-time solutions allow users to change their voice in live scenarios—whether switching to a different gender or altering their accent instantly.
This is incredibly impactful for communities such as transgender individuals, who may feel uncomfortable with their voices. They can now use our real-time voice change technology to find a voice that suits their identity and personality better. We have also developed new real-time text-to-speech capabilities, which are increasingly popular for chatbots and other automated systems.
How does Respeecher ensure the privacy and protection of clients?
At Respeecher, we take data privacy very seriously. We never use a client’s data to train other models; no internal models are built using client data. When you provide us with data for a project, that data is exclusively used to create your model. Additionally, we always require permission from the voice owner to proceed with cloning. Every project involves three key parties:
- IP owner:Â who provides permission for the voice to be cloned;
- The client: Who produces the content and owns all deliverables related to their project.
- Respeecher: The technology.
Once the project is complete, if the client no longer needs the AI model, we delete both the data and the model. In some cases, if a project, such as a TV series, has future seasons in development, clients may request that we retain the model until they are ready to proceed.
What data sources does Respeecher use to train its AI models?
We receive data from our clients for each project. For example, if Lucasfilm needs to recreate Mark Hamill’s voice, they provide us with recordings of his voice, which we use to train our model. If another studio wants Mark Hamill’s voice, they must provide their data. We cannot use the same voice model for different projects without fresh input from the new client.
In our Voice Marketplace, we use voices from real people who have donated their voice data to Respeecher. These voice donors are fully informed about how their voices will be used. Many came into our office to record their voices in our studio, where they also learned more about how our platform works.
Can you share an example of a significant feature update driven by user feedback at Respeecher?
We gather a lot of feedback from our users, especially during the development of the Voice Marketplace. We regularly set up one-on-one calls with beta testers or existing users to learn about their experiences.
Additionally, we maintain close contact with our B2B clients, like sound producers, post-production managers, and coordinators, to understand how they use our technology and what improvements they would find helpful.
One feature we implemented based on user feedback is the pitch correction tool. Sometimes, the input recording’s pitch doesn’t match the final output, mainly if it was recorded at a higher pitch. This feedback led us to develop a pitch correction tool that resolves those issues.
We are always open to custom development for our long-term B2B clients. For example, we recently worked with a client to set up our models on their servers for data security and faster processing.
Another example is when a client needed our technology to work in the Slovakian language—we trained a Slovakian model specifically for their project. Based on client feedback, we also added time control features for text-to-speech.
What are the key industry trends you’re seeing in AI voice technology?
Three major trends are shaping the AI voice technology industry right now.
- The first is awareness—people finally realize that AI tools exist, which was not the case when we started six years ago.
- The second is adoption—AI tools are integrated into work and personal life. Just a few years ago, people would ask, “What is AI voice cloning?” Now, the question is, “How can I use it in production? How do I use it ethically?” This shift in narrative shows that people are more comfortable using AI now and want to know how to apply it effectively in their workflows.
- The third major trend is regulation. Some regulations are already in place, but we expect more to be introduced in the coming years. These will shape how AI is used, especially to avoid potential misuse, such as deepfakes of Biden.
We hope these regulations come sooner rather than later, as it is essential to maintain trust in the technology and avoid harmful applications.
What advice would you give someone looking to enter the AI voice technology industry?
I advise anyone entering the AI voice technology industry to talk to people and understand their needs. If you have an idea for a solution, find a few people in the industry and ask them if your solution would help them. Learn about their pain points and adjust your offering accordingly. It’s crucial to know whether your idea addresses a real need before you invest time and effort into building it.
Can AI voice synthesis technology replace traditional actors, singers, and artists?
No. I don’t believe AI can fully replace traditional actors, singers, or artists, at least not yet. Our core technology, speech-to-speech, still requires human input—whether from an actor, a singer, or someone else. Human performance is necessary to achieve the most natural and authentic-sounding results.
AI also struggles with understanding emotional nuances. While text-to-speech systems might let you choose between “happy,” “sad,” or “angry,” real human emotions are much more complex. For instance, someone can simultaneously be happy and angry, which is difficult for AI to replicate. AI models evolve based on the data we feed them, so AI won’t learn these subtleties if we do not produce new human-driven content. Before we fully understand human emotion, there is still a long way to go.
Is there anything for those hesitant about AI and synthetic voice technology?
Don’t be afraid of AI—it’s already here. The best way to stop fearing it is to learn about it. Understand what the technology can and cannot do and how it fits your needs. It’s also important to recognize the potential risks to protect yourself and your family from negative consequences. Educating yourself about the technology is the first step toward maximizing its benefits while avoiding potential pitfalls.
FAQs
- Does Respeecher offer real-time voice synthesis for live applications? Yes.
- Is Respeecher scalable to support large-scale production needs? Yes, we recently completed a large-scale project, though I can’t share details due to NDAs.
- Does Respeecher integrate with popular audio production software like Pro Tools, Logic Pro, and others? Yes, we recently released a plugin for Pro Tools and are working on Reaper and other audio solutions plugins.
- Are there tutorials, webinars, or community forums to help users? Yes, we have a YouTube channel with demos and tutorials. Additionally, our tool has help videos, and our support team is available to assist users. We also have an FAQ section for common questions.
- Can voice synthesis be customized to meet the specific needs of different industries? Yes.
- Is there 24/7 customer support? No. We aim to provide round-the-clock support, but it’s not always possible, especially given the situation in Ukraine. We do our best, but occasionally, there may be delays due to circumstances like wartime conditions. For example, one of our sound engineers delivered Darth Vader’s voice for Obi-Wan Kenobi from a bomb shelter during the early days of the Russian invasion.
- Is the platform accessible on mobile devices? Yes, but I recommend using a laptop with a good microphone or headphones for better output quality.
- Can Respeecher’s AI technology detect and address discrepancies in synthesized voices to maintain consistency? No. We’re working on automating that process. For now, our B2B clients benefit from our team’s manual quality checks, but automation is in development.
- Is Respeecher compliant with international data security standards, including GDPR? Yes, we are fully GDPR-compliant and adhere to international data security standards.
Key Takeaways
Margarita Grubina’s insights into Respeecher highlight the company’s mission to revolutionize voice technology with AI while maintaining ethical practices. Respeecher offers cutting-edge speech-to-speech solutions that allow creators to replicate voices with Hollywood-grade quality, which major studios like Disney and Netflix use.
Margarita shared how Respeecher’s growth has been fueled by its commitment to technical excellence and responsible AI usage. By ensuring every voice cloned is done with full permission and focusing on projects that push creative boundaries, Respeecher has become a leader in AI-powered voice synthesis.
For more insights and ground-breaking AI stories, stay tuned!