How to Create a Professional AI Voice from Text | How to AI #5
Use AI to generate clear, professional voiceovers with OpenAI audio tools. This guide covers prompt setup, pacing, style control, and export steps for social, product, and explainer videos.
Video Tutorial
Watch the tutorial above to see the complete process in action.
Step-by-Step Guide
Step 1: Access Gemini AI
- Go to Google's Gemini Platform
Step 2: Define Your Voice Persona Goal
- Clearly identify the kind of voice you need (e.g., professional, enthusiastic, deep, sharp).
- The goal is to generate a detailed "Voice Persona" that will act as a high-quality prompt for an AI audio generation tool.
- Although Gemini will do it itself, but if you want a better prompt, specify in detail each elements of a voice:
- Voice
- Tone
- Character
- Feature
Step 3: Generate the Voice Prompt with Gemini
- Based on the persona defiled, paste the following prompt into Gemini.
- You can adjust the persona details (tone, features) as needed.
- This is the prompt that we used:
Create a voice persona for an audio generation tool. The person should have the following elements: the type of voice, the tone of it, the delivery, the dialect, the features. I want the voice of a very enthusiastic broadcaster who is very, very happy introducing someone on the panel. and I want it to be an American accent and this is needed for an intro to a YouTube video. Give the response in plain text.
Step 4: Generate the AI Voiceover
- Copy the detailed "Voice Persona" that Gemini generates.
- Go to OpenAI's OpenAI FM, a free AI audio generation tool by OpenAI.
- In the AI audio tool, paste the Gemini-generated persona prompt and then paste your script content. We used the following script in the video:
Ladies and gentlemen, welcome to the monthly, State of AI video, by This Week in AI.
- Generate the audio and review the results.
Troubleshooting
Issue #1: The generated voice sounds robotic or lacks the required emotion (e.g., enthusiasm).
Solution: Return to Gemini and refine your original prompt. Add more specific, emotive adjectives for the tone, such as "high energy," "dramatic pacing," or "powerful projection," to force a more dynamic response.
Issue #2: The AI audio tool mentioned in the video (OpenAI FM) cannot be found.
Solution: Search for alternative, popular AI voice generation tools like ElevenLabs, Microsoft Azure Text-to-Speech, or others, and follow a similar process of applying the detailed voice persona prompt.
Need Help?
If you face any issues with this micro AI hack, email us at hello@thisweekinai.club
This guide will help you create high-quality audio without needing a professional voice actor.