I recently posted my first AI podcast on the channel and the response has been incredible, lots of great questions about how it all came together. So today I'm pulling back the curtain and showing you exactly how to create your own step-by-step. Creating an AI podcast starts with some form of documentation or a piece you have written that you want to convert into a podcast. In my situation, I wanted to convert this free course I taught five months ago on how to start a faceless YouTube channel for free and I wanted to convert that into a podcast. The first thing I did was to click on the link and get share and then copy this link.
We'll be using this link in the subsequent step. By the way, if you're new here, I am Gini and I make videos on how to use AI to create different types of faceless YouTube channels, as well have online businesses to help you generate additional income. If this type of topic interests you, make sure to subscribe to get more videos just like this one. The first AI tool we'll be using for this process is Notebook LM. It's a personalized AI research assistant that uses Google's Gemini 1.5 Pro model.
It can do lots of things. It can become an expert in your documentation. Of course, it could create podcasts and discuss specific topics that you have given to it.
Once you come to notebookalm.google, you would just sign up. It will bring you to its homepage. Click on create new.
It opens up this interface and asks you to give it the source. You could upload it, upload your file with a PDF, text, markdown, audio, or whatever you want from different sources. In this situation, I'm going to click on the link and choose YouTube. Go ahead and paste that link and click on insert.
It would go ahead to give me this pop-up window. There are a couple of things. You can use it to create a study guide, a timeline, or a table of contents.
But this is what we are interested in. We want to do like deep dive conversation based on that link that we have pasted. Go ahead and click on it.
on generate and it's going to go ahead and generate a conversation from what i had generated earlier you could see from a 49 minute video it generated about 16 minutes podcast play the audio so you can see how it sounds wondered if you could be a youtube star without ever showing your face exactly yeah turns out you totally can it's wild right it is it really is so we're doing a deep dive today and as you can see this sounds very conversational It's interesting how this just turned my almost one hour video into something very interesting. The next step we're going to take is to download this audio because we're going to do a couple of things to it. Click on the three ellipses button and click on download.
The next thing is to go ahead to make a couple of edits on it because we need to separate the conversation into two speakers. By the way, Notebook LM is free to use as. at the time of recording this video it might be monetized in the future i don't know but as of now it is completely free the next tool we'll be using is called descript is a tool i'll be using to make edits to my videos before i do a final editing as a whole this particular tool it's not free descript has a couple of plans it has a free plan though but it doesn't give you much but it has a couple of plans so you are aware Once you come into Descript, you will click on new project and go to audio project and it opens up this dashboard.
It'll ask you to upload the file, upload the podcast we just downloaded and click on open. Now, what it's going to do is it's going to go ahead to transcribe the audio file for me. And I also give the speakers that are speaking within this conversation different speaker flags.
Once it's done transcribing, you can see that it has gone ahead to give. different speakers that it has identified in the file, go ahead to listen to the audio, make changes wherever you can and remove some things depending on what you want. Some people would like to use the exact voice from Notebook LM, but I will suggest not to because if you're using this for a faceless YouTube channel, you want to know where the source of your AI voice is coming from just to be sure that you get monetized.
The next thing is to create what we call a sequence in Descript and be able to separate the voices and export them separately. Let me quickly show you how that works. Come to projects, click on the ellipses, select and create a sequence.
go in leap and then separate the speakers into different tracks so you can create a new track here and then you can go ahead and move for example this is the second speaker i could move or copy you could move the second speaker here and then delete this and do the same thing for all the speakers this method specifically is easy if you want to use the voices that it is but like i recommended you would have to do something else which I'll show us in the next step. Once you have this all separated, the next thing would be if you really want to export this out is you would have to click on solo track to mute one, export it out, come back, mute the second track and export it out. Then you have two separate audios, one for the first speaker and the other for the second speaker. Once you have that audio, you can proceed with that.
But based on our step, we'll just click on done and leave it as it is. What we are going to do is we are going to export this as a transcript because we're using it in the next AI tool. Go to publish and then under export, you could go ahead and say transcript. I'm going to make sure that all this is checked. We want to be sure that it has included speaker labels there, include type codes.
You will see how useful this is down the road. And here it has multiple formats which you could export to. But for the sake of the two we're going to use, we're going to export it as plain text.
Text. Select plain text and click on export. We are done with Descript.
Before we dive deeper, I want to share something that's been a game changer for leveling up my content creation skills on projects like this one. Skillshare, our sponsor for today's video. If you're exploring content creation, whether for podcasts, YouTube videos, or more, Skillshare offers a vast selection of classes that can help bring your ideas to life.
It's the largest online learning community for creatives with thousands of classes on everything from video editing, to storytelling, to productivity hacks, perfect for streamlining your creative process. I've been especially impressed by courses like audio editing and and productivity for content creators by Ali Abdaal. These classes have given me powerful tools and insights that I've integrated into my own workflow. Plus, Skillshare's unique learning paths make it easy to go from beginner to pro with sequential classes that build on each other at your own pace.
And here's the best part. The first 500 people to use my link in the description will get one month free on Skillshare. So if you're ready to take your creative skills to the next level, grab your spot now. These go pretty fast.
The next IA tool we'll be using would be 11 Labs because 11 Labs is the AI tool I love using because it gives me realistic AI voices and I can use these voices commercially. 11 Labs has a free option as well, which you could try while it has a tiered option. Once you come to 11 Labs, come in here under workflow and click on the voiceover studio.
It's a feature. I discovered it recently and it's mind blowing. You click on create new voice and name the project.
Go ahead to upload that audio file and click on create voiceover. What it's going to do is going to open a studio. This gives you the ability to bring in the transcripts we had already created.
and we could go ahead to make some changes to it. Once the VoiceOver Studio opens, you can see the background. We don't need this at this point.
We'll just remove it. Next, bring in the text we have exported from the script. Click on the gear button here. Click on import script.
Once you click on import script, it will tell you to import the script as a csv file. Now, remember the file that we expected was in a text format. So we need to take this to ChatGPT, which is another AI tool to convert this from a text file to a CSV file. Now, it's important to note that when you're using voiceover of Studio by 11 Labs, it needs to be in a specific format.
You could see it needs to be in a speaker line, start time and end time, or it could be in speaker and line. So just to show you what I did in ChartGPT. So coming to ChartGPT, I fed it the transcript and I said, can it convert this particular file, which the text file use it.
This particular format and ChartGPT went ahead to convert this for me into a podcast script CSV. Let me go ahead and show you what the new format looks like. This is the text file that we fed ChartGPT. This is what it looks like. and now this is what the csv file generated looks like as you can see it has put everything in one column there's a speaker there's a line there's a start time and end time.
This is what we'll be importing into. 11 Labs will go ahead and click on this and then I'll be importing podcast transcripts and click on open. It's going to load this transcript and separate all the voices automatically into each track as you can see.
It has given the speaker one track and the second speaker another track and now to generate our voices like I said you want to have control over your voice. I clicked on this gear and here for the male voice, I chose Mark, which is a natural conversation voice in 11 labs and you choose it and then click on close. And then here, click on the second gear and I chose the second voice that I used.
Click on close. Now, the next step would be to generate the audio because right now there's no audio generated. Go ahead and click on generate still audio and it will start processing. all these audios for you.
Sometimes it's going to clip the audio if they are too close together but obviously you're going to go ahead to go through each clip to make sure that everything looks right. Once you have all your voice generated, here I could just go ahead and change this to Mark just to know which I'm working on and change this to Zara. You'd have to go ahead to make sure that these lines are perfectly making sure that it's synchronized one after the other.
other this is what you're going to do and for example if you want to change what mark has said which is something you cannot do if you are using descript you can remove anything you want and make changes and regenerate and it does that perfectly for you once you have everything rearranged perfectly and lined up one after the other the next step would be to export this out we want to export it out in such a way that Mark has his own audio and Zara has her own audio. Quickly just go to export at the bottom here and then during that export you choose the file format audio tracks zip file. Go ahead and click on export and it's going to download it as a zip file.
Then once it downloads a zip file you would see the voice of Mark and the voice of Zara. in that zip folder now once we have those voices done the next thing would be to create our avatar for the podcast i used midjourney is one of my favorite ai generative tools ever i just went ahead i'll just show you i'm not going to go in depth because i'll be doing an in-depth in-depth tutorial on how i create images on mid journey in the next tutorial i'll just click on create just to show you how the images were generated And some of the prompts that I use for Zara, which is this, this is the prompt I used for Zara. I went ahead to say a realistic image of a black girl, front facing camera wearing round glasses.
And I went ahead to describe what I wanted. One good thing about this particular prompt, which I'll leave in the description box would be, I was able to tell mid journey, the specific type of lens, the kind of resolution. That's why she looks very captivating and almost realistic.
Once you have the image created, just go ahead of scale it and go ahead and download it. I did that for Zara as well as Mark, as you could see over here. Once you have those images, the next step would be to animate this image. And I'll be using the next AI tool, Hedra. Hedra is an AI tool as well.
It has a free plan, which you could try out, but it gives you about five videos per day, slow generation. And the amount of video length you could create with it is just about 30 seconds. But right now I use the professional plan, which gives me access to create a single video up to 12 minutes at once. And it gives you about two hours of video per month. To show you how I created that of Zara, just click on the image and select the image of Zara.
The good thing about Hedra is it gives you different options depending on the type of video you're creating. But since we are creating an AI podcast, which is typically 16 by 9, we'll select that and then I'm going to play around with this. Now, it's important to note that make sure that the face is not overly close but just in the right amount so it animates it properly and then for the audio you go ahead and click on upload and drop the audio for zara now once you click on this you will see what it would say it said that it has reached its file limit because when you export a file from 11 labs it saves it as a wave when it comes to the zip folder You need to convert it from wave to MP3 specifically.
I'll use the one I had converted and click on open. And it's going to go ahead to load the 10 minutes. I'm not going to generate this because it takes two to three hours for this to be generated. But you could just go ahead and click on generate. And this will start generating here.
Let me show you the one I generated earlier. This is Zara and this is Mark. And then you go ahead and click on it.
once it's generated. You will notice that Zara is speaking only in places where she spoke during the podcast and Mark is as well. We'll be using these two footage to merge it together in subsequent steps.
The next thing is to download the clip. Click on this and click on download and it downloads it to your computer. Once this video is loaded to your computer, you will notice when you click on it that the quality is not the best. best let's look at it when you put it on a larger screen and try to play it you could see how blurred this is and how flickery it looks and this is where upscaling comes into play now i'm going to mention the tool i use to upscale the videos that you saw on the channel it's pretty expensive but i would also give you an alternative now for upscaling the video i use topaz ai their video ai it costs about 299 usd to buy this for a year and after a year it kind of renews it's pretty expensive i know but there's also a free alternative which i would show you but let me show you what i do using topaz ai opening topaz ai you could go ahead and add the video file you'll just click on plus i need to actually add the file once you place it you can see that the original format came into the to 896 by 512 pixels, which is pretty low. And what you could do is change the output resolution here to 4K or even any other one that you feel like using.
If you want a higher resolution, and I have tested all these AI models specifically for the podcast, I used Proteus in terms of recover details. I played around with it a little bit, but I like to increase it so it could recover the details of the face of the character in the video. And then basically I don't do anything else. I just go ahead and click on export. Once you click on export, it will ask you to choose the name where you want to save it.
Once you click on save, it goes ahead to start exporting this out. Now, because this is a 10 minute video, it is going to take a bit of time, roughly an hour to upscale. And once this is done, double click on it.
You could see how clear this is now. compared to what we had before this is how i went ahead to upscale both videos now the free alternative you could use is you could use cap cut video upscaler i've talked about cap cut video upscaler before but it doesn't give you the same amount of finesse when it comes to quality but it is an option you can click on create new go to video upscaler and then you could go ahead to upload the video right here it can only upscale only two x all right now that we have created our audio we have created our transcript used it to separate the voices and made it sound better than what notebook lm originally generated generated our character animated it and upscaled it the next thing would be for us to put everything together The final step that we'll be taking to make sure that everything comes together is to edit our clips and put some graphics presentations in our podcast. And I'll be using CapCut.
CapCut has a free plan which you could use to get started as well as a pro plan depending on what you're using it for. But the free plan is good enough to get started. I've gone ahead to import all the clips that I'm going to use with the graphics that I want to show within the podcast.
Now in the next couple of tutorials, I'll be showing you as well how these graphics were created with AI. The first thing you need to do is to drag Mark into the timeline. Mark is the person that started the conversation, right? And then we'll go ahead and drag in Zara on top of Mark.
If you were dragging Zara at the very beginning, you would notice that Zara wasn't saying anything as Mark was speaking. What you need to do is to listen through the clip and notice when she started speaking. I'll go ahead and bring that here because this is where after Mark said, you want to be a YouTube star and this is where she came in.
All right, so you would go ahead and layer it this way. The next step would be, I normally go ahead to make sure that this sits properly within the 16 by 9 frame and this as well. But Zara sits within the 16 by 9 frame.
So you would go ahead and listen to it and start making cuts. But let me show you something that is very important. Now, if you notice in the podcast, I placed Mark and Zara side by side. We're going to click and move Mark to the side. We'll move back to the side.
And then for Zara, we click on Zara and move Zara to this side. And then click on Zara, click on horizontal. And then we go ahead and switch this.
And then you try to move it around till you get the direction that you want. I like this. As you can see, when I scrub to it, you could see both of them speaking at the same time.
Different side by side. Here at the beginning when Mark was speaking, you could just go ahead and use B on your keyboard and click on B. It brings out the cuts. You could split it.
And then what this would do is you could see here, there's a black space here. Just for this specific one, you just restore the mark to its original position. Let's go through it.
You scrub and you notice that it splits. This is fantastic. The way it is without you even doing anything much anymore.
But if you want to add graphics in between, you could now go ahead and drag the graphics into the timeline where you wanted. Now, once you are done, you could always select these clips and then you could decide to make it a compound click. could select them and then right click and say create a compound clip and it compounds everything into one single clip and you could go ahead and export it to do that you could just click on export and it opens up this you could name the file name it choose the location where you want and obviously you want it to be 4k or you choose the particular resolution that you dim fit and go ahead and click on export and it exports to your location of choosing.
And then you could go ahead to upload it on your YouTube channel or any platform that you want to use it for. In upcoming videos, we'll dive even deeper into all the stunning graphics I use in the AI podcast. You definitely don't want to miss those tutorials. So be sure to subscribe and turn on the bell notification to be the first to know when they go live.
Thanks so much for watching. I'll see you in the next one.