Nutzung von RAG-Systemen für das Wachstum von KI

In this video, I'm going to show you the single AI automation skill that can help you generate an extra $200,000 in your AI agency in 2025. And it's building a RAG system, otherwise known as a Retrieval Augmented Generation System. These systems include large language models like ChatGPT and Cloud with an external knowledge base that is stored within a normal database. The system takes in normal prompts that you would see in any LLM, but first processes your private database for relevant expert knowledge. And then it combines that with the LLM, either ChatGPT or Cloud or something else, and then finally takes all that information and sends it back. to the user. And this allows you to build out your own AI tools and provide answers to questions that are grounded in up-to-date facts and domain-specific knowledge that your company may possess. Building these systems will be the number one skill that you need in 2025 to stand out from the competition. In this video, I'll show you the big problem that you can solve for businesses and how to profit from it. And I will also break down exactly how this works with a live demo using an Airtable database and an AI database platform, pinecone.io. All right, so first, let's go ahead and talk about the big problem and how we can profit from it. A lot of companies have specialized data. that they want to use with AI so that they can build other tools. For example, I've got all these YouTube videos, all these videos from all my courses, all these different posts that I've made into my community. And it would be great if I could take all of that content and repurpose it into other formats. Or a company might just have a lot of technical documentation that they'd like to index that could be used with customer service and automated voice agents. There is a lot of untapped value and profit that can be made here by... unlocking that for companies. But not many AI automation experts are proficient at building these slightly more complex systems, like this RAG system here. So if you can learn how to build these systems, it's a wide open field to tap into without much competition. And it's also a great opportunity because the types of companies that have this kind of data that they can use are already well funded and they can pay for true expertise. Now, consultants are helping companies build out these solutions right now, but they do have limitations, which gives you the opportunity. So let's discuss the more common solution and their limitations. So one way that people are trying to solve this problem right now is that they simply take a existing knowledge base. Maybe it's in a PDF. or a txt file and they use the contents of that knowledge base in their prompts directly when they are making out their automations. So I'm using cloud here. I could add a message, we'll create a prompt, we'll use text. I could do something like use the knowledge base below to write an article about x y and z. And then of course either by cutting and pasting all of the contents of your knowledge base directly into the automation or maybe you load the information dynamically and pass in a variable. Either way this can work up to a certain size but once your knowledge base starts to grow, like this example here, once you have hundreds of videos and posts and school posts and school courses and modules you're not gonna be able to cut and paste or embed that much content into these modules every single time. It's just far too much data for cloud to process at any given time. So you can directly insert your knowledge base as a PDF, but it's just not always practical or efficient. Now you do have GPTs and assistants which are good in some cases, but I even have a GPT for the no-code architects toolkit API, but GPTs can't be used in automations directly. So GPTs only work well when this particular type of interface where your users can just talk with it. It's only in this situation where it works well. If you want to integrate your knowledge database into your automations, GPTs just aren't going to work. Now, GPT assistants work with automations, but in my experience, they don't work as well as GPTs. You can build out your own custom GPT assistants in the OpenAI Playground. You can upload your own knowledge base. You can define how that assistant should behave using the system instructions. But in my experience, these assistants just don't work as well as GPTs. And with GPTs or assistants, you can't really control how the data is structured. and that will limit the performance and flexibility of your solution. So when I upload data here into the assistant, I can't really control what type of database is being used here or how everything is indexed. So I really have to rely on however they decided to do it. Now again, it works okay and it works even better inside of a GPT. Here you can see I have a knowledge base where I've uploaded information about the no-code architects toolkit. But again, I'm limited on how this is indexed into the system and how to actually use it. So the secret here is learning how to build out your own RAG systems. How do you use your own database and various APIs to connect to other data sources and sync it back into the database where you can sync it with a vector database like Pinecone? And then you can use that data with questions that come into your automated tool where we can retrieve our own expert knowledge from our own database, work with ChatGPT or Cloud to get the data in the way we want to respond to the user and then finally respond back to the user. Now one key thing that I want to articulate about mastering these regs systems and how to profit from them, the true skill in 2025 is going to come from those who can manage data, who can connect to various API endpoints, successfully take that data in a consistent manner, have their own database that they can query against, and then use that to build smarter AI solutions. And so while other AI automation companies focus on flashy new AI tools, you're going to focus in on how. to manage data so that you can really set yourself apart with a high income skill and integrating these private knowledge bases with AI tools that companies are using for themselves internally on a day-to-day basis. All right, so now let's break down this diagram here, which breaks down a RAG system that I have built out. And the tools that I used to build this were Airtable, Make, and a database platform called Pinecone. And I'm going to show you all the important features of this entire system, but I'm going to start on this side of the system, starting with our Airtable database. and then talking about data collectors and how we get that data from our Airtable database into a vector database. And again, Pinecone is this vector database and I'll explain a bit more about that here and why that's cool. And then we'll talk about how do we actually interface with that vector database and then LLM like ChatGPT or Clod to help us produce highly refined answers from questions that come into our system. And so again to make this real, what I have here is a database that has been populated with all of my YouTube content and all of the course material that I have in my school community. Plus, I've also indexed all of the actual written posts that I've made into my community as well. Now, how I actually synced all of this data, including the URLs and the thumbnails and the titles, descriptions and the transcriptions, either in doc format. where if we open this up, we can see the transcription or written directly to a database field like this. The process for syncing all of this data and keeping it organized is done here on this side of our diagram. And so we'll come back and cover that in a minute. What we're gonna talk about first is how do we get all of that data in our rational database or our Airtable database, where we can see all of these columns and rows and data. How do we actually sync that with a vector database, pinecone.io? which will allow us to integrate more seamlessly with our LLMs. Now the most important process for syncing our data in Airtable with pinecone.io is this here, the data collector. Now you're going to see here I have data collectors scattered throughout this diagram and I'll explain them in more detail. But at the core, a data collector is responsible for syncing data between two different sources and also making sure that that data is in good condition. We have to be able to trust that the data that we have here is the same data we have here. We want to make sure that when new rows come into our error table database that they show up in our vector database. We want to make sure that when there are updates to our error table database those are also updated in our vector database. We want to make sure that the data collector is also able to find duplicates and remove those if need be. The data collector also needs to make sure that there are no data gaps, that we aren't missing information in our vector database that we find in our error table database. We also want to make sure that the process of getting this data from here to there is also cost efficient. If you want this vector database to be updated on a regular basis and this data is changing all the time, it's important to think about the efficiency of this process so that you're not spending a lot of money just to maintain the database. And we also want it to be fault tolerant. So we don't want something to happen here or in this process that makes it so that these two systems here don't have the same data. So the data collector is something that you build and it's going to be different for every situation. There is not one single format that you will use in any situation. but it's more a concept so that when you go to build out this integration, that you make sure that you satisfy all of these different criteria so that you can actually trust your solution. So if we look at this data collector here that syncs our Airtable database with our vector database, we can take a look at this make automation here. Now, again, I want to point out that you really only need two simple automations here to keep these two items in sync and actually run this entire process. This part of the system is actually the easiest part. to build out even though something like a vector database and pinecone might be a new concept putting this together is actually pretty straightforward it's actually keeping all of the data in sync and relevant and updated in airtable that is the hard part and we'll talk about that in a bit so now let's take a look at the data connector that helps us connect airtable and the vector database so again we have our airtable database it's full of youtube videos and school courses taking a look here at the source We can also have school pages and also loom trainings. You could really have anything that you want in here. You just have to add it and then develop the necessary data connector to connect up the API that you want to pull data from. And then we have this automation here, which is searching records from a specific view on Airtable. So we can see that this search is always looking at this view in Airtable sync pine cone and we have this view right here. And basically the way this filter is constructed is that only new and modified rows will show up in this particular view. And then when those records come into view, it will trigger this automation, which ultimately syncs it into Pinecone. And so coming back to our definition of what is a data collector, in this specific scenario where we are syncing data from Airtable to the vector database, we are controlling the new record sync and the updates to various records using this particular view and the filter to power it. And so the technical design choice this view and filter allows us to satisfy these first two elements of the data connector. And now I'll continue on here. So the first thing that happens is of course we grab the rows and new data that we want to sync in this view here and then it's going to loop through all of those rows here. And then the next thing that happens in this particular automation is that we download the file of the transcription document and I'll jump back to Airtable. For every one of these YouTube videos you can see that we have a transcription document. So this automation is downloading that document and then it's parsing that into data that we can use within the automation. So you can see here we had 12 operations. That means there were 12 rows that we pulled into this particular search. And now we have the data here and we have that plain text transcription that we can use and ultimately store in our pinecone database for retrieval later when we are processing our own requests into our own AI tool. Now, these modules here are just helping us clean up some data for these steps that are coming up here. And then this module here is just helping us break up that transcription text Again, we'll open this up again and we'll take a look at the data. We have all of this transcription text, but sometimes you need to break it up into smaller chunks for databases like Pinecone. So this module here is actually using the no-code architects toolkit. It's using this small bit of code to automatically chunk up the data into smaller bits. So you can see we're sending in the transcript right here. But if we look at the output for each of these operations and we look at the data here in the response, there is an array. and each one of these elements here is just a chunk of the larger transcript. The transcript that we have here was broken up into five smaller parts. It's also worth noting that each one of these chunks slightly overlaps with the previous. So right here in section five you can see it starts with your part of the no-code architects which is and then you can see here in the previous chunk you can also see that same bit of text so there's a small amount of overlap. that allows the Pinecone database to tie these together down the line. So that's how you actually connect different segments together, even though you've broken them up into smaller chunks. And then this part of the automation simply loops through all of the chunks that come out of this module here. And the first thing that we need to do is actually take that text that is coming from each of the individual chunks that we've broken down from the larger transcript. And then for each one of those chunks, we need to create something called an embedding. Now, what that is, is it's a... numeric representation of the text data. So you can see here we're using ChatGPT. And again, this is an API call and we're calling the API endpoint embeddings and we're sending the text value to it. And what it responds with is a series of numbers. Continue to open this up. Here's the collection here. Here are the embeddings. It provides us these numbers, which are a numeric representation of the original text data that we sent it. And I can't say I know exactly how this works. But these numbers are used to quickly find relevant data when you're trying to search your big database for something specific. So for every one of those chunks that we are looping through right here, we created one of those embeddings and then we're checking the pinecone database to see if that vector already exists. In this case we're building the vector using the URL because that's a unique value. Again remember if we jump back to our data collector we need to ensure that there are no duplicates. and we do that by using the URL as a key. That way we know that if there is a duplicate, if we have the same URL, then we can find that duplicate and remove it. And so this is helping us look to see if there is already a vector so that we know whether we need to create a new vector or update an existing one. And so if we take a look inside of our Pinecone database, we can see here that we already have some records and you can see that there is some text here which is a portion or a chunk of a transcript for a particular YouTube video. In this case, the YouTube video that you can see in the pop-up. So you can see every one of these entries in the Pinecone database has these vectors, but it's these values here that we use to find the data, not the text itself. We use the text here in the Pinecone database to power the queries. We use this text and we send it this text to actually generate the responses, but we use these vector values to actually find the data in this Pinecone database so that we can send the proper data to the LLMs. So these two modules here keep the Airtable database in sync and we're doing a little bit of cleanup here just to maintain the Airtable database but there's nothing there critical to this overall process. So now that shows us how we're getting data into the vector database. This data connector is making sure that all the new records, all the updated records, no duplicates, there are no data gaps, it's cost efficient and it's also font tolerant. So this data connector is trying to ensure all of those things are true so that when we are using this vector database in this solution here we have data that we can trust. And I didn't specifically cover this yet, but the way this data connector is helping with keeping things cost efficient is that we use all of the Airtable automations as much as possible to process data that's coming. back from the API rather than trying to use make automations, which obviously can ramp up if you're processing a lot of data. So when we're processing all of this data on this side of the process, we use the Airtable automations to process as much of this data sync as we can to avoid racking up a ton of automations here in make, which are a lot more expensive. So now moving along to this part of the process where we are actually taking in a question or a request from the user, querying the vector database for a relevant. expert information that is relevant to the question and then we send it to the LLM like ChatGPT or Cloud for the final revised answer which then is then passed back to the user. So to represent this entire section here I have this automation and it's quite simple. This here is just representing a question from a user. So this is just used for testing but it could easily be connected up into something like Slack where you could ask an assistant in Slack a question. That would be like inserting the question from Slack directly into this module here. and then when we finally got the answer in the back end we could respond back to the user but now to cover the guts of this automation which again i like to point out is fairly simple given all of the power here but the way this automation works is in this section here we create a prompt to create search terms for our pinecone database so again remember we need to search our pinecone database for relevant information with regard to the query that's coming in the beginning of the automation and in this scenario here The question is, I want to create an article on how to start an AI automation school community. So the first thing that we need to do is we need to ask ChatGPT, hey, can you go ahead and create us some search terms that we can use in Pinecone that will give us the data that we want so that we can then send it to ChatGPT to actually write the article or Claude in this case. I'm using ChatGPT and Claude just to see how the output differs. So here we are coming up with the search terms. So if I open this up and look at the result, the search terms are AI automation community building, starting AI school, starting AI school, creating online AI community, AI automation education platforms. So again, these are the search terms that ChatGPT came up with to help us query our own database based off of the topic we want to write about. Now in this next module here, we are creating embeddings again. Again, we're making an API call because there isn't a built-in module. Just like we built embeddings for what we store in the database. So we sent it this text which created these vector number values that we can use to search the database. To actually search the database we also need these vectors. So coming back to this module here we are taking the search terms and you can see we're putting them right here and it's going to output a set of values here in the body, open up the data, and here are those embeddings. So here we have those numbers again except this time these numbers are going to be used to search as opposed to store. Then we simply go to the Pinecone database and we pass in those vectors. So we're passing those vectors from this step here. And this is our search term. So we search the Pinecone database with these values in a similar way that we use these values to store the data. So even without understanding how vectors work, you can understand that we are saving and searching the Pinecone database for the data that we actually want, which is this transcription. But we're doing it from these. embeddings, which are these vector numbers that allow us to search and store our data in a way that we can quickly pull up the most relevant information for any given query. So in this situation, since we are searching for how to start an AI automation school, it's going to search all of our values here for something that's relevant in the content and only return these rows. And so now you can see from our massive knowledge base, now we only have the rows and the data we need to actually write the article and nothing more. So instead of trying to send our entire database to ChatGPT to write our article, which is actually impossible, you couldn't really do it, we're just getting the most relevant information from Pinecone and then sending that into ChatGPT or Cloud to actually write the article. And so you can see here, if we open this up, we've got quite a few different responses that relate to the query that we had. And if we open this up here to the metadata, we can see we have the content and there's the chunk. So we use the vector numbers to search and store. but it ultimately returns back the relevant text that we can group together and finally pass into ChatGPT to actually write that article. And so here we're generating that article with ChatGPT and here we're generating it from CLAW just to see the difference. And then ultimately we can take a look here and we can see that there's an article that's actually written from our knowledge base. This isn't taken from just the internet, it's being taken from actual expert knowledge that exists either on my YouTube channel, from my post in school, or even the classroom and all of the videos that I post here. And so again, if we look at this system here, you could easily have people asking questions on a platform like Slack. Here was an example that I had set up before. You've got a channel here, and then when somebody asks a question, it will actually prompt the assistant, which will trigger this automation here. We create the search terms, we create the embeddings, we do the actual search, bring all of that data together for a final query to one of our favorite LLMs, and then the output of this process here, whether it's ChatGPT or Cloud or both, you could finally map those back into a Slack response, just like you can see from this no-code architect's assistant that is responding to me and my question with this specific answer. So hopefully you can see here that this entire section here, while there might be a few new things to learn, it's really two simple automations right here that take care of this entire part of the RAG system. And just as a recap, this automation here is the data connector for this part of the system. and then this automation here is what allows us to take that data from the vector database send it to a llm like chat gpt or cloud for final processing and then returning it back to the user and again the hardest part of this entire process and also in this process which i will cover shortly is this data connector it is this logical understanding and system that allows you to isolate and keep the data between one source and another source intact and trusted All right, so now that we covered the section of the RAG system that deals with the databases and working with the vector database, let's talk about sometimes the more complicated issue, which is going out to all the various systems where our content might be housed and developing the data connectors that allow us to sync it back to Airtable. And again, the same way that we want to maintain the integrity of our data from Airtable into the vector database, we also want to do the same with our external data sources. We want to make sure that we grab all of our YouTube data. We want the new records and the updates. And we want to make sure there are no duplicates. And we want to make sure there are no gaps in the data. We don't want to be missing a YouTube video. And we want this to be cost efficient. If there's a scenario where we are syncing data quite often, like if there's a situation where it needs to sync every minute or five minutes, it needs to be efficient. Otherwise, we're going to pay a lot of money just to sync the data back and forth. And we want this entire system to be cost efficient. So the automations and the way we sync the data with our data connector needs to be efficient. And again, it also needs to be fault tolerant. So if there is an error, our database doesn't get corrupt. So when it comes to getting data into our database, there's really four main ways that that's going to happen. We're going to reach out to an official API. We're going to develop our data connector, which again is not necessarily always the same thing in every situation. It's more a concept where you develop the automations or the code or whatever it needs to be so that this is true. You want to isolate all the logic. into an object that can work with the API to extract the data and keep things in sync. But again, there are a few different ways that we tend to interface with the data collector. Official APIs, which is usually going to be the easiest because it's an official API and it's designed to connect to a specific service. And so building a data connector that can help you do this is usually pretty straightforward. And then we have hidden APIs, which are still APIs. They're just not publicly documented. So for instance, school has an API. There is a way. where you can download all of the course content. You can get access to those Loom videos. And then from the Looms, you can get the transcriptions. So here you can see I have some of my school courses and the transcript from that video. So it's possible to do this, but it's not documented. So you have to figure out how to do it. And then often because it's not documented, the data collector might be a little bit more robust so that you can handle the situations that you might not have to deal with if you're working with a normal API. So there's an official API, the hidden API, and then there's also situations where you simply have to scrape the data. This is an example where you might use a tool called Axiom, which can log into a browser and extract things from a browser when there's not even an API or even hidden API. It just goes to the page, it looks at it, and it just simply grabs text. And so in a situation like that, there's some website here, www, and some data collector that's helping you make sure that that process. is done properly. And in a lot of cases when you are scraping data, the data collector has to be even more robust to handle all of the different situations. For instance, if you build an automation that scrapes data from a website, if they change the location of some text from here to over here, this might cause your automation to break. And then in that connector, you would have to come up with some way to recover from that issue. And again, it'll be a case-by-case basis, but you'll also have to make sure you can get new records, updated records. You can make sure to remove duplicates. There are no data gaps. It's cost efficient and it's fault tolerant. So obviously working with a direct API or even a... Hidden API is much better than working with a automation that uses a scraping method. And then, of course, there's the good old manual process where somebody comes in and they just put in the data into the database directly. So now, obviously, there are major drawbacks to these two situations. So as best you can, make sure you use normal APIs or hidden APIs. Finding hidden APIs is easier than you think. You simply need to use the developer tools that are in your browser. And with these developer tools, you can monitor the network and the connections back and forth between your computer and the remote server. And then based off of that research, you can actually find the APIs that the website is using to communicate with your browser, and then you can exploit it and then use it for your data collector. Now, if you're enjoying this content, make sure to like and subscribe to the channel. It tells me what type of content you want more of. So now let's talk about this process in more detail, where we are grabbing all of it. the YouTube videos from my YouTube channel. We're using Appify as the API and we're using Make to build our data collector and to ensure that we adhere to all of these different standards. And that ultimately comes back and syncs to our Airtable database. So the automation that actually powers this is very simple. We have this one automation here which triggers a API call to Appify. We are using one of their actors here. That triggers the automation and then when it's done it triggers this automation. and if we look back to our requirements here the way we are tracking new and updated videos from youtube in our collector is simply to always get all of the new videos in this case because there are only 160 videos or however many there are getting every single video every single time is not an issue so that will obviously include all of the new and updated videos so this process here will respond with all of the videos then the rest of this automation works by downloading the data that we received from this module here. And we do that so that we can process as much of the data as possible inside of Airtable automations. So by downloading all of the data that came from Appify into a file that is stored in Airtable just like this, instead of processing all of this data, which would be hundreds of operations in Make, we can do it with just one automation inside Airtable. And so that's how we are able to keep this cost efficient. we find duplicates by using the url and making sure that there is only one unique youtube url and so we always look to see if that url is here before we add a new row if it is then we just update it otherwise we add the new row and then we take care of the node gaps in the data by just always requesting all of the videos in other scenarios you would need to modify that approach you might need to use some sort of timestamp like last updated but depending on the situation you'll have to develop a data connector strategy that helps you make sure that all of these things are true. And the reason why I really point this out and really isolate it to these situations is that it helps you logically separate your system from these very specific data connectors. And it gives you a list of things that you need to think through when you're connecting to these other APIs such that your data is going to be in good shape. So then once we get all of that data from Appify, Again, it could be hundreds of rows. Instead of trying to loop through the hundreds of rows here, we spend three operations. We drop the file here with all the data, and then our Airtable automations, we can simply process that data here and only spend one Airtable automation. Now, if you want access to this diagram, the Airtable database and the four make automations that allow you to save data to your Pinecone database, retrieving and processing data from your Pinecone database, and the automations that help you sync data with Appify and... YouTube and your Airtable database. Make sure to jump into the NoCode Architects community. It's an engaged group. You can ask any question and get tech support. On the calendar, there's calls almost every single day. You can get access to a Make an Airtable course and a bunch of other cool automations that you can build and a whole lot more. I hope to see you there, but either way, I hope you enjoyed this video, what it can do for your business in 2025, and I'll see you on the next one.

to the user. And this allows you to build out your own AI tools and provide answers to questions that are grounded in up-to-date facts and domain-specific knowledge that your company may possess. Building these systems will be the number one skill that you need in 2025 to stand out from the competition. In this video, I'll show you the big problem that you can solve for businesses and how to profit from it.

And I will also break down exactly how this works with a live demo using an Airtable database and an AI database platform, pinecone.io. All right, so first, let's go ahead and talk about the big problem and how we can profit from it. A lot of companies have specialized data. that they want to use with AI so that they can build other tools. For example, I've got all these YouTube videos, all these videos from all my courses, all these different posts that I've made into my community.

And it would be great if I could take all of that content and repurpose it into other formats. Or a company might just have a lot of technical documentation that they'd like to index that could be used with customer service and automated voice agents. There is a lot of untapped value and profit that can be made here by...

unlocking that for companies. But not many AI automation experts are proficient at building these slightly more complex systems, like this RAG system here. So if you can learn how to build these systems, it's a wide open field to tap into without much competition. And it's also a great opportunity because the types of companies that have this kind of data that they can use are already well funded and they can pay for true expertise.

Now, consultants are helping companies build out these solutions right now, but they do have limitations, which gives you the opportunity. So let's discuss the more common solution and their limitations. So one way that people are trying to solve this problem right now is that they simply take a existing knowledge base.

Maybe it's in a PDF. or a txt file and they use the contents of that knowledge base in their prompts directly when they are making out their automations. So I'm using cloud here. I could add a message, we'll create a prompt, we'll use text.

I could do something like use the knowledge base below to write an article about x y and z. And then of course either by cutting and pasting all of the contents of your knowledge base directly into the automation or maybe you load the information dynamically and pass in a variable. Either way this can work up to a certain size but once your knowledge base starts to grow, like this example here, once you have hundreds of videos and posts and school posts and school courses and modules you're not gonna be able to cut and paste or embed that much content into these modules every single time. It's just far too much data for cloud to process at any given time.

So you can directly insert your knowledge base as a PDF, but it's just not always practical or efficient. Now you do have GPTs and assistants which are good in some cases, but I even have a GPT for the no-code architects toolkit API, but GPTs can't be used in automations directly. So GPTs only work well when this particular type of interface where your users can just talk with it.

It's only in this situation where it works well. If you want to integrate your knowledge database into your automations, GPTs just aren't going to work. Now, GPT assistants work with automations, but in my experience, they don't work as well as GPTs. You can build out your own custom GPT assistants in the OpenAI Playground.

You can upload your own knowledge base. You can define how that assistant should behave using the system instructions. But in my experience, these assistants just don't work as well as GPTs.

And with GPTs or assistants, you can't really control how the data is structured. and that will limit the performance and flexibility of your solution. So when I upload data here into the assistant, I can't really control what type of database is being used here or how everything is indexed.

So I really have to rely on however they decided to do it. Now again, it works okay and it works even better inside of a GPT. Here you can see I have a knowledge base where I've uploaded information about the no-code architects toolkit. But again, I'm limited on how this is indexed into the system and how to actually use it. So the secret here is learning how to build out your own RAG systems.

How do you use your own database and various APIs to connect to other data sources and sync it back into the database where you can sync it with a vector database like Pinecone? And then you can use that data with questions that come into your automated tool where we can retrieve our own expert knowledge from our own database, work with ChatGPT or Cloud to get the data in the way we want to respond to the user and then finally respond back to the user. Now one key thing that I want to articulate about mastering these regs systems and how to profit from them, the true skill in 2025 is going to come from those who can manage data, who can connect to various API endpoints, successfully take that data in a consistent manner, have their own database that they can query against, and then use that to build smarter AI solutions.

And so while other AI automation companies focus on flashy new AI tools, you're going to focus in on how. to manage data so that you can really set yourself apart with a high income skill and integrating these private knowledge bases with AI tools that companies are using for themselves internally on a day-to-day basis. All right, so now let's break down this diagram here, which breaks down a RAG system that I have built out. And the tools that I used to build this were Airtable, Make, and a database platform called Pinecone.

And I'm going to show you all the important features of this entire system, but I'm going to start on this side of the system, starting with our Airtable database. and then talking about data collectors and how we get that data from our Airtable database into a vector database. And again, Pinecone is this vector database and I'll explain a bit more about that here and why that's cool.

And then we'll talk about how do we actually interface with that vector database and then LLM like ChatGPT or Clod to help us produce highly refined answers from questions that come into our system. And so again to make this real, what I have here is a database that has been populated with all of my YouTube content and all of the course material that I have in my school community. Plus, I've also indexed all of the actual written posts that I've made into my community as well.

Now, how I actually synced all of this data, including the URLs and the thumbnails and the titles, descriptions and the transcriptions, either in doc format. where if we open this up, we can see the transcription or written directly to a database field like this. The process for syncing all of this data and keeping it organized is done here on this side of our diagram.

And so we'll come back and cover that in a minute. What we're gonna talk about first is how do we get all of that data in our rational database or our Airtable database, where we can see all of these columns and rows and data. How do we actually sync that with a vector database, pinecone.io? which will allow us to integrate more seamlessly with our LLMs. Now the most important process for syncing our data in Airtable with pinecone.io is this here, the data collector.

Now you're going to see here I have data collectors scattered throughout this diagram and I'll explain them in more detail. But at the core, a data collector is responsible for syncing data between two different sources and also making sure that that data is in good condition. We have to be able to trust that the data that we have here is the same data we have here.

We want to make sure that when new rows come into our error table database that they show up in our vector database. We want to make sure that when there are updates to our error table database those are also updated in our vector database. We want to make sure that the data collector is also able to find duplicates and remove those if need be.

The data collector also needs to make sure that there are no data gaps, that we aren't missing information in our vector database that we find in our error table database. We also want to make sure that the process of getting this data from here to there is also cost efficient. If you want this vector database to be updated on a regular basis and this data is changing all the time, it's important to think about the efficiency of this process so that you're not spending a lot of money just to maintain the database.

And we also want it to be fault tolerant. So we don't want something to happen here or in this process that makes it so that these two systems here don't have the same data. So the data collector is something that you build and it's going to be different for every situation.

There is not one single format that you will use in any situation. but it's more a concept so that when you go to build out this integration, that you make sure that you satisfy all of these different criteria so that you can actually trust your solution. So if we look at this data collector here that syncs our Airtable database with our vector database, we can take a look at this make automation here. Now, again, I want to point out that you really only need two simple automations here to keep these two items in sync and actually run this entire process. This part of the system is actually the easiest part.

to build out even though something like a vector database and pinecone might be a new concept putting this together is actually pretty straightforward it's actually keeping all of the data in sync and relevant and updated in airtable that is the hard part and we'll talk about that in a bit so now let's take a look at the data connector that helps us connect airtable and the vector database so again we have our airtable database it's full of youtube videos and school courses taking a look here at the source We can also have school pages and also loom trainings. You could really have anything that you want in here. You just have to add it and then develop the necessary data connector to connect up the API that you want to pull data from. And then we have this automation here, which is searching records from a specific view on Airtable. So we can see that this search is always looking at this view in Airtable sync pine cone and we have this view right here.

And basically the way this filter is constructed is that only new and modified rows will show up in this particular view. And then when those records come into view, it will trigger this automation, which ultimately syncs it into Pinecone. And so coming back to our definition of what is a data collector, in this specific scenario where we are syncing data from Airtable to the vector database, we are controlling the new record sync and the updates to various records using this particular view and the filter to power it.

And so the technical design choice this view and filter allows us to satisfy these first two elements of the data connector. And now I'll continue on here. So the first thing that happens is of course we grab the rows and new data that we want to sync in this view here and then it's going to loop through all of those rows here.

And then the next thing that happens in this particular automation is that we download the file of the transcription document and I'll jump back to Airtable. For every one of these YouTube videos you can see that we have a transcription document. So this automation is downloading that document and then it's parsing that into data that we can use within the automation.

So you can see here we had 12 operations. That means there were 12 rows that we pulled into this particular search. And now we have the data here and we have that plain text transcription that we can use and ultimately store in our pinecone database for retrieval later when we are processing our own requests into our own AI tool. Now, these modules here are just helping us clean up some data for these steps that are coming up here.

And then this module here is just helping us break up that transcription text Again, we'll open this up again and we'll take a look at the data. We have all of this transcription text, but sometimes you need to break it up into smaller chunks for databases like Pinecone. So this module here is actually using the no-code architects toolkit.

It's using this small bit of code to automatically chunk up the data into smaller bits. So you can see we're sending in the transcript right here. But if we look at the output for each of these operations and we look at the data here in the response, there is an array. and each one of these elements here is just a chunk of the larger transcript. The transcript that we have here was broken up into five smaller parts.

It's also worth noting that each one of these chunks slightly overlaps with the previous. So right here in section five you can see it starts with your part of the no-code architects which is and then you can see here in the previous chunk you can also see that same bit of text so there's a small amount of overlap. that allows the Pinecone database to tie these together down the line. So that's how you actually connect different segments together, even though you've broken them up into smaller chunks. And then this part of the automation simply loops through all of the chunks that come out of this module here.

And the first thing that we need to do is actually take that text that is coming from each of the individual chunks that we've broken down from the larger transcript. And then for each one of those chunks, we need to create something called an embedding. Now, what that is, is it's a... numeric representation of the text data. So you can see here we're using ChatGPT.

And again, this is an API call and we're calling the API endpoint embeddings and we're sending the text value to it. And what it responds with is a series of numbers. Continue to open this up.

Here's the collection here. Here are the embeddings. It provides us these numbers, which are a numeric representation of the original text data that we sent it.

And I can't say I know exactly how this works. But these numbers are used to quickly find relevant data when you're trying to search your big database for something specific. So for every one of those chunks that we are looping through right here, we created one of those embeddings and then we're checking the pinecone database to see if that vector already exists.

In this case we're building the vector using the URL because that's a unique value. Again remember if we jump back to our data collector we need to ensure that there are no duplicates. and we do that by using the URL as a key. That way we know that if there is a duplicate, if we have the same URL, then we can find that duplicate and remove it.

And so this is helping us look to see if there is already a vector so that we know whether we need to create a new vector or update an existing one. And so if we take a look inside of our Pinecone database, we can see here that we already have some records and you can see that there is some text here which is a portion or a chunk of a transcript for a particular YouTube video. In this case, the YouTube video that you can see in the pop-up.

So you can see every one of these entries in the Pinecone database has these vectors, but it's these values here that we use to find the data, not the text itself. We use the text here in the Pinecone database to power the queries. We use this text and we send it this text to actually generate the responses, but we use these vector values to actually find the data in this Pinecone database so that we can send the proper data to the LLMs.

So these two modules here keep the Airtable database in sync and we're doing a little bit of cleanup here just to maintain the Airtable database but there's nothing there critical to this overall process. So now that shows us how we're getting data into the vector database. This data connector is making sure that all the new records, all the updated records, no duplicates, there are no data gaps, it's cost efficient and it's also font tolerant. So this data connector is trying to ensure all of those things are true so that when we are using this vector database in this solution here we have data that we can trust. And I didn't specifically cover this yet, but the way this data connector is helping with keeping things cost efficient is that we use all of the Airtable automations as much as possible to process data that's coming.

back from the API rather than trying to use make automations, which obviously can ramp up if you're processing a lot of data. So when we're processing all of this data on this side of the process, we use the Airtable automations to process as much of this data sync as we can to avoid racking up a ton of automations here in make, which are a lot more expensive. So now moving along to this part of the process where we are actually taking in a question or a request from the user, querying the vector database for a relevant.

expert information that is relevant to the question and then we send it to the LLM like ChatGPT or Cloud for the final revised answer which then is then passed back to the user. So to represent this entire section here I have this automation and it's quite simple. This here is just representing a question from a user.

So this is just used for testing but it could easily be connected up into something like Slack where you could ask an assistant in Slack a question. That would be like inserting the question from Slack directly into this module here. and then when we finally got the answer in the back end we could respond back to the user but now to cover the guts of this automation which again i like to point out is fairly simple given all of the power here but the way this automation works is in this section here we create a prompt to create search terms for our pinecone database so again remember we need to search our pinecone database for relevant information with regard to the query that's coming in the beginning of the automation and in this scenario here The question is, I want to create an article on how to start an AI automation school community.

So the first thing that we need to do is we need to ask ChatGPT, hey, can you go ahead and create us some search terms that we can use in Pinecone that will give us the data that we want so that we can then send it to ChatGPT to actually write the article or Claude in this case. I'm using ChatGPT and Claude just to see how the output differs. So here we are coming up with the search terms.

So if I open this up and look at the result, the search terms are AI automation community building, starting AI school, starting AI school, creating online AI community, AI automation education platforms. So again, these are the search terms that ChatGPT came up with to help us query our own database based off of the topic we want to write about. Now in this next module here, we are creating embeddings again.

Again, we're making an API call because there isn't a built-in module. Just like we built embeddings for what we store in the database. So we sent it this text which created these vector number values that we can use to search the database.

To actually search the database we also need these vectors. So coming back to this module here we are taking the search terms and you can see we're putting them right here and it's going to output a set of values here in the body, open up the data, and here are those embeddings. So here we have those numbers again except this time these numbers are going to be used to search as opposed to store.

Then we simply go to the Pinecone database and we pass in those vectors. So we're passing those vectors from this step here. And this is our search term. So we search the Pinecone database with these values in a similar way that we use these values to store the data. So even without understanding how vectors work, you can understand that we are saving and searching the Pinecone database for the data that we actually want, which is this transcription.

But we're doing it from these. embeddings, which are these vector numbers that allow us to search and store our data in a way that we can quickly pull up the most relevant information for any given query. So in this situation, since we are searching for how to start an AI automation school, it's going to search all of our values here for something that's relevant in the content and only return these rows. And so now you can see from our massive knowledge base, now we only have the rows and the data we need to actually write the article and nothing more.

So instead of trying to send our entire database to ChatGPT to write our article, which is actually impossible, you couldn't really do it, we're just getting the most relevant information from Pinecone and then sending that into ChatGPT or Cloud to actually write the article. And so you can see here, if we open this up, we've got quite a few different responses that relate to the query that we had. And if we open this up here to the metadata, we can see we have the content and there's the chunk.

So we use the vector numbers to search and store. but it ultimately returns back the relevant text that we can group together and finally pass into ChatGPT to actually write that article. And so here we're generating that article with ChatGPT and here we're generating it from CLAW just to see the difference. And then ultimately we can take a look here and we can see that there's an article that's actually written from our knowledge base. This isn't taken from just the internet, it's being taken from actual expert knowledge that exists either on my YouTube channel, from my post in school, or even the classroom and all of the videos that I post here.

And so again, if we look at this system here, you could easily have people asking questions on a platform like Slack. Here was an example that I had set up before. You've got a channel here, and then when somebody asks a question, it will actually prompt the assistant, which will trigger this automation here. We create the search terms, we create the embeddings, we do the actual search, bring all of that data together for a final query to one of our favorite LLMs, and then the output of this process here, whether it's ChatGPT or Cloud or both, you could finally map those back into a Slack response, just like you can see from this no-code architect's assistant that is responding to me and my question with this specific answer.

So hopefully you can see here that this entire section here, while there might be a few new things to learn, it's really two simple automations right here that take care of this entire part of the RAG system. And just as a recap, this automation here is the data connector for this part of the system. and then this automation here is what allows us to take that data from the vector database send it to a llm like chat gpt or cloud for final processing and then returning it back to the user and again the hardest part of this entire process and also in this process which i will cover shortly is this data connector it is this logical understanding and system that allows you to isolate and keep the data between one source and another source intact and trusted All right, so now that we covered the section of the RAG system that deals with the databases and working with the vector database, let's talk about sometimes the more complicated issue, which is going out to all the various systems where our content might be housed and developing the data connectors that allow us to sync it back to Airtable.

And again, the same way that we want to maintain the integrity of our data from Airtable into the vector database, we also want to do the same with our external data sources. We want to make sure that we grab all of our YouTube data. We want the new records and the updates. And we want to make sure there are no duplicates.

And we want to make sure there are no gaps in the data. We don't want to be missing a YouTube video. And we want this to be cost efficient.

If there's a scenario where we are syncing data quite often, like if there's a situation where it needs to sync every minute or five minutes, it needs to be efficient. Otherwise, we're going to pay a lot of money just to sync the data back and forth. And we want this entire system to be cost efficient. So the automations and the way we sync the data with our data connector needs to be efficient. And again, it also needs to be fault tolerant.

So if there is an error, our database doesn't get corrupt. So when it comes to getting data into our database, there's really four main ways that that's going to happen. We're going to reach out to an official API.

We're going to develop our data connector, which again is not necessarily always the same thing in every situation. It's more a concept where you develop the automations or the code or whatever it needs to be so that this is true. You want to isolate all the logic.

into an object that can work with the API to extract the data and keep things in sync. But again, there are a few different ways that we tend to interface with the data collector. Official APIs, which is usually going to be the easiest because it's an official API and it's designed to connect to a specific service.

And so building a data connector that can help you do this is usually pretty straightforward. And then we have hidden APIs, which are still APIs. They're just not publicly documented. So for instance, school has an API. There is a way.

where you can download all of the course content. You can get access to those Loom videos. And then from the Looms, you can get the transcriptions. So here you can see I have some of my school courses and the transcript from that video. So it's possible to do this, but it's not documented.

So you have to figure out how to do it. And then often because it's not documented, the data collector might be a little bit more robust so that you can handle the situations that you might not have to deal with if you're working with a normal API. So there's an official API, the hidden API, and then there's also situations where you simply have to scrape the data.

This is an example where you might use a tool called Axiom, which can log into a browser and extract things from a browser when there's not even an API or even hidden API. It just goes to the page, it looks at it, and it just simply grabs text. And so in a situation like that, there's some website here, www, and some data collector that's helping you make sure that that process. is done properly.

And in a lot of cases when you are scraping data, the data collector has to be even more robust to handle all of the different situations. For instance, if you build an automation that scrapes data from a website, if they change the location of some text from here to over here, this might cause your automation to break. And then in that connector, you would have to come up with some way to recover from that issue.

And again, it'll be a case-by-case basis, but you'll also have to make sure you can get new records, updated records. You can make sure to remove duplicates. There are no data gaps. It's cost efficient and it's fault tolerant.

So obviously working with a direct API or even a... Hidden API is much better than working with a automation that uses a scraping method. And then, of course, there's the good old manual process where somebody comes in and they just put in the data into the database directly.

So now, obviously, there are major drawbacks to these two situations. So as best you can, make sure you use normal APIs or hidden APIs. Finding hidden APIs is easier than you think. You simply need to use the developer tools that are in your browser. And with these developer tools, you can monitor the network and the connections back and forth between your computer and the remote server.

And then based off of that research, you can actually find the APIs that the website is using to communicate with your browser, and then you can exploit it and then use it for your data collector. Now, if you're enjoying this content, make sure to like and subscribe to the channel. It tells me what type of content you want more of.

So now let's talk about this process in more detail, where we are grabbing all of it. the YouTube videos from my YouTube channel. We're using Appify as the API and we're using Make to build our data collector and to ensure that we adhere to all of these different standards. And that ultimately comes back and syncs to our Airtable database. So the automation that actually powers this is very simple.

We have this one automation here which triggers a API call to Appify. We are using one of their actors here. That triggers the automation and then when it's done it triggers this automation. and if we look back to our requirements here the way we are tracking new and updated videos from youtube in our collector is simply to always get all of the new videos in this case because there are only 160 videos or however many there are getting every single video every single time is not an issue so that will obviously include all of the new and updated videos so this process here will respond with all of the videos then the rest of this automation works by downloading the data that we received from this module here.

And we do that so that we can process as much of the data as possible inside of Airtable automations. So by downloading all of the data that came from Appify into a file that is stored in Airtable just like this, instead of processing all of this data, which would be hundreds of operations in Make, we can do it with just one automation inside Airtable. And so that's how we are able to keep this cost efficient.

we find duplicates by using the url and making sure that there is only one unique youtube url and so we always look to see if that url is here before we add a new row if it is then we just update it otherwise we add the new row and then we take care of the node gaps in the data by just always requesting all of the videos in other scenarios you would need to modify that approach you might need to use some sort of timestamp like last updated but depending on the situation you'll have to develop a data connector strategy that helps you make sure that all of these things are true. And the reason why I really point this out and really isolate it to these situations is that it helps you logically separate your system from these very specific data connectors. And it gives you a list of things that you need to think through when you're connecting to these other APIs such that your data is going to be in good shape.

So then once we get all of that data from Appify, Again, it could be hundreds of rows. Instead of trying to loop through the hundreds of rows here, we spend three operations. We drop the file here with all the data, and then our Airtable automations, we can simply process that data here and only spend one Airtable automation.

Now, if you want access to this diagram, the Airtable database and the four make automations that allow you to save data to your Pinecone database, retrieving and processing data from your Pinecone database, and the automations that help you sync data with Appify and... YouTube and your Airtable database. Make sure to jump into the NoCode Architects community.

It's an engaged group. You can ask any question and get tech support. On the calendar, there's calls almost every single day.

You can get access to a Make an Airtable course and a bunch of other cool automations that you can build and a whole lot more. I hope to see you there, but either way, I hope you enjoyed this video, what it can do for your business in 2025, and I'll see you on the next one.

Transcript for:Nutzung von RAG-Systemen für das Wachstum von KI

Transcript for:
Nutzung von RAG-Systemen für das Wachstum von KI