Building a RAG Application: Overview

Hello all my name is Prashna Aik and welcome to my youtube channel. So guys in this specific video we are going to create an end-to-end rag application with the help of deep seek and this entire deep seek will be specifically installed in our local with the help of olama. I'm also going to use olama embedding and try to show you completely end-to-end like how the vectors is basically created how you can store that vector even in your local and the best part is that the accuracy right when I specifically chat with that rag application. It's quite amazing. Okay, so let's go ahead and implement it completely end-to-end step by step I will be showing you with respect to the coding and all Performance wise accuracy wise. This is really really good. We are able to get a good result out of it Okay, so let me quickly share my screen now already if you remember in my previous video I've discussed about how you can run this particular application with the help of Chat olama and all if you have not seen my previous video, I would suggest go ahead and see it but now i'm just going to go ahead and create my ragdip so let me first of all hide my face so that your focus is completely on this particular page itself now quickly what i will do in requirement dot txt these are the libraries that i'm actually going to use there is something called as pdf plumber because i need to read from a pdf and i will try to make that a specific data source okay so quickly let's go ahead and start writing the code so first of all i'm just going to import Streamlit as ST. Okay, then since I'm going to use streamlit Since I have already imported or installed PDF plumber. So we definitely require PDF plumber Loader right. So this will specifically load all the content from a specific PDF. Okay, then As soon as we probably load this then I have to perform a recursive character text splitter then once I perform a or take this chunk of data and convert that into an embedding i will store it in a vector store which is called as in-memory vector store i don't have to be dependent on any third party vector stores or in any cloud yes you can also store that in specific cloud but here i'm just trying to show you an example so here i'm going to use the in-memory vector store now along with this what i will actually do is that i will go ahead and write from langchain underscore ulama because before we store inside the in-memory vector store I need to also perform olama embeddings right so olama embedding what it will do is that it convert that text into vectors so this is just like a embedding which is stored in my local uh I've probably installed olama in my local itself and we are going to specifically use that okay then I will go ahead and write from olama sorry from langchain underscore core dot prompts I'm just going to import one more thing which is called as chart prompt template okay so chat message prompt template i will not use but i'll use chat prompt template so that i can actually uh fix up my uh you know some prompts uh one recent thing that i've actually explored in olama which is called as olama llm okay so before we also used to use chat olama so here you will be able to see that uh we have imported something called as chat olama this is the new thing that i've actually explored recently so i'll go ahead and write from langchain underscore ulama dot llms import ulama llm okay so we are going to specifically use this you will be able to understand and all now the next thing that i'm actually going to copy is that again i've used deep seek and i have also created some kind of styling okay so for the styling purpose i've used this entire styling so that you know the background actually looks good with respect to the things that we are basically doing like when we are creating this entire end-to-end application then the next part is that i will just go ahead and create my prompt template now inside this prompt template this is what i'm actually saying hey you are an expert research assistant use the provided context to answer the query if unsure state that you don't know be concise and factual maximum two three sentences okay so here you will be having the user query and here you will be basically having the document context since in any rag application that you specifically create you will be having some kind of document context. Okay. Now, The next thing is that I will go ahead and create this particular folder document store because whenever I specifically upload any PDF It should be available over here, right? The next thing is that you can also upload this in any kind of cloud platforms or s3 bucket It is up to you, but just to make sure that everything works fine. I will just go ahead and write this So so now my PDF storage path. So here I'm going to define some constants. This is nothing but document store PDFs Okay, so this is the path that I'm actually going to do that basically means whenever we try to upload any pdf it should be uploaded over here then you have this embedding model where we are going to specifically use olama embeddings and the model will be deep seek r1 1.5 billion and if you remember in olama we can directly go ahead and install this right so i have already installed it all you have to do is that go ahead in the command prompt and write olama run this specific model automatically this will get downloaded okay then As I said document vector store DB. I'm going to specifically use in memory So I'm going to initialize in memory vector store inside this the embedding model will be used nothing But all I'm embedding then language model will be olama LLM and I have to just specify my model name That is this specific model. Okay. Now, let's go ahead and discuss the first part The first part will be that I have to upload the file whenever the file is specifically uploaded We should be able to return that particular file path. Okay So I will be creating a function which is called as save uploaded file. So as soon as I upload the file, this will basically be the file path. I can open that file path and we can basically write it. Okay. Then I'm returning the file path over here. So this is the first thing. Then we will try to load the PDF documents. As soon as we upload a file, the next step will be that I will try to load a PDF documents. If you are following my generative AI playlist, guys, I think you should be very much familiar with this. so here we have this pdf plum loader and then here you have document loader and we are just writing document loader dot load after that what we are basically going to do is that i'm going to basically also add chunk documents so here you have definition chunk documents your raw documents recursive character text splitter chunk size is this and i've also said that hey go ahead and add the starter index is equal to true then we are returning the text processor dot split documents on raw documents whatever raw documents is basically given so this function is basically for our chunking process okay now once we do the chunking the next thing is that we need to convert that into embedding and store it in our in-memory vector right in-memory vector document vector db okay so for that i will just go ahead and create another function where i will write definition in this documents i'm going to give the document chunks which is coming from here and it will get added in your document vector store db right this is perfect okay now whenever i hit my document vector db it should be able to do the similarity search and it should be able to get the context right so for that i will create another function because we need to just link this specific function see guys if you're following my generative ai playlist already i have been explaining all these things right so that is the reason i'm writing copying and pasting the function over here so here i have created a function which is called as function find related documents then it is just going to return the document vector db dot similar similarity search which is nothing but it is using a cosine similarity okay and finally we basically also once i probably get this particular context the next thing is that i will just go ahead and make sure to do the chaining and all and get this now see here i have a function which is called as definition uh generate answer okay so here we give a user query context documents right So we are joining all the context documents and this context documents will probably come from this particular function, right? Then we have created a chat prompt template with the prompt template then this is chaining where we have used Conversational prompt and lang chain model. Okay, and then we are doing response underscore chain dot invoke with the user query and the context text Right, that's it. So this is the beautiful thing about this now. Let me quickly go ahead and add some UI configuration Ok so this is my generate answer now you see how it will be basically getting linked ok. So here I have my uploaded PDF file, htf file loader, select a PDF document for analysis. You can also make it true if you want to give multiple PDFs but here just for an example I have made it false. Ok and finally let's do one thing let's as soon as we upload any PDF this all functions whatever we have created right that should get executed. Ok. So here first of all if uploaded PDF then first of all we will go ahead and save the uploaded PDF file right save uploaded PDF file basically means this that basically means inside this particular folder it will get saved then the next step will be load PDF documents right then chunk documents then index documents so this is done till the vector database okay now user when it puts an user message so if the user input is there it will just go ahead and write this then it will analyze the document it will find the related documents to get the context and then it will generate the answer which is interacting with the llm and finally that will get probably displayed in front of it right so this is the entire thing here we are specifically using uh this particular llm model that is the olama with the deepsync r1 1.5 billion right now the next thing is that i will just go ahead and try to execute it okay now this will be interesting you see the thing it is completely open source open source basically means in my local no data is basically being shared you know and we'll try to do this so first of all i will just go ahead and write conda activate vnv okay then i will just go ahead and write streamlit run rag underscore deep not py okay so let's this gets executed okay and uh till then i will also show my face and this is what it is basically getting loaded the best part is that guys i do not have to probably do anything in this right so now let's go over here and click on the browse file let's see okay here i have uploaded one pdf so this is the recent batch that we are coming up with the ultimate data science and gen ai bootcamp.pdf and for this particular batch here we are teaching data science and all and this is the course syllabus now see as soon as it uploaded the ulama embedding started working and now i can probably go ahead and ask any question so i will go ahead and write what is the prerequisites prerequisites i'll just ask this particular question so in that particular syllabus we have written some kind of prerequisites i think we should be able to get the answer okay and if see what are the prerequisites analyzing documents see that analyzing documents things is basically running over here and then it should be able to give me the answer see okay so i need to figure out this this this this this okay or i'll ask still more question provide me the detailed syllabus provide me the detailed syllabus so this has actually become my data source and it is being able to provide me any kind of answers that i require right so here you have this entire thing you can also add chat history it is up to you okay so here i need to figure out this this object oriented overview all these things okay so kyle right can you summarize the entire curriculum this is really good and the performance wise i'm i don't mean performance and but the accuracy wise the kind of content as soon as we i'm writing any answers it is being able to probably provide you the entire details that is the best thing right so i've given this name as document documind ai okay so here modulate is the foundation of this the ultimate data science includes projects on deep learning with neural networks and dvi vector database everything is there so i hope you like this particular video uh this was it for my side i will see you all in the next video all the content all the code will be given in the description of this particular video Thank you. Have a great day. Bye-bye. Take care.

It's quite amazing. Okay, so let's go ahead and implement it completely end-to-end step by step I will be showing you with respect to the coding and all Performance wise accuracy wise. This is really really good.

We are able to get a good result out of it Okay, so let me quickly share my screen now already if you remember in my previous video I've discussed about how you can run this particular application with the help of Chat olama and all if you have not seen my previous video, I would suggest go ahead and see it but now i'm just going to go ahead and create my ragdip so let me first of all hide my face so that your focus is completely on this particular page itself now quickly what i will do in requirement dot txt these are the libraries that i'm actually going to use there is something called as pdf plumber because i need to read from a pdf and i will try to make that a specific data source okay so quickly let's go ahead and start writing the code so first of all i'm just going to import Streamlit as ST. Okay, then since I'm going to use streamlit Since I have already imported or installed PDF plumber. So we definitely require PDF plumber Loader right. So this will specifically load all the content from a specific PDF. Okay, then As soon as we probably load this then I have to perform a recursive character text splitter then once I perform a or take this chunk of data and convert that into an embedding i will store it in a vector store which is called as in-memory vector store i don't have to be dependent on any third party vector stores or in any cloud yes you can also store that in specific cloud but here i'm just trying to show you an example so here i'm going to use the in-memory vector store now along with this what i will actually do is that i will go ahead and write from langchain underscore ulama because before we store inside the in-memory vector store I need to also perform olama embeddings right so olama embedding what it will do is that it convert that text into vectors so this is just like a embedding which is stored in my local uh I've probably installed olama in my local itself and we are going to specifically use that okay then I will go ahead and write from olama sorry from langchain underscore core dot prompts I'm just going to import one more thing which is called as chart prompt template okay so chat message prompt template i will not use but i'll use chat prompt template so that i can actually uh fix up my uh you know some prompts uh one recent thing that i've actually explored in olama which is called as olama llm okay so before we also used to use chat olama so here you will be able to see that uh we have imported something called as chat olama this is the new thing that i've actually explored recently so i'll go ahead and write from langchain underscore ulama dot llms import ulama llm okay so we are going to specifically use this you will be able to understand and all now the next thing that i'm actually going to copy is that again i've used deep seek and i have also created some kind of styling okay so for the styling purpose i've used this entire styling so that you know the background actually looks good with respect to the things that we are basically doing like when we are creating this entire end-to-end application then the next part is that i will just go ahead and create my prompt template now inside this prompt template this is what i'm actually saying hey you are an expert research assistant use the provided context to answer the query if unsure state that you don't know be concise and factual maximum two three sentences okay so here you will be having the user query and here you will be basically having the document context since in any rag application that you specifically create you will be having some kind of document context. Okay.

Now, The next thing is that I will go ahead and create this particular folder document store because whenever I specifically upload any PDF It should be available over here, right? The next thing is that you can also upload this in any kind of cloud platforms or s3 bucket It is up to you, but just to make sure that everything works fine. I will just go ahead and write this So so now my PDF storage path.

So here I'm going to define some constants. This is nothing but document store PDFs Okay, so this is the path that I'm actually going to do that basically means whenever we try to upload any pdf it should be uploaded over here then you have this embedding model where we are going to specifically use olama embeddings and the model will be deep seek r1 1.5 billion and if you remember in olama we can directly go ahead and install this right so i have already installed it all you have to do is that go ahead in the command prompt and write olama run this specific model automatically this will get downloaded okay then As I said document vector store DB. I'm going to specifically use in memory So I'm going to initialize in memory vector store inside this the embedding model will be used nothing But all I'm embedding then language model will be olama LLM and I have to just specify my model name That is this specific model.

Okay. Now, let's go ahead and discuss the first part The first part will be that I have to upload the file whenever the file is specifically uploaded We should be able to return that particular file path. Okay So I will be creating a function which is called as save uploaded file. So as soon as I upload the file, this will basically be the file path. I can open that file path and we can basically write it.

Okay. Then I'm returning the file path over here. So this is the first thing. Then we will try to load the PDF documents.

As soon as we upload a file, the next step will be that I will try to load a PDF documents. If you are following my generative AI playlist, guys, I think you should be very much familiar with this. so here we have this pdf plum loader and then here you have document loader and we are just writing document loader dot load after that what we are basically going to do is that i'm going to basically also add chunk documents so here you have definition chunk documents your raw documents recursive character text splitter chunk size is this and i've also said that hey go ahead and add the starter index is equal to true then we are returning the text processor dot split documents on raw documents whatever raw documents is basically given so this function is basically for our chunking process okay now once we do the chunking the next thing is that we need to convert that into embedding and store it in our in-memory vector right in-memory vector document vector db okay so for that i will just go ahead and create another function where i will write definition in this documents i'm going to give the document chunks which is coming from here and it will get added in your document vector store db right this is perfect okay now whenever i hit my document vector db it should be able to do the similarity search and it should be able to get the context right so for that i will create another function because we need to just link this specific function see guys if you're following my generative ai playlist already i have been explaining all these things right so that is the reason i'm writing copying and pasting the function over here so here i have created a function which is called as function find related documents then it is just going to return the document vector db dot similar similarity search which is nothing but it is using a cosine similarity okay and finally we basically also once i probably get this particular context the next thing is that i will just go ahead and make sure to do the chaining and all and get this now see here i have a function which is called as definition uh generate answer okay so here we give a user query context documents right So we are joining all the context documents and this context documents will probably come from this particular function, right? Then we have created a chat prompt template with the prompt template then this is chaining where we have used Conversational prompt and lang chain model.

Okay, and then we are doing response underscore chain dot invoke with the user query and the context text Right, that's it. So this is the beautiful thing about this now. Let me quickly go ahead and add some UI configuration Ok so this is my generate answer now you see how it will be basically getting linked ok. So here I have my uploaded PDF file, htf file loader, select a PDF document for analysis.

You can also make it true if you want to give multiple PDFs but here just for an example I have made it false. Ok and finally let's do one thing let's as soon as we upload any PDF this all functions whatever we have created right that should get executed. Ok. So here first of all if uploaded PDF then first of all we will go ahead and save the uploaded PDF file right save uploaded PDF file basically means this that basically means inside this particular folder it will get saved then the next step will be load PDF documents right then chunk documents then index documents so this is done till the vector database okay now user when it puts an user message so if the user input is there it will just go ahead and write this then it will analyze the document it will find the related documents to get the context and then it will generate the answer which is interacting with the llm and finally that will get probably displayed in front of it right so this is the entire thing here we are specifically using uh this particular llm model that is the olama with the deepsync r1 1.5 billion right now the next thing is that i will just go ahead and try to execute it okay now this will be interesting you see the thing it is completely open source open source basically means in my local no data is basically being shared you know and we'll try to do this so first of all i will just go ahead and write conda activate vnv okay then i will just go ahead and write streamlit run rag underscore deep not py okay so let's this gets executed okay and uh till then i will also show my face and this is what it is basically getting loaded the best part is that guys i do not have to probably do anything in this right so now let's go over here and click on the browse file let's see okay here i have uploaded one pdf so this is the recent batch that we are coming up with the ultimate data science and gen ai bootcamp.pdf and for this particular batch here we are teaching data science and all and this is the course syllabus now see as soon as it uploaded the ulama embedding started working and now i can probably go ahead and ask any question so i will go ahead and write what is the prerequisites prerequisites i'll just ask this particular question so in that particular syllabus we have written some kind of prerequisites i think we should be able to get the answer okay and if see what are the prerequisites analyzing documents see that analyzing documents things is basically running over here and then it should be able to give me the answer see okay so i need to figure out this this this this this okay or i'll ask still more question provide me the detailed syllabus provide me the detailed syllabus so this has actually become my data source and it is being able to provide me any kind of answers that i require right so here you have this entire thing you can also add chat history it is up to you okay so here i need to figure out this this object oriented overview all these things okay so kyle right can you summarize the entire curriculum this is really good and the performance wise i'm i don't mean performance and but the accuracy wise the kind of content as soon as we i'm writing any answers it is being able to probably provide you the entire details that is the best thing right so i've given this name as document documind ai okay so here modulate is the foundation of this the ultimate data science includes projects on deep learning with neural networks and dvi vector database everything is there so i hope you like this particular video uh this was it for my side i will see you all in the next video all the content all the code will be given in the description of this particular video Thank you. Have a great day.

Bye-bye. Take care.

Transcript for:Building a RAG Application: Overview

Transcript for:
Building a RAG Application: Overview