Transcript for:
Lang Chain: Generative AI Application and Ecosystem Overview

hello all my name is krishak and welcome to my YouTube channel so guys here is one amazing one short video on Lang chain in order to learn generative AI so if you are interested in creating amazing llm application or gen AI power uh application then this specific video is definitely for you if you don't know about Lang chain it is a complete framework that will actually help you to create Q&A chatbots rag application and many more the most interesting thing will be that I will be covering all the paid llm models along with that open llm models even though that they are hosted in hunging face so we will be getting to know each and everything about it and how we can specifically use in Lang chain so I hope you enjoy this particular video and please make sure that you watch this video till the end so thank you let's go ahead and enjoy the series uh before I go ahead guys uh since uh you know there's some there should be some motivation for me also so I will keep the light Target to th000 please make sure that you hit like share with all your friends and we'll also keep a target of comments okay 200 comments I know you will be able to do it so let's keep that specific Target and let's understand what all things we are going to learn uh about Lang chain and then we'll also understand the second topic that we are going to understand in this specific video is about the Lang chain ecosystem and how it is related to each other now uh right now in the langin documentation if you probably see the recent updates that are there mainly uh most of the modules revolve around this particular topics in Lang chain okay so over here you'll be able to see lsmith uh here you'll be able to see Langer if I talk about lsmith recently I had also made a video on this uh just a simple video but I will try to create more videos on that when we go in this specific Series so lsmith is if I probably give you some examples it is it'll help you to monitor your application it'll help you to debug your application so in short whatever mlops activities is specifically required with respect to mon in deploying sorry debugging testing you know so third point I will say testing you can specifically use this amazing module in Lang langin that is called as lsmith the best thing will be that all the reports all the analytics you'll be able to see very much easily in this ecosystem itself in the langin ecosystem so there is a dashboard in langin which you'll be able to see it okay now how we going to use in this use this entire technique in some projects we'll be seeing completely end to end and we'll also be able to understand this so if I talk about lsmith it is mostly of U if I say llm Ops okay so this part that is right now required uh in many many companies so that part we'll also be able to cover it that is the reason why I like langen because it is providing you the entire ecosystem irrespective of any llm model okay any llm model now coming to the second thing over here you have Langer let's say that you have created your llm application you obviously want your entire llm application in the the form of apis right without writing much code yes you can write the code from scratch with the help of flask or some other some other libraries but here Lang uses something called as fast API okay and because of this fast API you know creation of this particular apis becomes very much easy so we'll be also able to understand it before the deployment if I have actually created my own llm app how I can actually create all the services in the form of apis that we'll try to see with respect to the Langer now coming to the next thing there are some amazing Concepts and important Concepts in Lang chain from data inje to data transformation and all in that major major topics are with respect to chains we'll try to understand about chains and probably in the next video once I probably start the Practical implementation the first thing that I'm actually going to cover is with respect to chains and I'm also going to discuss something called as agents and retrieval right not only that U you'll be able to see there is a concept of LC okay now in this LC we will be discussing and this full form is Lang chain expression language right so there are lot of Concepts that are specifically used in LC we'll also see that how it is basically important while you are building what are techniques it actually has when you're creating your own generative AI power application right along with that uh there are three main topics also which I will be able to cover it while I will be discussing all these things that is about model iio retriever agent tooling and all uh these are Concepts that you should really know uh the main aim of this entire series is not to make you understanding theoretical concept but uh it's more to understand how you can create amazing generative AI applications irrespective of any llm model now see guys one of the question that I get from many people hey Kish uh I'm using this specific llm model I don't have openi access I don't have API access tell me what should I do you know uh let's say many people say that hey Krish I don't have credit card for open AI right can you show me some examples with respect to open source models like Google gini or some other mrr what about open source models right see llm over here as I said Lang chin is an amazing framework to build the entire application and creating the best llm is already a rat race so if I probably say this is a rat race by Tech Giants right so Google will be competing meta will be competing um uh you know anthropic will be competing um openi will be comp rating so you don't even have to worry about this right whatever models will be probably coming up in the future don't worry about this the main thing is that how you can use this llm app in a generic way to build any kind of application so here whatever model it may come you just need to stay up to date right which model has the best accuracy and all that will basically happen over here right later on all the integration part will be completely generic okay now let's go ahead and understand the entire Lang chain ecosystem like what are are the very important things so this is a diagram that I have already taken from the Lang chin so as I already discussed with respect to lsmith the first model that you see over here we basically say it as observ observability right and with the help of lsmith you will be able to do debugging playground evaluation annotation and monitoring when I say annotation it's all about creating your own custom data set which will require uh with respect to finetuning or creating your gen AI powered application the next thing is with respect to deployment and recently uh right now Lang Ser has actually come it'll be in the form of API so here it is written right chains as rest API whatever Services you are specifically providing Lang chain soon will also come up with only one click deployment mechanism once you probably create this entire apis with the help of langub the next thing is that you just need to deploy it okay then um and already I have created a couple of videos on Langs Smith and Lang langsa but don't worry uh in this entire series I'm going to again start from fresh combine new topics and create a project Okay the third thing that you really need to understand is about templates okay so there are different different templates reference the application uh with respect to Lang chain you need to understand these three important things chains agent and retrieval strategies so we'll try to understand that how these things work and understand guys you will be able to get the code with respect to Python and JavaScript but my main aim will be with respect to python because in some of the article it was already said that AGI application is going to get created in Python okay then you also have some integration components so we are understanding about the ecosystem because all these things we are going to discuss in this specific playlist itself so here you have model iio here you have retrieval here you have agenting tool so retrieval you also will be having features to read your data set from different different uh data sources how you can actually create vector edings and all and all are there right in model iio you have various techniques with respect to model chain prompt example selector and output parcel and finally you'll be also seeing that we'll be focusing on this amazing thing which is called as protocol which is basically coming under Lang chain core and here we are going to discuss about Lang chain expression language right so there are some important Concepts like paralyzation fallback tracing batching streaming a sync and composition so all these things we are basically going to cover and then we will be focusing on end to end projects how you can use all these Concepts together right understand one thing guys when we create L when we use Lang ser and start creating rest apis we'll also be writing our we'll also be writing our uh client side code so that we'll be able to access those kind of apis right so all these things in the form of ecosystem will get completely covered trust me again why I'm saying this is this is important because tomorrow whatever llm models may come right how advanced it may be Lang chain will be a generic framework which will actually help you to build any kind of llm application in this video I will be showing you how you can create chatbot applications with the help of both paid apis llm along with that we'll also see how you can integrate with open source llms now you should definitely know both the specific ways how you can actually do it one way to basically integrate any open source llm is through hugging face but as as you know that I'm focusing more on the Lang chain ecosystem and with respect to hugging face I've already uploaded a lot of videos in my YouTube channel and how you can actually call these kind of Open Source llms but since we are working with the langen ecosystem we will try to use all the components that are available in langen as you all know guys uh this is a fresh playlist and obviously my plan is that this month I will be focusing entirely on langin many more videos will be coming up many more amazing videos along with end to endend applic fine tuning many more things is going to come up so please make sure that we'll keep a like Target for every video and for this video the like Target is 1,000 and at least 200 comments and please make sure that you watch this video till the end because it is going to be completely practical oriented okay and uh if you really want to support please make sure that you subscribe the channel and take up membership plan from my YouTube channel so that it will help me and with the help of those benefits I will be able to create more videos as such so let me quickly go ahead and share my screen so here is my screen over here and you'll be able to see in the GitHub that you'll be finding in the description of this particular video you'll be having folders like this so today is the third tutorial not third second tutorial uh in the first and second we just understood that what all things we are going to learn but in this is the real practical implementation that is probably there so as usual the first thing that we are going to do is that create our V andv environment how to create it cond create minus pvnv python is equal to 3.10 you can probably take 3.10 version and I have already shown you how to create virtual environments in many number of videos then you'll be using EnV file so this will basically be my environment variable um in this environment variable I will be putting three important information one is Lang chain API key uh the second one is open a API key and Lang chain project you might be thinking this open API key I've kept it as open no it is not I've changed some of the numbers over here so don't try it out it'll be of no use okay and then the third environment variable that I'm actually going to create is my Lang chin project name that is tutorial one I have written it over here the reason why I have written this because whenever I try to go ahead and see in my lsmith right I will be able to see observe the entire I'll be able to monitor each and every calls from the dashboard itself how we will be using this everything I will be discussing about it okay so all these things will spe specifically get required and uh all this will be used in our environment variable so these are the three parameters I have already created myv file so let's go ahead and start the coding okay and you have to make sure that you code along with me because this is the future AI engineering things are basically coming up I'll just show you initially with the foundation model later on this complexity will keep on increasing so let's go ahead and start our first code now what is our main aim what we are trying to do in our first project let me just discuss about because these are all the things that we going to discuss in the future but first thing that we will try to create is our normal chat GPT application okay I'll not say chat GPT but a normal chatbot okay and this chatbot will be important it will be helping you to probably create chatbot with the help of both paid and open open open source llm model so this will be the chatbot that we will be creating one way is that we will be using some paid l l m now paid llms one example I can show it with the help of open AI API okay open AI API the second one that I will try to probably show it uh or you can also use cloudy API so that is from a company called as anthropic okay that you can do and one more I will try to use it with the help of Open Source llm see calling apis is a very easy task okay but the major thing is that since we have so many modules we are going to use Lang chain as suggested right and in Lang chain we definitely have so many modules how we can use this modules for different different calls and along with this whenever we are developing any chatbot application what all dependencies we have specifically right dependencies now if you probably see this diagram here you'll be able to see there will be model prompt output parcel so in our video in this video I'm going to see some of the features with respect to lsmith I'm going to see some of the features with respect to chains and agents and I'm also going to use some of the feature present in model and output parcel so all this combination we going to specifically use and that is the reason how this is how I'm going to create the all the projects that we are doing entire videos that are probably going to come up will be much more practical oriented okay so now let's start our first chatbot application so here I will go ahead and write from Lang chin okay from linore open a since I'm going to use open a import chat open AI okay chat open a so this is the first one that we're going to basically do from Lang chain see this three things will definitely be required then one is chat openi or whatever open you whatever chat model that you're going to use how to call Open Source I will also be discussing about that first of all we'll start with opening API itself okay so from Lang chinor core. prompts I'm going to import chat prompt template okay chat prompt template so this is the next thing that we are probably going to use chat prompt template okay at any point of time whenever you create a chat bot right this chat prom template will be super important right here is what you'll you'll basically give the initial prompt template that is actually required Okay the third library that I'm actually going to import is from Lang chain uncore core dot output uncore parsers okay input H Str output parcel okay now this three are very important this string Str Str output processor is the default output processor whenever your llm model gives any kind of response you can also create a custom output parer that also I will be showing you in the upcoming videos okay this custom output parcel you can do anything with respect to the output that probably comes you want to do a split you want to make it as a capital letter anything right you can write your own custom code with respect to this but by default right now I'm going to use just St Str output parsel now along with this the next thing that I'm actually going to do is that I'm going to use streamlet as St okay stream streamlet as St then I'm going to also import OS and since I'm also going to use from EnV import load. EnV so that we'll be able to import all our libraries okay so let's see whether everything is working fine or not okay U from EnV so here I'm going to basically write python loore sorry python app.py I'm just running it so that everything works fine and all our libraries will also get ined cannot python app.py okay I have to probably go to my chatbot folder CD chatbot so now I'll clear my screen python appp oh sorry from streamlet as St okay import stream as H I have to write so that is the reason it was coming all this errors now let's see if everything is working fine Lang chain core so here you can probably see that there is a spelling mistake okay but I'm just going to keep all the errors like this so that you'll be able to see it python app.py if everything works fine uh output parser okay P Capital um so I think my suggestion box is not working well and that is the reason now everything is working fine uh here you can see that I'm not getting any error so let's start our coding and let's continue it okay so we have imported all these things right now now as I suggested guys since we are going to use three environment variables one is the open API key Lang chain API key and along with that I will also make sure that the tracing to capture all the monitoring results I will keep this three environment variable one is opening API key lch tracing verion to and langin API key so langin API key will actually help us to know that where the entire monitoring results needs to be stored right so that dashboard you'll be able to see all the monitoring results will be over here and tracing we have kept it as true so it is automatically going to do the tressing with respect to any code that I write and this is not just with respect to paid apis with open source llm also you'll be able to do it now this is the second step that I've actually done now let's go ahead and Define my promt template simple so here I'm going to write my prompt template okay promt template so here I'm going to Define prompt is equal to chat prompt template dot okay from uncore messages okay and here I'm going to Define my promt template in the form of list the first thing that with respect to my promp template that I'm going to give is nothing but system and system here I say that you you are a helpful assistant please respond to the queries okay please respond to the questions or queries please response to the user queries okay whatever queries that I'm going to specifically ask a simple prompt that you can probably see over here the next statement uh after this is what so this will be my next see if I'm giving a system prompt I also have to give a user prompt right user prompt will be whatever question I ask so this will be user and here I will define something like question colon question I can also give context if I want but right now I'll just give it as a question a simple chatbot application so that you'll be able to start your practice of creating all the chat bots so now I will go ahead and my streamlet framework okay see the learning process will be in such a way that I will try to create more projects and use functionalities that are there right and in this way you'll be able to work it in an amazing way okay so here I'm going to basically write st. title Lang chain demo with the opening API std. textor input search the text topic you want okay now let us go ahead and call my open AI llms okay open AI llm so here I'm going to be basically write llm and whenever we use openi API so it will be nothing but chat open Ai and here I'm going to give my model name the model name will be nothing but GPT GPT 3.5 turbo so I'm going to use turbo because the cost is less for this i' I put $5 in my openingi account okay just to teach you so please make sure that you support so that I will be able to explore all these tools and create videos for all of you okay and finally my output parser see always remember Lang chain provides you features that you can attach in the form of chain right so here three main things we have created one is the chat prom template next one is the llm and next one is the output parcel obviously this is the first thing that we require after this we integrate with our llm and then finally we get our output so string output parser is responsible in getting the output itself finally chain is equal to we will just combine all these things so here I'm going to write prompt L LM and then finally my output parsel right I will show you going forward how we can customize this entire output parsel and all and finally if I write if input text if input undor text colon now whenever I write any input and probably press enter Then I should be able to get this output so std. WR and here I'm going to just write chain. invoke and finally I get I give my input as question and that input is assigned to my input text input text right so this is what we are going to basically do right st. write now this is what we are doing a simple chatbot application but along with this we have implemented this this this feature is specifically for Langs Smith Langs Smith Lang Smith tracking okay this will be amazing for to use okay and this is the recent updates that are there so whatever code I'm writing will be applicable going forward in various things that are probably going to come up okay now let's go ahead and run this so in order to run it you'll just need to write nothing but streamlet Run app.py Okay oops there is an error app.py and here I'll to allow access okay so right now you'll be able to see over here Lang chain series test llm but my my my project name was Project one okay so now if I go ahead and hit hey hi okay and just press enter you'll be able to see that we'll be getting this information over here and here you can see my project something let me reload it tutorial one right so this is the first request that is already been H and here you'll be able to see your enable sequest chat prom template right all the chat prom template output message your helpful assistance pleas response to the user queries right along with this you will be seeing chat open AI API and with respect to this what was the cost everything you are able to track so 027 is the cost that actually took with respect to this and finally my string output parser how can you assist today with respect to this output parsel it is just going to give me the response clearly now when I develop my own custom output parel I'll be able to track everything so here what you are able to do you are able to monitor each and everything that is there right all the request that is probably coming up okay so provide me a python code a python code to swap two numbers okay so once I execute this and here you'll be able to see that I'm able to get the output and answer everything is over here and for this you'll be able to see the cost will be little bit High okay if you don't agree with me or let's see with respect to tutorial one the second request that I've actually got 4.80 seconds yes it took a little bit more time and here the cost was 0211 so it is based on the token size right for every token it is bearing some kind of cost perfect uh this was the first part of this particular tutorial now let's go to the second part uh the second part is more about making you understand that how you can call um open source llms in your local itself and how you can actually use it so for this first of all I will go ahead and download olama okay AMA is an amazing thing because you'll be able to run all the large language models locally uh the best thing about AMA is that it automatically does the compression and probably in your local you'll be able to run it let's say if you have 16 GB Ram you will just have to wait for some amount of time to get the response but Lama 2 and code Lama you can specifically use it over here all the open source llm model and it supports a lot of Open Source LM models and yes uh in Lang chain ecosystem the integration has also been provided over here so what I'm actually going to do over here is that I'll show you first of all just go ahead and download it this is available both in Mac Mac Linux and windows wherever you want just download it after you downloaded it what you really need to do is just go ahead and install it it is a simple exe file for Windows MSI file for Mac OS and then Linux is a different version so you just need to double click it and start installing it once you install it here uh somewhere in the bottom this AMA will be start running okay now once AMA installation is done now what I will do over here I will create another file inside my chatbot okay and create another file local llama okay local Lama py now local Lama py what we are going to basically do over here is that uh with respect to the local llama I will first first of all go ahead and import some of the library see code will be almost same right there also I'll be using chat open API chat prom template string output parser so I'll copy the same thing over here I'll paste it over here now along with this what I'm going to do I have to import olama right because that is the reason why we will be able to download all the specific models okay so Lang chain community. llm see over here whenever we need to do the third party integration so that will be available inside langin Community okay so AMA is third party cont configurations uh let's say you're using some Vector embeddings that is also third party so everything will be available over here okay now this is done langore community. LM import AMA and then we have this output parser string output parser core. prompts that is nothing but chart prom template and everything is there okay now let's go ahead and write import streamlet as St so I'm going to going to use the streamlet over here along with this import OS and not only that we will also go ahead and import from EnV import load uncore dot loore dob okay now we'll initialize it load underscore Dov okay once we initialize all this random all this uh environment variables as usual I will be importing this three things now see in my previous code when I was using open aipi promt template we have written it over here right same promt template we'll also write it over here because it we just need to repeat it because the main thing is that you really need to understand how with the help of AMA I can call any open source models okay so here it is and then finally you'll be able to see where is my uh code to call my open AI llms that we going to see over here so this is done now stream late framework also I will try to call it over here here okay it's more about copy paste the same thing that we have actually implemented and then you will also be seeing this is the code that we going to implement it okay but here we are calling chat open AI okay I specifically don't want chat open AI instead I will be calling AMA okay so AMA whatever Library we have imported so AMA okay and then here we are specifically going to call a Lama 2 okay now before call any models now which all model are specifically supported if you go ahead and see in the GitHub right of AMA you'll be seeing the list of everything every every every libraries that it supports like Lama 2 mral dolphin F 52 neural chat code Lama all are mostly open source GMA GMA is also there but before calling this what you really need to do is that just go to your command prompt let's say that I want to use GMA GMA model okay so what I have to do or I have to use llama model right so in order to do this I have to just write AMA run whatever model name because initially it needs to download it right uh this will get downloaded from some open source some GitHub it can be GitHub it can be hugging face somewhere right some location there will be there we have to download that entire model so let's say that I want to go ahead and write o run gamma so this what will happen it will pull the entire gamma model right wherever it is so here you can see pulling will basically happen now this is right now 5.2 GB right for the first instance you really need to do it now since I I I am writing the code with respect to Lama 2 I've already downloaded that model so that is the reason I'm showing you another example over here run gamma now once this entire downloading happens then only I'll be able to use the gamma model in my local with the help of AMA so I hope you have got an idea about it now what I'm actually going to do so here I've called AMA model Lama 2 okay then again output parser is this and I'm combining prompt llm on output parser and everything will be almost same and that is the most amazing thing about Lang the code will be only generic now only you need to replace open AI or paid or open source it is up to you again I'm saying you guys the system that I'm currently working in has a 64GB Ram uh it has Nvidia Titan RTX which was gifted by Nvidia itself so with respect to this uh amazing system I will be able to run very very much quickly that is what I feel so let's go ahead and run it so here what I'm actually going to do I'm going to write python uh so it is stream streamlet so streamlet run local Lama py so once I execute it here you'll be able to see now now instead of open a API I should had okay no module name Lang chain Community let's see where is Lang chain Community okay I have to also make sure that in my requirement. txt I go ahead and use this langin community and I need to import this Library since and need to do that and that is the reason I'm getting an error so if I go ahead and write pip install minus r requirement. txt oops CD dot dot okay now if I go ahead and write pip install minus r requirement. txt so here you'll be able to see my requirement. will get installed this Lin Community will get installed once I'm done with this then I can probably go ahead and run my code okay so this will take some amount of time so if you liking this video please make sure that you hit like uh there are many things that are probably going to come up and it'll be quite amazing when you learn all this things okay so uh once this is done then what will happen is that we can and you can use any model up to you okay and I don't want this open key also only this two information I specifically want I'll be able to track all these things okay and later on I'll also show you how you can create this in the form of apis again it'll some time it'll take this but uh let me know uh how do you think all these tutorials are blank chain I see a lot of purpose for this particular Library it's is quite amazing that people are doing um the company is doing amazingly well in this open source in world and it is developing multiple things over there so now I will go ahead and write CD chatbot I will go inside my chatbot and then I will run this python local Lama py once I execute this now I don't think it should be an error okay it should be streamlit come on streamlit run local llama oops local Lama py not python run streamate run now here you have again I'll be getting open AI text over here let me change this also so that I can make it perfect with Lama 2 okay so I've executed it saved it I will rerun it I'll say hey high so once I execute it you'll be seeing that it'll take some amount of time in my system even though I have a 64 GB Ram but I'll get the output over here so assistant says hello how can I help you today now if I probably go ahead with respect to this dashboard uh let's see where it is so now tutorial one you'll be able to see that this will increase okay there will be one more over here right so I've reloaded this page okay and you'll be able to see it okay you'll be able to see the new olama request see hey hi 4.89 second token 39 but there is no charges because it is an open source model right so here you'll be able to see if I extend this there you'll be able to see chat PR template ama ama is over here now this AMA is specifically calling llama 2 over there and whatever open source libraries that you specifically want just to call this it is very much simple you have to just go into the GitHub and download any model first of all just by writing o Lama run that particular model name and once it is downloaded it is good that you can probably go ahead with and use it okay now I will say uh provide me a python code python code to swap two numbers okay if you want more coding well chatbot you can directly use code Lama if you want okay so here you can see all the examples are there and this was quite fast right so this is good you know so if you have the right kind of things so here you can see 4 seconds it has probably taken okay AMA is over here all the information is probably over here prompt and completion and all right hello guys so we are going to continue the langin series already in our previous video we have already seen how to create chat Bots with the help of both openai API and open source llm models like Lama 2 we have also seen what is the use of AMA how you can run all these open source model locally in your system and uh along with this this we also created multiple end to-end projects using both of them now what we are going to do in this specific video one more step which is very much important for our production G deployment that is creating apis you know for all this kind of llm uh models we will be able to create apis and uh through this you will also be able to do the deployment in a very efficient manner now how we are going to create this specific apis there is a very important component uh which is called as Lang serve in Lang chain we're going to use that along with this we are going to use fast API um and not only that we'll also create a sag Swagger UI which is already provided by Lang chin uh the Lang serve library that we are specifically going to use now it is important guys you know this specific step because tomorrow if you are also developing any application you obviously want to do the deployment for that particular application so creating uh the entire API for this application will be the first task that will be required so uh yes let's continue let's go ahead and uh discuss this entire thing first of all how we are going to go ahead first of all I'm going to show you the theoretical intuition how we going to develop it and then we will start the coding part so let me quickly go ahead and share my screen what we are actually going to do over here over here I se you have seen that I've written apis for deployment okay so if I consider this diagram this is the most simplistic diagram that I could draw okay now let's consider see um the at the end of the day in companies right uh there will be different different uh applications this applications are obviously created by software Engineers right it can be a mobile app it can be a desktop app web app and all now for this particular app if I want to integrate uh any foundation model or any fine-tuned foundation model like llms and all um so what I really need to do is that I need to integrate this with the form of apis at the end of the day see what we are going to do over here this side is my llm models itself right it can be a fine-tuned llm models it can be a foundation model I want to use those functionality along with my web app or mobile app so what we are doing is over here is that we will create this specific apis now this apis will be having routes okay routes and this routes will be responsible whether we have to probably interact with open AI or whether we have to interact with other llm model like Cloud 3 cloudy 3 or llama 2 open source model model so any number of llm models whether it is open source or uh whether it is uh paid API models specifically for llm we can definitely use it now this is what we are going to do in this video we will create this separately we will create this separately and at the end of the day I'll also give you an option through routes how you can integrate with multiple llm models right so I hope you have got an idea with respect to this and any number of llm models it may probably come you can probably integrate it at the end of the day whichever model is suitable for you you can use for different different functionality and the reason why I'm making this video understand one thing guys because in llms also you have different performance metrics some model is very good at some performance metrics over there like MML other other metrics are definitely there so this is an option where we can use multiple llm models now what we are going to do quickly I will go ahead and open my code document so in in my previous video already you have seen I had actually developed till here right I have basically if you see over here we have created this first folder that is chatbot and inside the chatbot we had created app.py then local Lama py right we with this did this entire thing and as I said every tutorial I'll keep on creating folders and developing our own project over here now let's go ahead and create my second folder and this time this will be apis okay so I'll just go ahead and write something like API okay now with respect to this API as I said whatever we do with this local Lama or app.py right over here I've used open a API key here I used open source models so we'll try to integrate both of them in the form of routes okay so that we'll be able to create an API so let me quickly go ahead and write over here app. Pui so one will be my app. Pui which will be responsible in creating all the apis the second one will specifically be my client.py now client.py in this specific diagram is just imagine like this one web app or mobile app because we are going to integrate this apis with this mobile app or web app okay so quickly let's do this uh first of all I will go ahead and start writing the code in app.py before I go ahead uh we have to make sure that we need to update all the requirement. txt almost all the libraries have installed it but I'm going to install more three libraries one is langu fast API and uvon okay since I'm going to create my entire Swagger documentation of API using fast API right so all these three libraries I'll be using it so first of all I will go ahead and install all these libraries that we'll do once we run the code right now let's go ahead and write my app.py code okay now as usual first of all uh let me just open my terminal okay and let me do one thing with respect to the terminal I will go ahead and write pip install pip install minus our requirement. txt okay first of all I need to just write CD dot dot okay then I'll clear the screen and then go ahead and write paper install minus r requirement. txd now here you'll be able to see that my entire installation will start taking place and here uh the other three packages that I have actually written right Lang serve and all that will get particular installed okay now till then installation is B basically going on let me go ahead and write my code so I will write from Fast API import fast API okay so this is the first uh first library that I'm going to import along with this I have to also make sure that I create or I import my uh chat prom template so from Lang chain since we are going to create that entire apis in this okay prompt import chat prom template right so this is done then from line chain dot chatore models import chat open aai so this is the next one since I need to make sure that I need to create a chat chat application so that is the reason why I'm using this chat models okay uh this is the next library that I'll be going ahead and importing along with this I'll also use langu which will be responsible in creating my entire apis right so from Lang serve import add routes okay so through this I will be able to add all the routes over there right whatever routes suppose one route will be that I need to interact with my opening API one one route will be to interact with the Lama 2 and all So based on that I will go ahead and import this next thing is import uon okay uon will be required over here oops okay next is import OS see I can probably enable my GitHub co-pilot and I can probably write the code but I don't think so that will be a better way uh I usually use this AI tool which is called as blackbox uh so that it will help me to write my code faster and uh you know it also explains about the code so I'll probably create another video about it okay so how to basically use this then one more thing that I really want to import over here is my olama so from langin community. llms LMS import o Lama okay so this is done all the libraries that is specifically required by my code uh I have actually written that okay and these are all the things that I will require to create my open a API now this is done now what I'm actually going to do over here is that I'm just going to write os. environment and first of all I will initialize my open AI API key so I will write open aore API _ key okay and this specifically I will load it from os. getenv and then I will go ahead and write open aior API uncore key okay so this is the first thing that we really need to do before I do this I will just go back to my o app.py and I will just initialize this okay load. EnV let me quickly copy this entire thing and paste it over here and I will initialize this load [Music] uncore Okay so this will actually help me to initialize all my environment variable perfect uh I've also loaded my open API key now let's start this fast API now in order to create the fast API uh I have to create an app here I've given title langin server version 1.0 and the third information I basically want is the description now after this I can use this app and I can keep on adding my routes okay so what I will do uh addore routes and this routes is basically to add all the routes over there right for so the first time when we are adding this particular routes you have to make sure that I give all the information like whether I'm going up with my app let's say I go with this chat open a API chat open AI open AI okay and the third information that I will probably give is my path so this is my one of my one of my uh route you can just consider this open AI route and this is my model that I will specifically be using so this is just one way how you can actually add route okay but let me just add some more things because see at the end of the day when we created our first application we used to combine prompt llm and output parser in this specific way so what we'll do over here is that when we are creating routes we also need to make sure that we add all the routes in such a way that I also integrate my prompt template with it okay so here I'm going to say model is equal to chat open AI chat open Ai and I'm going to initialize this particular model and then let me go ahead and create my other model also see o Lama I will just use my llama 2 Okay so this model also I need to basically create it or call it okay because I I want to use multiple models so here I'm going to write llm and here AMA and this will basically be my model is equal to llama 2 okay so I'm going to use the Llama 2 model so this is my one model over here this is my another model over here okay now let me go quickly go ahead and create my prompt one so my prompt one will be my chat prompt template chart promp template dot frore template okay and here I'm going to basically give one chat prompt okay let's say one of my interaction one I want to use open a API uh for open a API let's say I want to create an essay so I'll say write me an essay write me an essay about a specific topic that topic I will be giving it okay about some topic okay some topic okay around with 200 words or with 100 words okay so this is my first prom template okay I'm saying this will be my prom template write me an essay about whatever topic I give with 100 words okay something like this so this let's go ahead and write this this is my first promt template then I will create my second prompt template okay this is important just just hear me out okay and this promt template will be responsible in interacting with my open source model so let me say write me a poem I'll say write me a poem about this specific topic with 100 words so in short this first prom template will will be interacting with this model that is chart open API prompt 2 will be interacting with llm model okay whatever llm model I written over here that is Lama 2 now let me go ahead and add this routes so here I'll say add routes okay and again as usual I will write app now here you see I will be combining prompt or model right model will be chat open API and you know this prompt one is specifically for my chat open API right then I will also give the route path so my path will be something like this okay and this will be my API right so here say let's say I'll write slay that basically means this path is responsible in interacting with the open API and this is denoted by slash essay so that API URL that you'll be getting will be ending with slay okay the other route and I can keep on adding any number of routes as you want so another route will be like with promp two okay and here I'm going to basically use my llm and let me just go ahead and write this as/ poem okay/ poem so this is my another route this is my another route okay and finally I've created two apis in short and now here I will write if underscore uncore name underscore uncore is double equal to underscore uncore maincore uncore right now what we are going to do over here this is the starting of the application I will say uvon run and here I'm going to use my app comma host will specifically be my Local Host see here I'm giving you the Local Host uh this application you can run it in any server that you want right so that is the saying that is the reason why I'm saying this is the first step towards building that production grade application okay and the port number will be 8,000 perfect so this is done I think uh it looks really really amazing we have written the complete code over here and this is is my entire app.py with all the API so that basically means if I see the diagram over here this API part I actually created routes along with prompt template I've created and I've used two apis one is open aai and one is Lama tool two routes I've specifically used now my time is to basically create this web app and mobile app okay and I here I will just try to create a simple web app the reason why I'm creating this simple web app because it will be able to interact with this okay now what I will do let's go ahead and run this and see whether everything is running fine or not okay so here what I will do I will go to my CD which folder this is this is basically my API folder okay so CD API oops Port let me just write it Port okay done CD I'll save this CD API okay and then here I'm going to write python app.py okay now once I execute this from Lang chain uh Lang chain. prompts just let me see the documentation whether it is correct or not it should be prompts okay so a small issue over here it should be prompts now let me just go ahead and execute it now I think it should work fine uh so here you can see import error uh SSE Starlet Okay so so this is one of the dependency that is probably coming up uh I will quickly go ahead and update my requirement. txt okay and let me do one thing just let me write pip install this particular application okay so I will go and write pip install control V oops pip install and SSC Starlet okay so once this installation will probably happen I think now we'll not get an error now if I write python app.py it should run I guess so it is now running okay uh so let's go ahead and open this Local Host uh now here you can see I'm getting details not found but if you really want to see that entire API that is basically created you can just write slash docs now here you can see your entire langin server see I've created open aai input schema everything is given essay essay is also there poem is also there and all right so all the apis is almost created see here you can see what is the input what is the output output schema what is is basically required everything is probably given over here and this is basically called as a Swagger UI documentation okay the entire Swagger UI documentation is done now perfect your apis part is basically created now how do we interact with this apis that is the most important part so for that what we are going to basically do I'm going to uh let let let let let let this terminal be hidden over here I'm going to create my client.py okay now with respect to the client.py what we are going to do and let me do one thing till then I will go ahead and disable this extension okay because I don't want this extension to disturb things okay disable perfect I will show you this extension it is a very powerful extension okay now this client. py is my app it can be my app okay so first of all what I'll do I will go ahead and import request okay after importing request I'm going to write import streamlet as stream late as St okay so two libraries I'm going to import request and streamlet understand I'm creating a web app I can create a front end that will interact with the API okay so here first of all I will write two definition or two important function one is uh getor open aore response and here I'm going to basically go ahead and write my input underscore text okay here I will call I will probably call use this request object and create my response now in order to create the response in order to call the API how do I do it so here I'm going to write request dot post and how to basically call it so first of all I need to give the URL understand we are already running this URL right this entire Local Host if you probably see over here everything is running over here right if you if you see this okay if you probably see this and try to try to see this entire URL okay you can directly see that in order to call this API I will be using a URL that is already provided by this particular Swagger UI that is nothing but let's say I want to call the essay URL okay now in order to call the essay URL which is using the open a API here you can see HTTP Local Host 8000 / invoke so this is my entire API URL which will be responsible in calling this and here we also know that we need to give one input so what I will do I will go ahead and write over here as Json is equal to and in the form of Json I will give one of my input so input colon okay and here I will write topic why topic because see here what we are given over here if you see over here topic is there right now this topic will be nothing but whatever input text I am giving to this particular function that same input text will be visible over here okay so this is what we have actually done over here right so input topic colon input so text so this is what is the input I will give and it will hit this particular URL along with this particular input post okay and then finally I get my response now what I will do I will go ahead and take this response convert into a Jon and it has a key dictionary value pairs which where my output content will be present inside this variable called as output and one more thing that I require is basically content okay so inside this key my entire response will be available the next thing that I'm actually going to do is similar with the help of AMA response and here we are going to use Lama 2 so get AMA response input text again request. poost now instead of using this/ / invoke we are going to use SL po/ invoke why/ po/ invoke because SL poem will be responsible in interacting with Lama 2 okay and then I get my Json again I uh I get my response here I'm writing response. Json of output okay okay so these are my two functions that I've actually created to get the response now what let me do one thing now let me go ahead and create my streamlet framework very simple now now from here everything will be simple okay so st. title this this is there I've created two textbox one is input text and one is input text one first text box I'm saying write an essay on whatever topic the second text box I'm writing write a poem of any other topic okay so first one we'll be interacting with the openi application this the second one will be interacting with Lama 2 okay and finally I'm going to call this particular two function that I have created if input. text if I write any text in the first time then it should basically interact with the openi response whatever response I'm getting it will display it over here if input text one I'll just call the AMA response which is interacting with the Llama 2 on this particular URL okay and then I will get the over here the output so this is my entire application which is interacting with the API which is already running over here right now with respect to this client.py what I'm actually going to do I'm going to create another command prompt and let me run this okay so here I will say uh cond deactivate and I will say cond activate VNV okay and let me just go ahead and run this now see python client.py okay okay sorry it should be streamlit streamlit run client.py okay so this is my client side okay file does not exist okay oh I will say CD and I will go to my folder that is API now let me go ahead and write streamlet run client.py now see this this is my Swagger UI the API is running over here and this is my front end application which is basically in now it will interact with with this apis okay now if I write the first one the first the first anything in the first text box this is going to interact with the open API so I'll say write an essay on machine learning okay so here you can see name get open AI response is not defined why this is the error let's see open AI response oops just a second this function name is does not match and this also o Lama response does not match or what okay now I think it should work let's reload it okay now you can see automatically machine learning is a powerful technology that enables computer this this this so the first text box is interacting with the open aipi write a po poem on machine learning let's say if I give the same thing if I execute it now this is going to interact with the Llama 2 so machine learning is a field of arti that allows computers to make decision or still running sorry uh this was the first response that I got from them uh okay it's running I guess let me reload it write a poem on machine learning okay and I can also write my own custom prompt if I want everybody I was thinking that we could make a bit of interesting by adding some extra challenges so and so so what do you say something something is given okay uh and if I probably see over here with respect to the prompt input text one this is also going over here uh this is my input text okay perfect so everything works fine and here you'll be able to see write a poem about uh this particular topic with um for a five years child okay so now let's see I'll just try to change the prompt and see whether it is working fine or not so this has got reloaded it uh let's see whether this is reloaded or not now if I go ahead and reload this write a poem on on unicorn okay I'm just writing something unicorn I think it should be able to give the poem shimmering coat in various colors here are 100 words that describe a unicorn okay something something information is basically and this is basically interacting with the entire Lama 2 hello guys we are going to continue the Lang chain series and in this specific videos and in the series of upcoming videos we are going to discuss about rag pipeline now rag uh the full form is retrieval augmented generation one of the very important use cases you may probably solving when you specifically use llm models and most of the use cases right now in companies are demanding this kind of skill sets you know where you can actually develop rag let's say you have a set of documents and you should be able to query that specific documents and get the result quickly if you have different kind of files let it be a readme file a text file a different data source file you should be able to read it convert that into a vectors and able to retrieve any queries specifically from those kind of data sources so uh as I said uh we will be implementing this completely from scratch from basic to Advanced in this specific videos we'll understand the entire architecture and then we will do a lot of practical intuitions okay so uh in any rag pipeline these are the major components that will probably exist so the first component is specifically called as load data source now whenever we say load data source initially we may have see rag is all about quering from a defin different different data source all together right so we may be having files that may be having PDF it can be a MD file readme file it can be a Excel file it can be a txt file it can be a database file it can be different different files all together right and in the first so first step that we see over here it is basically called as load Source data this step is also called as data injection okay data injection now in Lang chain the most amazing thing is that it definitely have a lot of different kind of data injection tools which will be able to load the data in various manner so in our current video we are going to implement each and every of this component from data injection till the query Vector store okay then after ingesting the data you can probably do load you can transform and you can embid okay we will discuss about this what exactly is load transform and embid specifically when we do loading that is nothing but we are reading from a specific data source then we perform some type of feature engineering over here if we want like in transform stage here the data the complete data will be broken into smaller chunks okay now why do we divide this data into smaller chunks it is very much important to understand because whatever llm models we specifically use right is definitely has some kind of context size right so based on this context size it is always a good good way to basically convert this entire data which can be of many number of PDFs it can be U many number of pages in this specific PDF we will try to divide this particular thing into chunks of data so this is also what we'll going to see in the practical way then finally we go ahead with embeddings embeddings basically means how we can convert all these chunks into vectors okay how we can actually use vectors in order to probably convert this chunks right and finally all these vectors will be further stored in some kind of vector stored database okay so it can be a database over here and the main reason of this specific database is that we will be able to query this particular database in an efficient manner right with respect to any query that we have have right so whatever vectors are basically stored if I have a query if I hit the query on this particular database I should be able to get the result based on the context of the query so this is the entire rag pipeline that we specifically use in the upcoming series of videos we are going to use different different Vector databases in different different clouds also okay so right now we will try to implement all these things that you can actually see in front of you and we will follow this entire architecture so let me quickly open my vs code and once I probably start my VSS code I will be um I will continue with respect to the same uh thing that I have actually used right all the projects that I've actually created as I said that all the things will be given in the description of this particular video with respect to the GitHub link okay now I've have created a folder which is called as rag inside this I will create one ipynb file okay initially we'll go with ipnb file and then later on we'll try to create an end to end project okay so here I will say uh simple rag. iynb since we are doing it completely from Basics it is always good that we try to do it completely from basic itself right so U this is the thing so first of all I will go ahead and see whether in my terminal whether I have that I kernel installed or not okay so I will go ahead and write pip install iy kernel IPI kernel is specifically used to install all the jupyter notebook kernels itself right so once I probably install this you'll be able to see that this installation will take place and I think it is already satisfied so it is good enough we can probably go ahead and start the coding over here then I will go ahead and select my kernel and it is nothing but 3.1 1.0 right this is what is the environment that I have created now if I go ahead and execute something so it is giving me some kind of error okay so perfect now what I'm actually going to do uh I can see that there is some kind of message over here I will just try to uh disable this I've already uninstalled uh this extension so that it will not give us any disturbance so let me do one thing let me just open over here for you and here I will open this particular folder and then I will go ahead and close this okay close window okay and now let me quickly open the vs code again and here is my entire project we will just try to execute it once again okay perfect so that so that I don't have any disturbance with respect to that okay now perfect now this is done now let's go ahead and start our coding initially I will try to show you multiple data injection steps okay so data injection basically means let's say I want to read from a text file I want to read from a PDF file I want to read from a web web page how can I actually do it okay so all those things we'll start to discuss so first of all I will go ahead and import from Lang chain unor Community do document loaders now document loaders has all the techniques specifically if you want to load uh let's say a PDF you want to load a Excel file you want to load a txt file so over here all the libraries will be present so from linore community. doent loaders until then I will just close this terminal because I will not require over here then the first one that I'm actually going to discuss about text loaded itself now over here what I'm actually going to do I'm going to create a loader is equal to text loader and after creating this text loader here I'm going to give a text file okay speech. txt so we'll just go ahead and see whether we have the speech. txt file or not no I don't have it so let me go ahead and create one speech. txt file okay and uh with respect to this particular file what I am actually going to do I'm going to put some content so I have already have that content let me just quickly copy and paste it this is one of the most famous speech um that is available in some history it has been taken over here so I just Googled it and I bought this particular speech over here and saved it in speech. txt okay uh then what I will do I will go over here and read the speech. txt okay now just by using this text loader what will happen is that you'll be able to read the speech. txt then I will use loader do load load and here I'm going to specifically convert this entire thing into text documents okay once I write loader. load that basically means it is going to just convert this into a text documents and this finally you'll be able to see my test documents and I will go ahead and execute it so here you'll be able to see that once it is reading the specific speech it has the entire text document now it is becoming so easy just to read the txt file okay now the next thing that I'm going to specifically do is that uh since I'm also going to use open keys or um I'm also going to use Ama embeddings or AMA model so what I can do is that I will just go ahead and import OS um the next thing that I will go ahead and do is lo load. EnV okay so load. EnV so for this I'm also going to import from. EnV uh import load. EnV okay now why I'm doing this so that I will be able to call all my environment variables and in one of the environment variables I also have my open API key so now I will go ahead and write OS dot environment and here I'm going to use my opencore a sorry open aior API uncore key and here I'm just going to use os. getv and here again I'm going to use my open API key so this is basically going to call the open API key from the environment variable now a very important step Al together over here guys now is the main term that we will be starting to uh see some more data injection techniques so one more data injection technique is directly reading from the web based web based uh loader you can basically say okay and for this again I will be importing understand any document loader will be available in this specific Library itself and instead of writing text loader we will be using webbased loader okay now along with this I will also import bs4 that is beautiful soup 4 so for that also what I will do I will just import over here in my requirement. txt bs4 so that whenever I execute this it should be able to run okay now in order to do something like this load chunk and index the content of the page HTML page right or any web page so here I will create a loader and this will be equal to web based loader and here I'm going to give two important parameters one is the web path so webcore paths okay is equal to and here I will give the UR Ur l so let me just take one URL from the GitHub IO page that is also available in the documentation of langin so this is what is the page over here um what I will do is that quickly I will open a browser and show it to you how this page looks like okay so if I execute this over here you'll be able to see that just a second okay perfect so here you you are able to see that uh just a second this see is this the same page yes it is the same page and I will be opening this okay perfect so this is the page that you will be able to see over here and I want to probably read this entire page and uh use it as a rag system itself right so in order to read this entire page I can just take this URL and use this web loader Library so here if you see I'm using this web based loader library and I'm giving the first parameter that is the web path okay the second parameter that we can specifically give is our arguments right so here I will wrate BS doore KW a r GS okay so this is the second parameter I'm just seeing the documentation and we will give the next argument in the form of dictionaries the first is parscore only okay and then here I'm basically going to use my bs4 that is a beautiful Soup For Soup trainer okay soup strainer soup strainer okay yeah soup strainer so this is the first uh parameter that I will be specifically giving and understand why this is required because here the soup strainer and since we are using beautiful soup over here we really need to give the classes that it read it needs to read from that particular page okay and here what all classes are there let's go ahead and see this okay so if I probably see over here and if I do just inspect element okay so here you can see post title is there then you have something like post content where all the content is basically available so let us take this one as my first one okay so I will just go ahead and execute this and here you'll be able to see that uh this will basically be my post title and here you have the postore content and third one that I also want to take is postore header because header will also have some information over here right so this is done you'll be able to see that it looks absolutely fine and here with respect to this we have also created this entire loader um and here I will use a Comm up so that this is basically my argument over here right so once we do this uh then let's see whether it'll execute or not uh bs4 is not found as I said that bs4 is not there so let me do one thing quickly let me open my command prompt and do the PIP install requirement. txt okay so here I will go ahead and write pip install minus r requirement. txt and execute it over here you'll be able to see now bs4 will also get installed and once it is getting installed it is completed it is good to go now if we go ahead and execute it I think it should work perfect now loader is there now what we are going to do over here is that again write loader. load and finally I will be able to get in the form of text documents textor documents okay documents and here is my text document so perhaps you meant uh this one let's see um so guys there was one one error we need to give this URL in the form of duple so let this get closed and then I will write comma over here now I think it should be working let's see yes perfectly it is working right now and if you go ahead and see your text documents again you will be able to get all the information from that particular web page in the form of URL okay or in the form of documents so this is perfect uh you able to get every information over here and uh with respect to this you will be able to see that yeah you have all the information and you can also see the page content and everything looks fine so these are some of the ways how you can specifically make sure that uh you probably get all the content from a page itself one more thing uh this is not not underscore this is Dash so now I think you'll be able to see more content okay so let me just open this so here you'll be able to see all the content itself so uh this is one more way one more way is directly to read from the PDF itself so what I will do is that I will quickly create a PDF okay so let me do one thing and I will upload a PDF over here okay so here I've created this attention. PDF this is nothing but attention is all you need uh now uh using document loader you can also directly read completely from the PDF itself okay and how to do that see still we are in the data injection phase and how do we do this we'll also see that so from uh the same library that we'll be using again we have to focus on document loader and here I will copy and paste it okay um and this is basically my PDF reader one more Library dependency will be there okay so we will try to install that also so here I'm going to write Pi PDF all loader okay and again I will be using loader is equal to Pi PDF loader Pi PDF loader and I have to give my PDF name right so PDF name over here specifically is nothing but attention. PDF so it is in the same F same folder itself so here I will write attention. PDF okay now I think I should be getting an error if I don't get an error I will definitely install yeah I'll install it now see pip install P PDF so here I will make make sure that I will write requirement. txt and I will write Pi PDF okay so all these things you really need to take care of because they will definitely be dependencies because and understand rag is one very important thing that really needs to get created in the form of application so that is the reason I'm creating this completely from scratch so pip install minus r requirement. txt so this is also done I think and Pi PDF is also installed perfect then the next step again I will go ahead and execute this and but here also the same step will be required loader. loadad and this will basically be my documents and I will execute this and once I probably execute this this is my docs okay perfect so here you'll be able to see my entire PDF has been read and it is in front of you now the next step now see this load data source part is basically done we know how to load it okay now we will move towards this transform okay then we will go ahead towards embid okay so the first part of load I've actually done with respect to PDF web based with respect to text file if you go ahead and check out the langin documentation this there are still more amazing ways to do for Excel file for readme file different directories even for directors let's say that you have lot of files in the directory itself many files and you can also load that so we'll try to see in upcoming videos more about different different examples but here just to give an idea about load data source and load is actually done now let's go ahead and do this transform now transform is very much important important Now understand guys this is your entire PDF documents right now you need to convert this PDF documents into chunks now how do you do that and again they are multiple ways and this ways entirely Falls inside the category of text splitter inside Lang chain so here we are going to write from Lang chain from Lang chain dot text splitter import recursive recursive character text splitter we'll try to split it with the with the help of uh text itself so here I'm going to create my text splitter is equal to recursive character text splitter and here I'm going to specifically use my chunk uncore size is equal to 1,000 uh I want this specific size to be the chunk and let's say the overlap chunk chunk overlap okay chunk overlap so here I will go ahead and write chunk underscore overlap is equal to 200 I'll keep this as 200 okay so once I do this now this text splitter will be uh responsible textor splitter dot split documents will be responsible in splitting all the specific documents that I give over here okay so this docs will be there and this will be my documents final documents that I will be able to see okay let's display this documents the top five documents still here okay uh and let's see this so here you'll be able to see provided proper attrition is Google by Grant permission best model this this this this everything is there attention. PDF right so all the information is basically there um and with respect to this if you really want to see the more things more uh different different uh all the documents in short so you can also go ahead and see the entire documents over here okay done so these are all my documents over here Transformers everything is available over here now see this is my entire documents it has been divided into proper smaller chunks now we can take this chunks and we can convert that into vectors okay and that is what is giving you an idea over here right like how we have transformed We have basically taken this entire PDF document and we have divided that into chunks now it's a time that we understand how to probably convert this into vectors and for this we will also be using some different different Vector em uh Vector embedding techniques right one of the vector embedding techniques that I will be showing you is with respect to open AI so here now we will go ahead and probably write about vector embeddings and Vector store right Vector embeddings is a technique wherein we convert a text into vectors okay so how do we do that again with the help of open AI so here I'm going to write from Lang chain uncope Community Dot embeddings import open AI open AI embeddings you can also use AMA embeddings it is up to you since if you don't have open API you can use AMA embeddings directly but the performance of open AI embeddings is better far more better than AMA embeddings okay so here what I'm actually going to do next thing is that uh first of all uh we need to understand how to probably create vectors again to create vectors we can use open AI embeddings but after creating the vectors we also need to store in the vector store that is what it is right this Vector store is like kind of a database embeddings can be embeddings is a technique where we convert text into vectors but later those text needs to be stored in some kind of vector store right so for this reason we will be using something called as chroma DB so they are come there are couple of uh there are couple of vector databases that has been provided by Lin itself one is chroma one is f um as we go ahead you know how to create this kind of vector database in the cloud also I will show it to you so from Lang chain Community I will be using Vector stores and here I'm going to import one is nothing but chroma uh so with respect to this we will go ahead and create my DB and here I'm going to write chroma do from uncore documents okay from documents and here I'm going to basically give my entire document okay entire documents and I'll not give the entire documents let's just give the first 20 documents itself because it'll take more time to create the embeddings right and here the embeddings that I'm going to specifically use is nothing but open AI embeddings once I execute this I think I'm going to get an error saying that chroma is not available I don't know whether I've installed chroma or not so here you can see pip install chroma DB I really require chroma DB so again I'll go back to my requirement. txt the reason why I'm showing you all this completely from scratch so that you whatever error you face you should be able to fix it set up okay so I'm going to delete this and let's go ahead and write pip install uh one more library that I will be using is nothing but F CPU okay F CPU because f is also one type of chroma one type of vector database okay so minus r requirement. txt so let's install both the specific libraries it'll take some amount of time again it depends on your internet speed and how fast your system is but yes I think 5 CPU and chroma B is the kind of uh Vector databases that we will be using one assignment I'll give you try to also use Lance Vector database from again seeing the lanon documentation you can actually do it okay now this is done once uh this will probably be done then we can go ahead and check by executing it whether it is working now or not but at the end of the day here what we are actually doing we have imported open iddings the vector database that we are going to use is chroma then chroma from document I'm giving the entire document and using this embeddings it will be storing inside this particular Vector store now this DB we can store it in our local we can store it in the cloud wherever you specifically want okay so guys the installation has been completed now let's go ahead and execute this chroma part now here you'll be able to see that entire embedding will basically happen and again we are going to use the open a embedding for this and now this DB is nothing but our Vector database okay so if I probably consider Vector database and now all I have to do is that I can query anything with respect to this particular Vector database to retrieve any kind of result that I really want so I will create a query first of all so here let me go ahead and write my query so query is like I will go ahead and write who are the authors who are the authors of attention is all you need research paper okay so this is my query that I'm going to specifically ask let's see whether it'll be able to understand this or and it'll be able to give us the result so here I will go ahead and write DB do similarity search now there are multiple option similarity search similarity search by Vector if you want you can also convert your data into vectors by using again the openi imings and you can query it from here but we are just going to use the similarity search since we are just going to use the query over here right so if I write db. similarity search of query so this will basically be my result okay now let's go ahead and see my result okay and let's execute this so here you'll be able to see that I'm able to see multiple information like four documents has been over here so let's take the first document and let's see how the result is so here I'm going to basically take zero and inside that the field name is nothing but page content so I'm going to basically write my page content now here you'll be able to see Pro provided proper attribution is provided Google here by Grand permission and here you'll be able to see go gole brain all the email IDs all the researchers names are actually available so that basically means it is able to resear uh it is able to retrieve the results of the all the scientists who are involved all the researchers who are involved in creating this particular paper um I can also do one more thing I can go ahead and write what is attention uh is all you need okay attention is all you need Okay so let's see what kind of results I will be able to get and understand this is entirely rag pipeline which is being able to come from the entire documentation and here you'll be able to see that result is also coming uh let me do uh write something which is available in the research paper so let me open my research paper over here and let me ask some question directly okay from here you'll be able to see see when I'm searching attention is all you need it is coming from here itself right so let me use something over here m uh let me just write over here encoder is uh what is an attention function let me just go and search this okay let's see attention is something something some text is there over here so if I go ahead and execute it and here you'll be instead of performing a single attention function with the model Dimension values keys and all the information is basically coming up right and here is a very good result and this is directly coming from the vector database that is the most amazing thing over here right now you may be also thinking chish can we also use the F database fire deped database that we have used right so let me just show you with respect to that also F and here I'm going to use the fire Vector database and this will also give you an idea how you can actually store that embeddings into F database itself and we'll also do that so here I'm going to write from Lang chain Lang chain uncope Community do Vector stores I'm going to import my FIS F and then I'm going to also use TB is equal to FIS Dot from documents the same thing right so from underscore documents and here I'm going to use my documents okay and let's say I go ahead and do the embedding for the first 20 documents and then here also I'm going to use my open biddings and I think this is also my another DB so I will write it it at DB1 which is my f database okay uh from documents okay no worries uh so here the spelling mistake was there and I think this will also work now DB1 is also ready and what I can do I can use the same thing and paste it over here okay and just write DB1 do similarity search and here you can see instead of performing a single attention everything is probably coming up so I've just shown you the example of both F and uh chroma database chroma Vector database as I said one assignment will be given to you is regarding Lance Vector database you can go and search it for that in the in the in the langin documentation we are going to continue the langin series and now we will be developing Advanced rag pipeline using Retriever and chain Concepts that are available in Lang chain now before I go ahead I really want to talk about a funny incident that really happened uh recently just today itself so what I did what I do is that every day morning I usually go and play some badminton you know I play for one hours one and a half hour and usually a lot of my friends and neighbors usually come and play you know so today what happened after playing I went today in this look okay I played around three to four games and then one of my neighbors said hey Krish you're Kish right you just got identified right now just by seeing your should I identified you but your look has completely changed so let me know if this is true in the comment section of this particular video but I am liking this look uh you know it's like so much of less maintenance you don't have to maintain your beard your mustac or your hair also right it looks super cool now let's go ahead and uh work towards this specific project in our previous tutorial what we had actually done is that we had created this simple rack pipeline okay we had a data data source like we took PDF we took website then we uh load that particular data set using different different data injection techniques that were available in Lang chain then we did transformation wherein we broke down our bigger PDFs into chunks and then we converted all this particular chunks into vectors and stored it in a vector store and then with the help of query we are able to probably retrieve some of the data that is available in the vector store now this is one step now the further step after this particular query Vector Now understand query Vector are not that efficient you know when in terms of retrieving the entire results here we will also specifically used llm models okay so now what we will try to do is that using some prompts okay and we will take this specific prompts we will take this particular data using the concept of chain and retriever okay and understand this topic is very important because this is where your Advanced rack pipeline implementation will start using chain and retriever we will also use and in this chain and retriever what we do is that we specifically use llm models it can be open source it can be paid whatever model you want so we will specifically use this llm model and based on this prompt we will try to get the response on what we are specifically looking right so there will be a lot of customization that will be added once we implement this specific part in the first part we discussed about this and this is the second part that we are going to discuss okay how we can use chain how we can use retriever what exactly is chain how you can integrate in this particular llm model there is a concept of something called as stuff document chain what exactly it is so we will discuss everything all about it and here we also going to do a practical implementation so please make sure that you watch this video till the end and we are going to learn a lot of things okay so here the first step what we had done on already uh we have implemented this in our previous tutorial also so here you'll be able to see that I'm trying to read attention. PDF which is Pres present in a folder and then we just write loader. load and we get the documents okay so these are all the documents that will be available in this specific PDF okay then what we are specifically doing next step is that from lin. text plat we will be using recursive character text plater wherein we convert the entire document into chunks right and then we are probably using this text splitter and we are using a chunk size of th000 overlap of 20 so this everything is implemented in my previous videos right so we are going going to split this entire document and save it in this particular documents in the previous tutorial we have implemented all these things now here what we are going to take do now we'll take all these documents and then we will convert it into a vector store right so Vector store for that we are using this F okay so here we are going we can use AMA embedding or open AI embedding right as I said you open AI embedding is very much Advanced and it will perform better than AMA embedding if you don't have opening a API key use Ama eming emding instead of open AI embedding over here you can just write AMA embedding right so from here I'll be using F which is again a kind of vector store and it has been developed by meta so fest. from documents document of 20 so I'm just taking the first 20 documents and I'm just writing open a biddings let's make it to 30 so that it will have some amount of data right now we are specifically using open AI bidding and this DB that you specifically see is a my Vector store okay so here you can see vector store f f of type okay perfect now any question that I ask attention function can be described as a mapping query and then we can take this Vector store and just write do similarity search on this query and we will get the result over here okay so this all things we have actually done in our previous video now is the most important thing how I can combine prompt along with chains and Retriever and then probably get a response based on the prompt okay so say since many people have the use of only open source uh they have the access of Open Source model llm model so I'm going to use AMA from lin. community. llms import olama then I'm going to use AMA over here model will be Lama 2 if you don't have Lama 2 just to go to command prompt after downloading AMA I hope everybody knows how to download it if you're seeing my series of videos here you can just write AMA run Lama 2 right so once you write like this then the model will get downloaded right if it is already downloaded it will be coming something like this okay so this is the first step that you really need to do then I have written from lin. community. LM AMA load AMA so whatever AMA model we are specifically using that is Lama 2 so this is my open source model so if you see Lama llm it is nothing but AMA now is the time we will start designing our prompt template now in order to design the chat prompt template I will be using CH linore core. prompts import chat prompt template okay and then from chat pron template from template I'm just writing like this so I'm saying answer the following question based only on the provided context now see we are trying to develop a Q&A Q&A chatboard based on the context it should provide me the response previously what we are doing using Vector store we used to if you see the code over here we used to query the vector score right Vector store by using similarity search algorithm but here what we are doing here we are defining our own prompt and we saying hey answer the following question based on the provided context right I will and simply I'm writing away I'll tip you $1,000 if you find the answer helpful okay if the user find the answer helpful okay just at least by seeing money the AI May perform well and then we are giving our context and then question will be input okay how why I'm writing in this specific way because this chain and retriever right you'll be understanding this context will be autofilled and this input will also get autofilled okay how it will get autofilled I'll let you know so now what I will do I will execute this now I will go ahead and Implement about chain uh it is always a good idea okay to probably go to your browser and check about each and every topic that I'm explaining so what does chain refer to chain referred to a sequence of calls whether to an llm tool or data pre-processing step the primary supported way to do this is lcf okay now if you talk about chain over here there are multiple functions with respect to chain one of the function that I'm am going to use is create stuff document chain now what this exactly does this chain takes a list of documents and formats them into all into a prompt then passes that prompt to an llm see this this chains take a list of documents and formats them based on the prompt formats them into a prompt sorry not based on the prompt into the the problem that basically means over here if I go ahead and open my browser here in the context I definitely require my document sour self right based on that context and based on this input I will be able to give the answer right so based on this context and based on the input right context basically means all the documents that are there available in the vector store right inputs are what question I'm asking right so with the help of this create stuff document chain what is basically happening this chain takes a list of documents and formats them all into a prompt then passes that prompt to an llm it passes all the documents so that you should make sure that it fits within the context window the llm you are using so what it is exactly doing it will take up all the documents from the vector store it will put that put inside that particular prom template and then it will send the to the llm and then we finally get the response okay and that is what we'll be using similarly there are different different uh things also over here like create stuff document change is there create SQL query chain if you're working with respect to SQL database for natural language and this is one of the very important project that I'll also do in the future one or the other way I'll try to use one or the other functionalities to just make you understand how we can use all this functionalities itself right but it is always good that we have a specific use case okay now if I open this okay let's go ahead and create my chain how do I create my chain over here again it's very simple so so I will write from Lang chain Lang chain _ community so it is present inside Community itself or not Community sorry chains Lang chore chains dot combine documents import create Stu document chain now how do I know this okay I did not create this I have already seen the documentation okay that is the reason I'm writing then I will go ahead and create my document chain now inside this document chain as I said I'll be using create stuff document chain okay that we have already seen now inside this chain two things are basically required one is the llm model and the second one is the prompt that I have created because inside this prompt itself the list of documents will be added right whatever documents is basically coming from here right from my Vector store that will be added over here okay so once I create this so This basically becomes my document chain okay very much simple now after this we also have to learn about one very important thing which is called as retrievers okay so I will go ahead and write something called as retriever retriever Lang chain okay now what exactly retriever Lang chain is it is an interface that recur document given an unstructured query it is more General than a vector store right a retriever does not need to be able to store the documents only to return or retrieve them Vector store can be based as the backbone of the retriever now see there is a vector store which is having some information some Vector stored in it right if I want to take out any data from there right I can actually do a similarity search which we have already seen okay but langen what it did is that since we usually do a lot of programming in a way right where in classes are used interfaces are used so it created a separate interface which is called as Retriever and that interface has a backend source to that particular Vector store to retrieve any information right whenever a query is given right that entire Vector store will be passing the information through this retriever okay so what we will do is that here I will quickly open this I have also written some amount of description so that it will be helpful for you whenever you probably go go ahead and check it this entire materials okay so now what I will do I will just go ahead and write DB dot DB dot as retriever now once I do like this DB do as retriever this basically has become my retriever right so what we have done DB is our Vector store already it is there we have connected to an interface which is basically this particular variable now okay so if you go ahead and ex uh probably display this what is retrieval it is nothing but it is a vector store retrieval internally you'll be also able to see that what all it is implemented F and open AI Mings and all all the information are there now retriever is done chain is also done okay now is the time that what I will do I will try to use this Retriever and document chain both together to probably see when we combine both of them then only we'll be able to get the response right so with respect to this now let's go ahead and create my retriever chain okay so the next step is what since I need to combine both of them one is Retriever and one is the document chain right this document chain is responsible for putting the information in the context when we combine both of them then it becomes a retriever chain now what is the definition this chain takes an input as a user inquiry which is then passed to the retriever to fetch the relevant documents so it passes through the retriever it is connected to the vector store then those documents are then passed to an llm to generate the response and this llm that we are basically getting it it is basically coming from what this document chain understand the flow okay so let me just go ahead and mention this flow again so that it becomes very much easy for you so whenever the users this is my user okay whenever the user asks for any inquiry okay any inquiry so first what it is going it is going to this retriever okay very much important okay so this is my retriever this retriever is an interface to what Vector store which has all the information right so once we basically get the retriever then the next step what it happens it goes to what it goes to my llm model with some prompt right there will be some prompt involved to this and how this is basically happening with the help of stuff document chain so this stuff document chain has already both these things combined llm and prompt right and then finally we get our response I hope you're able to understand this right and this is what we have basically implemented over here now how to basically create a retriever chain so first for for all of this again I will be using a library um with respect to chains okay okay so form Lang chain Lang chain dot chains import create retrieval chain right and there are lot of things create QA with Source chain retrieval chain this chain that chain I will try to explain you all of things don't worry okay I will cast the right kind of use case and I will be showcasing you all these things don't worry about that okay then what I will do I will I will take this entire create retrieval chain and then I will create my uh retrieval chain so here I will write retrieval chain is equal to create retrieval chain and here I'm going to use first parameter that I'm going to use is retriever then the second parameter is nothing but document chain so once I have this chain right now I will be able to invoke any queries so I will go ahead and write retrieval chain dot invoke okay and now what are the parameters that I have to give nothing I have to give my input so the input will be given over here colon and whatever input that I can give let's say from the PDF I have put some input over here let me just copy and paste it okay so this is one of the text that is available in the PDF that is attention. PDF and now if I invoke this you'll be able to see that I will be able to get the entire response okay so retrieval do chain. invoke and again we are using open source llm model that is Lama 2 okay yeah I've used open EMB bidding so here you can see this is my input this was the context right all the context information is there and finally I get the answer and this answer I will just try to save it over here something like response uh res response okay and then I will execute response response of answer so this finally becomes the my output that is probably coming over here okay so this if I execute it the answer to the question is right and all the information you can see the answer to the question and it is retrieving all the details right so here you'll be able to see that how beautifully with the help of llm we have constructed this entire thing and we have used this chains Retriever and this is the first step towards developing your Advanced rack pipeline right so whatever question you ask let me just open this and probably show you some more examples okay so what I will do I will just open my download page let's see my download page I'll ask for any statement okay just a second um attention Okay so this is my thing uh let me just go ahead and search for anything the decoder is the this and this okay so I'll be searching from here to here okay now let me go ahead and change my input and search for it so chain. invoke I've done and here I've got the response now let me just go and I uh the answer to this question is six um the decoder is also composed of stack of n6 oh it has basically taken this no worries okay let's take some more thing I'll write scaled do product attention some more examples oops uh error okay just a second now I think it should not be okay it is taking that question and it is trying to form some answers out of it okay got it got it it is not a mistake I thought it was a mistake out there scale is a type of this this this see I'm getting all the answers over here uh we are going to continue the Lang chain series and in this video we are going to probably develop an amazing multi-search agent rag application um what exactly is the project all about what all new things we are going to specifically learn and probably I think this is the most amazing thing that has been inserted in the Lang chain Library itself some of the amazing modules that are there which will actually help you to make your entire conversational chatbot quite amazing itself right so let me go ahead and discuss about this what what exactly we are going to do what kind of projects we are going to implement here so let's consider that I want to generate a gen AI powered llm application so this is my llm application that I really want to generate and this let's consider that it has dependencies on some of the other open source platforms or other websites all the data sources itself right like like RVE R if you don't know like all of the research papers that are available will be available over here if I probably consider with respect to Wikipedia right Wikipedia also has a huge amount of content similarly I may also have let's say my own company's PDF you know from where I also need to probably develop our Q&A applications right so all these things we will probably try to implement it now the main thing is that here you have multiple data sources and you really want to integrate all them as a rapper so that you'll be able to implement this entire Q&A solution right over here so here you're also going to learn about some new terms and right now it is becoming very popular like tools right tools what are tools in Lang chain what are agents in Lan chain what are toolkits we'll also be learning about toolkits right and how do you probably create a wrapper on top of this particular toolkits but if I just want to give a small brief introduction about tools you can see that here I have dependency on this platform on this plat I can probably use all these platforms as a separate tool so that I'll be able to ask any questions from this particular platform itself along with that I may also have my own customized PDF own custom data which will also be in the form of vector embeding so what I will do is that I will try to wrap this up in the form of toolkit and then with the help of Agents I will be able to execute any Q&A search that I really want to do it yes I have probably told a lot of terms and topics over here but let's go ahead and let's see how we can can actually implement it and all this implementation that you will probably be seeing will be quite amazing lot of learnings will be specifically there and you will definitely be able to learn many things out of it okay so let me quickly go ahead and open my uh code file and I will continue the same code file that I had actually implemented earlier itself right uh one more very important thing that I really want to talk about is that uh if you remember I have uploaded five videos till now and all those videos are with respect to some different different applications right and that is how we should definitely learn all these topics so that we can actually get an efficient way of how things are basically getting used okay so as usual what I'm actually going to do is that let me go ahead and search for Lang chain tools okay so here you you if you able to see here you have something called as Lang chain tools so tools are interfaces that an agent chain or llm can interact with the world right so if I really want data from some other data sources uh some other queries that I have you can also include Google search API over here and lot of tools are specifically given by uh lanin itself you know we have to really create a rapper and then we will be able to have a conversation with them so some of the examples with respect to the tool like what all tools are specifically provided you'll be able to see in langin these are all the built-in tools that you have Alpha Vantage fi R sa AWS Lambda Bing search Brave search Chat GPD plugins you know D image generator you can also able to generate images then you have Eden AI file system golden query Google Finance Google jobs Google Lens Google Places Google scholer Google search Google serer so you can probably use any of these particular tools to extract some of the data that you want let's say that I am planning to create a chatbot which will be research more oriented and I will probably ask questions that will be related to different topics research and all other than that I will also try to create a rapper on top of Wikipedia and along with that my under the customized uh PDF files okay so I will try to include all those thing and you can use any of this that you want based on your use cases but I will try to show you an application how I will go ahead with okay so let me quickly open a new folder over here so I will go ahead and create a folder and let me go ahead and write agents over here as my folder name and I am going to create a file which is called as agents. iynb file okay now inside this particular file I will go ahead and select my step by step I will try to show it to you how things will be working in this and how I'm going to build this entire thing okay so the first thing as usual uh as I said that I will be requiring uh RF so you have to probably go ahead and install this so I will go ahead and write pip install RF so okay dou S no worries so I will go ahead and install it it'll go it'll get installed in my V EnV environment once this is basically getting installed what I'm going to do is that I'm going to use Wikipedia I'm going to use also and I'll create as a rapper okay so first of all let's go ahead and create a rapper on top of Wikipedia so for that what I will do I will go ahead and write from Lang chain uh Lang chain undor community so it will be present inside this community itself all the tools that are available uh within the Lang chain itself then I'm going to import Wikipedia Wikipedia query run okay so I'm going to just write Wikipedia query run I hope this works let's see whether this works or not okay this is my Lang chore Community but I think it should be working now um from lin. community. tools import Wikipedia or query run along with this what I'm going to write from langin I'm also going to create a use some rapper class on top of it do utilities utilities import Wikipedia API rapper okay so I'm going to specific Al use this two things over here one is the Wikipedia query run and Wikipedia API rapper okay for Wikipedia you don't require a separate API itself already Lang chain is taking care of that so if I go ahead and write uh and probably execute this Wikipedia API wrapper I will say hey provide me the top K results right so how many results I want let's say I want one so I'll go ahead and write document uh content Car Max Max to Max I will probably require 200 characters okay from whatever search I actually do from Wikipedia you can increase it so this will basically be my wrapper itself API wrapper okay so this in short is interacting with the Wikipedia to find out the so many number of results so this is basically giving some configuration details over here the next will be my tool as I said I will be using my Wikipedia query run tool okay and then this will be initialized and here I'm going to initialize my API wrapper with the API rapper that I've defined over here okay so this is what is my tool over here so I'm getting some error could not import Wikipedia python package please install it with Pip install Wikipedia so I will go over here and requirement. txt I will go ahead and write Wikipedia okay I will be requiring this Library along with this as I said I'm going to import also RF okay so let me quickly go ahead and open my terminal and I will start the installation over here okay so let me go ahead and write P install minus r requirement. txt so both these libraries will get installed that I specifically want initially you actually require it whatever is there the packages are available and Lang chist trying to integrate with that and that is the most amazing thing over here okay so now I don't think so we should be getting an error so now it looks good now if you go and write tools over here or tool tool name over here so here you can see Wikipedia query run if I go ahead and write tool. name you'll be able to see Wikipedia is my uh tool name okay now this is one of the tool similarly what I I will uh probably do is that I will also uh take a website or a PDF whatever it is right I will read all those kind of PDF okay so that I will also consider that as a data source so let me do one thing let me I will probably create another tool over here okay so let's see over here I will go ahead and go to this website okay so let me just see this website uh I've just copied this okay so this website this website okay so this is the website that I will also be using and I'll try to retrieve the content from here also so in order to read this content from this particular website I will be using web-based loader as you all know we have also discussed about that because that is also one of the data injection thing right so I will write from Lang chain community. document loaders and here I'm going to specifically import web based webbased loader okay along with this uh as you all know since I'm reading the content I will be using F I'll be using open AI Bings and all so let me copy it from here and let me paste it over here okay so I'll be using F I'll be using open a Bings I'll be using recursive uh character text splitter so that I'll be able to divide all those into chunks also so everything will be used over here in this particular case and one thing that you really need to understand why we are doing this because this is my own custom data let's consider this as my own custom data I need to probably convert this into a vectors so here I will go ahead and load my loader and here I will be using webbased loader and I will give my URL the URL that I had actually got from here so this URL will be basically reading the entire page okay once I probably get the loader. load okay uh loader. load it will load the entire content from that particular website so this will be my docs okay after getting the docs the next thing that I will do is my recursive character split okay and here I'm going to specifically use some Chun size so this will basically be my chunk underscore chunk uncore size is equal to th000 uh and then I will also be writing chunk uncore overlap which will be nothing but 200 okay so this is some default configuration I have actually taken in this um and then uh I will also write do splitcore documents I will split all all these documents that are available inside this docs okay and I will finally get my documents itself okay and this we have already done it many number of times I think in my previous tutorial also then I will go ahead and create my DB my Vector DB so I will go ahead and write Vector DB Vector DB and here I'm going to specifically write F FES do from documents from documents and and here I'm going to give my documents comma my openings that I'm specifically going to use for this purpose right and later on if I really want to convert this Vector database into retriever all I have to write is Vector DB do as retriever as retriever if you don't know what is retriever it is an interface which will be able to retrieve the result from this particular Vector database right so here will be my retriever right so this is my second important thing so let me go ahead and write retriever retriever okay so I've done I've created one tool I've created one retriever tool is for Wikipedia but still I have I also installed RV right so I'm also going to use RV for the same purpose first let's uh all this particular thing happen now you have that Victor um the vector store retriever completely over here okay now uh the next thing uh that I am probably going to do over here is that I will take this Retriever and use create retriever tool so that I can actually make it as a uh as a as a in short like if I really want to ask any question I really need to make this retriever as a create retriever tool okay so if you go ahead and search for Lang chain for Lang chain create retrieval create retriever tool okay so here you'll be able to see that this is what we are going to specifically use and create retriever tool you can see create a tool to do retrieval of all the documents so we are going to specifically use this other than this if you can definitely check out all the agents that are probably available over here for that purpose right and by that you will be able to implement things so here what I'm going to do I'm going to create my retriever uh tool itself so I will write from Lang chain dot tools Dot retriever import create retriever tool I will initialize this create retriever tool and here I'm going to use retriever is equal to or whatever retriever name I have and then this will basically be my lsmith search right so because that is with respect to the lsmith page okay L smithore search okay so that it'll be able to get identified okay like which which tool I'm basically hitting or which tool I'm basically searching okay and here I will give my third parameter and I will say hey search for information about lsmith I'll just copy and paste it I've already done this in one of my project so so here you can see the third parameter that I'm giving in the create retrieval tool is just like a prompt like what I really want this tool to do search for Information Gain about lsmith for any question about lsmith you must use this tool so in short when I do this Creator retrieval tool right it is basically creating a tool uh to do the search for that particular page right so here I'm going to write retrieval uncore tool okay and this will get initialized right so I have got one tool over here one tool is Wikipedia and the other tool if you go and see this retriever tool. name this is nothing but lsmith search now the third retrieval tool that I'm actually going to create uh is specifically my my own uh RI platform that I have actually developed so for this also I will be having one rapper and one query Run Okay so this will basically be for my Rift Rift basically means the website where all the research papers have been uploaded okay so let's create one more tool so this is for this particular tool okay and again the same thing that we will try to do is that we have to create a rapper API rapper Rift rapper where again my document Max search will be 200 and this will be B basically my for my query run and here I'm actually going to get my ri. name and here you'll be able to see that this is my tool right so this is inbuilt tool that is also provided by Lang chain Wikipedia and R is provided by Lang chain if you really want to create your own custom tool then you can also create like this but like I have actually shown you right at the end of the day you combine all these particular tools so let me go ahead and combine it so I will write uh tools is equal to and let's go ahead and combine this so the first tool that I have I think it is name is tool only okay so I at written tool so let me go ahead and write wiki wiki over here Wiki okay wiki wiki okay so here I'm going to combine all these things so first one is Wiki the second one is my ARF and the third one is my lsmith okay so Lang Smith uh this is nothing but it is a retrieval tool name right so I'm going to basically combine all this tools so finally I got my entire tools over here so this tools is nothing but it is a list of all the tools that you can see over here with respect to this now my next aim is basically to query from this specific tools right so for this I can do it in multiple ways uh and that is where we will be specifically using agents okay so let me just go ahead and tell you what is the main purpose of agent agents agents will be responsible in probably see if I go ahead and show you the documentation also let me go ahead and show you the documentation agents the core idea of the agent is use to Lang is to use a language model to choose a sequence of action to take in change the sequence of action is hardcoded in code in agent a language model is used as a reasoning engineering to determine uh which actions should take place in which order right now here you can see that I have have given one two three tool in that specific order so what my agent will basically do is that it will whenever I perform any request right whenever I give any input to my llm model it is first of all going to first search in Wiki if it is not able to get in from this particular tool it is going to go to Rift then it is going to go to retrieval tool right so from these three tools it is going to get the query and it is going to provide you the response okay so let me quickly go ahead and write from Lang chain. agents I'm going to import create create open AI tool agent and here I'm going to Define my agent name and let me go ahead and write create open AI tools agent name and here I'm going to give my llm my tools that I've actually created and one will be my prompt okay so the prompt and Tool have not yet defined so uh what I will do here I will show you an example like how you can probably call your prompts also that is already available in langen Hub but before that let me go ahead and call my open AI API so quickly I will go ahead and call it and then this is what I'm initializing my llm models as okay so I have actually created chat open API so this is what I'm actually going to use okay now let me go uh so this is my llm model the next thing is that how can I probably create a prompt one way that I've already showed you of creating a prompt is just by using chat promp template but in langin there is a module called as Hub there people have already created or L chain has already created some amazing prompts generic prompts and they have uploaded over there so in order to call from there so I will write from Lang chain uh from Lang chain import Hub okay and then we are going to Prem uh give the prompt we are going to get the prompt and the prompt will be of this name okay so we'll also see see get the prompt to use and you can modify this so there is a something called as open a function agents uh and this is the username that is present in langen hub if I go ahead and see what is the prompt over here so I'm getting an error let's see please install lanin Hub okay so we also require lanin Hub uh in order to use this see the reason why I'm showing you all this things because so many different options are there lanch Hub okay so let me quickly go ahead and run in the terminal pip install requirement. txt Okay so this I will probably Lang chain Lang chenore Hub let me see the error what was the error over there okay it should be something like this okay so I will go ahead and write or see in the requirement. txt the name is different now let me quickly go ahead and write it down over and install it right so here you can probably see yes the installation is done my agents is ready now let me go ahead and execute this now I think you should be able to see the m messages okay now you'll be able to see the messages over here now see by default whatever prompt is available in open aai function agents you'll be able to see I have system message prompt template there's prompt variable the input variable is there in template you are a helpful assistant so what we used to do manually everything is available over here the same thing it will have one human message one system message prompt template one human message prompt template right now uh you have the prompt you have everything now we can go ahead and create my AI tool agents once I probably create my open a tool agents now you can probably use this agent and execute to Pro get any kind of response now in order to use this particular agents we have to use something called as agent executor okay agent executor please understand the flow the flow is important and that is how you'll be able to understand this so from Lang chain. agents import agent executor okay so agent executor is my next I that I'm going to specifically use and I'll use this agent executor here I'm going to use my agent is equal to agent along with that I will use my tools is equal to tools okay whatever tools I've given and then this verbos will be true so that I'll be able to see all the details whenever I get any response and this will basically be my agent executor so this is basically responsible in executing anything right so now if I go and see my agent executor so here you can see all the things and it has also added runnable binding arguments over here right now in order to execute anything see if I write agent executor dot invoke so once I write invoke and I give my input and let me give the message tell me me about lsmith now you should just think that from where should this input come already in our tools we have R platform we have lsmith search right and we have Wikipedia so obviously this this agent will execute and it will interact from the tool that is related to lsmith so once I execute this here you'll be able to see what kind of response I usually get Okay so so here you can see tell me about lsmith lsmith is a platform for building production G application uh llm application and here and this is probably coming from the tool itself right now here uh mostly uh the thing that you could not see right you could not see over here what exactly the thing was the reason was very simple I did not write the verbos parameter properly now you'll be able to see what details you'll be getting see once you write verbos equal to true now you'll get all the details what it has hit now see invoking Langs Smith search right it is basically searching the langmi Search tool right whatever tool was there right let me go ahead and execute once again with respect to some different query right and here let me write tell me about machine learning now I don't know from where it will go ahead and execute it but let's see whether it'll go with Wikipedia or RF okay so agent executor enters invoking Wikipedia with machine learning see and automatically Wikipedia it is probably giving you the entire result okay let me also try with something like RV uh uh platform so uh I have one question with some research paper Okay research paper name so let me just go ahead and execute what's the paper all about the paper number is there and in RC will'll be able to find out this see now it is invoking the RF with query uh this particular tool and here you'll be able to get the response and that is the reason why I say this agent executor tools are probably the most important thing whenever you probably develop a rag application guys we are going to continue the Lang chain series and in this video we are going to create an amazing endtoend rag project uh with the help of Open Source llm models and grock inferencing engine an amazing project Al together over here I'm going to show you many more things and the best part is that many of you were specifically requesting only for open source llm models so here I will try to show you from embedding to using all the open source llm models along with with that what exactly is Gro inferencing engine we'll also discuss about it so here is the gro inferencing engine if you don't know about Gro it is on a mission to St the standard for Gen AI inferencing speed right and it specifically uses something called as lpu language processing unit okay it is a new type of end to-end processing unit system that provides the fastest inference for computationally intensive application with is with a sequential component to them why it is so much faster than GPU because it solves two llm bottleneck one is compute density and memory bandwidth I've already created a detailed video before and spoken more about what exactly is LP inferencing engine but now it's time that we go ahead and Implement some amazing Solutions out here right so let me go ahead and let me implement this end to end project and this project uh here I'm going to probably use streamlet over here so that you can actually see it in the form of uh web application so first of all as usual what I will do is that I will go ahead and uh see in the requirement. txt here you can see grock is also available and along with that we'll also be using langin grock so both this is the most important part along with that bs4 beautiful sou for so that we'll be also able to scrap from a website so that is what I'm actually going to do in my project so here you can see in grock I will create one folder that is called as or one file that is called as app.py Now quickly let me go ahead and start the coding over here first thing first uh I will first of all go ahead and import all the important libraries like import streamlink as St along with that along with streamlet I'm also going to go ahead and import OS okay now uh the first thing that you actually require is the API access so click on the API access of Gro and you will be able to access it completely for free so once you go to this API access page you'll be able to see that there is a something called as uh Self Serve through playground on grock so Gro Cloud also gives you a playground along with this if you really want to use it for a commercial purpose then you can see currently uh Lama 270b that is 70 billion parameter gives you 4096 context length and uh the response that you'll be probably getting is 300 tokens per second this is quite fast when compared to open AI CH GPT specifically uh and the price per 1 million toker is somewhere around 1.7.8 uh similarly Lama 27b you can see 750 tokens per second then Mistral is there 480 tokens per second gamma 7B which is basically having 8K Contex L and this is somewhere around 32 get context L right 820 tokens per second and the price is also very much minimal so uh as usual see you can see doc has also demonstrated 15 times faster llm influencing performance on an artificial intelligence. leaderboard compared to the top cloud-based providers so this is awesome and you should definitely know this so in order to create the API I will go ahead and click on the playground on Gro uh once you go over here you here you'll be able to see the button of apis so just go ahead and click on create API give the API name and just click on sub submit once you do that you'll be getting an API name itself right and this API this secret key only I will be specifically using in my environment variable so uh I'll show you step by step already I have created those kind of API key so please make sure that you also update your environment variable I'll tell you with what name you really need to update right now here we have done this now since we are going to use this grock with Chad grock itself and that is already present in Lang chin right so here what I'm actually going to do I'm going to write it from Lang chin underscore sorry underscore grock import chat grock okay chat grock so this is the uh chat oh here I have made a spelling mistake Lang chain Lang chain Gro okay import Chad Gro so Chad grock is like how we use Chad API similarly Chad Gro is also there for this purpose as you all know that I'm going to probably take a website and read all my content from that specific website itself right so here what I'm going to do is then again from Lang chenore Community I'm going to use webbased loader again uh from Lang chenore Community uh then I'm going to use document loaders as you know this is the First Data injection technique that we can actually use import we web base web Bas documentor loaders webbase loader web base load okay so I will specifically use this particular langore community do documentor loader is import webbased loader along with this what I'm actually going to do as I said I'm going to only use open source models so first of all uh let's uh go ahead and do this and I'm going to use Lang chain and one of the embedding technique that I'm specifically going to use uh is AMA embedding because it is completely for free it is an open source itself so langore community. embeddings uh import AMA embedding so here you don't even have to worry about anything about open AIS right so embeddings I'm going to use it over here um uh along with this what we can do is that since we are going to read the entire document and convert that into smaller chunks so for that I will be using recursive character splitter so again from L chain under dot L chin dot again I'm seeing the documentation guys so I need to know where this library is actually present uh the recursive character text spitter so from Lang chain um I will be using dot text splitters tox text plater and here I'm going to import recursive character text plater okay so this is another library that I'm going to specifically use for this purpose and as you know that I've already shown you how to use create stuff document chain so for that also I will be using from Lang chain from Lang chain. chains. combine documents import create stuff document chain okay so this is the another library that I'm going we going to use um along with this I'll create my own custom chat prompt template so again from Lang chain dot oh sorry underscore code. prompts I'm going to import okay line chain uhor code. UT chat promp template okay so chat prompt template I'm going to use along with this and as you know after creating the document chain I also have to create the retrieval chain so from L chain do chains come on here from Lang chain. chains okay sometimes all the time when I'm writing Lang chin there is a spelling mistake come on okay import create uh create create create create retrieval underscore r r r i Vore chain okay so uh yes one more Library with respect to this I'm going to use all the specific things for my purpose uh and as you all know that I'm also going to use one more thing so that I'll be able to load all the environment variables so from dot EnV import load. EnV I'm also going to import OS I I've imported OS so load. EnV I'll just load this so that all my environment variables will loaded and one of the environment variable that I'm going to specifically use is grock API okay so let me load the grock API key okay so that whatever I'm specifically running I'll be able to use the open source llm model from this specific inferencing engine so here let me go ahead and write grock _ API _ key is equal to os. envirn okay and here I'm going to use crore API key that basically means whatever Gro API key I've have taken I have written in my environment variable with this key is is equal to whatever my API key I've got from that particular website okay so you also have to really do the same thing okay so whatever key you actually get you have to take this in the name and you have to probably go into environment variable so environment variable is somewhere here and just assign this with that whatever key name because I have created a lot of key over there and since we are continuing the Lin series I'm going to probably do in this specific project itself okay so once uh this is done uh here one more thing I'm going to specifically use uh in streamlit I'm going to use something called a session state so all those things I'll also be doing with respect to this okay and that will also help you to understand so here I'm going to write if Vector not in okay st. session state I'm going to create some session States so that you'll be able to understand so first thing that I'll be creating with the session state is nothing but my AMA embedding and probably you'll also get an idea in this video about many things session States and Streamlight and all okay so first of all I'm going to create my embeddings this embeddings I'm going to assigned to my all my embeddings okay so this is the first parameter that I'll be using then I'll be using session loader so session uncore State okay uh and along with this do loader so I will load some information from my website now what website I'm actually going to consider let's see uh initially I've used some kind of websites over here here at least somewhere I've used some websites let's pick up any of the website over there okay I'll probably take previous examples um this is done agents is the agent let's see what example I've actually taken okay let's pick up this website okay so I'm going to use webbased loader along with this website okay so here I'm going to write webbased loader and here will be basically my website link okay so I'm actually reading the content from this website okay so let me quickly open it over here for you so that you'll be also able to see it okay so it's uh doc dos Smith chain L Lang chain some information I want from here so I will try to pick it up from here itself okay so this becomes my loader now you know after getting the loader what we really need to do is that we need to uh uh from this particular website we need to get the entire documents so I will create another session States session uncore State okay dot docs and I will say loader dot oops not loader it should be this one st. session state. loader do load okay so once I load it uh again the same thing whatever we have specifically done now after getting the docs what we have to do is that we have to probably use this st. session State now you know what we specifically do after that you know we get the final documents or chunk documents right so I'll write chunk um documents because I need to do over here as recursive character splitter right so here I'm going to use recursive character splitter and here inside this recursive character splitter I have to probably consider this as my docs okay oh sorry over here I will probably be assigning my or instead of writing this chunk underscore documents I will write text spitter so this will basically be my text spitter here uh I will be using chunk chunk uh underscore size is equal to 10,000 or th000 th000 I'll go ah and right chunk undor overlap will be somewhere around 200 right so this is done uh good we have we have our text splitter again that is also saved in the session itself now I'm going to basically go ahead and write st. session uh session uncore State and here I'm going to create my final documents final documents okay and this final documents will be used along with this st. session state. textor splitter and here we are going to basically uh use do splitcore documents okay and we are going to probably take the documents over here documents is nothing but all the documents that is available over here along with this we going to use the embedding that is nothing but uh St do session state do embeddings right um embeddings I think it should be coming up right I think it is the same thing St do state. embeddings right so this basically becomes your final document okay so final documents and here you can probably see all the things are available right with respect to all the uh State session states that are basically getting created now the next thing uh that I also need to do after probably getting the form all the documents form then I have to really convert that into vectors also and store it in a vector store so from Lang chain uncore community on inside this community I'm going to specifically use my Vector stores okay so Lang Lang Lang Lang Lang Lang all the time c will be coming before G so how many of you are also doing the same mistake like me okay so from this one I'm going to import F okay now this F uh that I have actually initialized again I'm going to store it inside my session session unor state DOT name it as vectors is equal to I'll be using F F.T from documents and inside this as you know that since we are going to use this functionality of from documents two important things I have to really give one is all my documents all my final documents so here uh I will give my first parameter as final documents over here okay and the second uh one that I'm specifically going to give is nothing but my embeddings right so this will specifically be my embeddings right done so this is perfect uh here you'll be able to see that I'm able to see all the documents everything as such and uh this is all in short my uh these vectors will get created now I can probably ask any questions from this particular vectors okay but since I really need to create this as an end to end project so I'm going to use multiple things one is st. title okay and here I'm going to basically use uh uh chat grock chat grock demo okay chat grock demo now let me go ahead and create my llm my llm is nothing but chat grock and inside this I'm going to specifically take my groor API key which will be initialized to my groor API key right so whatever grock uncore API key I'm actually getting it okay um along with this uh the model that I'm going to specifically use there multiple models as already said over here so let me check out what all models are there so you have this gamma 7B it right or you have this Mistral 87b 32768 so I will use gamma 7B it so let me quickly go ahead and over here and I'll let me write in my name model _ name is equal to so gamma I think it is capital letter gamma 7B so this will be gamma 7B it I think I spelling is right right uh it's Capital it t is it capital it or 1T okay if it does not work I'll change the model name no worries I have three other options Capital it okay I think it should it should be Capital it okay so once this is done uh my Gro my model name everything is basically created uh then uh this is my llm models now what I will do is that I will also go ahead and create my prompts so let me go ahead and write prompt chat prompt template do fromom template so they are also some templates that is defined you can also create uh chat prom template from scratch okay so I will say answer the question or so here I'm going to basically use some context also answer the answer the questions based on the provided context context only okay um so here I'll write please provide provide the most accurate response based on the based on the context okay so this will basically be my thing and uh uh one very important thing is that uh I can also specify based on the based on the question okay now let me go ahead and Define my context so here I will basically Define my context and this will uh this will be ending over here so let me go ahead and write it so this will basically be my context and here I can give my context like this now the next thing that I usually give is nothing but my question which will be my input parameter over here right so this will basically be my input okay so this becomes my chat prompt template uh this is very good now as you know what are the next step we have to probably use document chain so I will go ahead and use document chain over here then I will specifically use all the vectors in the form of as retrieval right so that this is just like an interface to probably read all the data context that I specifically get from these particular vectors now since uh we are using all these things right like we are using this chart promp template we are using some context based information then document chain retriever and create retrieval chain will also come into existence now create retrieval chain already I have explained in my previous video here I'm going to combine my document chain and the retriever so both of them will be specifically combined and now I will go ahead and probably write my prompt prompt will be STD do textor input and here I'm going to Define my input your prompt here so guys now next step what I can actually do is that uh llm is ready my prompt is ready my retrieval chain is ready everything is ready now uh all I have to do is that I will write if prompt okay if the prompt is there uh I will go ahead and write my response okay and with respect to response I will be using this retrieval chain so let me go ahead and write retrieval chain do invoke okay and whatever input I'm specifically giving right with respect to this particular input and that should be in the form of dictionaries okay and here I'm going to write input colon prompt right so done so this is also done perfect I'll be able to get the response over here now uh it's always the best way that uh if I really want to get the response right uh I go ahead and probably use uh some timer out there right so that you'll be able to see it like how much time it is probably taking so import time I have imported time over here now in as soon as probably I go ahead and H hit uh The Prompt itself right as soon as I press enter you know the timer should probably start over here so I'll write starter time dot uh process time okay so this will basically be my process time over here and then I will go ahead and write print okay or I don't have to print also over here nothing okay what time it will start and all we can probably do it but again I can go ahead and write print uh response time time colon okay and whatever is the response time I can specifically uh do the subtraction over here itself right and here uh I can just write something like this uh the current time that is time do process time okay process time minus start or whatever is the start time right so I can specifically get this and I'll be getting the response itself and once I get the response I will go ahead and write st. write and this will basically be my response of answer okay whatever answer I'm specifically getting okay so yes almost done each and everything over here um and I think uh we'll be able to get all the things uh with respect to this but again it is a good idea that uh you know you also try to create an expander where you probably get all the context also what it is basically getting retri so a small four to five lines of code that I will be adding it over here right with a Strate expander you can see I'm taking up all the context from that context I'm taking the page content and displaying it over here right now let's go ahead and run this I think it should be working absolutely fine uh amazing end project where we definitely learn a lot of thing and here finally you'll be able to see streamlet run agents uh sorry app.py right but I think it should be running and we'll see the speed because the speed will be the most important thing so let me go ahead and CD CD grock and now same thing I will go ahead and write now I feel now the main the most excited thing that I will probably be excited about this application is how quickly I'm able to get it so text pl. split documents St two position number three were given let's see this kind of Errors usually comes but it's okay recursive cator uh is it regarding recursive text Splitter Splitter documents okay where did I give three uh splitter documents let me see in the documentation uh but again this kind of Errors will definitely be coming up okay here I don't require embeddings okay so one is only required I just saw the uh document so that's it okay sometime it happens but it's okay now probably whatever question I asked from here okay let's see I will go ahead and ask this now I think it should be running let's see let's see let's see let's see it's stopping b a second but there are lot of content from here so that is the reason I think it is taking so much of time I think for the first time for the indexing purpose it is probably taking a lot of time let me delete this once again um terminal and let me do one thing okay let me to just improve the performance right I will just not take all the documents over here I will just take some documents okay uh let's probably take the initial 50 documents that is how I'm going to probably do it okay uh initial 50 documents if we are able to do that much I think uh it'll work out um I don't want the maximum number of files that will probably create me a problem over here okay command prompt let's see open CD grock okay uh stream late I think it'll take some time because I'm saying that as soon as the pages get getting loaded this needs to be embedded or indexed okay so stream late run app.py let's see if this works faster uh I will not show you the other one then probably it'll take some time you know because the entire website is basically getting indexed and uh let's see how much time that vectors will probably get created that is how things are basically happening over here um because we are taking the first 50 documents but I hardly see that 100 to 200 documents because it's a big website that this document. smith. blank chain right so you can see this is a big big page and there are so many things that is already available over here okay let's see how things goes ahead and okay uh st. session has no attribute Vector did you forget to initialize it okay one more error one more error error error error error okay let me see St vectors Vector vectors okay vectors is used over here let me see okay vectors I understood the error it's okay we can understand that okay okay now I have again reloaded it it is going to take some time hardly I think hardly 10 to 15 seconds because it is going to Index this entire page and for the first time indexing it is it is going to take that much time so guys uh whenever I'm probably searching things right and it is giving me some of the document similarity search but uh with respect to the summarization from the Lang chain it is not coming up I find out that one of the reason is that uh GMA 7B model you know it has a very less context length you know so if I probably consider over here uh and probably see in the API accs right uh there are multiple models that we have specifically used right one is uh mr8 into 7B gamma 7B only 8K context L 32k context L let's do one thing let's try some other model over here okay the mistol model and see whether it will be improving the performance or not or whether it'll be improving the accuracy or not so you should be knowing all these things I'm not going to edit any part of the video over here so that you know you get an idea like how things works and here now let me go ahead and run this and reload it completely and now we will try to see it okay uh how the accuracy is but again we going to focus over here and see that how things are with respect to this right but at the end of the day uh it is being able to provide this context you know whenever I ask any specific question based on that context so uh I can just ask how to install lsmith how to create an API key uh um set how to set up an environment something like that okay so here how to set up an environment so let's go ahead and work it out I think it should be there okay it should be able to probably give me some kind of response at least now and again let's see this part also okay it's working but I think uh the document similarity search and everything we will be able to see it okay and along with that you'll also be seeing uh uh many more things right so many things that you can probably search out of it so let's see I think for the indexing purpose only it is probably taking time so the response time is less see to set up an environment this this this all the information are specifically over here so this looks absolutely amazing absolutely good you know and that is how you can actually do it and just you can imagine like what happens if you probably take an open source l m models with a higher context length right so definitely try it out from your end uh again you can ask any questions as such uh let's ask one more question uh over here um how to create an API key okay how to create an API key create an API key and if I execute it so here you can actually see the response is there and to create an API in lsmith you can follow the following steps all the content that is present in the website is shown here you can probably see the entire document similarity search so guys uh I hope you are loving the entire Lang chain series we have almost completed almost each and every module specifically to langin that is actually required to create end to- endend projects and we have also developed multiple projects over here uh we have used both paid and open source llm models now one of the most requested video was that chish how can we work uh considering Lang chin along with hugging face libraries and making sure that you use only open source llm models and try to create some amazing end to end project so in this video I'm going to use Lang chain having face and open source llm models like Mistral that is specifically hosted in hugging face and then we'll try to develop a Q&A rag complete rag app and uh I hope you will be liking this particular video please make sure that you watch this video till the end because there will be a lot of learnings that you'll be able to find from this particular video too so let me quickly go ahead and share my screen so here you can actually see I've created a fer called as hugging face and there are some us sensus data which I probably downloaded from the internet in the form of PDF and I have uploaded in this specific folder right so we will go ahead and probably read this again we we are going to use completely open source llm models like mistol using hugging face itself and then we are going to create the entire Q&A application so let's quickly go ahead and do this and initially as you know that initially we need to load all the data injection step that is loading all the libraries all all the PD itself so for that I will be importing from Lang chain undor Community okay do document loaders import Pi PDF loader okay so this is one of the uh things that I'll be requiring then from Lang chain undor Community do doent loaders I'm also going to make sure that I use a PDF directory also for that so here I'm going to import Pi PDF directory loader okay and then from since I after reading it I need to do the character splittings uh divide that documents into chunks so for that I'm going to use recursive character splitter along with this again as we know that we'll also use some kind of vector store over here so I'm going to use the vector store called as F and uh as you all know we are also going to use hugging face so for hugging face as I said all the embedding techniques also I'm going to use from hugging face itself so for that in Lang chin also you have this embedding techniques where you have hugging face uh BG embedding which is completely an open source and which is good enough right so we also going to use it um along with this I'm also going to import numpy as NP if I specifically require numpy I don't think so I will be requiring but I'm going to just use this okay um two more amazing libraries that I'm going to use from L chain since I also have to make sure that I create a template so from Lang chain. prompts I'm going to import prompt template okay so this is also what is specifically required and one more right one more uh chains I will specifically require again from Lang chain dot chains import retrieval QA right so retrieval QA chain we will specifically be requiring so all these libraries we are basically imported uh and we are going to use this for this particular purpose now let's go read the PDFs from the folder okay now here you have loader is equal to Pi PDF directory loader and here I'm going to give my sensus folder right so here I will go ahead and write dot start usor sensus okay so this sensus folder I'm going to specifically read and then I'm going to use this loader do load and I'm going to split uh I'm going to basically get my documents okay so once I probably get the documents then what I will do I will create my text spitter and for this I'm going to use my recursive character text spitter and for this uh the chunk size chunk undor size I will be taking th000 okay and chunk chunk uncore overlap will be somewhere around 200 okay so all these things as you all know I've written this kind of code a lot uh and then finally my final documents okay will be nothing but I will be using this text splitter and then I will split all the documents so I will write split all the documents uh with considering all my documents over here and let's go ahead and see my final documents first final document I'll just try to see I don't have to display the entire thing so uh this will probably read all the PDF file inside that particular us sensus folder which is present over here all the four PDF files and then it will probably give me a thing so here you can see one p error is there so here let me go ahead and write it chunk size now I think it should be working fine and all the PDF all the materials will be available in the description of this particular video guys for you all of you right so here you can see and the total number of length of final documents you can probably see is somewhere around 316 so here you can see page content H insurance cover status everything is available over here and this is the total number of documents that I'm actually able to get it okay now uh the next thing that I will will be specifically requiring is the embedding technique what embedding technique I'm going to use as I said I'll be creating hugging or let me just go ahead and write see you can use AMA embedding anything but since I am planning to do it with uh embedding using hugging face so I will be using all the hugging face components and these all are also very good completely open source and available for you from hugging face so here I'm going to basically write hugging face and biddings and uh the next thing that I will specifically require is my model underscore name this is the first parameter that I will be giving and then model number underscore name will be nothing but I have explored some of the model over here that is this one right bu this one is also there okay so one aming technique is this one I can also use uh sentence Transformers okay so let me just show you instead of using this I can also use this so I'll write down a comment over here oops just a second I'll just write down the comment over here okay so this also you can specifically use let's go ahead and use sentence Transformers if you want okay and uh if you don't want you can also use this no worries it is up to you okay both any of them can be specifically used it is up to you okay and these are some of the free available embedding techniques that is basically available okay so now I have actually created this I will go ahead and write my next one so before that let me do one thing okay and okay so the model name is driven then I will be using model KW R GS and I know I don't have to even use GPU right because with the help of GPU with the help of CPU this is possible because I have very less documents so I'll be using device device colon CPU okay so this will basically be my device with respect to this model arguments col okay the next parameter that I'm going to specifically use encode KW R GS and incode _ KW RGS will be here I'm going to use a parameter which is called as normalize so you may be thinking Krish from where are you probably considering this I'm just checking the documentation from the hugging phase itself right so if you go ahead and search for this right and you'll be also able to see some of the examples right so here you can see this okay so here you just click on this and here you'll be able to get all the information there will be some code files also which you'll be able to see sentence Transformer is also there see everything is over here right all mini this one only I told right so similarly you can probably go and check for many of if you want okay now the next thing that you really need to do is that you have to set your access token of hugging face unless and in order to access all these models hugging face token needs to be there so I will go and click on settings in my hugging face account there'll be something called an an excess token okay and just make sure that you create this particular new token so once you probably create it you'll be able to see it okay uh and then you can copy it and you can specifically use it but anyhow I will show you where we'll be specifically using now if I go ahead and execute it so here could not import sent Transformer install sentence Transformers okay I need to install sentence Transformers so let me quickly go ahead and update my requirement. txt file see whenever error comes some type of things is always going to happen but it's okay we should not be afraid of Errors you know errors will probably help us to do anything that we you want okay P install minus r requirement. txt so here you can see now the sentence Transformers will also get installed and finally you'll be able to see that okay everything is getting installed itself okay perfect perfect perfect perfect sentence Transformers is coming over here and now it'll be able you'll be able to see that okay on once it is installed then I think then you'll be able to execute this okay so we'll wait till this installation basically happens so if if you have not subscribed and please make sure that you write something in the comment section how are all these videos for you all are you able to understand I hope many many people I see whatever feedback that I'm getting from people with respect to the transition I feel that yes the videos are doing something in your life at least some amazing things in your life and I hope to do as long as I can okay but again my main aim is to teach you in such a ways that you learn learn learn learn things in a way that always it matters in your life right and that is what I'm actually looking at it okay so let's wait uh sometime sometime sometime some amount of time usually takes for this particular installation but I think once this installation takes place we good to go okay and I'll also show you how you can actually set up the hugging face you have actually created the hugging face tokens right but uh how to set it up as an environment variable still it is saying that it is not installed yes installation is taking some amount of time so let me pause the video till then I think it is going to take some time and then we will continue so guys the installation is taken place now you can go ahead and execute it and I think it should not give me an error now and your embeding hugging fish should be imported over here okay so perfect uh let's see uh it is going to take some amount of time again it's going to take somewhere around at least 1 minute because this particular embedding is quite big oh 13.1 seconds this is absolutely amazing and this is really good now after you have this hugging face embeddings we will try to use this hugging face embedding and try to just check on one of the file okay so I'll just go ahead and write embed uncore just to check whether it is working fine or not emid query and here I'm going to basically take my final underscore documents of zero okay so I'll just take the first one okay first one um and uh probably I will just go ahead and use the page uncore content because it'll be available over here uh now what I'm actually going to do if I'm specifically using these hugging face emings okay I will also go ahead and import numpy let's see import numi as NP and let me convert this into an array np. array okay I think so it should not be giving an error so we are just trying to see okay perfect so whatever was available in the first document of page content this is how the embedding is basically shown for the entire words okay whatever the words are specifically present in that particular embedding and uh if you probably go ahead and see this what is the shape of this particular array okay I will also print that specific shape okay let's go ahead and print this first print this first okay and then we will print it I will paste this do shape okay I think you'll be able to see the shape 384 okay and it is also getting printed so perfect now what we going to do uh if this embedding technique is working fine then what we'll do we'll convert for uh the words into vectors by using this embedding technique and store it in a vector store so for that I already have imported FS so I'm going to probably create my Vector store and I'm going to write my f Dot and then I'm going to say from documents uh did I import F yeah it is imported from documents and from the documents what I'm actually going to do is is that I'm going to give my final documents my final documents till let's say I give 120 records okay there are somewhere on 310 records uh I'm not putting all because it is going to take time okay just to show it to you but later on once once you have time you can go ahead and increase this number and here I'm going to basically use hugging face embeddings okay so Vector store will basically get created so if I go ahead and execute this again it is going to take some amount of time at least 30 to 40 seconds because there are so many documents over here and all this will be embedding right it'll get embedded into words and as you can understand every document will be getting converted into 384 based on the chunks that we have specifically 384 vectors okay so this will basically be my this will be my Vector store creation okay and this is what is the step that we have specifically done in every step right Vector store creation perfect so let's wait for some time and I think it is 30 seconds it has crossed since uh 120 before when I tried to do it it took me around 30 seconds because they were only 100 documents okay so 43.4 seconds perfect now we will try to use this Vector store and perform query using similarity search okay so similarity search will be basically done over here similarity search search will be done okay and once we specifically do the similarity search we will be able to see or we'll be able to get some kind of response over here okay so how to do the similarity search first of all I will go ahead and write my query uh with respect to the query you'll be able to see I will just write a question what is the health insurance okay now see this question that you'll be saying I did not just write it by purposely or somewhere uh uh here uh what I will be doing with what is the health insurance insurance coverage okay I what I did is that I just went to this folder okay and uh when I probably go ahead and see this folder I will just open my first file okay and this is where I show this question what is health insurance coverage so I will take up this question itself okay and I'll will paste it over here okay now once I paste it so this is the query that I'm actually searching I will go ahead and write relevant uncore documents since I need to probably retrieve that result that is present over there so what I will do I will use Vector store Vector store dot similarity similarity _ search okay and whatever query I have specifically written okay whatever query I have written and let's go ahead and print all the relevant documents let's go ahead and print all the relevant documents so once I print it I think I should be able to oh relevant relevant relevant relevant is this not the same thing okay so here you can see page two what is health insurance and here you'll be able to see the answer this brief prevents uh stage level everything uh you are able to see this right completely right and in order to get this what I will do to exactly get the page content so I will write underscore page content so whatever question I'm specifically asking you'll be able to see this is the brief present state level estimates of health insurance cover is it same or not so this brief presents state level estimates of health insurance coverage using data from American Community s server survey American Community survey so here you can see the US sensors and all everything is working absolutely fine okay uh sometime you also may be requiring uh you know more results right more let's say right now I'm just getting one result let's say I want two to three results now two to three similar kind of results now what I can actually do is that um or uh according to that particular text or query that I've asked the most relevant document I specifically want so what I can do I can basically write this Vector store okay vector store and I will specifically call as retriever function I hope everybody knows what is as retriever because we have discussed more different thing about it completely as retriever is just a kind of interface that gets connected to a vector store so that you'll be able to get more results more amazing results based on the similarity search and here I will say search type similarity but my search underscore arguments okay will be nothing but my search underscore arguments will be nothing but K is equal to 3 K colon 3 okay so this is what I will be probably getting the vector store okay and uh you can probably consider this as my retriever itself right so I will save this as retriever okay and then let's print retriever okay so here you can see F hugging face embeddings all the things and the number of search arguments like most relevant documents I require is three okay and now I can also use this particular retriever to invoke anything that I specifically want but before I invoke understand I also require an llm model now for those llm model as I said I'm going to use hugging face llm model now that is the reason the excess token that are actually created right I'm going to set up that excess token with my hugging face so I will write os. Environ okay and here first let me go ahead and import OS import OS os. environment my hugging face Hub API token so I'm going to set this up as my API token because from here hugging face Hub only will be downloading the model we'll be using the model and here we are going to specifically use my token key right I'm hardcoding the token over here because I've have copied it from there itself but don't use this because I'm going to regenerate that particular token okay it will be of no use okay now let's go ahead and if if you don't know about hugging face Hub guys what exactly is a hugging face and all uh I'll be writing some description over here for you all right so hugging face is a platform with over 350k models 75k data sets and 150k demo apps all in spaces all open source and publicly available in an online platform where people can easily collaborate and build ml together so I'm also going to use this uh uh same hugging face Hub and then I'm going to write from langin do community Lang chain uncore Community dot llms import hugging hugging face Hub right so I'm going to specifically use hugging face Hub I will create HF is nothing but hugging face Hub and now I will show you how you can call your models right the main parameter over here is nothing but you have to give your repore ID okay now here one repo ID since you are going to use the open source model is nothing but mistol right so this is basically a skeleton for you you can go ahead and replace the model whatever you want right so we going to use mistol 7B okay again it is up to you whatever model you specifically want to use only you have to give your repo ID along with this one more parameter that you really want to give is model arguments okay so in the model arguments here we will specifically give temperature parameter okay so temperature will be nothing but it will be 0.1 and the next parameter that you specifically going to give is nothing but Max underscore length Okay and this Max underscore length will be nothing but 500 okay so these are the two parameters that you really need to create uh so that you will be able to use this particular open source model understand one thing this will be directly using from the U hugging face Hub itself okay you don't need to download it locally for downloading if you want to download it locally and probably want to check it out so I will also give you the code for that you can which you can try it out okay now let me go ahead and write the same query so I will say query is equal to um what is the health same query insurance coverage okay so this is the query that I have now in order to invoke it I will use the same H HF do invoke okay and then I'm going to basically give my query so here you can see now I will be able to get my response and this is the most generic response right what is the health insurance cover I have just asked this is the most generic response which mral knows but my main aim is to probably create a Q&A app right Q&A app specifically for rag application so still it is not communicating communicating with my data that is available in my PDF okay so right now this is just given me a general uh General output okay General output what is available with respect to the knowledge of the internet now the next step what we are specifically going to do see understand if you want to make sure that we interact with our PDFs this hugging face uh llm models we have to probably create our own promp template one more thing that I really wanted to show is that um if you want to download locally right hugging face models can be run locally through hugging face pipeline so here is the uh new code before we used to use hugging face Hub here you have hugging face pipeline same code for TAS generation everything you can probably execute this and this will basically be uh able to generate it okay but I'm not going to use this since I've already trying to use from this right so it is going to take some amount of time with respect to execution perfect now this is done now as I know I need to use this llm model to interact with my PDF so for that I will try to create my prompt template okay now the prompt template will be nothing but I'll I'll be use the following piece of context okay to answer the question the question asked okay so uh here is the simple prompt that I have used use the following piece of context uh piece of context to answer the use the following piece of to answer the question asked please try to provide the answers only based on the context okay so I'm just giving a simple prompt okay pretty simple prompt okay now my context will be over here C and txt okay and then my question will be nothing but it'll be over here and I think we have done this many number of times in my previous video okay and and uh if I want one more thing I can probably give I will just write helpful answers okay and I'll keep it blank okay so this is my prompt template that I'm going to use and then in order to make it as a this I'm going to use my prompt template prompt template uh my template will be assigned to this same prompt same same prompt uh I think this is assigned to this prompt okay so this will basically be my prompt okay and then my input variables will be having two things one is context okay context and the other one is question oops same question just a second question okay so context and question we are setting it as an input variable and this basically becomes my prompt itself okay so I'm going to execute this okay now understand if I want to combine okay I if I want to combine my Lang chain right my my sorry not Lang chain my my template with the llm models obviously what you really need to do so that we also get the context in the question it's create retrieval QA right so create retrieval QA chain right so if you remember that in my previous video one more way that I can specifically do is by using this retrieval QA right and in order to use this retrieval QA here what I will do I will just write from chain type okay from chain type and here I'll start giving my parameters that I require first parameter is nothing but llm llm will be nothing but HF whatever the hugging face library that we have created I will create my chain type as stuff okay so chain type will be nothing but stuff the next parameter that we are specifically going to do is my retriever which will be assigned to my retriever itself which is basically defined return Source document you can keep it as true or false so that you'll be able to understand from where it is basically coming up so return Source documents which will be assigned to True okay and the next parameter is nothing but it is my chain type chain type KW RGS okay and you can also check out this entire documentation from the from the Lang chain documentation so prompt will be nothing but it will be assigned to this particular prompt okay sorry this prompt so this is done like most of the things have been completed over here but understand one thing in retrieval QA you have to give your hugging face your chain type stuff your retriever return Source document so that you need to understand from where it is coming and CH time arguments is nothing but with respect to the prompt so this prompt will return some of the documents it will be setting up in the context then question will go over there and entire this retrieval q and will work okay so here I'm going to write it as retrieval QA right so this will basically be there let me quickly go ahead and execute it okay now it's final that you just need to run it in order to run it here is the question right so now finally we can use this retrieval QA that we have actually created and we invoking with respect to the query that we have already WR what is the health coverage and finally let's go ahead and check our result so here you can see what is health and insurance covers the brief presents the state level estimates of this American Community survey the US cons US Census Bureau conducts ACS throughout the year the service asks responds to report the coverage at a time so every information is specifically given now you can go ahead and ask some other questions also if you want um let me go ahead and ask this differences in the uninsured rate by state okay so I'll go ahead and ask this let me go ahead and write my new query so query is equal to let me just go ahead and close this in okay so here is my query so let me go ahead and execute it now if I go ahead and see this it should be able to give me the response so here is a comparison of ASM measure of health insurance so everything is probably matching over here see unsur rates of this this this is coming up okay so all the information are specifically coming up from here this is perfect uh now we have I've also shown you how you can probably work with Lang chain and hugging face so let me go ahead and update uh this also if you want okay um insert sell CH sale to insert okay so perfect this is it uh I hope you like this this particular video uh at the end of the day uh the one thing that I really want to say is that more you practice things more you see things definitely you'll be able to achieve multiple things over here so yes I hope you like this particular video this was it from my side I'll see you in the next video have a great day ahead thank you one all take care bye-bye