Knowledge Graphs and LightRAG

Graph rag is one of the best ways to improve the accuracy and reliability of your AI agents. But most people are put off using knowledge graphs because they seem to be too complicated to set up and too difficult to maintain. Well, today I'm going to debunk that by showing you how you can quickly get your very own knowledge graph set up and autopop populated from your own documents and how you can connect it to your NAN AI agent within minutes. I'm going beyond what comes out of the box in NADN here by tying in an open- source system called light rag. But don't worry, the setup is pretty easy and anyone can do it. In this video, I'll explain what is a knowledge graph along with why graph rag produces better responses than traditional rag. I'll show a demo of light rag and how documents are ingested and processed. I'll walk through step by step how you can connect your knowledge graph expert to your AI agents in NAN to produce much more detailed and comprehensive responses from your AI agents. And I'll also show you how I built this into our state-of-the-art N8N rag system which includes fleshed out rag ingestion pipelines to process documents as well as hybrid search and reranking to produce super accurate AI agent responses. This system is available in our community, the AI automators. So, if you'd like to get a head start, then check out the link in the description below. I put a lot of work into research in this video, so I'd really appreciate it if you give it a like and subscribe to our channel for more deep AI content like this. To explain the concept of graph rag, we first need to look at what is a knowledge graph. And essentially, a knowledge graph is a structured way to represent information about real world entities and how they relate to each other. So if we take this knowledge graph of Steve Jobs for example, we can see that Steve Jobs was born in San Francisco which is located in California. He is the founder of Apple which is also headquartered in California. Apple created the iPhone which was launched in 2007. So as you can see here we have lots of different entities and with these labels you can see how these entities relate to each other. So, it's essentially a massive mind map of things and how they're interconnected. And knowledge graphs can really help you identify patterns in data as well as making it easier to understand and navigate and use the data. Google probably has the best example of a knowledge graph online. So, to continue the Steve Jobs theme, if you Google his name, you'll see this knowledge panel on the right hand side. And similar to the last screen, you can see he was born in California. He's the founder of Apple. And there's lots of other interesting connections. So when you think about knowledge graphs, there are three specific concepts you need to understand. The first one is nodes or entities which you can see here in circles. Here we have a person. This one is a course. This one is a learning institution. And then the second thing are the edges or relationships. So this person teaches this course. He lives in this city. And finally you have properties. So these describe the nodes. So this course is a computer science course. that's taught in the English language. This person's name is a Fischer. So there is the ability to set these custom properties for nodes within the graph. So these three elements provide a huge amount of flexibility for you to model your data. And that traditionally was one of the biggest challenges with creating knowledge graphs was the manual human effort it took to design the schema for the graph and populate it out. Machine learning models and natural language processing has been used for a long time to autogenerate graphs. But more recently, now in the age of LLMs, it is now pretty straightforward to have an LLM auto extract the nodes, the edges, the properties of data that's buried within documents. And that's really where graph rag kicks in. And all of these nodes and relationships and properties need to be stored somewhere. And that's where graph databases fit into the equation. Neo4j is the most popular one out there. And these systems are designed to store and query data that's naturally represented as a network of relationships. And similar to how SQL queries a structured relational database, there are different graph query languages to query graph databases. Cipher is one of the most popular ones that's used in Neo4j, for example. But the important thing is you don't need to learn Cipher to actually get this system up and running. So then what is graph rag? Essentially it is just rag or retrieve augmented generation using a knowledge graph. If we take a traditional rag application as an example, a user asks a question which then triggers a search of a knowledge base which in this case is a vector store and the most semantically relevant results are then returned and then the question plus those documents or plus those chunks is sent to an LLM to generate an answer. Whereas with graph rag, there's two stages to the process. Initially, we need to construct the knowledge graph based off your documents. So you have lots of documents that you need to ingest and they are sent into an LLM for entities and relationships to be extracted from the raw unstructured text and those entities and relationships are essentially stored in a graph database. And then at inference time when a user asks a question, not only is the vector store queried for the most semantically relevant results, but also the knowledge graph is queried to get the most relevant entities and relationships as well as other close related entities in the neighborhood. And then the question plus the document chunks from the vector store and the entities and relationships from the knowledge graph are sent to the LLM to generate the answer. Now, there are lots of different retrieval strategies with graph databases. The one I've described here is the one that we're using in this system. So the big question is why graph rag? And it does solve a number of inherent problems with semantic search. And the first one is lost context. Here we have example documents. And depending on the query that's asked, you're going to end up with what are essentially independent chunks of the documents. And the problem with this approach is that it's quite fragmented. The example I always use to explain this is what if this page of the document was the exclusions of a particular insurance policy for example depending on the query the rag system could pull out a couple of paragraphs from the middle of that page not realizing that this is in the context of exclusions of the policy and then the LLM may hallucinate a response that it deems to be what's included in an insurance policy because that's what came back from the vector store. So the nature of these independent fragmented chunks and the inherent loss context of the chunks is a major problem with rag systems and that is something that graph rag along with other techniques like contextual embeddings helps to solve. And another issue with getting these independent random chunks from documents is that you're missing the relationships between entities. So if the chunks for example don't have a full summary of all of the relationships around Google or OpenAI for example, the LLM will not be able to generate an accurate response. Whereas getting all of this information fed into the LLM from a knowledge graph means that you're going to get a much more comprehensive answer. And a big benefit of graph rag is multihop reasoning. And the best way to explain this is the six degrees of Kevin Bacon game. So, with this game, we're assuming that every actor in Hollywood can be linked to Kevin Bacon through a series of co-starring roles. And the challenge here is to find the shortest path to connect an actor to Kevin Bacon. You can see that Robin Williams was in The Butler with Cuba Gooding Jr. and he was in A Few Good Men with Kevin Bacon. So, there's two degrees of separation. So, it's this type of traversing of real world networks that semantic search is pretty poor at. And there are lots of much more serious real world use cases for multihop reasoning that knowledge graphs can help with. For example, if an agent was asked the question, who should I contact for budget approval for a marketing automation project? A semantic engine is not really going to respond that well to that question because it might pull snippets from marketing automation projects. It might pull snippets from budgets, but it's not able to tie them all together through a relationship. And that's where AI agents can get really clever and really smart with knowledge graphs. There are lots of different implementations of graph rag. The most popular one is Microsoft graph rag which was released last year. With Microsoft Graphra, you have an automated knowledge graph construction. So it makes heavy use of LLMs to extract out entities and relationships and properties. But then it also carries out huge enrichment and processing to generate clusters and community summaries which are brilliant for global style questions around concepts for example. And from looking at the various benchmarks, Microsoft Graph Rag is quite impressive compared to naive rag on those global questions and multihop reasoning. And you have a variety of retrieval strategies that can be used within Microsoft Graph Rag. The problem though is it's quite expensive to run both to ingest all of your documents to extract out the entities and relationships and summarize everything. It can be quite slow to trigger inference on the knowledge graph and it's generally quite complex which makes it challenging to carry out incremental updates of the knowledge graph. A variation on Microsoft's graph rag is light rag. This was released late last year and also features automated knowledge graph construction. So you just ingest your documents. It processes them and builds out the graph. A key aspect of light rag is they don't have these clusters or community summaries. Instead, they use dual level retrieval, which I'll talk about in a minute. But the pros of light rag is it still does have strong performance against naive rag based off the benchmarks. It's significantly cheaper to run and operate than Microsoft graph rag and you get faster responses and it's a lot easier to update the knowledge graph. On the negative side, it is called light rag for a reason. It's a much simplified graph, so the responses aren't as good as Microsoft graph rag. It also doesn't handle these multihop queries very well because it's essentially just retrieving the nearest neighbors of the entities that it finds in its search. But still better than traditional rag. And to speak to this dual level retrieval, I think this is really interesting. So if you take a Formula 1 question for example, how has the FAA budget cap affected midfield teams performance pace? What light rag does is it extracts local keywords which are exact forms of words used in the query. So it'll pull FIA, budget cap, midfield for example. But then it also extracts global keywords which are more broader concepts or themes. So here it might extrapolate financial regulations for example or resource allocation or wind tunnel usage. So it's inferring these global keywords from the query. And that's how it's able to replicate what Microsoft graph rag does with its communities and clusters because it is able to search the knowledge graph for these global keywords. And the beauty of this approach is it returns what's essentially a semantic context both locally of the exact matches of words in the queries but also the higher level concepts and themes that are inferred from the query. There are lots of other graph rag implementations like rag flow, nano graph rag, fast graph rag. I had a brief look at those, but I feel like light rag is one of the best out there to actually integrate into NAN. In terms of evaluations and benchmarks, as usual, you need to kind of take these with a grain of salt. Microsoft Graph Rag, for example, claimed that their system outperforms naive rag on comprehensiveness and diversity with a 70 to 80% win rate. Light rag has a full performance table where it's beating everyone. And again, it's not hugely surprising that everyone claims that they have the best system. I carried out my own benchmarks and evaluations across a number of questions on a tennis knowledge base and I did find that graph rag using hybrid mode with light rag did perform better than naive rag in most cases but you definitely need to test this with your own data with your own configuration and tune it to get the best responses. Light rag is an open-source Python application that you can download from GitHub. So you could run this locally or you could spin this up on a server in the cloud. There is a docker image available. So I'm quickly going to spin this up on render so that we can connect it into naden. So for this go to render.com and create an account. And once you log into your dashboard, click on create new project. I already have one up and running here but let's spin up another one. Give the project a name. And from here you can now create a new service within this project. So we'll be creating a web service. And you have a few options to hook up your source code. So for this if you click on existing image and then where it says image URL we're going to point this to the docker image of the latest version of light rag which is hosted on the GitHub container repository. I'll leave a link for this in the description below. So if we click on connect then you can give the service a name. You can specify a region. You can choose an instance type. I'll just go with the starter plan here which is $7 a month. And then when it comes to environmental variables this is how you configure your light rag application. If you go into the lighter rag GitHub repository and if you click on the example env file, you can see that these are all the possible configurations that you can set. So we'll copy in a few of these. So the first one is off accounts because you want to be able to log into the lighter lag app. The value of this needs to be username colon and then a password. So I'll set this as Daniel Walsh for the username and then colon and then we'll just generate a password on Last Pass and we can copy that in. We also want to set an API password so that N8N can actually authenticate and connect. So that's this light rag API key. And again, we'll just generate another one. We then also need to set the API credentials for our embedding service and our LLM. So if we come back into here, we can see that we have variables for embeddings. So if we copy all of these in, we can add from env. So we'll paste them in there. And we're going to use OpenAI for this. So our binding is OpenAI. The OpenAI base URL for embeddings is api.openai.com/v1. So that's there. For the moment, we'll use the text embeddings 3 small model which has 1536 dimensions. And then we need an API key that we can also drop in which is there. I'll recycle this after the video is published. So those are the embeddings environmental variables. So we'll click add variables and it injects them there. Next, we need the LLM environmental variables which are here. So we'll copy these in and again we'll use open AAI for the model. We'll use GPT4.1 nano and the reason for that is there's a lot of calls when ingesting the documents to actually extract out entities and relationships. So if you use a large model it's going to cost you a lot of money. It's going to take a long time. Whereas the likes of Nano is cheap and is fast. That's the correct base URL. And again we need the same API key which is there. And then we can click add variables. And then the only other things worth setting at this point are around concurrency configuration. So from my experience using light rag the settings are quite conservative here. Now it depends on the size of the instance that you have actually provisioned. But if we copied these in and just remove the commented ones for max async I was able to set this to 12. I increased parallel inserts of documents to three. I increased the number of embedding asynchronous calls to 24 because the APIs can handle huge amounts of requests for embeddings and for the batch size I've increased that to 100. And as a comparison within N8N when you're embedding the batch size is defaulted to 200 chunks. But again, you may need to play around with these configurations depending on if you get throttled by the API endpoints or if your server is maxing out resources. So we'll add these variables and I think we're all set. Under advanced, we then need to add a disk because obviously we want the files and the data that we upload to persist. So I'll just click add disk. I'm going to mount this into the app data folder because that's where lighter rag outputs all of the files that it needs. For the moment, I'll just set it to a single gigabyte. And we're in good shape. So let's click deploy web service. And that has begun the process now of deploying that Docker image with those environmental variables. Loading up the disk. So the data that we load into light rag will persist if the actual server restarts and once this process is finished we should be able to log in and there we go the service is live available at this URL. So if we click here we land on our login screen. So let's put in our username and let's fetch the password that we used which is in the environmental variables which is this one. And if you click login there we go. So within lighter rag then you have a document section where you can upload documents manually. There is the knowledge graph which is built up as the documents are ingested and entities and relationships are extracted. There is a retrieval tab where you can test out having a conversation with those documents. And then there's an API section which essentially is the wrapper for this application that NADN is going to be talking to. So let's upload a document and we'll work through step by step what is actually happening. So I've clicked on upload and let's just drop in our Formula 1 financial regulations. And you'll see now that it's processing. And if you click on pipeline status, you can see what's happening. So it's entered the extraction stage and it's now processing the document. And you can see it's now working through the chunks to actually extract out entities and relationships. So to work through this, we've uploaded our documents. It goes through a process of filtering and dduplication so that it's not uploading the same documents and duplicating everything. It breaks the document into chunks based on the chunk size set in the environmental variables. The first stage then is your typical vector store ingestion stage. So the chunks are sent into an embedding model. Vectors are created that represent the chunks and they're stored in a vector database within light rags application. So there's nothing new here. That's the same as using superbase or pine cone within N8N. Where things get interesting though is after the chunks are embedded, they're sent into an LLM to extract out entities and relationships. So there's various preset prompts within the light rag codebase that it uses to accurately extract out these. All of those entities and relationships are then parsed and transformed and merged. And if we didn't receive enough from the LLM, it goes back through a loop to glean out more entities from the LLM. So that's essentially what you see here with each of these chunks. You can see that for chunk four of 28 it extracted out 20 entities and 12 relationships. So the merging side of it could be that in chunk five where there's 16 entities maybe half of those are already gleaned from chunk 4. So there is a need to merge entities so that we don't have duplicates. So once the chunk processing stage is completed it then moves to the merging stage and you can see we have 348 entities and 358 relationships within this 50page document. And this is this important section here which is the merge and generate entity and relationship descriptions. So in the case of the first entity here which is the FIA entity, you can see that there were 17 entities across the 28 chunks that referenced FIA. So as opposed to appending all of those entity descriptions, it's sending them all into an LLM to generate a single consolidated entity description. Whereas for something like costcap, there was only three references to cost cap within the 28 chunks. So it can simply just concatenate or append those descriptions within that entity. So there is a threshold that's built into light rag and that threshold is four as you can see because that has gone for an LLM merge whereas if it's three, it's just a simple merge. I'll show you these entity descriptions in a minute, but it's really impressive how it does this. So once the entities are resolved and merged and the descriptions either concatenated or created by an LLM, those descriptions are then sent into an embedding model to create more vectors. And this is a key stage because the way light rag works is that to actually find the starting entities based on a query, it carries out a semantic search of the knowledge graph. And that's what happens here. Those vectors are then saved into the semantic search database within liferagg. And the entities and relationships are saved into the graph database within light rag. And as you can see, there was a lot of activity in merging and resolving these entities and in some cases going to an LLM to generate more comprehensive descriptions. At which point then that document is completely processed. We can X out of this. And now if we go into knowledge graph and click refresh on the top left, you can see we now have a graph of this data. And there's different visualizations of this. If you click on the dots on the left, this is a circle pack for example. And back to those entity descriptions that I talked about. If we zoom in on the FIA entity and if we click it, then for properties on the right hand side, you see description. And if you click that, you can see a full description of this entity and how it relates to all the other entities that it was connected to through the entity resolution process. And there are some other interesting properties that you can see on the right hand side here. So for source it is highlighting the document where it actually extracted out this entity which is the financial regulations document but it's also referencing the various text chunks that were processed from the document. So it is now possible to be able to track back and site the actual document chunks where entity information came from. So you can zoom in on any part of the knowledge graph and be able to see the various connections. So again if I click on FIA and you can see that that is connected to formula 1. If you click on F1 teams for example you can see all of the various entities that are connected to that and there's a lot of mention of cost cap because these are the financial regulations. So if I click on cost cap you can see that that in turn has lots of connections like fullear reporting periods which are connected to the F1 team and the actual end of year. So this interconnected web of information is brilliant now when it comes to retrieval. So let's test out the retrieval side of it. So if you click on retrieval, let's ask it to tell me about the FIA. And you can see it's streaming through an answer. Instead, actually what we'll do is on the right hand side here, let's just choose only need context. And then let's ask the same question again. And what it has provided here is the list of entities. As you can see, it's showing the list of relationships and then also the various text chunks that are referenced from those entities. And that's what's grounding the model in creating this response. And you saw in the knowledge graph where we zoomed in on FIA, there was a huge web of interconnected nodes. And that's essentially what we're getting back here. So this is the entity. We're getting that full LLM description that was generated based on the various connections, but then we're also getting all of the nearest neighbors of that entity. And as you can see, this is quite a large JSON return from the knowledge graph. Then within the relationships for example you can see that FIA as an entity is connected to Formula 1 and there's a description of that relationship. So the FIA is the governing body responsible for overseeing and regulating F1 racing. So this really is incredible context for an LLM to generate an accurate answer. And one thing I just realized is with the query mode we have global set here and this was back to the dual retrieval that I talked about previously. So your options here are naive rag which doesn't use the knowledge graph at all. That's just your standard semantic vector store search. Local query is using the knowledge graph but only searching for almost exact matches within the query string. Global is extracting out concepts. Hybrid is a mix of local and global. So I would recommend that if you're looking to return knowledge graph information. Whereas if you're looking to have light rag act as an expert and as a standalone system, I would recommend using the mix mode which is a mixture of semantic search and knowledge graph. And the key I think for using mix mode is to use reranking because you're going to end up with a large number of document chunks from both the knowledge graph and the semantic retrieval. And the re-ranker will look at all of those and it's going to provide you the 10 most relevant ones for the question that was asked. So when I was evaluating the system I found that in mix mode when you use re-ranker it performed way better than without a reranker. The other thing is you can see that there's quite a lot of data that was going to be passed into your LLM. So it is important that you set reasonable max entity token sizes and max relationship token sizes because if you have a very large knowledge graph with entities that have vast numbers of nearest neighbors, you could absolutely burn through the token usage of your LLM. So, it is worth setting these to realistic levels. And this is the step-by-step process for that mix mode. So, the user asks a question like I did there. It extracts out local and global keywords like we talked about with the dual retrieval, generates embeddings of those keywords and then carries out a semantic search looking for entities and relationships that were embedded in the ingestion phase. And then off the back of the entities and relationships that return, it then carries out a graph traversal looking for the one hop nearest neighbors. And this is why I was saying that light rag isn't as impressive as other graph rag solutions for multihop style queries. Off the back of that then it gets all the text chunks and as I recommended you send it into a reranker which is a cross encoder and it can compare the user's question with the various chunks to provide the top 10 for example and from there those entities those relationships and the top 10 chunks are sent into the LLM to generate an answer which is then returned to the user. So now that we have our lighter rag system up and running, if you click on API, you can see the various endpoints that you can hit so that we can connect this to NADN. First off, let's post a query to the system to get a response. So I've come into NADN and created a new workflow. So let's add a chat trigger so that we can have a conversation with an AI agent. So let's add an AI agent. And then from here, let's add a chat model. So I'll just use OpenAI again. And I'll set this to 4.1. just added simple memory just to test this out and then under tool if you click on HTTP request and then back into the documentation if we go down to the query you can see the schema that we need to use and if you click execute it'll give you a sample curl request so we can just copy that out and then we can click on import curl dropped it in here so we click on import and now you can see the URL that we're going to hit which is for/query now we do need to pass authentication so I think we will end with an error here, but this is the example of the JSON that we're going to pass. If you click execute, for example, you'll see authorization failed. Please check your credentials. So, that makes sense. So, let's set authentication. So, we will choose generic credential type. And let's create a new credential. And then back into our API docs. And at the very top of the docs, if you click on authorize, you can see API key header. And we set this in our environmental variables in render light drag API key which is that one. And you'll see if you just paste it in there and click authorize. And now if we generate or if we execute this request via the documentation you can see that we have this header which is X API key and that's the key. So this is essentially what we need to set in our header off. So we click save to that. And now if we click execute we're getting a different error which is good. Your request is invalid. So within this body that we're passing, you can actually just remove all of this and instead of using a JSON body, just use fields below and we can just map what we have within the schema here. So it's looking for query and it needs to be passed the query text. So let's do that query and then the value needs to be set by the AI. So we'll just press this button and that will be automatically populated by the agent. Let's just leave it at that for the minute. So now if we click save, we'll rename this to our F1 expert. And then within the AI agent, we'll just specify a system message which is you must trigger the F1 expert tool. And now let's ask a question. Explain the F1 financial regulations. And you can see that's now going to the light rag application to generate a response. And there is the response that's talking about the cost cap, reporting and compliance, breaches and penalties, etc. But if we look at the actual tool call, you can see that we have got quite a detailed LLM response including citations from light rag. And we did only pass the query parameter which you can see there on the left hand side. But as you saw in the retrieval section, there are a lot of parameters that you can set to actually configure and tune the responses. So within the API, the default query mode is mix and as I mentioned, you should use re-ranking with that. Whereas for the rest of the parameters, it's just going to use what's set in the environmental variables such as top K set to 40, top chunks to 10, max tokens to 30,000. So as you can see, it is pretty straightforward to spin up the light rag application on render for example, manually upload some documents which will then autogenerate your knowledge graph which you can then easily connect as a tool for an AI agent within N8N. And using this mixed mode and getting the LLM within light rag to generate a response means that it can act as an independent expert based off whatever documents you upload. And this is where there is a lot of crossover between the functionality of NAN and of lighter rag. Both platforms have the ability to upload documents which are then embedded and inserted into a vector store. They both have the ability to generate LLM responses. There's some capabilities in light rag to manage chat history. They both have API endpoints, so it's totally viable to have Lightrag as your independent knowledge base that you can just ping as a tool for an AI agent. There are some shortcomings to only using Lightra, though. For example, there's no agentic capabilities within Lightra like what you have in N8N. There's no workflow logic, so you can't build drag ingestion pipelines. Sure, you can upload a document via the UI, but if you want to programmatically ingest documents from a folder, from Gmail, from web scraping, you can't really do that without using Light Rag's APIs. There's only basic chunking in Light Rag. Yes, you can configure the number of characters and tokens, but it's a really rudimentary splitter that could split in the middle of a word. You can only specify a single LLM for both the ingestion and for the actual inference. And as I mentioned, we have GPT 4.1 nano set here to speed up the ingestion, but that means you're using a really basic model to generate responses. Whereas in N8N, you can have different models for different tasks. And lighter rag doesn't support any of the advanced features that you need to get super accurate answers from a rag agent. Things like hybrid search or contextual retrieval or metadata filters or chat to your databases or chat to your spreadsheets. So that's why instead of having LightRag as a blackbox knowledge base that you can query and get answers, we prefer to use it just for its knowledge graph capabilities. And that's what we've built here with our state-of-the-art NADN agent. So, if you followed any of our RAG masterclass videos, you'll know that we've built out extensive Rag ingestion pipelines that covers advanced techniques like contextual embeddings, tracking changes using a record manager, document enrichment with advanced metadata extraction and filtering, and then using hybrid search and reranking on inference. So, I've extended this now to actually add an additional tool to this agentic rag agent. So not only can it now query a vector store but it can also query the knowledge graph and get back entities and relationships and this agentic rag system can carry out the query routing. So if it makes sense to go to the knowledge graph it does whereas if it doesn't make sense it just goes straight to the vector store and we have versions of the system that could go to a database that could query a spreadsheet that can carry out a web search etc. So if we look at the changes then from an ingestion perspective we're picking up new files in this case from a Google drive folder but that could be one drive or it could be a local folder if you're self-hosting this. Each document in the folder is then processed in this rag pipeline. First up we extract data from the document depending on the file type of the document and from there then we query the record manager comparing the contents of the file with what we already have saved to see is there anything new that needs to be saved. We then enter our document and metadata enrichment phase. We then work through contextual vector embeddings to enrich each chunk and contextualize it within the document. And we then have this new section for knowledge graph updates. And what we do here is if the document has never been seen before, we ping the light rag server and we pass the text that was extracted earlier in the pipeline into the light rag document store. This is essentially the same as uploading the document via the lighter rag UI. We then fetch that document to retrieve the document ID which we can then update our record manager in Superbase. This allows us to carry out deletions in the future if updates to the document take place. And that's covered in this section here where if the contents have changed, we delete that document from the knowledge graph which removes all of the entities and relationships and then we reingest the new version of the document. And there's a little polling loop happening here just making sure that the document has been fully deleted before we add the new version. So with this rag ingestion pipeline we now have documents ingested into our knowledge graph so that on inference when someone asks a question to the agent we can then query the knowledge graph which is this tool here. And that tool is triggering this retrieval subworkflow which now has a new switch which caters for graph requests to come down to this stream. And here we're querying the graph. This is the same endpoint that we hit previously, but the difference is we're only retrieving the context. We're not retrieving the LLM's response within light rag. We're just getting the entities and relationships JSON. The response is then tidied up and sent back to the agent to actually generate a grounded response based off the entities and relationships. So I've uploaded a few more documents to our rag system and you can see that we have a much richer knowledge graph now. So let's ask this agent a question. What are the regulations on wind tunnel usage and based on the system prompt you'll see that it's now going to query both the vector store and the knowledge graph at the same time. With our vector store, we have really tidy contextual embeddings where you can see that each chunk has an intro sentence that grounds the chunk in the document. We also have full text search as part of the hybrid retrieval as well as complex metadata filters. The agent has also gone to the knowledge graph and has now received the following entities and relationships. And you can see the various entities and relationships there. So the agent has now retrieved these contextrich chunks from our vector store as well as the overall local and global context from the knowledge graph and is able to produce this response which is incredibly comprehensive and also includes references to the various sections of the documents. So you can see that this is at another level compared to most NAN rag systems. If you'd like to get access to our state-of-the-art NAN rag system, which now includes full knowledge graph creation, then check out the link in the description to our community, the AI Automators, where you can join hundreds of fellow automators, all looking to leverage the latest in AI to automate their businesses and further their careers. We are obsessed with building accurate and reliable agents, which is why we dive so deep into these topics. So, within the community, you'll have all the resources you need to get your agents to another level. We'd love to see you here, so check it out below. Thanks for watching and I'll see you in the next

Transcript for:Knowledge Graphs and LightRAG

Transcript for:
Knowledge Graphs and LightRAG