hello how you doing in this video I'm going to talk about llm and Bings have you heard the term embedding but you haven't had time to research it if so then watch along with me for the next few minutes and I will quickly get you up to speed okay let's get started what is an llm embedding well llms can take in text inputs and generate embeddings embeddings are either one or more sequenced and dimensional vectors in this example there is an llm generating an embedding for a text document quick note although in this example we are showing an embedding being created for a text document you can also create embeddings for sentences and even words so what does the embedding generation process look like well Step One is a tokenization process in this process the llm tokenizer converts raw text into tokens in step two in the embedding process each token is mapped to an n-dimensional Vector through an embedding layer in the llm the result of step two is a generation of an embedding which in the example of a text document is a sequence of vectors where each Vector corresponds to a token and captures its meaning in n-dimensional space quick call out I'm showing a generic llm in this diagram turns out there are special llms optimized for creating embeddings all of the providers of llms also provide versions of their llms optimized for generating embeddings a nuanced feature of llm embeddings are that they are context sensitive what does this mean exactly well the same word can have different embeddings depending on the rounding text allowing the llm to capture the meaning of the word here in the first sentence the guy is cool the word cool is used to describe someone who's trendy or hip in the second sentence it is cool outside at night the word cool is used describe temperature in these two examples the same word cool will have a different embedding because of the surrounding text allowing the llm to correctly capture the meaning of cool in both of these sentences so why do system Builders care about llm embeddings why should you care well when you build a system with an llm you're likely to bring in proprietary data from your organization and you'll want to augment your thirdparty llm with this data so how exactly do you integrate your proprietary data into an llm driven system one approach is to take your proprietary text documents and systematically create embeddings for each of them now where do you store these embeddings well the best place to store these embeddings is a vector database I know you might be thinking great another kind of database after you mastered relational databases Along Came key value pair data stories and then column or data stores and then we got graph DBS for social network related use cases and then came blockchain databases for distributed immutable ledgers and now and now we have Vector databases but here's the deal guys Vector databases are optimized for storing vectors and doing the type of vector operations that come with that specifically they are optimized to efficiently handle similarity searches as well as Vector insert update and delete operations they're able to create efficient and dimensional Vector indexes which enable them to support all of these ML and llm driven use cases in a world with Purp driven data stores you want to consider a data store that is optimized for the query patterns in the system use case doing this gives you the best performance with optimal operating cost I have an upcoming video on Vector databases where I will go deeper into this topic but for now you should have a good conceptual understanding of why Vector databases are needed for embeddings so lastly quickly show you the flow for an llm driven system that that includes proprietary data stored in a vector database first the user sends a user prompt to an llm next the llm tokenizer converts the text in the user prompt into tokens the user prompt is also submitted to an embedding llm which generates a query embedding and uses this to perform a similarity search on the vector database semantically relevant documents are then paired back up with the original user prompt and the llm uses both of these to generate a response the response embedding is converted to Output tokens and from there the output tokens are converted back to a user response by the way this type of llm driven system is also known as rag or retrieval augmented generation in summary llm embeddings enable you to capture your proprietary data sets and integrate them into your llm driven system these types of llm driven systems are also known as rag systems I'm working on upcoming video on llms and rag but this gives you a quick highle introduction to this type of llm system okay thanks for watching this video along with all my other videos in the MLA knowledge Concepts playlist are listed in the YouTube description I invite you to watch other videos on my channel if you like the way I'm sharing this content please please consider subscribing when you subscribe this really helps my channel grow one last thing we all love technology and we're all excited about the Innovation with the cloud machine learning AI but don't forget to carve out some time to live in the real world go outside go swimming go hiking go climbing go surfing get out and move your body and if you do tell me in the comments I want to hear about it and with that have a great day thanks