Building a RAG Application: Overview

Feb 14, 2025

Lecture Notes: Creating a RAG Application with Deep Seek and Olama

Introduction

  • Host: Prashna Aik
  • Purpose: Demonstrate an end-to-end creation of a RAG (Retrieval-Augmented Generation) application
  • Tools: Deep Seek, Olama embedding
  • Focus on installation and implementation
  • Promises high accuracy with local installation

Requirements and Setup

  • Libraries: Mention use of PDF Plumber for reading PDFs
  • Environment: Local installation of Olama and usage of Streamlit for app development
  • Objective: Step-by-step demonstration of coding and performance

Implementation Steps

Initial Steps

  • Import Libraries: import Streamlit as ST, PDF Plumber
  • Load PDF Content: Use PDF Plumber Loader to read PDF data
  • Text Splitting:
    • Perform recursive character text splitting
    • Convert data into embeddings
    • Store in an in-memory vector store to avoid cloud dependency

Embedding and Vector Store

  • Use langchain_ulama for Olama embeddings
  • Olama Embedding: Convert text into vectors, stored locally
  • Langchain Core: Import and use chat prompt templates

Deep Seek and Styling

  • Deep Seek: Used for additional styling and visual enhancement
  • Prompt Template Configuration:
    • Designed for concise, factual responses
    • Chat prompt includes user query and document context

Document Handling

  • Document Store: Setup local storage for uploaded PDFs
  • Model and Embedding Configuration:
    • Embedding Model: Olama embeddings, Deep Seek R1 1.5 billion
    • Vector Store: In-memory vector store with Olama LLM

Functionality

File Upload and Processing

  • Save Uploaded File: Function to handle file uploads
  • Load PDF Documents: Initiate loading upon file upload
  • Chunk Documents: Text splitting for processing

Embedding and Similarity Search

  • Index Documents: Store chunks in vector store
  • Similarity Search: Function for cosine similarity search to find related documents

User Interaction and Response Generation

  • Generate Answer: Uses user query and document context
  • Chat Prompt: Utilizes chaining with conversational prompts
  • Response Chain: Invoke with user query and context text

UI Configuration

  • Streamlit UI: Setup for PDF upload and analysis
  • PDF Selection: Option to select single PDF for simplicity

Execution

  • Command Execution:
    • Activate environment: conda activate vnv
    • Run application: streamlit run rag_deep.py
  • User Interface: Upload PDF and interact with the application using queries

Conclusion

  • Local, open-source setup ensures data privacy
  • Application uses Olama and Deep Seek for embedding and response generation
  • Provides accurate and efficient retrieval-augmented information

Additional Notes

  • Naming: Application named Documind AI
  • Resources: Code provided in video description
  • Performance: Described as accurate and effective for data retrieval and query resolution

Hope you find this summary useful for understanding the implementation and capabilities of the RAG application as demonstrated by Prashna Aik.