🚀

Implementing RAG with Watson Explose

Feb 10, 2025

Lecture Notes: Implementing RAG with Watson Explose Engine

Introduction to RAG

  • RAG (Retrieval-Augmented Generation) is an efficient method for leveraging LLMs (Large Language Models) in a business setting.
  • Deploying RAG at scale involves complexities beyond the development environment (e.g., Jupyter notebook).
  • Key challenges include standing up vector databases, managing embeddings, creating authenticated APIs, and handling large data volumes and many users.

Three Steps to RAG Implementation

Step 1: Installation

  • Objective: Install Watson Explose engine.
  • Process:
    • Download the installer and use the command to install.
    • Verify installation by running wxflows --version on MacBook.
    • Additional command: wxflows --help for command overview.

Step 2: Authentication

  • Objective: Authenticate with Watson Explose.
  • Process:
    • Run wxflows login to start the authentication.
    • Required inputs: environment, domain, and admin key.
    • Use wxflows whoami to verify authentication details (domain, environment, admin key, API key).

Step 3: Data Upload and Deployment

  • Objective: Upload data and deploy a RAG flow.
  • Process:
    • Run wxflows init --interactive to initiate data chunking.
    • Provide data location (e.g., IBM's annual report in markdown) and chunking parameters.
    • Post chunking, receive three new files.
    • Customize the RAG flow by altering steps in the flow.
      • Options include adding prompt templates, hallucination score steps, distance metrics.
    • Deploy vector store using wxflows collection deploy.
    • Choose and uncomment the desired RAG flow in the toml file.
    • Deploy the flow using wxflows flows deploy.
    • Retrieve an API endpoint for enterprise RAG application.

Outcomes

  • Successful deployment includes querying capabilities, receiving completions, groundedness warnings, hallucination metrics, and source documents.

Conclusion

  • Implementing RAG with Watson Explose engine simplifies the deployment process while automating complex tasks like tokenization, retrieval, and setting up guardrails.