Hybrid Search in AI Systems

Overview

The lecture explains the limitations of pure vector (semantic) search and keyword (full text) search in AI agents, and demonstrates how hybrid search combines both approaches using Superbase and Pinecone for more accurate and flexible retrieval-augmented generation (RAG) systems.

Vector Search and Its Limitations

Vector (semantic) search captures the meaning of queries by transforming them into dense numerical vectors.
It retrieves results with similar meanings, useful for broad, intent-based queries.
Struggles with precise queries involving specific names, product codes, acronyms, or exact terms.
May return loosely related results due to its focus on semantics.

Keyword (Full Text) Search and Its Limitations

Keyword search matches exact or partial words in titles and descriptions using sparse vectors.
Delivers precise results when queries match specific terms in the database.
Lacks understanding of synonyms or conceptual meaning; e.g., "t-shirt" does not equal "tee" unless specified.
Not flexible with varied or natural language queries.

Hybrid Search: Concept and Implementation

Hybrid search merges vector and keyword search for both semantic relevance and precision.
User queries are transformed into both dense (vector) and sparse (keyword) vectors.
Search results from both methods are combined and ranked using techniques like reciprocal rank fusion.
This approach surfaces exact matches first, followed by semantically related items.

Hybrid Search with Superbase

Superbase hybrid search requires a documents table with embedding (dense) and TS vector (full text) columns.
Extensions for vector and full text search must be enabled; appropriate indexes are created for speed.
A custom database function performs both searches and fuses the results, with ranking details available.
Edge functions interact with OpenAI's embedding API to generate vectors from queries.
Action nodes in N8N send chat queries to Superbase's edge function, which returns ranked search results.

Hybrid Search with Pinecone

Pinecone hybrid search uses a single index configured with dense vectors and sparse (keyword) values.
Documents are chunked, and both dense (multilingual E5 large) and sparse (pinecone-sparse-english-v0) embeddings are generated.
On queries, embeddings are produced with input type "query" for accurate search.
The Pinecone API supports hybrid querying, returning results with both semantic and keyword matches scored.

Advantages and Use Cases

Hybrid search excels at precise term searches (e.g., product codes, technical acronyms) and natural language questions.
Re-ranking models (e.g., Cohere) can further improve result ordering for answer generation.
Ensures AI agents can ground answers with both contextually relevant and exact information from the knowledge base.

Key Terms & Definitions

Vector Search (Semantic Search) — Uses dense numerical representations to match the meaning of queries and data.
Keyword Search (Full Text Search) — Uses sparse vectors to find exact or partial textual matches in data.
Hybrid Search — Combines vector and keyword searches to leverage their respective strengths.
Dense Vector — Numerical array capturing the semantic meaning of text.
Sparse Vector — Array representing the presence of specific keywords or terms.
Reciprocal Rank Fusion — Method for merging and ranking results from different search systems.
RAG (Retrieval-Augmented Generation) — AI system architecture that retrieves relevant data for answer generation.

Action Items / Next Steps

Review Superbase and Pinecone documentation for hybrid search setup.
Set up the required database schemas and API credentials.
Test the hybrid search implementation with both broad and precise queries.
(Optional) Explore re-ranking models to further refine search result order.