Coconote
AI notes
AI voice & video notes
Try for free
👕
Building LLM Tool for T-Shirt Store
Aug 3, 2024
Lecture: Building an End-to-End LLM Project for a T-Shirt Store
Overview
The project aims to build a tool similar to ChatGPT that converts natural language questions into SQL queries to interact with a MySQL database.
The store data includes t-shirt inventory and discounts.
The project will use Google PaLM, LangChain, and other technologies.
The goal is to allow the store manager to query the database in plain English.
Project Requirements
Store Data
: Includes four brands (V Hussein, Levis, Nike, Adidas) stored in a MySQL database.
Tables
:
t-shirts
: Maintains inventory count, price per unit.
discounts
: Maintains discount percentage for each t-shirt.
Key People
:
Tony Sharma
: Store manager, uses software to query the database.
Lok
: Data analyst who writes SQL queries, often busy.
Peter P
: Data scientist tasked with building the query tool.
Technical Architecture
LLM (Google PaLM)
: Converts natural language questions into SQL queries.
LangChain Framework
: Utilizes Google PaLM within LangChain's
SQLDatabaseChain
class.
Few-Shot Learning
: Improves query accuracy by preparing a dataset of sample questions and corresponding SQL queries.
Vector Database
: Stores embeddings of the few-shot learning data.
Options include Pinecone, Milvus, ChromaDB (using ChromaDB for this project).
Streamlit
: Used to build the UI.
API Keys
: Set up through Google Maker Suite.
Implementation Steps
MySQL Database Setup
Use MySQL Workbench for database operations.
Create necessary tables (
t-shirts
and
discounts
) and populate sample data.
Jupyter Notebook Experimentation
Setup
: Import necessary libraries and set up Google PaLM API key.
Data Connection
: Connect Jupyter Notebook to the MySQL database.
Creating LLM Object
: Create and test Google PaLM LLM object.
Sample Queries
: Run sample queries to ensure the setup works.
Handling Errors
: Use few-shot learning to improve the model's performance on more complex queries.
Few-Shot Learning
Purpose
: Improve the LLM's ability to handle complex queries by providing sample question-query pairs.
Embeddings
: Use Hugging Face to generate embeddings for the sample data.
Vector Database
: Store embeddings in ChromaDB.
Example Selector
: Use semantic search to find the most similar examples from the vector database.
Custom SQL Prompt
: Use LangChain's pre-built SQL prompt to guide the LLM.
Testing
: Run various queries to test improvements.
Streamlit Interface
Setup
: Create a main Python file (
main.py
) to house the Streamlit code.
UI Elements
: Text input for questions, display answers.
Integration
: Transfer code from Jupyter Notebook to Python files, modularize the code.
Run Streamlit
: Test the user interface and functionality.
Conclusion
The tool allows the store manager to query the database using natural language, improving efficiency.
Encouraged to practice coding and explore the code provided in the video description.
Additional Resources
LangChain Basics Video
Vector Database Explanation Video
MySQL Tutorial Video
Embeddings Introduction Video
📄
Full transcript