Lecture Notes: Introduction to Large Language Models (LLMs)

Overview

Presenter’s experience with GPT and LLMs
Three main questions addressed:
1. What is an LLM?
2. How do LLMs work?
3. Business applications of LLMs

LLM: A type of foundation model specialized in text and text-like data (e.g., code)
Foundation Model: Pre-trained on large amounts of unlabeled, self-supervised data
- Learns from patterns in the data
- Produces generalizable and adaptable output
Training data: Books, articles, conversations, etc.
- Models can be tens of gigabytes to petabytes in size
- Example: 1 gigabyte ≈ 178 million words
- 1 petabyte = 1 million gigabytes
Parameters: Values the model can change independently to learn
- More parameters = more complex model
- Example: GPT-3 uses 175 billion parameters, pre-trained on 45 terabytes of data

Customer Service: Intelligent chatbots to handle customer queries
Content Creation: Generating articles, emails, social media posts, and video scripts
Software Development: Generating and reviewing code