AI Foundations Overview

Overview

This lecture series introduces the foundations of artificial intelligence (AI), covering core algorithms, reasoning with knowledge and uncertainty, optimization, machine learning (including neural networks), and natural language processing.

Search and Problem Solving

AI can search for solutions to problems by modeling them as states, actions, and goals.
Key terminology: agent (decision-maker), state (situation), action (choice), transition model (state changes), goal test, path cost.
Search problems can be visualized as graphs with nodes (states) and edges (actions).
Frontier is the set of unexplored states; nodes keep track of state, parent, action, and path cost.
Depth-First Search (DFS) uses a stack (LIFO); Breadth-First Search (BFS) uses a queue (FIFO).
DFS may not find optimal solutions; BFS guarantees the shortest path in equal-cost scenarios.
Informed search (like Greedy Best-First and A*) uses problem-specific knowledge (heuristics).
A* expands nodes with the lowest path cost plus heuristic estimate; requires admissible/consistent heuristics.
Adversarial search (e.g., games) uses Minimax and alpha-beta pruning for optimal decisions.

Knowledge Representation and Inference

AI represents knowledge through propositional logic (using symbols, connectives: and, or, not, implies, biconditional).
Models assign truth values to every symbol; a knowledge base stores known sentences.
Inference is deriving new knowledge from known facts; entailment means one sentence necessarily follows from others.
Model checking evaluates all possible worlds; proof procedures (like resolution) use inference rules.
Constraint satisfaction problems (CSPs) involve assigning values to variables under specific constraints.

Reasoning Under Uncertainty

Probability theory models AI's uncertainty about the world with random variables and probability distributions.
Conditional probability expresses likelihoods given evidence; joint and marginal probabilities combine/aggregate events.
Bayesian networks represent dependencies as directed graphs; inference can be exact (enumeration) or approximate (sampling).
Markov and Hidden Markov Models handle processes over time with state transitions and noisy observations.

Optimization Problems

Local search algorithms (e.g., hill climbing, simulated annealing) optimize solutions by iteratively improving states.
Linear programming solves optimization with linear constraints and objectives.
CSPs and optimization appear in scheduling, routing, and resource allocation problems.

Machine Learning Fundamentals

Supervised learning uses labeled data to learn functions mapping inputs to outputs (classification or regression).
Algorithms include nearest neighbor, linear models (perceptron), support vector machines, and decision trees.
Loss functions measure performance; regularization helps prevent overfitting.
Cross-validation splits data for training/testing model accuracy.

Neural Networks and Deep Learning

Neural networks are layers of interconnected "neurons" (units) with learned weights for inputs/outputs.
Activation functions (sigmoid, ReLU, softmax) introduce non-linearity and probabilities.
Backpropagation and gradient descent adjust weights to minimize loss.
Deep networks with hidden layers model complex functions (deep learning); dropout prevents overfitting.
Convolutional Neural Networks (CNNs) are specialized for images, using convolutions and pooling for feature extraction.
Recurrent Neural Networks (RNNs), such as LSTM, handle sequences for tasks like translation or text generation.

Natural Language Processing (NLP)

NLP tasks include summarization, translation, question answering, and classification.
AI parses syntax (structure) and semantics (meaning); context-free grammars describe structure.
n-gram models and Markov chains capture statistical language patterns.
Techniques like bag-of-words, word embeddings (Word2Vec), and neural sequence models provide numeric representations.
Transformers and attention mechanisms enable scalable, state-of-the-art language models by focusing on relevant context.

Key Terms & Definitions

Agent — Entity making decisions
State — Configuration of agent/environment
Frontier — Set of unexplored states in search
Heuristic — Estimate of proximity to goal
Knowledge Base — Set of known facts
Entailment — Logical consequence relation
Conditional Probability — Probability given evidence
Bayesian Network — Graphical model of variable dependencies
Loss Function — Quantifies error of a model
Regularization — Penalizes complexity to avoid overfitting
Neural Network — Model of connected units learning functions
Activation Function — Determines unit output (e.g., sigmoid, ReLU)
Backpropagation — Algorithm for neural network weight updates
Attention — Mechanism to focus on relevant input parts
Transformer — Deep learning architecture using attention for sequence tasks

Action Items / Next Steps

Review lecture concepts and definitions
Test search, optimization, and machine learning algorithms with sample problems
Explore AI programming libraries like NLTK, Scikit-learn, and TensorFlow
Experiment with AI models for language or image data
Continue studying advanced AI topics and current research trends