🤖

AI Foundations Overview

Jul 25, 2025

Overview

This lecture series introduces the foundations of artificial intelligence (AI), covering core algorithms, reasoning with knowledge and uncertainty, optimization, machine learning (including neural networks), and natural language processing.

Search and Problem Solving

  • AI can search for solutions to problems by modeling them as states, actions, and goals.
  • Key terminology: agent (decision-maker), state (situation), action (choice), transition model (state changes), goal test, path cost.
  • Search problems can be visualized as graphs with nodes (states) and edges (actions).
  • Frontier is the set of unexplored states; nodes keep track of state, parent, action, and path cost.
  • Depth-First Search (DFS) uses a stack (LIFO); Breadth-First Search (BFS) uses a queue (FIFO).
  • DFS may not find optimal solutions; BFS guarantees the shortest path in equal-cost scenarios.
  • Informed search (like Greedy Best-First and A*) uses problem-specific knowledge (heuristics).
  • A* expands nodes with the lowest path cost plus heuristic estimate; requires admissible/consistent heuristics.
  • Adversarial search (e.g., games) uses Minimax and alpha-beta pruning for optimal decisions.

Knowledge Representation and Inference

  • AI represents knowledge through propositional logic (using symbols, connectives: and, or, not, implies, biconditional).
  • Models assign truth values to every symbol; a knowledge base stores known sentences.
  • Inference is deriving new knowledge from known facts; entailment means one sentence necessarily follows from others.
  • Model checking evaluates all possible worlds; proof procedures (like resolution) use inference rules.
  • Constraint satisfaction problems (CSPs) involve assigning values to variables under specific constraints.

Reasoning Under Uncertainty

  • Probability theory models AI's uncertainty about the world with random variables and probability distributions.
  • Conditional probability expresses likelihoods given evidence; joint and marginal probabilities combine/aggregate events.
  • Bayesian networks represent dependencies as directed graphs; inference can be exact (enumeration) or approximate (sampling).
  • Markov and Hidden Markov Models handle processes over time with state transitions and noisy observations.

Optimization Problems

  • Local search algorithms (e.g., hill climbing, simulated annealing) optimize solutions by iteratively improving states.
  • Linear programming solves optimization with linear constraints and objectives.
  • CSPs and optimization appear in scheduling, routing, and resource allocation problems.

Machine Learning Fundamentals

  • Supervised learning uses labeled data to learn functions mapping inputs to outputs (classification or regression).
  • Algorithms include nearest neighbor, linear models (perceptron), support vector machines, and decision trees.
  • Loss functions measure performance; regularization helps prevent overfitting.
  • Cross-validation splits data for training/testing model accuracy.

Neural Networks and Deep Learning

  • Neural networks are layers of interconnected "neurons" (units) with learned weights for inputs/outputs.
  • Activation functions (sigmoid, ReLU, softmax) introduce non-linearity and probabilities.
  • Backpropagation and gradient descent adjust weights to minimize loss.
  • Deep networks with hidden layers model complex functions (deep learning); dropout prevents overfitting.
  • Convolutional Neural Networks (CNNs) are specialized for images, using convolutions and pooling for feature extraction.
  • Recurrent Neural Networks (RNNs), such as LSTM, handle sequences for tasks like translation or text generation.

Natural Language Processing (NLP)

  • NLP tasks include summarization, translation, question answering, and classification.
  • AI parses syntax (structure) and semantics (meaning); context-free grammars describe structure.
  • n-gram models and Markov chains capture statistical language patterns.
  • Techniques like bag-of-words, word embeddings (Word2Vec), and neural sequence models provide numeric representations.
  • Transformers and attention mechanisms enable scalable, state-of-the-art language models by focusing on relevant context.

Key Terms & Definitions

  • Agent — Entity making decisions
  • State — Configuration of agent/environment
  • Frontier — Set of unexplored states in search
  • Heuristic — Estimate of proximity to goal
  • Knowledge Base — Set of known facts
  • Entailment — Logical consequence relation
  • Conditional Probability — Probability given evidence
  • Bayesian Network — Graphical model of variable dependencies
  • Loss Function — Quantifies error of a model
  • Regularization — Penalizes complexity to avoid overfitting
  • Neural Network — Model of connected units learning functions
  • Activation Function — Determines unit output (e.g., sigmoid, ReLU)
  • Backpropagation — Algorithm for neural network weight updates
  • Attention — Mechanism to focus on relevant input parts
  • Transformer — Deep learning architecture using attention for sequence tasks

Action Items / Next Steps

  • Review lecture concepts and definitions
  • Test search, optimization, and machine learning algorithms with sample problems
  • Explore AI programming libraries like NLTK, Scikit-learn, and TensorFlow
  • Experiment with AI models for language or image data
  • Continue studying advanced AI topics and current research trends