Understanding Lexical Semantics and WSD

Oct 6, 2024

Lecture 3: Lexical Semantics and Word Sense Disambiguation

Overview

  • Focus on lexical semantics and relationships between word forms.
  • Discussed WordNet and its hierarchical structure.
  • Introduced the concept of Word Sense Disambiguation (WSD).

Key Concepts

Lexical Semantics

  • Study of word meanings and relationships.
  • Words can have multiple meanings (senses).

Word Sense Disambiguation (WSD)

  • Problem of determining which sense of an ambiguous word is used in context.
  • Example: The word "bass" can refer to a fish or a musical term depending on context.
  • Important for natural language processing (NLP).

Challenges in WSD

  • Ambiguity arises when a word has multiple meanings.
  • Hard to apply disambiguation methods directly without context.

Approaches to WSD

Knowledge-Based Approaches

  1. Overlap-Based Methods

    • Use machine-readable dictionaries (like WordNet) to identify overlaps between senses and context.
    • Construct sense bags (sense packs) and context bags for comparison.
  2. Example of Sense Disambiguation Process

    • Given a sentence, identify all possible senses of a word.
    • Construct sense bags for each sense and a context bag from surrounding words.
    • Measure overlap and choose sense with the highest overlap.

Algorithms

  • Lesk Algorithm: Measures overlap between context and sense definitions.
    • Steps:
      1. Create sense bags for each sense based on definitions.
      2. Create a context bag from the sentence.
      3. Calculate overlaps and select the sense with the highest overlap.

Graph-Based Methods

  • Using PageRank-like algorithms to consider multiple words in a sentence together.
  • Create a graph where nodes represent senses and edges represent similarities.
  • Use PageRank to find the best sense connections among ambiguous words.

Machine Learning Approaches

  1. Naive Bayes Classifier: A probabilistic model for classifying senses based on features.

    • Features might include grammatical and semantic characteristics of surrounding words.
    • Computes probabilities to find the most likely sense based on context.
  2. Decision List Classifier: A list of collocations that associate specific senses with specific contexts.

    • Uses log likelihood ratios to decide which sense is used based on surrounding words.

Summary

  • WSD is an important problem in NLP with various approaches including knowledge-based and machine learning methods.
  • Different algorithms and techniques have been developed to tackle the challenges of disambiguating word senses based on context.
  • Upcoming lectures will explore additional approaches and techniques in WSD.