Improving Language Models through Interventions

Jul 10, 2024

Improving Language Models through Interventions

Introduction

  • Language Models (LMs):

    • Used in various fields such as medicine, finance, science, and entertainment.
    • Can behave unpredictably, generate incorrect or harmful content.
    • User needs change due to new regulations, limited resources, outdated knowledge, or copyright issues.
    • Need for quick solutions to prevent LMs from becoming inaccurate, outdated, or biased.
  • Interventions:

    • Efficient updates for LMs after initial training.
    • Aim to address issues such as outdated information and changing user requirements.
    • Examples: model compression, knowledge editing, and unlearning.
    • Typically developed independently, posing challenges for effective combination.
  • Proposed Solution:

    • Interventions should be composable, meaning they can work together harmoniously.
    • Metrics introduced:
      • Order-free error: Assesses if one intervention impacts the success of others.
      • Order sensitivity: Checks if the application order of interventions affects the outcome.
  • Experiments:

    • Conducted using the LLaMA 3-8B model.
    • Results showed that model compression can hinder other interventions.
    • Emphasis on developing interventions with composability in mind.

Composable Interventions for Language Models

  • Intervention Composition:

    • Involves applying multiple interventions in a specific order to a model.
    • Assume each intervention is based on a single criterion for simplicity.
  • Measuring Effectiveness:

    • Consider how one intervention affects the application of subsequent interventions.
    • Metrics introduced:
      • Order-free error: Performance regardless of order.
      • Order sensitivity: Performance invariance to order.
  • Evaluation Method:

    • Apply an intervention, measure its impact, add another, and measure the new impact.
    • Reverse the order of interventions and compare results.
    • Helps identify direct interactions between interventions.
  • Experimental Setup:

    • Study of the impact of multiple sequential interventions on model performance.
    • Importance of order in applying interventions for effective results.
    • Detailed evaluation crucial for developing practical and composable interventions.

Implementation Details

  • Model & Metrics:

    • Experiments conducted with the LLaMA 3-8B model.
    • Focus on how interventions work individually and combined.
  • Findings:

    • Model compression impacts the success of knowledge editing and overall performance.
    • Order of applying interventions affects the outcome.
    • Emphasis on creating tailored methods for successful composition.
  • Utility Evaluations:

    • Overall utility evaluations may not accurately capture composability.
    • Need for detailed evaluations focusing on composability as a design factor.

Composing Model Compression with Machine Unlearning

  • Interventions:

    • Composed three model compression and three machine unlearning interventions.
    • Evaluated success of unlearning using wdp (weighted data points).
  • Results:

    • Compressing models can make unlearning more difficult.
    • Pruning before unlearning reduces performance, especially at higher sparsity.
    • Optimal order varies depending on the method and intervention.
    • Compression tends to hinder other interventions and make targeted updates harder.
  • Metrics & Techniques:

    • GD and RMU outperform Gau in terms of order-free error on wdp.
    • RMU shows lower overall order sensitivity, while GD has higher order sensitivity for pruning compared to quantization.
  • Knowledge Editing and Unlearning:

    • High composability between some unlearning methods and knowledge editing.
    • RMU stands out due to low order-free error and order sensitivity.
    • Editing precise modifications do not disrupt unlearning targets.
  • Summary:

    • Compression alters model knowledge storage, making updates difficult.
    • RMU emerges as the most composable unlearning technique.
    • Need thorough evaluations using multiple metrics and datasets to assess intervention composability effectively.