📊

DGIM Algorithm Overview and Key Elements

Oct 22, 2024

Notes on Motwani Algorithm (DGIM)

Overview

  • The Motwani algorithm is used to count the number of ones in a continuous data stream.
  • Also known as DGIM (Datar-Gehrke-Indyk-Motwani).
  • Uses logarithmic space to represent a window of n bits, requiring less space than the total number of bits (n).
  • Probability of correct result is always greater than 50%.

Key Elements

1. Timestamp

  • Each element in the stream gets a timestamp based on its position.
    • Example:
      • 1st bit = timestamp 1
      • 2nd bit = timestamp 2
      • 3rd bit = timestamp 3

2. Buckets

  • Buckets represent time intervals in the data stream, dynamically created.
  • Each bucket contains bits (zeros and ones).
  • Size of each bucket is in the power of 2 (1, 2, 4, 8, etc.).

Rules for Bucket Formation

  1. Contain at least one '1':

    • A valid bucket must have at least one '1'.
    • Example: Bucket with four ones is valid; empty bucket is invalid.
  2. Rightmost bit must be '1':

    • The last bit in the bucket must be '1' for the bucket to be valid.
    • Example: Rightmost '1' = valid; rightmost '0' = invalid.
  3. Length of the bucket:

    • Length equals the number of '1's it contains.
    • Example: Bucket with four '1's has a length of 4.
  4. Bucket length in powers of 2:

    • Valid lengths are powers of 2 (1, 2, 4, 8, etc.).
    • Example: Length 4 is valid (2^2); length 3 is invalid.
  5. Increasing size from right to left:

    • Bucket sizes must not decrease as you move from right to left.
    • Example: Lengths of buckets should increase (1, 2, 4, ...).
  6. No more than two buckets of the same size:

    • Only two buckets can have the same length; more than two is invalid.
    • Example: Two length-2 buckets are valid; three length-2 buckets are invalid.

Example of Bucket Formation

  • Given a stream of bits, start forming buckets from the right side.
  • Example process:
    1. Size 1 bucket: Valid (1 '1').
    2. Size 2 bucket: Valid (2 bits, rightmost '1').
    3. Size 4 bucket: Valid (4 '1's).
    4. Merge buckets as needed, following rules.

Handling New Bits

  • New bits (especially '0's) do not require bucket adjustments.
  • If a new '1' bit enters:
    • If there are already two buckets of the same length, merge smaller buckets to accommodate the new bit.

Counting Ones in a Stream

  • To find the number of ones in the most recent 18 bits:
    • Sum lengths of relevant buckets within that range.
    • Example: If buckets of lengths 1, 2, 2, and 4 exist under recent bits, total = 1 + 2 + 2 + 4 = 9 ones.

Conclusion

  • The DGIM algorithm efficiently counts the number of ones in a stream using a logarithmic space approach.
  • Important to follow bucket formation rules.
  • For questions or clarifications, engage in comments.
  • Remember to like, share, and subscribe for more content!