Lecture 15 on 03/18/2026 - Epsilon-Delta Approximate Median; Morris+ and ++; Variance of Morris Counter
(ε,δ)-Approximate Median
Section titled “(ε,δ)-Approximate Median”Why Approximate Median?
Section titled “Why Approximate Median?”In an ideal scenario, we would compute the exact median—the element with rank exactly . However, this is often infeasible in streaming contexts where we cannot afford to store or revisit all elements. Instead, we settle for an approximate median: an element whose rank is close enough to the true median rank.
By “close enough,” we mean within an additive error band: we allow the rank to deviate by up to in either direction. This gives us a practical solution that uses limited space and time while still providing a meaningful approximation.
Problem Definition
Section titled “Problem Definition”Input: A stream and
Goal: Return such that w.p.
Acceptable Range for Median Rank
Section titled “Acceptable Range for Median Rank”The algorithm succeeds if the returned median’s rank falls within the acceptable window. Consider the sorted stream of elements. The acceptable range for the rank of the median spans from (the left boundary) to (the right boundary).
The size of this acceptable range is:
The remaining elements fall in the two bad regions combined. If the returned element’s rank falls outside the acceptable range - either below or above - the algorithm fails.
Algorithm
Section titled “Algorithm”The algorithm is surprisingly simple and relies on the power of sampling:
- Uniformly sample elements independently from the stream
- Compute the median of these samples
- Return this sampled median as our estimate
The intuition is that when we sample uniformly, elements in the “bad” regions (far from the true median) will also be sampled proportionally to their frequency. If we sample enough elements, it becomes unlikely that more than half of our samples fall into these bad regions, which would be necessary for the median of our samples to also be bad.
Theorem
Section titled “Theorem”Let be the output of the Algorithm. Then,
with probability at least .
Understanding When the Algorithm Fails
Section titled “Understanding When the Algorithm Fails”To understand why the algorithm works, let’s first identify when it would fail.
Partition the sorted stream into three regions:
- Left bad region (): the smallest elements
- Good region (middle): the middle elements—this is our acceptable range
- Right bad region (): the largest elements
Let be the set of samples. If the median of our samples falls in , then by definition of median, at least half of the samples must also be in . The same argument applies symmetrically for .
To formalize this, let and be the samples that fall in the left and right bad regions respectively. Then the algorithm fails if and only if or .
Bounding the Failure Probability
Section titled “Bounding the Failure Probability”Now we’ll use concentration inequalities to bound how likely it is that too many samples fall into bad regions.
Step 1: Probability of sampling a bad element
When we uniformly sample an element from the stream, the probability it lands in is simply:
Since we sample independently, each of the samples is a Bernoulli trial with success probability (where “success” means landing in ).
Step 2: Expected number of bad samples
By linearity of expectation:
This tells us that we expect about fraction of our samples to fall in . Since , we expect fewer than half of the samples to be bad—which is good! But we need to bound how much the actual count can deviate from this expectation.
Applying Chernoff Bound
Section titled “Applying Chernoff Bound”Step 3: Setting up the concentration problem
We want to bound the probability that exceeds a critical threshold. The critical threshold is when more than half our samples are bad: .
To apply Chernoff’s bound, we need to express this event as a multiplicative deviation from the expectation. Let’s rewrite:
where is a deviation parameter. Solving for :
Dividing both sides by :
Step 4: Applying Chernoff’s bound
Chernoff’s bound tells us that for a sum of independent Bernoulli random variables with expectation , the probability of exceeding decays exponentially:
Plugging in our values ( and ):
Similarly:
By union bound:
Setting the Sample Size
Section titled “Setting the Sample Size”We want to bound by :
Solving for :
Conclusion: If , then with probability at most the algorithm fails.
Hence, if (which is up to constant factors), then with probability at least the algorithm succeeds.
Recall: Morris Counter
Section titled “Recall: Morris Counter”Problem Setup
Section titled “Problem Setup”Input: A stream
Goal: Count the number of elements using bits.
Algorithm
Section titled “Algorithm”Let denote the current counter value.
Update rule: with probability
Guarantee
Section titled “Guarantee”Computing Variance
Section titled “Computing Variance”To compute variance, use the formula:
We need , which can be found in a similar fashion to .
Main Issue with Morris Counter
Section titled “Main Issue with Morris Counter”Morris Counter provides an elegant solution using only bits, with the following attributes:
- Mean: (correct on average)
- Variance: (highly variable, a significant drawback)
The large variance means that while the expected output is correct, the actual output can deviate wildly from . For example, with high probability the output could be as far as away from the true count—rendering the estimate unreliable.
Boosting via Averaging: Morris+
Section titled “Boosting via Averaging: Morris+”The natural solution is to run multiple independent Morris Counters and average their outputs. Exercise 4.9(b) tells us that to get an -estimate (within with probability ), we need:
where is a constant. So we need parallel machines.
By averaging the outputs of Morris Counters, we get an estimate within with probability at least .
Boosting via Median: Morris++
Section titled “Boosting via Median: Morris++”But we still have a problem: the failure probability is , which may be too high. We can boost confidence further by running copies of Morris+ in parallel and taking their median.
By Exercise 4.9(c), taking the median of independent weak estimates (each with failure probability ) yields a strong estimate with failure probability at most .
Final result: Morris++ uses parallel machines and achieves an -estimate of the count.
Variance of Morris Counter
Section titled “Variance of Morris Counter”Let denote the counter’s state when items have passed.
Expanding based on the transition probabilities:
Simplifying (noting ):
Breaking this into two separate expectations:
Note: From the Morris Counter algorithm, , so . Therefore:
Solving by telescoping (with base case ):
Therefore, the variance is: