Lecture 15 on 03/18/2026 - Epsilon-Delta Approximate Median; Morris+ and ++; Variance of Morris Counter

(ε,δ)-Approximate Median

Why Approximate Median?

In an ideal scenario, we would compute the exact median—the element with rank exactly $\frac{N}{2}$ . However, this is often infeasible in streaming contexts where we cannot afford to store or revisit all elements. Instead, we settle for an approximate median: an element whose rank is close enough to the true median rank.

By “close enough,” we mean within an additive error band: we allow the rank to deviate by up to $\varepsilon N$ in either direction. This gives us a practical solution that uses limited space and time while still providing a meaningful approximation.

\text{rank}(\text{median}) = \frac{N}{2}

Problem Definition

Input: A stream $S = \{x_1, x_2, \ldots, x_N\}$ and $\varepsilon, \delta \in (0,1)$

Goal: Return $m$ such that $\left(\frac{1}{2} - \varepsilon\right)N \leq \text{rank}(m) \leq \left(\frac{1}{2} + \varepsilon\right)N$ w.p. $\geq 1-\delta$

Acceptable Range for Median Rank

The algorithm succeeds if the returned median’s rank falls within the acceptable window. Consider the sorted stream of $N$ elements. The acceptable range for the rank of the median spans from $(1/2 - \varepsilon)N$ (the left boundary) to $(1/2 + \varepsilon)N$ (the right boundary).

The size of this acceptable range is:

\left(\frac{1}{2} + \varepsilon\right)N - \left(\frac{1}{2} - \varepsilon\right)N = 2\varepsilon N

The remaining $(1 - 2\varepsilon)N$ elements fall in the two bad regions combined. If the returned element’s rank falls outside the acceptable range - either below $(1/2 - \varepsilon)N$ or above $(1/2 + \varepsilon)N$ - the algorithm fails.

Algorithm

The algorithm is surprisingly simple and relies on the power of sampling:

Uniformly sample $t = O(\varepsilon^{-2} \log \delta^{-1})$ elements independently from the stream
Compute the median of these $t$ samples
Return this sampled median as our estimate

The intuition is that when we sample uniformly, elements in the “bad” regions (far from the true median) will also be sampled proportionally to their frequency. If we sample enough elements, it becomes unlikely that more than half of our samples fall into these bad regions, which would be necessary for the median of our samples to also be bad.

Theorem

Let $m$ be the output of the Algorithm. Then,

\left(\frac{1}{2} - \varepsilon\right)N \leq \text{rank}(m) \leq \left(\frac{1}{2} + \varepsilon\right)N

with probability at least $1-\delta$ .

Proof

Understanding When the Algorithm Fails

To understand why the algorithm works, let’s first identify when it would fail.

Partition the sorted stream into three regions:

Left bad region ( $S_L$ ): the smallest $\left(\frac{1}{2} - \varepsilon\right)N$ elements
Good region (middle): the middle $2\varepsilon N$ elements—this is our acceptable range
Right bad region ( $S_R$ ): the largest $\left(\frac{1}{2} - \varepsilon\right)N$ elements

Let $T$ be the set of $t$ samples. If the median of our samples falls in $S_L$ , then by definition of median, at least half of the $t$ samples must also be in $S_L$ . The same argument applies symmetrically for $S_R$ .

To formalize this, let $T_L = T \cap S_L$ and $T_R = T \cap S_R$ be the samples that fall in the left and right bad regions respectively. Then the algorithm fails if and only if $|T_L| > \frac{t}{2}$ or $|T_R| > \frac{t}{2}$ .

Bounding the Failure Probability

Now we’ll use concentration inequalities to bound how likely it is that too many samples fall into bad regions.

Step 1: Probability of sampling a bad element

When we uniformly sample an element from the stream, the probability it lands in $S_L$ is simply:

\Pr(\text{sampled element} \in S_L) = \frac{|S_L|}{N} = \frac{(1/2 - \varepsilon)N}{N} = \frac{1}{2} - \varepsilon

Since we sample independently, each of the $t$ samples is a Bernoulli trial with success probability $p = \frac{1}{2} - \varepsilon$ (where “success” means landing in $S_L$ ).

Step 2: Expected number of bad samples

By linearity of expectation:

E[|T_L|] = t \cdot \left(\frac{1}{2} - \varepsilon\right)

This tells us that we expect about $\left(\frac{1}{2} - \varepsilon\right)$ fraction of our samples to fall in $S_L$ . Since $\frac{1}{2} - \varepsilon < \frac{1}{2}$ , we expect fewer than half of the samples to be bad—which is good! But we need to bound how much the actual count can deviate from this expectation.

Applying Chernoff Bound

Step 3: Setting up the concentration problem

We want to bound the probability that $|T_L|$ exceeds a critical threshold. The critical threshold is when more than half our samples are bad: $|T_L| > \frac{t}{2}$ .

To apply Chernoff’s bound, we need to express this event as a multiplicative deviation from the expectation. Let’s rewrite:

P\left(|T_L| > \frac{t}{2}\right) = P\left(|T_L| > (1 + \delta') E[|T_L|]\right)

where $\delta'$ is a deviation parameter. Solving for $\delta'$ :

\frac{t}{2} = (1 + \delta')\left(\frac{1}{2} - \varepsilon\right)t

Dividing both sides by $\left(\frac{1}{2} - \varepsilon\right)t$ :

\delta' = \frac{t/2}{(1/2 - \varepsilon)t} - 1 = \frac{1/2}{1/2 - \varepsilon} - 1 = \frac{\varepsilon}{1/2 - \varepsilon}

Step 4: Applying Chernoff’s bound

Chernoff’s bound tells us that for a sum of independent Bernoulli random variables with expectation $\mu$ , the probability of exceeding $(1+\delta')\mu$ decays exponentially:

P(X > (1+\delta')\mu) \leq e^{-\frac{\delta'^2 \mu}{3}}

Plugging in our values ( $\mu = E[|T_L|] = (1/2 - \varepsilon)t$ and $\delta' = \frac{\varepsilon}{1/2 - \varepsilon}$ ):

P\left(|T_L| > \frac{t}{2}\right) \leq e^{-\frac{1}{3} \cdot \left(\frac{\varepsilon}{1/2 - \varepsilon}\right)^2 \cdot (1/2 - \varepsilon)t} = e^{-\frac{\varepsilon^2}{3(1/2 - \varepsilon)} \cdot t}

Similarly:

P\left(|T_R| > \frac{t}{2}\right) < e^{-\frac{1}{3} \cdot \frac{\varepsilon^2}{1/2 - \varepsilon} \cdot t}

By union bound:

P\left(|T_L| > \frac{t}{2} \text{ or } |T_R| > \frac{t}{2}\right) \leq P\left(|T_L| > \frac{t}{2}\right) + P\left(|T_R| > \frac{t}{2}\right) = 2e^{-\frac{1}{3} \cdot \frac{\varepsilon^2}{1/2 - \varepsilon} \cdot t}

Setting the Sample Size

We want to bound $2e^{-\frac{1}{3} \cdot \frac{\varepsilon^2}{1/2 - \varepsilon} \cdot t}$ by $\delta$ :

2e^{-\frac{1}{3} \cdot \frac{\varepsilon^2}{1/2 - \varepsilon} \cdot t} < \delta \iff -\frac{1}{3} \cdot \frac{\varepsilon^2}{1/2 - \varepsilon} \cdot t < \ln\frac{\delta}{2}

Solving for $t$ :

t > \frac{1/2 - \varepsilon}{\varepsilon^2} \cdot 3 \cdot \ln\frac{2}{\delta}

Conclusion: If $t > \frac{3}{\varepsilon^2} \ln\frac{2}{\delta}$ , then with probability at most $\delta$ the algorithm fails.

Hence, if $t = \varepsilon^{-2} \log \delta^{-1}$ (which is $t > \frac{3}{\varepsilon^2} \ln\frac{2}{\delta}$ up to constant factors), then with probability at least $1-\delta$ the algorithm succeeds.

Recall: Morris Counter

Problem Setup

Input: A stream $x_1, x_2, \ldots, x_N$

Goal: Count the number of elements using $\log \log N$ bits.

Algorithm

Let $C_i$ denote the current counter value.

Update rule: $C_i \leftarrow C_i + 1$ with probability $\frac{1}{2^{C_i}}$

Guarantee

$E[\text{Out}] = N$
$\text{Var}(\text{Out}) = \frac{N(N-1)}{2}$

Computing Variance

To compute variance, use the formula: $\text{Var}(x) = E[x^2] - E[x]^2$

We need $E\left[(2^{C_{i+1}})^2\right]$ , which can be found in a similar fashion to $E[2^{C_i+1}]$ .

Main Issue with Morris Counter

Morris Counter provides an elegant solution using only $O(\log \log N)$ bits, with the following attributes:

Mean: $E[\text{Out}] = N$ (correct on average)
Variance: $\text{Var}(\text{Out}) = O(N^2)$ (highly variable, a significant drawback)

The large variance means that while the expected output is correct, the actual output can deviate wildly from $N$ . For example, with high probability the output could be as far as $N$ away from the true count—rendering the estimate unreliable.

Boosting via Averaging: Morris+

The natural solution is to run multiple independent Morris Counters and average their outputs. Exercise 4.9(b) tells us that to get an $(\varepsilon, 3/4)$ -estimate (within $\varepsilon N$ with probability $3/4$ ), we need:

\text{Number of parallel instances} = O\left(\frac{r^2}{\varepsilon^2}\right)

where $r = \frac{\sqrt{\text{Var}(\text{Out})}}{E[\text{Out}]} = \frac{\sqrt{N(N-1)/2}}{N} \approx \frac{1}{\sqrt{2}}$ is a constant. So we need $O(1/\varepsilon^2)$ parallel machines.

By averaging the outputs of $O(1/\varepsilon^2)$ Morris Counters, we get an estimate within $\varepsilon N$ with probability at least $3/4$ .

Boosting via Median: Morris++

But we still have a problem: the failure probability is $1/4$ , which may be too high. We can boost confidence further by running $O(\log(1/\delta))$ copies of Morris+ in parallel and taking their median.

By Exercise 4.9(c), taking the median of $O(\log(1/\delta))$ independent weak estimates (each with failure probability $1/4$ ) yields a strong estimate with failure probability at most $\delta$ .

Final result: Morris++ uses $O\left(\log(1/\delta) \cdot \frac{1}{\varepsilon^2}\right)$ parallel machines and achieves an $(\varepsilon, \delta)$ -estimate of the count.

Variance of Morris Counter

Let $X_n$ denote the counter’s state when $n$ items have passed.

E[(2^{X_n})^2] = E[2^{2X_n}] = \sum_{j=0}^{\infty} P(X_{n-1} = j) E[2^{2X_n} \mid X_{n-1} = j]

Expanding based on the transition probabilities:

= \sum P(X_{n-1} = j) \left(\frac{1}{2^j} \cdot 2^{2(j+1)} + \left(1 - \frac{1}{2^j}\right) \cdot 2^{2j}\right)

Simplifying (noting $2^{j+2} - 2^j = 3 \cdot 2^j$ ):

= \sum P(X_{n-1} = j) (2^{2j} + 3 \cdot 2^j)

Breaking this into two separate expectations:

= E[2^{2X_{n-1}}] + 3 \cdot E[2^{X_{n-1}}]

Note: From the Morris Counter algorithm, $E[2^{X_n}] = n + 1$ , so $E[2^{X_{n-1}}] = n$ . Therefore:

E[2^{2X_n}] = E[2^{2X_{n-1}}] + 3n

Solving by telescoping (with base case $E[2^{2X_0}] = 2^0 = 1$ ):

\begin{aligned} E[2^{2X_n}] &= E[2^{2X_0}] + 3(1 + 2 + \cdots + n) \\ &= 1 + 3 \cdot \frac{n(n+1)}{2} \\ &= \frac{3}{2}n^2 + \frac{3}{2}n + 1 \end{aligned}

Therefore, the variance is:

\text{Var}[2^{X_n}] = \left(\frac{3}{2}n^2 + \frac{3}{2}n + 1\right) - (n+1)^2 = \frac{n(n-1)}{2} \leq n^2