Skip to content

Final Exam Formulas & Identities Sheet

Streaming Algorithms & Approximate Counting

Section titled “Streaming Algorithms & Approximate Counting”

Update rule: CiCi+1C_i \leftarrow C_i + 1 with probability 12Ci\frac{1}{2^{C_i}}

Estimate: n^=2C1\hat{n} = 2^C - 1

Variance:

Var(2C1)N22\text{Var}(2^C - 1) \approx \frac{N^2}{2}

Run T=1/ε2T = \lceil 1/\varepsilon^2 \rceil independent Morris counters, return average:

Output=1Ti=1T(2Ci1)\text{Output} = \frac{1}{T} \sum_{i=1}^T (2^{C_i} - 1)

Success probability: 3/4\geq 3/4 (output within εN\varepsilon N of truth)

Run log(1/δ)\lceil \log(1/\delta) \rceil independent Morris+ instances, return median:

Output=median{M1,M2,,Mlog(1/δ)}\text{Output} = \text{median}\{M_1, M_2, \ldots, M_{\lceil \log(1/\delta) \rceil}\}

Success probability: 1δ\geq 1 - \delta (arbitrary confidence)

Total counters needed: log(1/δ)×1/ε2\lceil \log(1/\delta) \rceil \times \lceil 1/\varepsilon^2 \rceil


Find element with rank between (1/2ε)n(1/2 - \varepsilon)n and (1/2+ε)n(1/2 + \varepsilon)n with confidence 1δ\geq 1-\delta.

Run T=1/ε2log(1/δ)T = \lceil 1/\varepsilon^2 \cdot \log(1/\delta) \rceil independent uniform samples in parallel, return median of samples.

By Chernoff bounds, probability that more than half the samples fall in either “bad” tail region is δ\leq \delta.