CSCI 328 Midterm Fundamentals Problem Sets
Problem 1: Expectation Basics
Section titled “Problem 1: Expectation Basics”(a) If we flip a fair coin 100 times, what is the expected number of heads?
(b) Using linearity of expectation, explain why the expectation of a binomial random variable is .
(c) If the expected number of heads is 50, what is the expected number of tails?
Solution
(a) For , we have heads.
(b) We can write where each . By linearity of expectation (which holds even without independence):
(c) The number of tails is , so tails.
Problem 2: Bernoulli Random Variables and Variance
Section titled “Problem 2: Bernoulli Random Variables and Variance”Consider flipping a fair coin once. Let if we get heads, and if we get tails.
(a) What is and ?
(b) Now flip the coin 100 times. Let be the total number of heads. Express as a sum of independent Bernoulli random variables and use linearity to find and .
(c) For coin flips with , compute and .
Solution
(a) , so:
(b) where each . By linearity:
(c) For :
Problem 3: Hashing with Chaining - Indicator Variables
Section titled “Problem 3: Hashing with Chaining - Indicator Variables”In hashing with chaining, we query for an element . Let be an indicator variable: if element collides with (i.e., ), and otherwise. Assume the hash function is perfectly random.
(a) If we have elements and buckets, what is for any element ?
(b) The query time is . Using linearity of expectation, what is ?
(c) If , what should be approximately?
Solution
(a) Since the hash function is perfectly random, each element hashes to each of the buckets with equal probability:
(b) Each is a Bernoulli random variable with . By linearity of expectation:
(c) For a Bernoulli variable with :
Assuming independence:
If , then (for large ).
Problem 4: Hashing with Chaining - Expected Query Time
Section titled “Problem 4: Hashing with Chaining - Expected Query Time”For hashing with chaining, assume keys are stored in buckets using a perfectly random hash function.
(a) What is the expected length of a chain?
(b) If (load factor ), what is where is the query time?
(c) Using Markov’s inequality, bound when .
Solution
(a) Each of the keys hashes to one of buckets uniformly at random. By linearity of expectation:
(b) With :
(c) By Markov’s inequality:
Problem 5: Hash Function Properties
Section titled “Problem 5: Hash Function Properties”A hash function is “perfectly random” if for any key and bucket .
(a) What does it mean for a hash function to be perfectly random?
(b) Why does the theoretical analysis of hashing depend on this assumption?
(c) If a hash function is biased (not perfectly random), how would it affect the expected query time?
Solution
(a) A perfectly random hash function maps each key uniformly to any of the buckets with equal probability , independent of other keys.
(b) The expected query time analysis assumes that each key is equally likely to land in any bucket, collisions are independent, and the expected chain length is . Without perfect randomness, some buckets might be overloaded.
(c) If the hash function is biased, some buckets receive more keys than expected, expected chain length increases above , and expected query time becomes worse than . In the worst case (all keys hash to one bucket), query time is .
Problem 6: Balls and Bins - Foundational Analysis
Section titled “Problem 6: Balls and Bins - Foundational Analysis”You throw balls (keys) uniformly at random into bins (hash slots).
(a) What is the expected load (number of balls per bin)?
(b) What is the expected number of empty bins?
(c) Using Chernoff bounds, estimate the probability that a particular bin receives at least 10 balls.
(d) Using a union bound, estimate the probability that the maximum load exceeds 10.
Solution
(a) Each ball lands in a uniformly random bin. By linearity of expectation:
(b) A bin is empty if no balls land in it. The probability a given bin is empty is:
Expected number of empty bins:
(c) Let = load of a particular bin. Then with .
We want . With , we have :
(d) The probability that any bin (out of ) exceeds load 10:
Problem 7: Balls and Bins - Comprehensive Analysis
Section titled “Problem 7: Balls and Bins - Comprehensive Analysis”Balls and Bins: throw balls uniformly at random into bins.
(a) What is the probability that a specific bin receives no balls?
(b) By linearity of expectation, what is the expected number of empty bins?
(c) Write out the explicit product expression for the probability that all balls land in different bins (no collisions).
(d) Write the generalized closed form formula for the probability that all balls land in different bins when thrown into bins.
(e) Using the balls and bins framework, explain why collisions (two items hashing to the same bucket) are likely even when the number of items equals the number of buckets.
Solution
(a) Each ball has probability of missing a specific bin. With independent balls:
(b) By linearity of expectation, the expected number of empty bins is:
(c) The probability that all balls land in different bins is:
(d) The generalized closed form formula for balls and bins is:
For our case, where we have equal numbers of balls and bins ():
(e) If the expected number of empty bins is about , then the expected number of non-empty bins is about . By pigeonhole principle, if fewer than bins are used for balls, some bins must have more than one ball.
Problem 8: Birthday Paradox and Hash Collisions
Section titled “Problem 8: Birthday Paradox and Hash Collisions”Suppose you have a hash table with slots and you insert keys using a random hash function.
(a) Using the birthday paradox, estimate the probability that at least two keys collide (hash to the same slot).
(b) Use Chernoff bounds to bound the probability that a particular slot receives 3 or more keys.
(c) Using a union bound, estimate the probability that any slot receives 3 or more keys.
Solution
(a) We want . It’s easier to compute the complement:
With and :
(b) Let = number of keys hashing to a particular slot. Then with .
We want . Using Chernoff with :
(c) The probability that any of the slots has 3 or more keys is at most:
Problem 9: Classic Birthday Paradox
Section titled “Problem 9: Classic Birthday Paradox”(a) In the birthday paradox with 365 days, what is the probability that no two people in a room of 24 people share a birthday?
(b) Using the complement, what is the probability that at least two people share a birthday?
(c) How is the birthday paradox related to the “balls and bins” framework?
(d) Explain why this probability is counterintuitive to many people.
Solution
(a) The probability that all 24 people have different birthdays is:
(b) Using the complement rule:
(c) In balls and bins, we have 24 balls (people) and 365 bins (days). The probability formula for no collisions is exactly the same as the birthday paradox.
(d) Many people intuitively think about the probability of sharing a specific birthday (which is low) rather than the probability of any two people sharing a birthday (which is much higher). With 24 people, there are pairs to compare, which is why collisions are likely.
Problem 10: Markov’s Inequality - First Concentration Inequality
Section titled “Problem 10: Markov’s Inequality - First Concentration Inequality”(a) State Markov’s inequality and identify what information you need to apply it.
(b) Using Markov’s inequality, if the expected query time for hashing is and you want to bound , what is the bound?
(c) Why is Markov’s inequality weaker than Chebyshev’s inequality for the same random variable?
Solution
(a) Markov’s Inequality: For a non-negative random variable and threshold :
You only need to know the expectation , and must be non-negative.
(b) With and :
(c) Markov uses only the expectation, while Chebyshev uses both expectation and variance. If the variance is small, Chebyshev gives a much tighter bound. In the same example, Chebyshev would give , which is typically much smaller than when variance is small.
Problem 11: Chebyshev’s Inequality
Section titled “Problem 11: Chebyshev’s Inequality”(a) State Chebyshev’s inequality.
(b) For a random variable with and , use Chebyshev’s inequality to bound .
(c) What does Chebyshev’s inequality tell you about how concentrated a random variable is around its mean?
Solution
(a) Chebyshev’s Inequality: For any random variable and threshold :
(b) With , , and :
(c) Chebyshev’s inequality says that the probability of being far from the mean (by more than units) is bounded by . Small variance means the random variable is tightly concentrated around its mean. As increases, the probability of large deviations decreases quadratically.
Problem 12: Chernoff Bounds
Section titled “Problem 12: Chernoff Bounds”(a) State Chernoff bounds for a sum of independent Bernoulli random variables.
(b) For with , use the simplified Chernoff bound for to bound .
(c) Compare this to what Chebyshev’s inequality would give for the same event. Which bound is tighter?
Solution
(a) Chernoff Bounds: For independent Bernoulli random variables with and :
For :
(b) We have , so .
(c) For Chebyshev with and deviation :
Chernoff gives while Chebyshev gives . Chebyshev is tighter in this case, but Chernoff becomes exponentially better for larger deviations.
Problem 13: Comparing Markov, Chebyshev, and Chernoff
Section titled “Problem 13: Comparing Markov, Chebyshev, and Chernoff”Compare three concentration bounds (Markov, Chebyshev, Chernoff) applied to the maximum load in balls-and-bins with balls and bins.
(a) State each inequality and the requirements for applying it.
(b) For , compute each bound on .
(c) Explain why Chernoff is exponentially tighter than the others.
(d) Which bound would you use in practice, and why?
Solution
(a)
Markov’s Inequality: - Requires only (non-negative RV).
Chebyshev’s Inequality: - Requires both and .
Chernoff Bound: - Requires is a sum of independent Bernoulli RVs.
(b) For , bounding :
Markov:
Chebyshev: (absurd)
Chernoff: , Union bound:
(c) Why Chernoff is exponentially better:
- Markov uses only expectation (weakest info)
- Chebyshev uses expectation + variance (better, but still limited)
- Chernoff exploits the structure: when with independent Bernoullis, concentration is exponential
(d) In practice:
Use Chernoff when: You’re analyzing sums of independent Bernoulli RVs (common in randomized algorithms) or need tight bounds for large-scale systems.
Use Chebyshev when: You don’t have independence (only pairwise is needed) or the distribution is unknown.
Use Markov when: You only know the expectation or need a simple, quick bound.
For balls-and-bins, Chernoff is the right choice because the load is literally a sum of independent Bernoullis.
Problem 14: Comparing Markov, Chebyshev, and Chernoff with Specific Example
Section titled “Problem 14: Comparing Markov, Chebyshev, and Chernoff with Specific Example”Consider with and .
(a) Use all three inequalities (Markov, Chebyshev, Chernoff) to bound .
(b) Rank the three bounds from loosest to tightest. Which inequality is most useful for this tail event, and why?
(c) As we ask for increasingly extreme deviations (e.g., ), which inequality’s advantage grows?
Solution
(a)
- Markov:
- Chebyshev:
- Chernoff: With , we get
(b) Ranking from loosest to tightest: Markov (0.667) > Chebyshev (0.04) > Chernoff (0.015).
Chernoff is most useful because it exploits the structure of independent Bernoullis and gives exponentially small bounds for tail events.
(c) As deviations grow larger, Chernoff’s exponential decay becomes increasingly powerful. For instance, at , Chernoff still gives an exponentially small bound while Markov and Chebyshev grow much more slowly.
Problem 15: Coupon Collector Problem
Section titled “Problem 15: Coupon Collector Problem”Consider the Coupon Collector problem: you want to collect all 6 unique Pokemon characters by buying cereal boxes. Each box contains one random character with equal probability.
(a) If you already have 3 unique characters, what is the probability that the next box gives you a new character?
(b) Let be the number of boxes bought after collecting characters but before collecting the -th character. What distribution does follow, and what is for ?
(c) What is the expected total number of boxes needed to collect all 6 characters?
Solution
(a) You have 3 characters, so 3 are new. The probability of getting a new character is .
(b) follows a geometric distribution with success probability . For :
(c) By linearity of expectation:
Problem 16: Coupon Collector - General Analysis
Section titled “Problem 16: Coupon Collector - General Analysis”In the Coupon Collector problem, you repeatedly draw coupons uniformly at random from a set of distinct types. The goal is to understand how many draws are needed to collect at least one coupon of every type.
(a) Let denote the number of coupons needed to go from having distinct types to distinct types. What distribution does follow?
(b) Compute and derive the total expected number of coupons needed to collect all types.
(c) For (as in the birthday problem analog), compute the expected total number of coupons needed to see all 365 types.
Solution
(a) When we have types, there are unseen types out of total. Each drawn coupon is new with probability . Thus follows a geometric distribution with success probability .
(b) For a geometric RV with success probability :
By linearity of expectation:
For large : .
(c) For :
Problem 17: Applying Concentration Inequalities to Collision Analysis
Section titled “Problem 17: Applying Concentration Inequalities to Collision Analysis”In FKS hashing, we hash keys into buckets. Let be the number of keys in bucket , and let be the total number of collisions (ordered pairs in the same bucket). We know .
(a) When hashing keys uniformly into buckets, compute by considering: for each key , how many other keys collide with in expectation?
(b) Use Markov’s inequality to bound given that .
(c) Suppose instead we use Chebyshev. If , can Chebyshev give a strong enough bound to show the probability is less than 1/2? Discuss why or why not.
Solution
(a) For key , there are other keys. Each collides with with probability . So:
Summing over all keys: .
(b) Using and , we have .
By Markov:
(c) Chebyshev would give:
This is not strong enough. This illustrates why Markov’s bound is actually better for this problem: the deviation we care about ( vs. mean of ) is a factor of 2, and Markov is perfectly calibrated for this regime.
Problem 18: Linear Probing Basics
Section titled “Problem 18: Linear Probing Basics”Linear probing resolves collisions by placing keys in consecutive empty slots: if slot is occupied, try , , etc.
(a) Describe the insertion and query operations in linear probing.
(b) What is “primary clustering,” and why does it occur?
(c) Using Donald Knuth’s analysis, what is the expected query time for a successful lookup with load factor ?
Solution
(a) Insertion of key :
- Compute
- If slot is empty, insert there
- Otherwise, probe until finding an empty slot
Query for key :
- Compute
- Probe until finding or an empty slot
(b) Primary clustering is the formation of long contiguous runs of occupied slots.
Why it occurs: A key hashes to slot . If occupied, it’s placed at the next free slot. Any future key with in the occupied run probes into it, extending the run further. The run grows like a “snowball.”
(c) By Donald Knuth’s analysis, the expected query time for a successful lookup is:
With :
Problem 19: Linear Probing - Detailed Analysis
Section titled “Problem 19: Linear Probing - Detailed Analysis”In a hash table with cells and keys (load factor ), let be the length of chain .
(a) What is ?
(b) For and , use Chernoff bounds to bound the probability that any chain exceeds length 3.
(c) Using Chebyshev’s inequality, explain why a union bound on the variance would not give a tight bound for this problem.
Solution
(a) Each key independently hashes to cell with probability . Thus .
(b) For and , we have keys. For a single cell with :
By Chernoff:
By union bound over cells:
(c) For a chain with , we have . Using Chebyshev to bound :
By union bound:
This is completely useless (bound > 1). Chebyshev only uses expectation and variance, while Chernoff exploits the structure of sums of independent Bernoullis, giving exponentially tighter bounds.
Problem 20: FKS Hashing
Section titled “Problem 20: FKS Hashing”FKS hashing uses a two-level approach: a primary hash table with secondary hash tables in each bucket.
(a) Why does FKS guarantee worst-case lookup time?
(b) What is the expected preprocessing time for FKS, and how does it compare to hashing with chaining?
Solution
(a) FKS uses:
- Primary table: Hash keys into buckets using function
- Secondary tables: For each bucket with keys, use a secondary table of size with function
This guarantees zero collisions at the secondary level with positive probability, so:
- Query = 1 probe in primary + 1 probe in secondary = worst-case
(b)
- Hashing with chaining: worst-case preprocessing (insert each key once)
- FKS: expected preprocessing
The difference: FKS may need to retry hash functions if the choice is unlucky (i.e., ), but by Markov’s inequality, the expected number of retries is constant.
Problem 21: HWC vs FKS Hashing
Section titled “Problem 21: HWC vs FKS Hashing”Compare Hashing with Chaining (HWC) and FKS Hashing.
(a) What are the query time guarantees for each?
(b) What are the preprocessing time guarantees for each?
(c) When would you choose HWC over FKS, and vice versa?
Solution
(a) Query time:
- HWC: (expected). Worst-case is .
- FKS: worst-case (exactly 2 probes)
(b) Preprocessing time:
- HWC: always (insert each key once)
- FKS: expected (may retry hash functions, but expected constant retries)
(c)
Choose HWC when: Simplicity and ease of implementation matter, insertions and deletions are frequent, the load factor might be high, or constant factors are important.
Choose FKS when: Worst-case query time guarantees are critical, the set is static or rarely updated, or you need theoretical guarantees.
Problem 22: Bloom Filters - Basic Operations
Section titled “Problem 22: Bloom Filters - Basic Operations”A Bloom filter uses bits and hash functions to test set membership with false positives but no false negatives.
(a) Describe the insert and query operations.
(b) Why is the false positive probability approximately ?
(c) What is the optimal number of hash functions to minimize false positive rate?
Solution
(a) Insert element into set :
- Compute
- Set bits at these positions to
Query whether :
- Compute
- If all bits at these positions are , return “possibly in ”
- If any bit is , return “definitely not in ”
(b) After inserting elements with hash functions:
The probability a specific bit is not set by all hash function invocations:
So the probability a bit is set to :
A false positive occurs when all queried bits are :
(c) To minimize the false positive probability, take the derivative with respect to and set it to zero. The optimal value is:
At this optimal , the false positive probability becomes:
Problem 23: Bloom Filters - Comprehensive Analysis
Section titled “Problem 23: Bloom Filters - Comprehensive Analysis”A Bloom filter uses bits and hash functions to store a set of elements.
(a) What is the expected number of bits set to 1?
(b) Estimate the false positive probability.
(c) If you want to reduce the false positive rate to , how should you adjust and/or ?
Solution
(a) Each of the hash function evaluations sets a bit to 1.
The probability a given bit remains 0 is:
Expected number of bits set to 1:
(b) For a query of a non-member, all hash function positions must coincidentally be set to 1:
(c) From the formula :
For fixed , to achieve , we need to increase .
At the optimal , solving gives:
Problem 24: Bloom Filter False Positive Dynamics
Section titled “Problem 24: Bloom Filter False Positive Dynamics”In Bloom filters, how does the false positive rate change as more elements are inserted?
(a) As increases, does the false positive probability increase or decrease?
(b) If you fix (the number of bits) and insert more and more elements, what happens to the false positive rate?
(c) How should you adjust and as grows to maintain a constant false positive rate?
Solution
(a) The false positive probability increases as increases.
With elements inserted and optimal :
As increases, the exponent decreases (assuming fixed ), so the base raised to a smaller power gives a larger result.
(b) If you fix and keep inserting elements:
- More elements more bits are set to
- Higher fraction of bits higher chance all queried bits are
- False positive rate increases monotonically
(c) To maintain a constant false positive rate as grows:
Solution: Increase proportionally to while adjusting to maintain the target rate.
In practice:
- Space: Use bits (linear in , logarithmic in target error)
- Hash functions: (adjusts automatically as is fixed)
Problem 25: Linear Probing vs Bloom Filters
Section titled “Problem 25: Linear Probing vs Bloom Filters”Compare Linear Probing and Bloom Filters.
(a) What is the main difference in what they do?
(b) In terms of space, which is more efficient for very large sets?
(c) When would you use Linear Probing vs Bloom Filter?
Solution
(a) Linear Probing:
- Solves the dictionary problem: store keys and support insert, delete, lookup with exact answers
- Uses space
- Gives exact answers
Bloom Filter:
- Solves the approximate membership problem: test if element is in set with false positives
- Uses space
- Gives probabilistic answers (no false negatives, but false positives allowed)
(b) Space comparison: For a set of elements from a large universe:
- Linear Probing: Must store each full key. Space is bits
- Bloom Filter: Uses only bits, independent of universe size
For large , Bloom filter is exponentially more space-efficient.
(c)
Use Linear Probing when: You need exact membership testing, you need to support insertions and deletions, the set fits in memory, or false positives are unacceptable.
Use Bloom Filter when: Space is critical, false positives are tolerable, you only need membership testing (no need to retrieve associated data), or the universe is very large.
Problem 26: FKS vs Bloom Filters
Section titled “Problem 26: FKS vs Bloom Filters”FKS Hashing vs. Bloom Filters: which should you use for a membership test on elements?
(a) Compare the space used by each.
(b) Compare the query time of each.
(c) What are the key tradeoffs?
Solution
(a) Space:
- FKS: words (each word stores a key). Total: bits
- Bloom Filter: bits, independent of
For universe size and error rate :
- FKS: bits
- Bloom: bits (3× smaller)
(b) Query time:
- FKS: worst-case (exactly 2 probes)
- Bloom: where hash function evaluations
(c) Tradeoffs:
| Aspect | FKS | Bloom |
|---|---|---|
| Exactness | Exact answers | False positives allowed |
| Space | bits | bits |
| Speed | worst-case | hashes |
| Dynamism | Hard to insert/delete | Cannot delete |
| Use case | Exact dictionary | Space-critical membership |
Decision rule:
- Use FKS for exact membership when you can afford the space
- Use Bloom when space is critical and false positives are acceptable