CSCI 328 Midterm Example 1
CSCI 381/780 Midterm
Section titled “CSCI 381/780 Midterm”The exam has 5 questions, adding up to 120 points. Max score is 100. Please do the problems in order.
Chernoff Bound: Let be i.i.d. Bernoulli random variables and . Let . Then for any :
-
Let and be two random -bitvectors, i.e., and are bitvectors of length such that each position is equally likely to be 0 or 1.
- (15 points) The Hamming distance between two bitvectors is defined as the number of positions in which they differ. For example, . Calculate , i.e., the expected Hamming distance between and . Using Chebyshev, give an upper bound on the probability that the Hamming distance between and is larger than or less than .
- (20 points) The dot product of two vectors and is defined as . Calculate , i.e., the expected dot product of and . Using Chernoff, give an upper bound on the probability that the dot product of and is larger than .
Solution
Part (a): Hamming Distance
Let’s start by understanding what we’re looking for. The Hamming distance counts how many positions differ between the two vectors. Let’s denote it as .
Finding the expected Hamming distance:
Here’s the key insight: we can think of the total Hamming distance as the sum of indicator variables for each position. At position , the bits differ if .
Let’s calculate the probability that position differs:
- If and : probability is
- If and : probability is
Since we have independent positions, and at each position there’s a chance of differing:
Finding the variance:
To use Chebyshev’s inequality, we need the variance. Think of each position as an independent trial: position contributes 1 to the Hamming distance (if it differs) or 0 (if it matches). Each position is a Bernoulli(1/2) random variable.
For a Bernoulli(1/2) random variable, the variance is . Since all positions are independent:
Applying Chebyshev’s inequality:
We want to bound the probability that deviates significantly from its expected value. Specifically, we want the probability that is either very large () or very small (). Notice that both of these are exactly away from the mean:
By Chebyshev’s inequality:
Part (b): Dot Product
The dot product sums the products of corresponding bits. Let’s think about when : this happens only when both AND .
Finding the expected dot product:
Since each bit is independently 0 or 1 with probability 1/2:
- Probability that is
- Probability that is
- These are independent, so
Each position contributes an expected value of to the dot product:
Applying the Chernoff bound:
We want to bound . The Chernoff bound is useful here because it gives us exponentially small probabilities.
The Chernoff bound has the form: , where .
We need to express our target in this form. We have , so:
Solving for :
Now we apply the Chernoff bound with and :
This exponential bound tells us that as grows, the probability of the dot product exceeding decreases exponentially—very reassuring!
-
(15 points) Consider the following wheel-of-fortune game: A player bets on one of the numbers 1 through 6. Three dice are then rolled, and if the number bet by the player appears times, then the player wins (note that it doesn’t matter what the actual number the player bet is, only how many times it appears). If the number bet by the player does not appear on any of the dice, then the player loses . Decide if this game is fair to the player by calculating expected winnings from one play of the game.
Solution
Let denote the number of times the bet number appears on the three dice. We need to find the probability distribution of , since the payout depends on .
Setting up the probabilities:
For each die:
- Probability the bet number appears:
- Probability it doesn’t appear:
Let’s calculate for each possible value:
When (number doesn’t appear on any die):
When (appears on exactly one die):
When (appears on exactly two dice):
When (appears on all three dice):
Computing expected winnings:
Now recall the payout structure:
- If : lose dollars
- If : win dollars
- If : win dollars
- If : win dollars
By the definition of expectation:
Conclusion:
The expected winnings are negative, approximately -\3.12$ per play. This means that on average, over many plays, a player loses money. Therefore, the game is not fair to the player.
-
(20 points) Suppose you are given that (the actual value doesn’t matter, just that it is a constant) and that .
Using this information, design a random variable that can take any value (so infinitely many values), such that the expectation of is finite, but the variance of is infinite.
Solution
Designing the random variable:
The key is to use the given series to control the behavior of expectation and variance. Let’s define:
where is chosen to make this a valid probability distribution (normalize so probabilities sum to 1).
Finding the normalization constant:
For probabilities to sum to 1:
We’re given that , so:
Computing the expectation:
Now let’s check if the expectation is finite:
We’re told that . Therefore:
Since is a finite constant and the sum is finite, is finite.
Computing the second moment:
For the variance, we need :
Here’s the crucial difference: is the harmonic series, which diverges to infinity. This is a famous result!
Therefore , which means:
We’ve successfully constructed a random variable with finite expectation but infinite variance!
-
(30 points) Alice and her younger brother Bob play chess. Alice is the better player, and wins any match with probability 0.75. They play a tournament of 200 matches, with ice-cream as the prize. What is the expected number of games that Alice wins?
Bob wants to negotiate a deal where his chances of getting ice-cream are at least . He can stipulate a number , and demand Alice win at least games in order to win the tournament. Obviously, Bob can stipulate , and he will almost surely get ice-cream, but that would be cheating. What is the lowest value of where Bob can still get ice-cream with probability at least ?
Solution
Part 1: Expected Number of Games Alice Wins
Let be the number of games Alice wins. Since :
Part 2: Finding the Lowest Value of X
Bob gets ice-cream if Alice wins fewer than games, i.e., if or equivalently .
We want to find the smallest such that .
Using the Chernoff bound, for where :
We want , which requires:
Therefore .
With :
We verify: .
Therefore, the lowest value Bob can stipulate is .
-
(20 points) What is the main advantage/disadvantage of using Cuckoo Hashing over Hashing with Chaining? What is the main advantage/disadvantage of using Hashing with Chaining over FKS hashing?