Skip to content

CSCI 328 Midterm Example 2

No calculators allowed (if I feel that you have used a calculator, you get minus 20 points). The exam has 5 questions, adding up to 110 points. Max score is 100. Please do the problems in order.

Chernoff Bound: Let {Xi}i=1n\{X_i\}_{i=1}^n be i.i.d. Bernoulli random variables and X=i=1nXiX = \sum_{i=1}^n X_i. Let μ=E[X]\mu = E[X]. Then for any 0<δ10 < \delta \le 1:

Pr(X(1+δ)μ)eμδ2/3Pr(X \ge (1+\delta)\mu) \le e^{-\mu\delta^2/3}

Here e=2.71828e = 2.71828\ldots

  1. (9 points) We roll two standard six-sided dice. Find the probability of the following events, assuming the outcomes of the dice are independent.
    • Two dice show the same number
    • The sum of the dice is even
    • The product of the dice is a perfect square

  1. (11 points) Define a random variable XX that takes the values {2,1,0,1,2}\{-2, -1, 0, 1, 2\} with probabilities {1/6,1/4,p,1/4,1/6}\{1/6, 1/4, p, 1/4, 1/6\}, respectively. For what value of pp is XX well-defined? Calculate the expectation and variance of XX with this value of pp.

  1. (30 points) Alice and her younger brother Bob play chess. Alice is the better player, and wins any match with probability 0.7. They play a tournament of 80 matches, with ice-cream as the prize. However, unlike a regular tournament, Bob has negotiated a deal with their mom where he only needs to win at least 36 matches in order to win the tournament. Show that Bob still has a less than one-in-seven chance of getting the ice cream. [Hint: At the last step, use the numerical value of ee given above.]

    Solution

    Let’s set up the problem carefully. Bob needs to win at least 36 matches, and we want to find the probability that he succeeds.

    Setting up the model:

    Let XX denote the number of matches Bob wins. Since:

    • Alice wins each match with probability 0.7
    • Bob wins each match with probability 10.7=0.31 - 0.7 = 0.3
    • They play 80 independent matches

    We can write XX as a sum of i.i.d. Bernoulli(0.3) random variables:

    X=i=180XiXiBernoulli(0.3)\begin{align*} X &= \sum_{i=1}^{80} X_i \\ X_i &\sim \text{Bernoulli}(0.3) \end{align*}

    The expected number of matches Bob wins is:

    μ=E[X]=800.3=24\mu = E[X] = 80 \cdot 0.3 = 24

    Notice that Bob’s target (36 matches) is significantly higher than the expected value (24 matches). This suggests the probability is quite small.

    Applying the Chernoff bound:

    The Chernoff bound allows us to bound P(Xa)P(X \geq a) exponentially. We need to express 36 in the standard form (1+δ)μ(1 + \delta)\mu:

    36=(1+δ)2436 = (1 + \delta) \cdot 24

    Solving for δ\delta:

    1+δ=3624=32=1.5δ=0.5\begin{align*} 1 + \delta &= \frac{36}{24} = \frac{3}{2} = 1.5 \\ \delta &= 0.5 \end{align*}

    By the Chernoff bound: P(X(1+δ)μ)eμδ2/3P(X \geq (1+\delta)\mu) \leq e^{-\mu\delta^2/3}

    P(X36)eμδ2/3=e24(0.5)2/3=e240.25/3=e2\begin{align*} P(X \geq 36) &\leq e^{-\mu\delta^2/3} \\ &= e^{-24 \cdot (0.5)^2/3} \\ &= e^{-24 \cdot 0.25/3} \\ &= e^{-2} \end{align*}

    Computing the final answer:

    Now we use the numerical value e=2.71828e = 2.71828\ldots given in the problem:

    e2=1e2=1(2.71828)217.389e^{-2} = \frac{1}{e^2} = \frac{1}{(2.71828)^2} \approx \frac{1}{7.389}

    Since 17.389<17\frac{1}{7.389} < \frac{1}{7}, we have P(X36)<17P(X \geq 36) < \frac{1}{7}.

    Therefore, Bob has less than a one-in-seven chance of winning the tournament.

  2. (30 points) A certain component is critical to the operation of an electrical system and must be replaced immediately upon failure. If the mean lifetime of this type of component is 100 hours and its standard deviation is 30 hours, how many of these components must be in stock so that the probability that the system is in continual operation for the next 2000 hours is at least 0.95? [Hint: Let NN be the number of components required. Define NN random variables XiX_i denoting how long component ii lasts, and use Chebyshev]

    Solution

    Setting up the Model

    Let’s think about what we’re trying to achieve. We have components with lifetimes that vary, and we need enough of them so that with probability at least 0.95, they can run the system continuously for 2000 hours.

    Let XiX_i denote the lifetime of component ii. We’re given:

    • E[Xi]=100E[X_i] = 100 hours (average component lifetime)
    • σ=std(Xi)=30\sigma = \text{std}(X_i) = 30 hours (variability in lifetime)

    If we keep NN components on hand, the total operating time is SN=i=1NXiS_N = \sum_{i=1}^N X_i. We want:

    P(SN2000)0.95P(S_N \geq 2000) \geq 0.95

    Equivalently, we need P(SN<2000)0.05P(S_N < 2000) \leq 0.05.

    Computing the Mean and Variance

    Assuming the component lifetimes are independent:

    E[SN]=N100=100NVar(SN)=N302=900N\begin{align*} E[S_N] &= N \cdot 100 = 100N \\ \text{Var}(S_N) &= N \cdot 30^2 = 900N \end{align*}

    Notice that if NN is large enough such that E[SN]=100N2000E[S_N] = 100N \gg 2000, then it’s unlikely we’ll fall short. The question is: how large must NN be?

    Applying Chebyshev's Inequality

    The event "SN<2000S_N < 2000" means we fall short of our target. This happens when SNS_N deviates below its mean by more than 100N2000100N - 2000 hours. By Chebyshev’s inequality:

    P(SN100N100N2000)Var(SN)(100N2000)2=900N(100N2000)2\begin{align*} P(|S_N - 100N| \geq 100N - 2000) &\leq \frac{\text{Var}(S_N)}{(100N - 2000)^2} \\ &= \frac{900N}{(100N - 2000)^2} \end{align*}

    For this to be at most 0.05, we need:

    900N(100N2000)20.05\frac{900N}{(100N - 2000)^2} \leq 0.05

    Rearranging:

    900N0.05(100N2000)218000N(100N2000)218000N10000N2400000N+4000000010000N2418000N+40000000N241.8N+400\begin{align*} 900N &\leq 0.05(100N - 2000)^2 \\ 18000N &\leq (100N - 2000)^2 \\ 18000N &\leq 10000N^2 - 400000N + 4000000 \\ 0 &\leq 10000N^2 - 418000N + 4000000 \\ 0 &\leq N^2 - 41.8N + 400 \end{align*}

    Solving the Quadratic

    Using the quadratic formula:

    N=41.8±(41.8)24(400)2=41.8±1747.2416002=41.8±147.24241.8±12.132\begin{align*} N &= \frac{41.8 \pm \sqrt{(41.8)^2 - 4(400)}}{2} \\ &= \frac{41.8 \pm \sqrt{1747.24 - 1600}}{2} \\ &= \frac{41.8 \pm \sqrt{147.24}}{2} \\ &\approx \frac{41.8 \pm 12.13}{2} \end{align*}

    This gives us N26.97N \approx 26.97 or N14.84N \approx 14.84.

    Since the parabola N241.8N+400N^2 - 41.8N + 400 opens upward, the inequality N241.8N+4000N^2 - 41.8N + 400 \leq 0 is satisfied when 14.84N26.9714.84 \leq N \leq 26.97. But wait—that’s not what we want! We need N241.8N+4000N^2 - 41.8N + 400 \geq 0, so N14.84N \leq 14.84 or N26.97N \geq 26.97.

    Since we also need 100N>2000100N > 2000 (i.e., N20N \geq 20) just to have a positive expected total, we must have N26.97N \geq 26.97. Rounding up to the nearest integer: N27N \geq 27.

    Verification

    Let’s verify with N=27N = 27:

    • E[S27]=2700E[S_{27}] = 2700 hours
    • Var(S27)=90027=24300\text{Var}(S_{27}) = 900 \cdot 27 = 24300
    • The deviation from 2000 is: 27002000=7002700 - 2000 = 700 hours

    By Chebyshev:

    P(S27<2000)243007002=243004900000.0495<0.05\begin{align*} P(S_{27} < 2000) &\leq \frac{24300}{700^2} \\ &= \frac{24300}{490000} \\ &\approx 0.0495 < 0.05 \end{align*}

    Therefore, we need at least 27 components in stock to ensure 95% probability of 2000-hour operation.

  3. (20 points) What is the main advantage of Cuckoo Hashing over Hashing with Chaining? What is the main advantage of Hashing with Chaining over FKS hashing?