Lecture 7 on 02/18/2026 - Chernoff Bound and Birthday Paradox
Scribe: Mauricio Monje
Summary of the Lecture
Section titled “Summary of the Lecture”- Recap FKS Hashing and HWC Analysis
- The Chernoff Bound
- The Birthday Paradox
- Balls and Bins Framework
Birthday Paradox
Section titled “Birthday Paradox”Problem
Section titled “Problem”In a room we have 24 people, what is the probability that some 2 people share the same birthday?
At first, this might seem unlikely. After all, there are 365 possible days, so intuitively you might think you’d need many more people for a collision. However, when you actually compute this probability, the result is surprisingly high - this is why it’s called the “paradox.” It’s not truly a mathematical paradox, but rather a result that contradicts most people’s intuition.
Approach: Using the Complement
Section titled “Approach: Using the Complement”When an event’s probability is difficult to compute directly, we can instead compute the probability of its complement (the opposite event) and subtract from 1. The complement of “some 2 people share a birthday” is “no 2 people share a birthday,” which means everyone has a different birthday. This complementary event is actually much easier to count.
Solution
Section titled “Solution”We set up the calculation using the complement:
To compute the probability of the complement, we count favorable outcomes over total possible outcomes.
Counting total possible outcomes: Each of the 24 people independently has any of 365 possible birthdays, so there are total configurations.
Counting favorable outcomes: For everyone to have different birthdays:
- The first person can have any of 365 days
- The second person can have any of the remaining 364 days
- The third person can have any of the remaining 363 days
- And so on…
This gives us favorable outcomes, which can also be written as .
Therefore:
What this means: In a room with just 24 people, there is more than a 50% chance that some two will share a birthday. This is the surprising result that motivates the name “birthday paradox.”
Balls and Bins
Section titled “Balls and Bins”Motivation: Scheduling Problem
Section titled “Motivation: Scheduling Problem”Imagine you are a scheduler with 10 machines. People come to you with different tasks (codes they want to run), and your job is to assign a machine to each task. These are called scheduling problems: you have a certain number of jobs and a certain number of machines, and you want to assign jobs to machines to finish them in the least amount of time (or optimize some other objective).
One of the simplest approaches is random scheduling: when someone comes with a task, you just randomly pick one of your machines and assign the task to it.
To analyze these randomized algorithms, we use a framework called balls and bins. This framework is also useful for understanding hashing algorithms, since a hash function takes keys and hashes them into cells of a hash table, where all cells are equally likely.
The Balls and Bins Framework
Section titled “The Balls and Bins Framework”When we throw balls into bins, the probability that no bin has 2 or more balls is:
Understanding Each Term: Let’s think through this product carefully. For no collisions to occur, each ball must land in an empty bin.
- First ball: Can go anywhere, so its “probability of success” (going into an empty bin) is 1.
- Second ball: The first ball occupies 1 bin. So there are empty bins out of total. The probability it avoids collision is .
- Third ball: The first two balls occupy 2 different bins. So there are empty bins out of total. The probability it avoids collision is .
- -th ball: The first balls occupy different bins. So there are empty bins. The probability it avoids collision is .
To have no collisions at all, all of these independent events must happen together, so we multiply the probabilities.
Note: This formula assumes (the number of balls does not exceed the number of bins). If , then by the pigeonhole principle, we are guaranteed to have at least one collision, so .
Connection to Birthday Paradox: If we relate people to balls and birthdays to bins, then throwing 24 balls into 365 bins gives us the birthday paradox. Here, 365 corresponds to the number of days in a year, and we’re asking for the probability that no date gets assigned more than one person, which is the same as everyone having a different birthday. When we plug in and into the product above, we get the same 0.4616 that we calculated earlier for the birthday paradox.
Simplifying Products with the Constant
Section titled “Simplifying Products with the Constant eee”The product we have contains many terms like , , etc. To make this more manageable, we use a fundamental mathematical constant called .
Approximation: For large values of , we have the relationship:
This approximation comes from the well-known mathematical limits:
We won’t prove this here, but this is a fundamental result in calculus. The intuition is that when you have a term like and is large, it’s very close to .
Why this helps: When we convert each term like into , the product becomes much easier to work with. Instead of multiplying many complicated terms, we can add exponents.
Applying the Approximation to Balls and Bins
Section titled “Applying the Approximation to Balls and Bins”Now we apply our approximation to the product formula. We start with:
Step 1: Replace each term using our approximation :
Step 2: Combine exponentials by adding exponents (since ):
Step 3: Simplify the sum in the exponent. We know that . For large , this is approximately :
Step 4: Find the probability of at least one collision by subtracting from 1:
What this gives us: Instead of laboriously calculating the product for specific values, we now have a simple closed-form approximation. This is much faster to use and reveals the essential behavior of the system.
Sanity Check: Comparing with the Birthday Paradox
Section titled “Sanity Check: Comparing with the Birthday Paradox”Let’s verify our approximation works by checking it against the birthday paradox numbers we calculated earlier.
For the birthday paradox, we have people and days. Let’s compute:
Substituting our values:
Evaluating , we get:
Comparison: Earlier we calculated the exact probability for 24 people to be about 54%. Our formula here with 23 people gives about 51.6%. These are very close! This shows that our approximation using captures the behavior of the birthday paradox very well, and the approximation is accurate even for moderate values of .
Inverting the Question: How Many People Do We Need?
Section titled “Inverting the Question: How Many People Do We Need?”So far we’ve been given the number of people and calculated the probability. Now let’s flip it: How many people do we need for an 80% chance of a collision?
With unknown and , we set up:
Using our formula:
Step 1: Isolate the exponential. Subtract 1 from both sides and multiply by -1:
Step 2: Take the natural logarithm. This helps us get out of the exponent:
Step 3: Solve for . Multiply both sides by -730:
(Note: , so )
Taking the square root:
Conclusion: Since we need a whole number of people, we need 35 people in a room to have approximately an 80% chance that two will share a birthday.