6.1 An applied scenario — one second of vibration samples
Back to the motor from the last chapter. The accelerometer streams 1 000 samples per second, and your threshold rule fires a 1 whenever a sample exceeds +2\,\mathrm{g}. On a healthy machine, calibration gave you p = 0.005 (1 in 200 samples is a transient).
You decide to summarise the stream one second at a time: count how many 1s show up in each 1 000-sample window. That count is the alarm metric the operator sees.
A few obvious questions:
What count should you expect on a healthy machine?
How much can the count drift from one second to the next without anything actually being wrong?
If you see a count of 20 in one window, is that worrying — or just normal variation?
Each window is n = 1000 independent Bernoulli trials with p = 0.005. The thing you’re counting — successes out of n trials — has a name.
6.2 Intuition
If a Bernoulli trial is one coin flip, the Binomial distribution answers: “If I run n independent trials, each with success probability p, how many successes will I get?”
Vibration window: in 1 000 samples, each with probability p = 0.005 of crossing the threshold, how many crossings k?
Manufacturing batch: 200 units, each defective with probability p = 0.02, how many defects k?
Photon counting: n photons strike a sensor and each has probability p = \text{QE} of being detected, how many electrons k?
In every case the structure is identical: n independent Bernoulli trials with the same p, count the successes.
6.3 The math
P(k \mid n, p) \;=\; \binom{n}{k} p^k (1-p)^{n-k}
where \binom{n}{k} = \frac{n!}{k!(n-k)!} is the number of ways to choose which k out of n trials succeed.
Mean:E[k] = np
Variance:\operatorname{Var}(k) = np(1-p)
Breaking the formula down:
\binom{n}{k} — how many ways exactly k trials can succeed out of n total
p^k — probability that those k trials all succeed
(1-p)^{n-k} — probability that the remaining n-k trials all fail
multiply: total probability of exactly k successes
Note▶ Show the math — E[k] = np from the Bernoulli sum
A Binomial random variable is just the sum of n independent Bernoulli(p) trials:
scipy.stats.binom.pmf does exactly these steps internally. Seeing it once makes the formula concrete.
6.5 Back to the vibration sensor
For the motor:
n = 1000, \quad p = 0.005
\;\Rightarrow\; E[k] = 5, \quad \operatorname{Var}(k) \approx 4.975,
\quad \sigma \approx 2.23
So on a healthy machine you should see about 5 crossings per second with a typical fluctuation of \pm 2. A count of 7 is unremarkable. A count of 20 is roughly 7\sigma above the mean — not normal variation, that’s a state change worth investigating.
Top row: fix p = 0.7, vary n. Distribution widens and becomes more bell-shaped (CLT preview). Bottom row: fix n = 30, vary p. Peak shifts; symmetric at p = 0.5.
For the motor case, p = 0.005 and n = 1000 — the distribution is heavily skewed and concentrated near small counts. That regime — many trials, each rare — is exactly where the Binomial morphs into the Poisson distribution (next chapter).
6.7 Simulation — repeated windows
Simulating many 1-second windows is the empirical version of the formula. Each simulated window is one independent set of n Bernoulli trials.
Simulation histogram (bars) hugs the theoretical Binomial PMF (line); residual differences shrink as the number of simulated windows grows (CLT effect).
TipWhy simulate when we have the formula?
Simulation validates the theory and builds intuition. If simulation and formula disagree, one of them is wrong — a debugging technique that survives all the way into deep learning, where closed-form answers stop existing and Monte Carlo is the only tool.
6.8 Where else binomial counts appear
Domain
n
p
Count k
Vibration monitoring
Samples per window
P(threshold crossing)
Crossings per window
Quality control
Units per batch
P(defective)
Defects per batch
A/B testing
Visitors in a bucket
P(conversion)
Conversions per bucket
Image thresholding
Pixels in a patch
P(intensity > T)
Bright pixels per patch
Photon counting
Incident photons
Quantum efficiency
Detected electrons
The formula doesn’t care what the trial is — only that the trials are independent and share the same p. When p varies across trials or trials aren’t independent, you need a different model.
6.9 Exercises
Manually compute P(k = 3) for n = 10, p = 0.4. Confirm against scipy.stats.binom.pmf.
For n = 1000 and p = 0.005, simulate 10\,000 windows. What fraction of windows produce a count \geq 20?
Plot Binomial PMFs for p = 0.5 at n = 5, 20, 100, 500. Watch the convergence to bell shape.
Show by simulation that mean and variance match np and np(1-p) to within Monte Carlo error.
6.10 Glossary
Binomial distribution — count of successes in n independent Bernoulli trials with constant p.
n — number of trials.
p — success probability per trial.
k — number of successes (random).
PMF — probability mass function P(k).
CDF — cumulative distribution function P(X \leq k).