There is a great essay by Jaikumar Radhakrishnan on the same topic, but here, we will look at a humble connection between binomial coefficients and counting.
We often need a closed form for sums of binomial coefficients of the following form.
n n n n + + + ... + 0 1 2 kThere are three basic cases where the sum can be evaluated easily.
In general, however, closed forms for sums of the above nature where k can be some arbitrary value, are difficult to obtain. However, there is one other important case where we can at least get the asymptotic order of the sum. Curiously, this estimate is related to the notion of Shannon Entropy.
When k is a constant fraction of n, say cn with 0 < c < 0.5, we have the following estimate, where H(c), the Shannon Entropy of the probability distribution on the binary alphabet with P(0)=c, makes an unexpected appearance.
Note that we are summing up to at most n/2. The case when c>0.5 can be handled in a symmetric manner.
Stirling's approximation says that the ratio of n! to
____________ n / (n/e) v 2 pi ntends to 1 as n goes to ∞.
Since we have
n n! = ------------- k (cn)! (n-cn)!we can apply Stirling's approximation to the numerator and each term in the denominator separately. After cancellation, we get
1 ----------------------------------- _______________ cn (1-c)n / c (1-c) v 2 pi n
We can rewrite this as
1 -cn log c - (1-c)n log (1-c) --------------- 2 ____________ v 2 pi nIn other words,
1 n H(c) --------------- 2 ____________ v 2 pi nThere. We have the entropy term in the numerator. This is the estimate of the largest term in the sum. Since c<0.5, we know that nH(c) < n.
______ / n nH(c) /------- 2 v 2 piwhich can be upper bounded by
n(H(c)-epsilon) 2for any positive epsilon, for all sufficiently large n.
The following method of approximating sums of contiguous binomial coefficients, well-known in probability theory, can also come in handy when we do not need closed forms for sums of binomial terms, rather, estimates will do as well.
The Central Limit Theorem roughly says that the averages of iid processes (no matter which distribution they follow) tend in distribution to the normal distribution as n, the number of points in the sample space, tends to ∞. The excellent Galton board illustrates this.
The honeycomb structure sets up the "randomizer" which randomly causes the ball bearings to go left or right. The balls are dropped from the center of the board. In order to end up in the kth bucket left of the central bucket, it must make k more left deflections than right. The number of ways this can be done is choose(N,k) where N is the depth of the honeycomb. As the number of balls increases, the binomial distribution tends to the normal.