Counting and Entropy: The First Cut

Satyadev Nandakumar
Fri Aug 26 07:34:49 2016

Binomial Sums

There is a great essay by Jaikumar Radhakrishnan on the same topic, but here, we will look at a humble connection between binomial coefficients and counting.

We often need a closed form for sums of binomial coefficients of the following form.

 n   n   n         n
   +   +   + ... + 
 0   1   2         k

There are three basic cases where the sum can be evaluated easily.

When k is a constant which does not depend on n, then the above sum is a polynomial in n of constant degree.
When k is n, the sum is 2^n
When k is n/2, we know that the sum is roughly 2^{n-1}.

In general, however, closed forms for sums of the above nature where k can be some arbitrary value, are difficult to obtain. However, there is one other important case where we can at least get the asymptotic order of the sum. Curiously, this estimate is related to the notion of Shannon Entropy.

When k is a constant fraction of n, say cn with 0 < c < 0.5, we have the following estimate, where H(c), the Shannon Entropy of the probability distribution on the binary alphabet with P(0)=c, makes an unexpected appearance.

Note that we are summing up to at most n/2. The case when c>0.5 can be handled in a symmetric manner.

Estimating the largest term

Stirling's approximation says that the ratio of n! to

          ____________
     n   /
(n/e)   v  2 pi  n

tends to 1 as n goes to ∞.

Since we have

n        n!
  = -------------
k   (cn)! (n-cn)!

we can apply Stirling's approximation to the numerator and each term in the denominator separately. After cancellation, we get

           1
-----------------------------------
                   _______________
  cn      (1-c)n  /
c    (1-c)       v   2 pi n

We can rewrite this as

           1           -cn log c - (1-c)n log (1-c)
   ---------------   2
     ____________
   v   2 pi n

In other words,

           1           n H(c)
   ---------------   2
     ____________
   v   2 pi n

There. We have the entropy term in the numerator. This is the estimate of the largest term in the sum. Since c<0.5, we know that nH(c) < n.

Estimating the sum

Let us call the largest term as C. There are cn terms in the actual sum we wish to evaluate. A lazy upper bound is cn C, which is of the order of

   ______     
  / n         nH(c)
 /-------    2
v  2 pi

which can be upper bounded by

 n(H(c)-epsilon)
2

for any positive epsilon, for all sufficiently large n.

An Alternative Viewpoint: Normal Approximation to the Binomial Distribution

The following method of approximating sums of contiguous binomial coefficients, well-known in probability theory, can also come in handy when we do not need closed forms for sums of binomial terms, rather, estimates will do as well.

The Central Limit Theorem roughly says that the averages of iid processes (no matter which distribution they follow) tend in distribution to the normal distribution as n, the number of points in the sample space, tends to ∞. The excellent Galton board illustrates this.

The honeycomb structure sets up the "randomizer" which randomly causes the ball bearings to go left or right. The balls are dropped from the center of the board. In order to end up in the k^th bucket left of the central bucket, it must make k more left deflections than right. The number of ways this can be done is choose(N,k) where N is the depth of the honeycomb. As the number of balls increases, the binomial distribution tends to the normal.