Balls in Boxes: Variations on a Theme of Warren Ewens and Herbert Wilf

By Shalosh B. Ekhad and Doron Zeilberger

.pdf   .ps   .tex
[Appeared in "Advances in Combinatorics: Waterloo Workshop in Computer Algebra, W80, May 26-29, 2011", (edited by Ilias Kotsireas and Eugene Zima), 161-174].
Written: June 13, 2011.

מ ו ק ד ש   ל ה רב ר ט   ש א ו ל   ו י ל ף   ב ה ג י ע ו   ל ג ב ו ר ו ת

Dedicated to Herbert Saul Wilf on his 80th Birthday

The wonderful conference W80 (organized by Ilias Kotsireas and Eugene Zima), celebrated, a bit early (May 26-29, 2011), Herb Wilf's 80th Birthday. One of the 29 invited speakers was Herb himself, and that was lucky, first because his talk was (as usual) great, and second, because it inspired the present note. Herb talked about two "exercises" (his word!) in combinatorial biology, that he did in collaboration with the great population geneticist Warren Ewens. One of these "exercises", written up in this article, is commented on, and elaborated, in the present article (and Maple package). In particular we meta-apply their ingenious method to show that it is not really needed, and that one is better off using the so-called Poisson Approximation, at least in applications to the real world, because extremely unlikely events mever happen in real life.

# Sample Input and Output for BallsInBoxes

• If you want to see a web-book with asymptotic formulas for the probability, upon placing a times n balls in b times n boxes, of no box containing more than m balls for
1 ≤ a,b ≤ 2 and 2 ≤ m ≤ 7,
the input file yields the output file.

• If you want to see linear recurrences (with polynomial coefficients) for the number of ways of placing a times n balls in b times n boxes, such that no box contains more than m balls for
1 ≤ a,b ≤ 3 and 2 ≤ m ≤ 6,
the input file gives the output file.

• If you place 2n balls in n boxes, and you you want to get an asymptotic formula (in n), to order 4, for the probability that no box contains more than 6 balls, then
the input file gives the output file.

• If you want to get the "probability distribution" (without the tails of course) for the random variable "largest number of balls in a box" when 14400 balls are placed, uniformly at random, in 9000 boxes, followed by the list consisting of the average, standard deviation, and (normalized) higher moments, followed by the normalized distribution, using, respectively, the exact, "Empirical", and "Poisson" approximation,
the input file gives the output file.

• If you want to get the "probability distribution" (without the tails of course) for the random variable "largest number of balls in a box" when 8000 balls are placed, uniformly at random, in 12000 boxes, followed by the list consisting of the average, standard deviation, and (normalized) higher moments, followed by the normalized distribution, using, respectively, the exact, "Empirical", and "Poisson" approximation,
the input file gives the output file.

• If you want to see empirically derived approximate formulas, in terms of n (or rather log(n)) for the expected value of the random variable "maximum number of balls in the same box" when nR balls are thrown into n boxes, for 1 ≤ R ≤ 5, and
the input file gives the output file.

• If you want to see the EXACT values of P(14400,9000,m); for m between 6 and 12, using rational arithmetic (NOT floating-point), and the time it took.
the input file gives the output file.

• If you want to see the EXACT values of P(8000,12000,m); for m between 4 and 8, using rational arithmetic (NOT floating-point), and the time it took.
the input file gives the output file.

• If you want to see whether you can brag when your Calculus section is more popular than your peers, or blame chance when it is much less popular (with confidence 99%) [Using the (very good!) Poisson Approximation]
the input file gives the output file.

• Same as above, but with confidence 99.9%,
the input file gives the output file.

Doron Zeilberger's List of Papers