HISTABRUT: A Maple Package for Symbol-Crunching in Probability theory

By Doron Zeilberger

.pdf   .ps   .tex

(Exclusively published in the Personal Journal of Shalosh B. Ekhad and Doron Zeilberger)

Posted: Aug. 25, 2010.

Last update of this webpage (but not article): Jan. 3, 2016.

One of the main uses of computers is to do statistical analysis of data. But, so far, the theory of statistics, and its noble mother, Probability theory, were all discovered and developed by lowly humans. No more! Computers can also develop probability theory (and statistics), and discover (and prove!) general theorems of much larger depth than those discovered by human-kind. Of course, at this time of writing, they still need these inferior humans to give them a head-start by teaching (i.e. programming) them to develop probability theory ab initio (only better!), but even this will soon be superfluous.

[Some browsers need the .txt extension here is the same package with .txt HISTABRUT.txt]

# Sample Input and Output for the Maple package HISTABRUT

• If you toss a fair coin n times, and win a dollar if it lends Heads and lose a dollar if it lends Tails, in order to find the mean, variance, and an explicit formula for the normailzed even moments α2r to order 2 (i.e. O(1/n3)) (of course the odd moments are zero),
the input gives the output.

• If you toss a loaded coin n times, and win a dollar with probality 1/3 and lose a dollar with probability 2/3, in order to find the mean, variance, and an explicit formulas for the normailzed even moments α2r, and normalized odd moments α2r+1, to order 2 (i.e. O(1/n3)) ,
the input gives the output.

• You roll a fair (cubic) die n times, and for each roll you win as many shekels as the number of dots it landed on, and consider the random variable: "total number of shekels won". The mean, variance, and an explicit formula for the normailzed even moments α2r to order 2 (i.e. O(1/n3)) (of course the odd moments are zero),
the input gives the output.

• Suppose you toss, n times, a loaded three-faced coin with prob. 1/2 of winning 3 dollars, 1/4 of winning one dollar and 1/4 of losing 4 dollars. Consider your total win. In order to find the mean, variance, and an explicit formulas for the normailzed even moments α2r, and normalized odd moments α2r+1, to order 2 (i.e. O(1/n3)) ,
the input gives the output.

• If you are wondering about the statistical behavior of the random variable "number of three consectutive Heads" (that may or may not be parts of longer Head-runs), upon tossing a fair coin n times,
the input gives the output.

• If you are wondering about the statistical behavior of the random variable "number of three consectutive Heads" (that are NOT part of a longer Head-run, i.e. the number of occurrences of the string THHHT), upon tossing a fair coin n times,
the input gives the output.

• To get a 5-second proof of the asymptotic normality of the random variable "number of inversions" in a random n-permutation (first proved by Feller (vol. I, X.6)), with first-order asymptotics for the normalized moments to boot,
the input gives the output.

• To get a proof of the asymptotic normality of the random variable "number of inversions" in a (uniform) random word with n 1's and n 2's, as well as a first-order finer asymptotics for the normalized moments
the input gives the output.

• For information about the asymptotics of the discrete probability distribution given by the q-Catalan polynomials
(1-q)(q)2n/(q)n2/(1-qn+1)
the input gives the output.

• To prove that the "continuous Stanley distribution" is the same as the normal distribution, but also to get a more refined asymptotics
the input gives the output.

(Note: see Shalosh B. Ekhad's note.)

• To get beautiful pictures of the limiting distribution of the random variable "duration of the game" in a gambler's ruin game that starts at the middle, or one third from either end,
the input gives the output.
(Notice the fat tails!)

• Suppose you walk in the plane, and with prob. 1/3 you stay in place, and with prob. 1/3 each you go to the left or up, and if you consider the bi-variate distibution (number of horizontal steps taken, number of vertical steps taken) after n steps. To get the means, variances, and the (normalized) mixed moments (up to the 10-th order)
the input gives the output.

• Conisder the bi-variate distribution (inv,maj) defined on n-permutations. To find the means, variances and, more interestingly, the normalized mixed moments, up to the fourth-order
the input gives the output.
(Note that this seems to indicate that these are joint-independently asymptotically normal. A rigorous proof is coming up by Andrew Baxter and Doron Zeilberger)

• To get the means, variances, and mixed moments (up to the 8th) for the pair of random variables "number of two consecutive Heads" and "number of two consecutive Tails" upon tossing a fair coin n times,
the input gives the output.

• To get the means, variances, and mixed moments (up to the 8th) for the pair of random variables "number of HHTH" and "number of TTHT" upon tossing a fair coin n times,
the input gives the output.

• To (empirically, symbolically) check that the bivariate pair of r.v. (number of occurrences of HTT, number of occurrences of THH), in tossing a fair coin n times, as n goes to infinity, tends to a pair of correlated normal variables (with the apropriate correlation),
the input gives the output.

• To get a verbose account of the mean, variance, correlation, and first few mixed moments for the pair of r.v. (Number of HT, Number of TH),
the input gives the output.

• To get the means, variances, and most interestingly, the bi-variate moment-(exponential)generating-function (i.e. inverse Fourier Transform) of the limiting distribution of the sequence of discrete prob. distributions whose prob. generating functions is ((1+x1+x2)/3)n,
the input gives the output.

• To get the means, variances, and most interestingly, the tri-variate moment-(exponential)generating-function (i.e. inverse Fourier Transform) of the limiting distribution of the sequence of tri-variate discrete prob. distributions whose prob. generating functions is (1+x[1]+x[2]+x[3]+2*x[1]*x[2]*x[3])/6)n
the input gives the output.

• To get the mixed-moments for the famous pair (NumberOfInversions,MajorIndex) defined on n-permutations up to order 10 (except that it was unable to find the (10,10)-mixed moment), using the previously computed first 30 terms (gotten from procedure GnD), and saved in the file oGnD30,
the input gives the output.

• To guess from the above output EXPLICIT expression for the asymptotics (to order 1) of the famous pair (NumberOfInversions,MajorIndex), proving the Baxter-Zeilberger deep theorem that they are asymptotically joint-independently-asymptotically-normal,
the input gives the output.

• Added May 28, 2012: My beloved wife, Jane Legrange, a regular participant in Rabbi Annie Tucker's "Bible B'Boker" Saturday morning class, told me that last Shabbat they discussed the military census of the twelve tribes of Israel (see the wikipedia entry). While this is a "trivial" finite distribution, HISTABRUT can also handle such. Indeed
the input gives the output.

In a very interesting article linking lambda calculus to planar maps. Noam Zeilberger came up with a sequence of polynomials P_n(x), whose ordinary generating function, with respect to z, satisfies the functional-differential equation (Eq. (6), p. 12) there, Lind(z,x) (let's call if f(z,x))

f(z,x)=x+z*(f(z,x)-f(z,0))^2 + z*diff(f(z,x),x)

Here x is the catalytic variable, and the main interest is the sequence of numbers P_n(0), in other words, the coefficients of f(z,0).

But this also raises the question of whether the random variable accounted for by the exponent of x, is asympotically normal. Using the first 300 terms, it seems that it is.

In order to crank-out this sequence we the maple code

yields the following input and output files

• To see the first 301 terms (starting at n=0) of the sequence of polynomials (called NoamX), and the sequences of numbers when x=0 (called Noam0) and when x=1 (called Noam1)

input file yields the output file

[Note: To just see Noam0, see here]

• It seems unlikey that Noam0 is P-recursive (alias holonomic), in any case no simple recurrence exists that fits the first 300 terms. This can be seen from

input file yields the output file

[Note: it also needs the Maple package Findrec

• Last put not least, using this project's Maple package HISTABRUT.txt Here is a numerical analysis of the statistical properties of the random variable, indicating that is is most probably asymptotically normal

input file yields the output file .

Personal Journal of Shalosh B. Ekhad and Doron Zeilberger