%crv.tex: Automated Derivation of Limiting Distributions Of Combinatorial Random Variables Whose Generating Functions are Rational
%Plain TeX
%begin macros
\def\mod{\rm mod}
\def\Z{\bf Z}
\def\L{{\cal L}}
\baselineskip=14pt
\parskip=10pt
\def\halmos{\hbox{\vrule height0.15cm width0.01cm\vbox{\hrule height
0.01cm width0.2cm \vskip0.15cm \hrule height 0.01cm width0.2cm}\vrule
height0.15cm width 0.01cm}}
\font\eightrm=cmr8 \font\sixrm=cmr6
\font\eighttt=cmtt8
\magnification=\magstephalf
\def\W{{\cal W}}
\def\P{{\cal P}}
\def\Q{{\cal Q}}
\def\1{{\overline{1}}}
\def\2{{\overline{2}}}
\parindent=0pt
\overfullrule=0in
\def\Tilde{\char126\relax}
\def\frac#1#2{{#1 \over #2}}
%\headline={\rm \ifodd\pageno \RightHead \else \LeftHead \fi}
%\def\RightHead{\centerline{
%Title
%}}
%\def\LeftHead{ \centerline{Doron Zeilberger}}
%end macros
\centerline
{\bf Automated Derivation of Limiting Distributions Of Combinatorial Random Variables}
\centerline
{\bf Whose Generating Functions are Rational}
\rm
\bigskip
\centerline
{\it By Doron ZEILBERGER}
\bigskip
{\bf Abstract}: The author's methodology of automated derivation of explicit expressions for the average, variance, and higher
moments of combinatorial random-variables
that possess explicit generating functions
is extended to those whose bi-variate generating
function is an arbitrary rational function.
{\bf Maple Package and Sample Output}
This article is accompanied by a Maple package, {\tt BiVariateMoms.txt}, that is available, along with some sample input and output files, from
{\tt http://www.math.rutgers.edu/\~{}zeilberg/mamarim/mamarimhtml/crv.html} \quad .
{\bf Ancient History}
If you toss a fair coin $n$ times, and are interested in the random variable `number of Heads', let's call it $H_n$, and plot the
graph of
$$
k \rightarrow Prob(H_n=k) \quad,
$$
and $n$ is large enough, you {\bf very famously}, get
something close to a {\it bell-shaped curve}, and as $n$ gets larger and larger, it would get closer and closer to
it. The official names of the bell-shaped curve are `{\it standard normal distribution}' and
`{\it Gaussian distribution}'.
More precisely, letting $\mu_n$ denote the {\it expectation} (aka {\it mean}, aka {\it average}),
and $\sigma_n$ the standard deviation
(that happen to be equal to $\frac{n}{2}$ and $\frac{\sqrt{n}}{2}$ respectively), and defining
the {\it centralized and scaled} version,
$$
X_n:=\frac{H_n -\mu_n}{\sigma_n} \quad ,
$$
then the sequence of discrete random variables, $X_n$, `tend to', in a certain precise sense, to the ({\bf continuous distribution})
called {\it standard normal distribution}, whose {\it probability density function} is {\it famously}
$$
\frac{1}{\sqrt{2 \pi}} e^{-\frac{x^2}{2}} \quad .
$$
This means That
$$
\lim_{n \rightarrow \infty } Pr(X_n \leq x) = \int_{-\infty}^{x} \frac{1}{\sqrt{2 \pi}}e^{-t^2/2} \, dt \quad .
$$
This was first proved, by de Moivre and Laplace, using fairly {\it ad hoc} methods,
but has since been greatly generalized, and the collective name is {\it Central Limit Theorems}.
\vfill\eject
{\bf From Enumeration to Statistics}
Suppose that we have a finite set $S$, on which a certain numerical attribute, called {\it random variable}, $X$,
(using the probability/statistics lingo), is defined.
For any non-negative integer $i$, let's define
$$
N_i:=\sum_{s \in S} X(s)^i \quad .
$$
In particular, $N_0(X)$ is the number of elements of $S$.
The expectation of $X$, $E[X]$, denoted by $\mu$, is, of course,
$$
\mu \, = \, \frac{N_1}{N_0} \quad .
$$
For $i>1$, the $i$-th straight moment is
$$
E[X^i] \, = \, \frac{N_i}{N_0} \quad .
$$
The $i$-th {\it moment about the mean} is
$$
m_i:=E[(X-\mu)^i]= E[\sum_{r=0}^{i} {{i} \choose {r}} (-1)^r \mu^r X^{i-r}]=
\sum_{r=0}^{i} (-1)^r {{i} \choose {r}} \mu^r E[X^{i-r}]
$$
$$
=\, \sum_{r=0}^{i} (-1)^r {{i} \choose {r}} \left ( \frac{N_1}{N_0} \right )^r \frac{N_{i-r}}{N_0}
$$
$$
= \, \frac{1}{N_0^i} \sum_{r=0}^{i} (-1)^r {{i} \choose {r}} N_1^r N_0^{i-r-1} N_{i-r} \quad .
$$
{\bf Finally}, the most interesting quantity, statistically speaking, apart from the mean $\mu$ and variance $m_2$ are
the {\bf scaled-moments}, also known as, {\it alpha coefficients}.
$$
\alpha_i :=\frac{m_i}{m_2^{i/2}} \quad .
$$
\vfill\eject
{\bf Getting the Moments via Generating Functions}
The {\it generating function} of our random variable, $X$, also called {\it weight-enumerator},
with respect to the variable $t$, is defined by
$$
f(t)=f_{S,X}(t) \, := \, \sum_{s \in S} t^{X(s)} \quad .
$$
Note that when $t=1$, we get the {\it naive enumeration}, $|S|$.
More generally we have, with the above notation,
$$
N_i= \left (t \frac{t}{dt} \right )^i f(t) \Bigl \vert_{t=1} \quad .
$$
In many enumeration scenarios, we don't have just {\it one} finite set, $S$, but
an {\it infinite sequence} of sets $S_n$, and the generating function $f_n(t)$ can be expressed
{\it explicitly} in terms of {\bf both} the `continuous' variable, $t$, and discrete variable $n$.
For example, if the set $S_n$ is the set $\{H,T\}^n$ and $X_n(s)$ is the ``number of heads'', then
by `independence', we have a {\bf closed form formula} for $f_n(t)$:
$$
f_n(t)= (1+t)^n \quad .
$$
More generally, if $S_n$ is the set of sequences of outcomes of rolling, $n$ times, a $k$-faced die
whose faces are marked with
$a_1, \dots, a_k$ dots, in other words the set $\{a_1, \dots, a_k\}^n$,
and $X_n$ is the random variable `total number of dots' (i.e. the sum of the sequence of outcomes), then
$$
f_n(t)= \left ( \sum_{j=1}^{k} t^{a_j} \right )^n \quad .
$$
So, for each non-negative integer $i$, we have a {\it sequence of numbers}, $N_i(n)$, obtained by applying $(t \frac{d}{dt})^i$ to $f_n(t)$,
and then plugging-in $t=1$. We always get {\it polynomials} for the $N_i(n)$. From them we can get polynomial expressions
for the moments-about-the-mean, $m_i(n)$.
This is better delegated to computers. In many
cases, one can also get the leading terms for {\bf symbolic} $i$, and prove completely automatically,
{\it asymptotic normality} up to any desired moment (i.e. that the scaled moments, $\alpha_i(n)$, tend, as $n$ goes to infinity,
to those of the standard normal distribution $1,0,3,0,5,0,15, 0, 105, \dots$.
In fact, it is even possible to prove it for an arbitrary moment (symbolic $i$), hence giving fully automated proofs of
central limit theorems. This was described in [Z1] and [Z2].
{\bf The Grand Generating Function}
Since we have an {\it infinite} sequence of polynomials, $f_n(t)$, we can form the {\bf grand generating function}
$$
F(t,z):=\sum_{n=0}^{\infty} f_n(t) z^n \quad .
$$
For example, for the coin-tossing example, we have
$$
F(t,z)= \frac{1}{1-(1+t)z} \quad,
$$
and for the more general die-rolling example, we have
$$
F(t,z)= \frac{1}{1- z \left ( \sum_{j=1}^{k} t^{a_j} \right) } \quad .
$$
Notice that these are {\it rational functions} in {\bf both} $t$ and $z$.
But there are {\bf many} examples in enumerative combinatorics (and statistical physics!), where there is no
`closed-form' expressions for the {\it individual} $f_n(t)$, but the `grand-generating function', $F(t,z)$ is
rational. These occur, for example, in {\it tiling} problems, see the beautiful article [Z3] with its
accompanying Maple package {\tt TILINGS}.
Note that in such cases, the enumerating sequence, $\{ N_0(n)\}$ is $C$-finite, i.e. satisfies a (homog.) linear
recurrence equation with {\bf constant} coefficients (see [Z4] for a brief overview of $C$-finite sequences, and
the modern classic [KP] for an in-depth account.) Obviously,
$$
\sum_{n=0}^{\infty} N_0(n) z^n =F(1,z) \quad .
$$
More generally,
$$
\sum_{n=0}^{\infty} N_i(n) z^n =\left (t \frac{d}{dt} \right)^i F(t,z) \Bigl \vert_{t=1} \quad ,
$$
are always rational functions of $z$, and hence the sequences $\{ N_i(n) \}_{n=0}^{\infty}$ are always $C$-finite.
Since $F(t,z)$ is a rational function, we can write it as a quotient of polynomials
$$
F(t,z)=\frac{P(t,z)}{Q(t,z)} \quad .
$$
It follows from the {\it quotient rule} from `calc1', that, for each $i$,
$$
\sum_{n=0}^{\infty} N_i(n) z^n =\left (t \frac{d}{dt} \right)^i F(t,z) \Bigl \vert_{t=1}=
\frac{P_i(z)}{Q(1,z)^{i+1}} \quad ,
$$
for {\it some} polynomial $P_i(z)$. Hence, by partial fraction decomposition, it follows that, denoting by $L$ the
order of the recurrence for $N_0(n)$ (alias the degree, in $z$, of $Q(1,z)$), we are {\it guaranteed} that
for each $i$, there exist polynomials $A_{i,j}(n)$, $0 \leq j \leq L-1$, of degree $i$ in $n$, such that
$$
N_i(n)= \sum_{j=0}^{L-1} A_{i,j}(n) \cdot N_0(n-j) \quad .
$$
Since we are guaranteed, {\it a priori}, for each $i$, that the $L$ polynomials $A_{i,0}(n), \dots, A_{i,L-1}(n)$
(of degree $i$ in $n$), {\bf exist}, we can ask our computers to find them empirically, by cranking-out sufficiently many terms
of the sequence $\{N_0(n)\}$ and $\{N_i(n)\}$, and using {\it undetermined coefficients}.
Once we have `explicit' expressions (in the above sense) for $N_i(n)$, we can get `explicit' expressions
(in the same sense), for the (straight) moments, and from them, the more
informative {\it moments about the mean}, that, in turn, lead to the scaled moments $\alpha_i(n)$. Denoting
by $\beta$ the smallest positive root of $Q(1,z)$, (a certain {\it algebraic} number) and noting that
$$
\lim_{n \rightarrow \infty } \frac{N_0(n-1)}{N_0(n)} \, = \, \beta \quad,
$$
and hence, more generally, for $0 \leq j