Saturday, March 2, 2024

Summing Uniform random variables to reach a target sum

The problem of finding the expected number of standard Uniform variables to get a sum greater than 1 is quite famous. In this post, we deal with the same problem but for a given target $t$.

This problem is not too difficult. Knowing that the sum of uniform random variables follows the Irwin-Hall distribution and the CDF of this distribution has a closed form solves the problem easily.

For example, if we let $N$ be the random variable denoting the number of draws needed to get the sum exceed $t$ for the first time, then

$\displaystyle \mathbb{P}(N>n)=\mathbb{P}(U_1+U_2+\cdots+U_n \leq t)=\frac{1}{n!}\sum_{k=0}^{\lfloor t \rfloor}(-1)^k\binom{n}{k}(t-k)^n$

Then $\mathbb{P}(N=n)=\mathbb{P}(N>n-1)-\mathbb{P}(N>n)$

But we can get this more directly using the idea in Excercise 6.65 (more importantly, its solution) of Concrete Mathematics.

For standard uniform random variables $U_1,U_2,\cdots,U_n$, let $X_1=\{U_1\}$, $X_2=\{U_1+U_2\}$ and so on, where $\{x\}$ denotes the fractional part of $x$.

It can be easily seen that $X_k$'s are all independently distributed uniform random variables. More importantly, the number of descents in the sequence of $X_k$'s is $\lfloor U_1+U_2+\cdots+U_n\rfloor$.

Therefore, for integer $t$, the sum of $n$ uniform random variables exceeding $t$ for the first time corresponds to the sequence of $X_1,X_2,\cdots,X_n$ that has exactly $t$ descents and end with a descent.

The probability of such a sequence is related to the permutation of numbers $1,2,\cdots,n$ that has exactly $t$ and end with the descent. We have already seen in this blog regarding this exact problem. Using that result, we finally have,

$\displaystyle\mathbb{P}(N=n)=\frac{(n-t)A(n-1,t-1)}{n!}$

Though we have solved this for integer $t$, with a giant leap in faith guided by some experimentation, we can also see that this works any real $t$.

With this, we get,

$\mathbb{E}(\text{No. of Uniform variables needed to get a sum greater than }\pi)\approx 6.949504$

$\mathbb{E}(\text{No. of Uniform variables needed to get a sum greater than }e)\approx 6.104002$

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Saturday, December 30, 2023

A nice property on the recurrence of Eulerian Numbers

I was recently looking at Eulerian numbers for a problem and I noticed a nice, and probably lesser/never known, property on their recursive definition. In summary, it seemed like an exercise problem out of a Combinatorics textbook except I couldn't find one.

Eulerian numbers $A(n,k)$ count the number of permutation of numbers from $1$ to $n$ in which exactly $k$ elements are smaller than the previous element. In other, words they count the number of permutation with exactly $k$ descents. 

With this interpretation, we can setup the following recurrence as shown in this Proposition 3 of this lecture.

$A(n,k)=(n-k)A(n-1,k-1)+(k+1)A(n-1,k)$

The first term here corresponds to the number of permutations in which the element $n$ contributes to the descent count and the second term to those where it doesn't.

The problem that I was interested in was that of counting the number of permutation with $k$ descents and end with a descent.

After some thought, it seemed easy to setup a recurrence for this count as well.

Let $A_d(n,k)$ be the number of permutations of $n$ elements with exactly $k$ descents and end with a descent. Let $A_a(n,k)$ be the same count but those that end with an ascent. Then,

$A_d(n,k)=k A_d(n-1,k)+A_a(n-1,k-1)+(n-k)A_d(n-1,k-1)$

The first term here corresponds to inserting $n$ in an existing descent. The second term is where we insert $n$ as second to last position thereby simultaneously making the permutation end with an descent and increasing the descent count.

The last one is where we take a permutation with $n-1$ elements. There are '$n-1$' gaps where we can insert $n$ (we don't $n$ to be in the last position as it would make the permutation to end with an ascent). Of the $n-1$ gaps, $k-1$ are already descents which means there are $n-k$ gaps that correspond to an ascent. Inserting $n$ here would increase descent count by one giving us the requisite permutation.

Similarly, we can write the following recurrence for the $A_a(n-k)$

$A_a(n,k)=A_d(n-1,k)+(n-k-1)A_a(n-1,k-1)+(k+1)A_a(n-1,k)$

Using the fact that $A(n,k)=A_a(n,k)+A_d(n,k)$ and some simple manipulations, we get

$A_d(n,k)=(n-k)A(n-1,k-1)$ and $A_a(n,k)=(k+1)A(n-1,k)$

I hope you can see now why this surprised me. The recurrence usually written for Eulerian numbers also has another simpler combinatorial representation without any change!!

 That is, the number of permutations with $k$ descents in which $n$ contributes to the descent count is equinumerous with the permutations with $k$ descents that end with a descent.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Sunday, December 24, 2023

The Rounding Error Conundrum

Like all managers, I too, from time to time, will be in a position to present a bunch of numbers in an Excel Sheet. To bring clarity to the sheet, I follow the etiquette of rounding the numbers to (MS-Excel's default) two decimal places.

The problem this brings is that when you sum of list of numbers and round everything to, say, two decimal places, the numbers doesn't always add up and someone or the other points out that the numbers are not adding up in the last decimal place. I got tired of pointing out that it is a result of rounding.

At first, I thought rounding with more decimal places might bring some resolution to this situation. But then, it became apparent (and intuitively obvious) that no matter the rounding, the problem persists. Being an amateur mathematician at heart, I wanted to solve this problem mathematically.

Let $X_1,X_2,\cdots,X_n$ be standard uniform random variables $U(0,1)$ and $Y=X_1+X_2+\cdots+X_n$. We are then interested in the probability that $[Y]=[X_1]+[X_2]+\cdots+[X_n]$ where $[x]$ denotes the value of $x$ rounded to $r$ places after the decimal point.

As I started thinking about this problem, the only progress I could make is that $r$ doesn't affect the probability and we could as well consider it to be $0$. Frustrated, I posted in my puzzle group and Atul Gupta from the group was able to quickly solve this.

The key idea is to use the periodicity of the 'round' function. That is, $[n+x]=n+[x]$ for integer $n$ and real $x$.

The way to solve this problem then is transform the $X$'s such that $X_k=B_k+Z_k$ where $B_k$'s are either $0$ or $1$ and $Z_k \sim U(-0.5,0.5)$.

Because we take $r=0$, we can see that $[Z_k]=0$ for all $k$. Therefore, $[X_k]=[B_k+Z_k]=B_k+[Z_k]=B_k$ which means $[X_1]+[X_2]+\cdots+[X_n]=B_1+B_2+\cdots+B_n=\text{(say)}B$.

On the other hand,

$\begin{align} Y&=X_1+X_2+\cdots+X_n\\ &=B_1+Z_1+B_2+Z_2+\cdots+B_n+Z_n\\ &=B+Z_1+Z_2+\cdots+Z_n \end{align}$

Using the periodicity of 'round' function, $[Y]=B+[Z_1+Z_2+\cdots+Z_n]$.

Hence, the condition we are interested in simplifies to $[Z_1+Z_2+\cdots+Z_n]=0$ which is equivalent to saying $-0.5 \leq Z_1+Z_2+\cdots+Z_n \leq 0.5$

Transforming the $Z_k$'s to standard Uniform variables $U_k=Z_k+1/2$, this becomes $(n-1)/2 \leq U_1+U_2+\cdots+U_n \leq (n+1)/2$

It is well known that the sum of $n$ standard uniform random variables defines the Irwin-Hall distribution (also known as Uniform Sum distribution) $IH_n$.

Irwin-Hall distribution and Eulerian numbers are closely related. For example, this should already be apparent from the PDF of $IH_n$ in the Wikipedia page and the integral identity of Eulerian Numbers.

While the CDF of $IH_n$ also nearly resembles Eulerian numbers, it is not quite there. To make Eulerian numbers more useful to our cause, we slightly modify the formula given in Wiki page.

$\displaystyle A(n,x)=\sum_{i=0}^{x+1}(-1)^i\binom{n+1}{i}(x+1-i)^n$

where $x$ can be a real number. With this definition, we have the following useful result.

$\displaystyle\mathbb{P}(t \leq IH_n \leq t+1)=\frac{1}{n!}A(n,t)$

Therefore, we finally have what we were looking for.

$\displaystyle \mathbb{P}([Y]=[X_1]+[X_2]+\cdots+[X_n])=\mathbb{P}\left(\frac{n-1}{2} \leq \sum_{k=1}^nU_k \leq \frac{n+1}{2}\right)=\frac{1}{n!}A(n,(n-1)/2)$

For large $n$, we can use the normal approximation to show that 

$\displaystyle \mathbb{P}([Y]=[X_1]+[X_2]+\cdots+[X_n]) \approx \sqrt{\frac{2}{\pi}}\sin\left(\sqrt{\frac{3}{n}}\right)$

which shows the probability of the sum matching only worsens as $n$ increases.

On an unrelated note, I also noticed that

$\displaystyle \mathbb{E}(IH_n|t \leq IH_n \leq t+1)=(t+1)\frac{n}{n+1}$

Given the simple expression for the expected value, I suspect there is a very nice reason for this but am yet to see it. In case you can, please do share it with us in the comments.

Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Friday, October 27, 2023

Generalising a famous Problem in Probability

 As always, when I was looking for newer problems in probability, I came across the famous problem of determining the probability that $n$ points chosen uniformly randomly on a circle of unit circumference lying in one semicircle.

This problem is quite famous with a neat solution as posted in the above SE post. However, I was intrigued by the choice of semicircle in the question. I mean what if we choose a quarter-circle? or a three-fourths circular arc? or any arc of length $x$ from that circle for that matter.

It should be obvious that the argument given in the SE post applies verbatim whenever $x \leq 1/2$. It's the other case that makes for an interesting question.

This complication arises primarily because the $x \leq 1/2$ cases exploits the fact that the event 'all points lying on the $x$-circular arc' is independent for each of the points (which in fact contributes to the factor of $n$ in the numerator). And this independency breaks when $x>1/2$.

A better way to solve this problem is to realize that if the largest gap between two consecutive chosen points is greater than '$1-x$', the selected points can all be covered by a circular segment of length $x$.

This is great because it takes us into the realm of order statistics on a circle which is an academic problem that was intensively studied. There was so much literature on this, I did not even find the motivation to work it out myself.

For example, On the Lengths of the Pieces of a Stick Broken at Random, Lars Holst shows, among other results, that

$\displaystyle\mathbb{P}(\text{largest spacing created by }n\text{ points chosen on a circle} \leq x)=\sum_{k=0}^n (-1)^k \binom{n}{k}(1-kx)_+^{n-1}$

where $x_+=\text{max}(x,0)$.

Therefore,

$\displaystyle \mathbb{P}(n \text{ points can be covered by }x \text{-circle})=\sum_{k=1}^n (-1)^{k-1} \binom{n}{k}(1-k(1-x))_+^{n-1}$

We now turn our attention towards the following problem: Points are selected one at a time on the circumference of an unit circle till we get a configuration of points which cannot be covered by a circular arc of length $x$. What is the expected number of points that will be chosen given $x$?

If we let $N$ be the random variable denoting the number of points needed, then

$\begin{align}\displaystyle\mathbb{E}(N|x) &= \sum_{n=0}^\infty\mathbb{P}(N>n)\\ &= 1+\sum_{n=1}^\infty\mathbb{P}(n\text{ points lie on circular segment of length }x) \\ &= 1+\sum_{n=1}^\infty\sum_{k=1}^n (-1)^{k-1} \binom{n}{k}(1-k(1-x))_+^{n-1}\\ \end{align}$

With WA's help, we can simplify this expression to give

$\displaystyle \mathbb{E}(N|x)=1+\frac{1}{1-x^2}\sum_{k=1}^{\lfloor 1 / (1-x) \rfloor}\frac{1}{k^2}\left(1-\frac{1}{k(1-x)}\right)^{k-1}$


Hope you enjoyed this. See ya soon.

Until then, Yours Aye

Me

Saturday, July 15, 2023

Vechtmann's theorem and more on Bernoulli Lemniscate

Bernoulli Lemniscate was one beautiful curves in Mathematics. The curve, which in some sense kick started the study of elliptic functions, has a lot of interesting properties and some of those will be subject of this post.

Bernoulli Lemniscate can be defined in atleast two ways. It is the locus of a point $P$ such that the product of its distance from two points, say $P_1$ and $P_2$, stays a constant. Or we as the inverse of the hyperbola $x^2-y^2=1$ w.r.t the unit circle which we will be using in this post.

$DG \cdot DF = \text{constant}$
$F \equiv (1/\sqrt{2},0)$, $G\equiv (-1/\sqrt{2},0)$

One of the first things that we might want to do is to construct a tangent at a given point on the Lemniscate.

A useful property of the hyperbola that comes in handy here is the fact that the line from the origin drawn perpendicular to the tangent of hyperbola intersects the hyperbola again at a point which the reflection of the tangent point w.r.t the $x$-axis. Also, the point of intersection of the tangent and the perpendicular line is inverse of the reflected point w.r.t the unit circle.

$I$ and $D'$ are reflections w.r.t $x$ axis
$I$ and $H$ are reflections w.r.t the unit circle

The inverse of the tangent line shown above w.r.t the unit circle becomes a circle passing through the origin. Because the tangent line intersects the blue line intersects at $H$, their corresponding inverses intersect at $I$ - which is the inverse of $H$ - showing that the inverted circle also passes through $I$. Note that $BI$ is the diameter of the circle.

 


For a given point $D$ on the lemniscate, we can extend the radial line $BD$ so that it intersects the hyperbola at $D'$. As the green line is the tangent to the hyperbola at $D'$, we know that the green circle must be tangent to the lemniscate at $D$. As $BD$ is a chord on the circle, the point of intersection of their perpendicular bisector (not shown in the picture below) with the diameter $BI$ gives the center of the circle $E$.

As the lemniscate and the green circle are tangent at $D$, they share a common tangent at that point. With the circle's center known, it is now trivial to draw both the tangent and normal to the lemniscate at $D$.

In summary, to draw the tangent at point $D$ on the lemniscate, extend the radial line to find $D'$ on the hyperbola. Reflect it w.r.t the $x$-axis to get $I$. We can then locate $E$ as the intersection of $BI$ and the perpendicular bisector of $BD$. Then $ED$ is the normal to the lemniscate at $D$ from which the tangent follows trivially.

Let $D \equiv (r,\theta)$ be the polar coordinates in the following figure. Because $D$ and $D'$ are inverses w.r.t the unit circle, we have $BD'=1/r$. As $D'$ and $I$ are reflections, we have $BI=BD'=1/r$ and $\angle D'BI=2\theta$.


Because $BI$ is the diameter of the circle, we know that $BD \perp DI$. Therefore using this right angled triangle, we can see that $\cos 2\theta=r/(1/r)$, thereby showing the polar equation of the lemniscate is $r^2=\cos 2\theta$.

Also, $DE$ (which is a normal to the lemniscate at $D$) and $EB$ are radius to the same circle, we have $\angle BDE=2\theta$ showing that the angle between the radial line and normal is twice the radial angle. This result is called the Vechtmann theorem which is mostly proved using calculus.

Vechtmann theorem shows that the normal to the lemniscate makes an angle of $3\theta$ with the $x$-axis i.e. three times the radial angle. Therefore, we can see that the angle between two normals (and hence two tangents) at two given points is three times the difference between the radial angle of those points.

Time to go infinitesimal. Consider two infinitesimally close points $D \equiv (r,\theta)$ and $J$. Let $DM$ be a line perpendicular to the radial line $BJ$.

From the infinitesimal right $\triangle DMN$ in the figure above, we easily see that $DM \approx r d\theta$, $DN \approx ds$ and $MN \approx dr$. Also, because the angle between the radial line and the normal at $D$ is $2\theta$, we have $\angle MDN=2\theta$. Therefore,

$\displaystyle \cos2\theta=\frac{DM}{DN}=\frac{r d\theta}{ds} \implies r \cdot ds=d\theta$ using the polar equation of the lemniscate.

The radius of curvature $R$ for a curve is given by the relation $ds=R\cdot d\varphi$ where $\varphi$ is the tangential angle (the angle a tangent makes with a common reference line). Therefore, change in tangential angle between two points is just the angle between the tangents.

We just saw that for a lemniscate, the tangential angle is thrice the polar angle. Therefore, $d\phi=3 d\theta$. The above three relations clearly show that the curvature $\kappa$, the inverse of the radius of curvature $R$, is $3r$.

Going back again to the infinitesimal triangle, we have

$\displaystyle \sin 2\theta=\frac{dr}{ds} \implies ds=\frac{dr}{\sqrt{1-r^4}}$

again using the polar equation of the lemniscate. It is this relation that made the lemniscate an extremely important, and the first, curve to be studied in the context of elliptic integrals.

If we think about it, the idea that makes this all work is the fact that polar equation of the lemniscate is of a very specific form. In fact, we can generalize the same to see that curves defined by the polar equation

$r^n=\cos n\theta$

are all amenable to a similar approach with which we can see that their tangential angles is $(n+1)\theta$, their curvature is $(n+1)r^{n-1}$ and their differential arc length is

$\displaystyle ds=\frac{dr}{\sqrt{1-r^{2n}}}$

These curves are called the Sinusoidal spirals, which includes circles, parabolas, hyperbolas among others as special cases, and have many other interesting properties as well which I leave the readers to explore.

Until then
Yours Aye
Me

Thursday, June 29, 2023

'Leftmost' random points on Circles and Spheres

As usual, I was randomly (pun unintended) thinking about some problems in Geometric Probability and thought would share some of those.

Consider a unit circle centered at the origin. If we select $n$ points uniformly randomly in this circle, how 'left' the leftmost point be?

Mathematically, I'm looking for the expected value of the abscissa of the point with the 'minimum' x-coordinate. This way, this becomes an order statistic problem on a circle.

If we parametrize the chords perpendicular to the x-axis by the central angle 2$\theta$ they subtend, then the probability that a given point lies on the right of the chord is

$\displaystyle\mathbb{P}(\text{point lies on the right side of chord})=\frac{\theta-\sin\theta\cos\theta}{\pi}$

In other words, the above is the complementary CDF $\bar{F}(x)$ for a given value of $x=\cos\theta$.

Now the expected value of min. of $n$ i.i.d. random variables $X_1,X_2,\cdots,X_n$ each with a Complementary CDF of $\bar{F}(x)$ is given by

$\displaystyle\mathbb{E}(\text{min.})=\int x \,d(1-\bar{F}(x)^n)=x(1-\bar{F}(x)^n)-\int (1-\bar{F}(x)^n)\,dx$

The expected value of the min. abscissa is given by

$\displaystyle\mathbb{E}(\text{min. abscissa})=\int_\pi^0 \cos\theta \,d(1-\bar{F}(\cos\theta)^n)$

Therefore, in our case, we have

$\begin{align}\displaystyle\mathbb{E}(\text{min.})&=\cos 0 (1-\bar{F}(\cos 0)^n)-\cos\pi(1-\bar{F}(\cos\pi)^n)-\int_\pi^0 \left(1-\bar{F}(\cos\theta)^n\right)\,d(\cos\theta)\\ &= 1 - \int_0^\pi \sin\theta(1-\bar{F}(\cos\theta)^n)\,d\theta \\ &= \int_0^\pi \sin\theta\left(\frac{\theta-\sin\theta\cos\theta}{\pi}\right)^n\,d\theta -1 \end{align}$

From this, we can see

$\displaystyle \mathbb{E}(\text{min. abscissa among two points})=-\frac{128}{45\pi^2} \approx -0.2882$ using WA and

$\displaystyle \mathbb{E}(\text{min. abscissa among three points})=-\frac{64}{15\pi^2} \approx -0.4323$ using WA with

further values having a slightly complicated expressions. However, we can get somewhat decent approximations by noting the following graph.


Therefore,

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points}) \approx \int_0^\pi \sin\theta \sin^{2n}(\theta/2)\,d\theta -1 = \frac{1-n}{1+n}$

which unfortunately only gets worse as $n$ gets larger.

A very similar approach can be used in case of a sphere as well just by noting that

$\displaystyle F(x)=\frac{1}{4}(2+\cos\theta)(1-\cos\theta)^2$

for $x=\cos\theta$ using the volume of a spherical cap.

Therefore,

$\displaystyle\mathbb{E}(\text{min. abscissa of }n\text{ points})=\int_0^\pi \sin\theta \left(\frac{(2+\cos\theta)(1-\cos\theta)^2}{4}\right)^n\,d\theta - 1=-1+\int_{-1}^1 \left(\frac{(2+x)(1-x)^2}{4}\right)^n\,dx$

Because of the polynomial integrand, the expected values in case of a sphere all turn out to be rationals. In fact by expanding the integral range (and adjusting later), we will be able to apply Laplace's method we can show that

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points}) \approx \sqrt{\frac{\pi}{3n}}-1$

which get better for large $n$.

Infact using the power series of the log of integrand at its maximum of $x=-1$ and more terms for Laplace method like we did here, we can show that

$\displaystyle \int_{-2}^1 \left(\frac{(2+x)(1-x)^2}{4}\right)^n\,dx \approx \sqrt{\frac{4\pi}{3n}}\left(1-\frac{17}{72n}\right)$

which can be used to give a slightly better approximation.

More generally, by this point, it should be clear that for $d$-dimensional case, the expected value can be written as

$\displaystyle \mathbb{E}(\text{min. abscissa of }n\text{ points in }d\text{ dimensions})=\int_{-1}^1 \bar{F}_d(x)^n\,dx-1$

where $\bar{F}_d(x)$ is the ratio between the volume of $d$-ball cap to the volume of $d$-ball. By symmetry, the expected value for the 'rightmost' point will be the negative of this.

An associated question could be how far is the 'leftmost' point from the center of the $d$-ball?

Let $L_d$ denote the random variable we are interested in. In the circle case, we could construct the density function of $L_d$ by proceeding as follows:

First, any of the $n$ points could be the 'leftmost' points.

Second, the probability that the point lies at a distance $r$ from the center is $2rdr$.

Third, we note that this point will be uniformly distributed on the circle of radius $r$. The probability that the line joining the point to the origin makes an angle $\theta$ with the $x$-axis is $d\theta / \pi$ (considering only the upper half of the circle).

Finally, If we want this point to have the min. abscissa, the remaining points should have abscissa greater than $r\cos\theta$.

Therefore, the pdf of $L_2$ would be

$\displaystyle f(l_2) = n \cdot 2r dr \cdot \frac{d\theta}{\pi}\cdot \left(\frac{\cos^{-1}(r\cos\theta)-r\cos\theta\sqrt{1-r^2\cos^2\theta}}{\pi}\right)^{n-1}$

with $0 \le r \le 1$ and $0 \le \theta \le \pi$

We can verify that the integral of this pdf over the interval is indeed $1$ using WA.

We then have

$\displaystyle \mathbb{E}(L_2)=\int_0^1\int_0^\pi r\cdot n \cdot 2r \cdot \frac{1}{\pi}\cdot \left(\frac{\cos^{-1}(r\cos\theta)-r\cos\theta\sqrt{1-r^2\cos^2\theta}}{\pi}\right)^{n-1}\,d\theta\,dr$

Using WA, we can see

$\displaystyle \mathbb{E}(L_2\text{ with 2 points})=\frac{2}{3}$ and $\displaystyle \mathbb{E}(L_2\text{ with 3 points})=\frac{4200\pi^2+3360\log{2}-457}{6300\pi^2}\approx 0.69677$

We can do something similar in the Sphere case as well. The density function of $L_3$ would be

$\displaystyle f(l_3) = n \cdot 3r^2 dr \cdot \frac{2\pi r\sin\theta\cdot r d\theta}{4\pi r^2}\cdot \left(\frac{(2+r\cos\theta)(1-r\cos\theta)^2}{4}\right)^{n-1}$

using the idea that the first point will be uniformly distributed on the surface of a sphere with radius $r$ which could then be sliced into rings (as shown here) parametrized by $\theta$.

Again using WA, we can see

$\displaystyle \mathbb{E}(L_3\text{ with 2 points})=\frac{3}{4}$ and $\displaystyle \mathbb{E}(L_3\text{ with 3 points})=\frac{1719}{2240}\approx 0.7674$

Hope you enjoyed this. See ya later.

Until then
Yours Aye
Me

Saturday, June 17, 2023

Certain Inverse Sums involving Binomial Coefficients

Consider the PMF of the binomial function denoted here by $f(p,k,n)$. We know what happens to this function when it summed over $k$ or integrated over $p$ in the appropriate interval. That is,

$\displaystyle \sum_{k=0}^n\binom{n}{k}p^kq^{n-k}=1$ and $\displaystyle \int_0^1 \binom{n}{k}p^kq^{n-k} \,dp=\frac{1}{n+1}$

The first by the story of the binomial and the second by Bayes' billiard argument. But what is the sum of this PMF summed over all possible $n$? If we let $S$ denote the sum, we want

$\displaystyle S=\sum_{n=k}^\infty \binom{n}{k}p^kq^{n-k}$

This is not too hard. If we multiply the above by $p$, we can see that the summand becomes the PMF of the Negative binomial distribution and must therefore sum to $1$ by definition.  That clearly shows $S=1/p$.

We can the same for the PMF of the Hypergeometric distribution as well. That is we are interested in,

$\displaystyle S=\sum_{N=n+K-k}^\infty \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}$

I wasn't able to solve this directly but with some luck and Wolfram Alpha, I was able to guess that

$\displaystyle S=\sum_{N=n+K-k}^\infty \frac{\binom{K}{k}\binom{N-K}{n-k}}{\binom{N}{n}}=\frac{nK}{k(k-1)}$

At about this time, I saw the following two identities in PiMuEpsilon proved using telescoping sums.

$\displaystyle \sum_{n=k}^\infty\frac{1}{\binom{n}{k}}=\frac{k}{k-1}$ and $\displaystyle \sum_{n=m}^\infty\frac{1}{\binom{n}{k}}=\frac{k}{k-1}\binom{m-1}{k-1}^{-1}$

But with these many binomials, I was sure there must be some probabilistic interpretation for the same. Am I'm gladI found we are going to solve both of these using probability in this post.

Consider a Polya Urn containing $k$ marbles of which one is white and the rest are black. Our task is to pick up one black marble from this Urn in a maximum of $m$ draws. Because this a Polya Urn, everytime we draw a ball, we replace with that ball back in the Urn along with an additional ball of the same color.

Let $I_j$ ($0\leq j<m$)be an Indicator random variable for the event of picking a black marble in the '$j+1$'th draw. Then,

$\displaystyle\mathbb{P}(I_j=1)=\frac{1}{k}\cdot\frac{2}{k+1}\cdot\frac{3}{k+2}\cdots\frac{j}{k+j-1}\cdot\frac{k-1}{k+j}=\frac{k-1}{k}\binom{k+j}{j}^{-1}$

Therefore,

$\displaystyle\mathbb{P}(\text{picking a black marble in }m\text{ draws})=\sum_{j=0}^{m-1}\mathbb{P}(I_j=1)=\frac{k-1}{k}\sum_{j=0}^{m-1}\binom{k+j}{j}^{-1}$

On the other hand, probability of picking a black marble is the complementary probability of not picking a black marble in $m$ draws.

$\displaystyle\mathbb{P}(\text{Not picking a black marble in }m\text{ draws})=\frac{1}{k}\frac{2}{k+1}\cdots\frac{m}{k+m-1}=\binom{k+m-1}{m}^{-1}$

This clearly shows that

$\displaystyle\sum_{j=0}^{m-1}\binom{k+j}{j}^{-1}=\frac{k}{k-1}\left[1-\binom{k+m-1}{m}^{-1}\right]$

The above after suitable relabelling gives,

$\displaystyle\sum_{n=k}^{k+m-1}\binom{n}{k}^{-1}=\frac{k}{k-1}\left[1-\binom{k+m-1}{k-1}^{-1}\right]$

Both the PiMuEpsilon identities given above are easy corollaries of the above identity.

We could have also started with $a$ white marbles and $b$ black marbles. In this case we would have arrived at the following result.

$\displaystyle \frac{a}{a+b}\sum_{k=0}^{c-1}\binom{b-1+k}{b-1}\binom{a+b+k}{a+b}^{-1}=1-\binom{b+c-1}{c}\binom{a+b+c-1}{c}^{-1}$

The above after some relabelling can also be written as

$\displaystyle \frac{a}{a+b}\sum_{n=b}^{c-1}\binom{n-1}{b-1}\binom{n+a}{a+b}^{-1}=1-\binom{c-1}{b-1}\binom{a+c-1}{a+b-1}^{-1}$


Hope you enjoyed this. See ya later.

Until then
Yours Aye
Me