Monday, April 30, 2018

The Birthday Problem


The Birthday problem is one of the classic problems in Probability theory concerning the probability that, in a set of $n$ randomly chosen people, some pair will have a the same birthday. The idea in this post is to generalize this setup and find the expected value in this generalization.

The generalization here is this: Assume that there is a very long line of persons ready to enter a very large room one by one. Each person is let in and declares her birthday upon entering the room. How many people, on average, must enter in order to find $k$ people that have birthdays that are $d$ days apart (given an $r$-day year)?

Non-Spoiler: Some readers may recognize that this Problem 584 of Project Euler. But note that this post neither gives the answer nor the right way to solve that problem. All we are going here is find an approximation of the required value which is far off from the correct answer.

Well, when we say 'approximation', we are immediately reminded of the Poisson approximation of the Birthday problem in Wikipedia. But the idea here is that most of the approximations I found in the web concerns only about the probability and not about the expected value.  A special case of this problem is solved in Strategic Practice 5 of the (amazing) Stat 110 course by the (extraordinary) Prof. Joseph Blitzstein. And I got curious as to how good the approximation really is.

Let $\mathbb{E}(k,d,r)$ be the expected number of people that have to enter room such that we have $k$ people with birthdays $d$ days apart for the first time.

Let $X_n$ be the number of $k$-set birthday matches in when $n$ people have entered the room. By the Poisson paradigm, $X_n$ can be considered to be a poisson variable. The parameter $\lambda_n$ for this variable is given by,

$\displaystyle\lambda_n=\displaystyle\binom{n}{k}\cdot\frac{1}{(k-1)!}\left(\frac{2d+1}{r}\right)^{k-1}$  (why?)

Therefore we immediately have, $\mathbb{P}(X_n=0)=e^{-\lambda_n}$

Now by a simple trick common in evaluating expected values, we have,

$\begin{align}
\displaystyle\mathbb{E}(k,d,r)&=k+\sum_{n=k}^\infty\mathbb{P}(X_n=0)=k+\displaystyle\sum_{n=k}^\infty e^{-\lambda_n}
\end{align}$

Though we have an infinite sum, the numbers decrease rapidly because of the exponential factor. For the values given the said Project euler problem, this approximation gives $\mathbb{E}_{\text{Poisson}}(3,1,10)\approx 6.15092$ and $\mathbb{E}_{\text{Poisson}}(3,7,100)\approx 8.79751$

But we can take it a little further. When the parameter of a Poisson variable get very large, then the Normal variable with the same mean and variance as the Poisson variable gives better accuracy. Therefore,

$\begin{align}
\displaystyle\mathbb{E}(k,d,r)&=k+\sum_{n=k}^\infty\mathbb{P}(X_n=0)\\
&=k+\displaystyle\sum_{n=k}^\infty \mathbb{P}(-1/2 \leq X_n \leq 1/2)
\end{align}$

where $X_n \sim \mathcal{N}(\lambda_n,\lambda_n)$ along with the continuity correction.

Here again, the numbers decrease fast and the sum converges in about fifty terms. Now we have $\mathbb{E}_{\text{Normal}}(3,1,10)\approx 5.68554$ and $\mathbb{E}_{\text{Normal}}(3,7,100)\approx 8.04791$

In fact for the required answer, the normal approximation gives $\mathbb{E}_{\text{Normal}}(4,7,365)\approx 33.6396$ with a relative error of just about $2.5\%$. Not bad.

These values were calculated with the following Mathematica code.

Clear["Global`*"];
PoissonApprox[k_, d_, r_] :=
  k + Sum[N[
     Exp[-Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/
        Factorial[k - 1]]], {n, k, 100}];
NormalApprox[k_, d_, r_] :=
 k + Sum[N[
    CDF[NormalDistribution[
       Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/Factorial[k - 1],
       Sqrt[Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/
          Factorial[k - 1]]], 1/2] -
     CDF[NormalDistribution[
       Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/Factorial[k - 1],
       Sqrt[Binomial[n, k] Power[(1 + 2 d)/r, k - 1]/
          Factorial[k - 1]]], -1/2]], {n, k, 100}]
PoissonApprox[4, 7, 365]
NormalApprox[4, 7, 365]

Hope you enjoyed this. See you again with a nice problem.


Until then
Yours Aye
Me

Tuesday, April 24, 2018

Probability on a Chess Match


Consider two friends $A$ and $B$ playing a series of bullet chess games. As every time the player who lost the recent game wants to continue, they come with an idea. They'll play the first game. After the game, they'll toss a fair coin. If the coin comes up heads, they'll continue playing another game, else they end the match. They'll continue this procedure until the coin turns up Tails.

The question we are interested here is the probability of either player winning the match given the probability $\alpha$ of $A$ winning a game, probability $\gamma$ of $B$ winning a game and the remaining probability $\beta=1-\alpha-\gamma$ of a game ending in a draw. WLOG assume $\alpha<\gamma$.

All the equations referred in this post now on are from A list of Integrals and Identities.

Let $A_{n,k}$ denote the event of $A$ winning a match of $n$ games by $k$ more wins (ignoring the coin tosses). Then,

$\displaystyle\mathbb{P}(A_{n,k})=\sum_{a+b+c =n\\a-c=k}\binom{n}{a,b,c}\alpha^a\beta^b\gamma^c$

where $a$ denotes the number of games won by $A$, $c$ the number of games won by $B$ and $b$ the number of games that ended in a draw.

First, let's tackle an easy question. That is finding the probability of the match ending in a draw.

$\begin{align}
\mathbb{P}(\text{Draw})&=\displaystyle\sum_{n=1}^\infty\mathbb{P}(A_{n,0}\text{ | }n\text{ games played})\cdot\mathbb{P}(n\text{ games being played})\\
\end{align}$

As we know that the probability of the match lasting $n$ games is $2^{-n}$, using (1) we have,

$\begin{align}
\mathbb{P}(\text{Draw})&=\displaystyle\frac{\cot\phi}{\sqrt{\alpha\gamma}}-1\\
\end{align}$,   where $\phi=\cos^{-1}\displaystyle\left(\frac{2\sqrt{\alpha\gamma}}{1+\alpha+\gamma}\right)$


Similarly, we now look at the probability of $A$ not losing the match. This time we have,

$\begin{align}
\mathbb{P}(A \text{ not losing})&=\displaystyle\sum_{n=1}^\infty\sum_{k=0}^\infty\mathbb{P}(A_{n,k}\text{ | }n\text{ games played})\cdot\mathbb{P}(n\text{ games being played})\\
\end{align}$

Using (6), this reduces to,

$\begin{align}
\mathbb{P}(A \text{ not losing})&=\displaystyle\frac{\cot\phi}{\sqrt{\alpha\gamma}-\alpha(\sec\phi-\tan\phi)}-1\\
\end{align}$

In fact, instead of using a fair coin, if they decide to use a biased coin so that they continue playing with probability $q$ and end the match with probability $p=1-q$,

$\begin{align}
\displaystyle\frac{q}{1-q}\mathbb{P}(\text{Draw})&=\frac{\frac{1}{2}\cot\phi}{\sqrt{(q\alpha)(q\gamma)}}-1\\
\end{align}$

$\displaystyle\frac{q}{1-q}\mathbb{P}(A \text{ wins the match})=\displaystyle\frac{\frac{1}{2}\cot\phi}{(q\gamma)(\sec\phi+\tan\phi)-\sqrt{(q\alpha)(q\gamma)}}$

$\displaystyle\frac{q}{1-q}\mathbb{P}(B \text{ wins the match})=\displaystyle\frac{\frac{1}{2}\cot\phi}{(q\alpha)(\sec\phi+\tan\phi)-\sqrt{(q\alpha)(q\gamma)}}$

where $\displaystyle\phi=\cos^{-1}\left(\frac{2\sqrt{(q\alpha)(q\gamma)}}{(1-q)+q\alpha+q\gamma}\right)$

These expressions can be written algebraically without the use of trigonometric functions but they are clumsy and I prefer this form. Also, there a couple of nice details hidden in these expressions. First, using a right choice of $q$, the weaker player will be able to decrease his opponent's chance of winning the match which was surprising for me.

Second, in the interesting and limiting case of $q\to1$, the LHSs of the above three expression can be seen as the expected number of games after which the match is at a drawn stage, player $A$ is in the lead and player $B$ is in the lead respectively.

$\displaystyle\lim_{q\to1}\displaystyle\frac{q}{1-q}\mathbb{P}(\text{Draw})=\frac{1}{|\gamma-\alpha|}-1$,

$\displaystyle\lim_{q\to1}\displaystyle\frac{q}{1-q}\mathbb{P}(A \text{ wins the match})=\frac{\alpha}{\text{max}(0,\gamma-\alpha)^2}$ and

$\displaystyle\lim_{q\to1}\displaystyle\frac{q}{1-q}\mathbb{P}(B \text{ wins the match})=\frac{\gamma}{\text{max}(0,\alpha-\gamma)^2}$

ASIDE: An interesting discussion on a related problem is given in Joint probability of geometric random variables. However, there the distributions are independent. Our problem can be stated similarly using the result that if,

$\vec{\mathbf{X}}\sim \text{Mult}(N, \vec{\mathbf{p}})$ where $N \sim \text{NBin}(m,s)$, then $X_i \sim \displaystyle\text{NBin}\left(m,\frac{s}{s+(1-s)p_i}\right)$

But the $X_i$'s are not independent (they would be had $N$ been a Poisson variable).

Even though we started with the assumption $\alpha<\gamma$, fortunately the final results holds for all values of $\alpha$ and $\gamma$. The symmetry of the formulae is really enchanting to me and greatly satisfied that I've solved this. Hope you enjoyed it.

Until then
Yours Aye
Me

A List of Integrals and Identities


A list the integrals and identities used in solving the problem in this post.

Throughout this post we consider $0<\alpha, \gamma,r \leq 1$ and $a, b, c, k$ is a non-negative integer.


(1) I'm not sure where I saw this. I'll update the reference if I find it. UPDATE: I found the reference. It's given in Advances in Combinatorial Methods and Applications to Probability and Statistics

$\displaystyle\sum_{a,b,c \geq 0 \\ a+b+c=n \\ a-c=k}\binom{n}{a,b,c}\alpha^a(1-\alpha-\gamma)^b\gamma^c=\frac{(\alpha/\gamma)^{k/2}}{\pi}\int_0^\pi  \cos k\theta\text{ } (1-\alpha-\gamma+2\sqrt{\alpha\gamma}\cos \theta)^n \, d\theta$


(2) This is given in Generality of algebra. Anyway, this is easy to prove if we recognize $r^k \cos kx=\text{Re}(r^k e^{ikx})$ and it's just a geometric series.

$1+r\cos x+r^2 \cos 2x+r^3 \cos 3x+\cdots=\displaystyle\frac{1-r\cos x}{1-2r\cos x+r^2}$


(3) This is just pattern recognition by using different values in Mathematica.

$\displaystyle\int \frac{1-r \cos x}{1-2r \cos x +r^2}\,dx=\frac{x}{2}+\tan^{-1}\left(\frac{1+r}{1-r}\tan\left(\frac{x}{2}\right)\right)+\text{constant}$

Interestingly,

$\displaystyle\frac{1}{\pi}\int_0^\pi \frac{1-r \cos x}{1-2r \cos x +r^2}\,dx=H[1-r]$

where $H[x]$ is the Heaviside Unit step function.


(4) I definitely saw this Wikipedia. But again I couldn't find it now. UPDATE: Found this too in List of definite integrals.

$\displaystyle\frac{1}{\pi}\int_0^\pi\frac{\cos kx}{\sec \phi-\cos x}\,dx=\frac{(\sec \phi - \tan \phi)^k}{\tan \phi}$


(5) Using $k=0$ in (4),

$\displaystyle\frac{1}{\pi}\int_0^\pi\frac{1}{\sec \phi-\cos x}\,dx=\cot\phi$


(6) Multiplying (4) by $r^k$ and summing across all $k$,

$\displaystyle\frac{1}{\pi}\int_0^\pi\frac{1+r\cos x+r^2 \cos 2x+r^3 \cos 3x+\cdots}{\sec \phi-\cos x}\,dx=\frac{\cot \phi}{1-r(\sec \phi - \tan \phi)}$



Until then
Yours Aye
Me

Thursday, April 19, 2018

A mild generalization of the Ménage problem


We all know about the classic Ménage problem which for the number of different ways in which it is possible to seat a set of male-female couples at a dining table so that men and women alternate and nobody sits next to his or her partner.

Here again we consider a similar setup except each family consists four members, a father, mother, a son and a daughter. To be clear, we seek the number of different ways in which it possible to seat a set of families at a dining table so that men and women alternate and no family end up sitting together. Note that we consider a family to be sitting together only when all members of the same family sit next to each other.

The idea here is inspired by the Non-sexist solution of the menage problem. It is easy to see that there are $2(2n)!^2$ seatings in total such that the dinner table has a male-female arrangement. That is, $2$ ways to decide which seat is for men/women; $(2n)!$ ways each for men and women.

Consider $n>1$. Let $L_n$ be the answer we are trying to find for the case of $n$ families. Then, using inclusion-exclusion, we have

$L_n=\displaystyle\sum_{k=0}^n(-1)^k\binom{n}{k}w_k$

where $w_k$ denotes the number of seatings such that a given set of $k$ families are seated together (There maybe other families sitting together as well but we don't care about them). To compute $w_k$, let's define $d_k$ which is the number of ways of covering a circle containing $4n$ equally spaced points with a curved domino that spans $4$ points.

Calculation of $d_k$'s are very easy. It is just the stars and bars problem on a circular setup. For the problem at hand, we have,

$d_k=\displaystyle\frac{4n}{4n-3k}\binom{4n-3k}{k}$

The idea for the above formulae is exactly like that of this Stack Exchange question.

Now, we compute $w_k$. Note that there are $8$ ways in which a family can be seated together with men and women alternating

If we are done choosing $k$ families, we have $2$ ways to decide on which are men's seats and which are women's. We have $d_k$ ways to seat these $k$ families in the circular table. The families can be permuted in $k!$ ways. As we have selected men's (and women's) seats, each of the $k$ families that are supposed to sit together, can have $4$ ways of being seated. The remaining $2n-2k$ men and $2n-2k$ women can be permuted in the remaining seats. In short, we have,

$w_k=2\cdot d_k \cdot 4 ^k \cdot k! \cdot (2n-2k)!^2$

Therefore,

$L_n=\displaystyle\sum_{k=0}^n(-1)^k\binom{n}{k}\cdot 2\cdot 4^k\cdot k!\cdot (2n-2k)!^2\cdot d_k$

Substituting $d_k$, and simplifying we finally end up with,

$L_n=2\cdot n!\displaystyle\sum_{k=0}^n(-4)^k\frac{4n}{4n-3k}\binom{4n-3k}{k}\frac{(2n-2k)!^2}{(n-k)!}$

And there we have our final answer.

Another interesting variation in our 'generalized menage problem' is asking for the number of ways such that no two members of the same family sit together. This seems rather complicated in that it doesn't seem to admit any closed form. Of course, this is just my opinion.

Trying to find an answer to this variation, I started thinking about arranging balls of $n$ different colors with $c[i]$ balls of color $i$. Seems DP is the best approach as discussed in AMBALLS - Editorial. But let's postpone that for a later post.

Hope you enjoyed this post. I'll write again with a nice post.


Until then
Yours Aye
Me

Monday, February 19, 2018

Python Code - Gauss's Circle Problem

This code uses the algorithm discussed here.

Iterative version of the code - Python

from math import sqrt, ceil, floor
from timeit import default_timer

start = default_timer()

def matmult (A, B):
    rows_A = len(A)
    cols_A = len(A[0])
    rows_B = len(B)
    cols_B = len(B[0])

    if cols_A != rows_B:
      print("Cannot multiply the two matrices. Incorrect dimensions.")
      return

    # Create the result matrix
    # Dimensions would be rows_A x cols_B
    C = [[0 for row in range(cols_B)] for col in range(rows_A)]

    for i in range(rows_A):
        for j in range(cols_B):
            for k in range(cols_A):
                C[i][j] += A[i][k] * B[k][j]
    return C

def gauss(n):
    sn = ceil(sqrt(n))
    res = 0
    stk = [([[1, 0, 0], [0, 1, 0], [0, 0, -n]], sn, sn)]
    while stk:
        m, w, h = stk.pop()
        if w <= 0 or h <= 0:
            continue
        a, b, c, d, e, f = m[0][0], 2 * m[0][1], m[1][1], 2 * m[0][2], 2 * m[1][2], m[2][2]
        if d * d - 4 * a * f < 0 or e * e - 4 * c * f < 0:
            res += w * h
            continue
        x0, y0 = (- d + sqrt(d * d - 4 * a * f)) / 2 / a, (- e + sqrt(e * e - 4 * c * f)) / 2 / c
        xint, yint, uint, vint, flag = 0, 0, 0, 0, 0
        if floor(x0) == ceil(x0):
            xint = 1
        if floor(y0) == ceil(y0):
            yint = 1
        if w <= 1:
            res += (h - ceil(y0)) * w - yint
            continue
        if h <= 1:
            res += (w - ceil(x0)) * h - xint
            continue
        if w - x0 > 1:
            res += (w - ceil(x0)) * h - xint
            stk.append((m, ceil(x0), h))
            continue
        if h - y0 > 1:
            res += (h - ceil(y0)) * w - yint
            stk.append((m, w, ceil(y0)))
            continue
        temp = (a - b + c) * (c * d * d - b * d * e + a * e * e + b * b * f - 4 * a * c * f)
        if temp < 0:
            continue
        u = ((a - b + c) * (b * e - 2 * c * d) - (b - 2 * c) * sqrt(temp)) / (a - b + c) / (4 * a * c - b * b)
        if floor(u) == ceil(u):
            uint = 1
        if u > w:
            u, flag = w, 1
        if u < 0:
            u, flag = 0, 1
        temp = (e  + b * u) * (e + b * u) - 4 * c * (f + u * (d + a * u))
        if temp < 0:
            continue
        v = (- e - b * u + sqrt(temp)) / 2 / c
        if floor(v) == ceil(v):
            vint = 1
        if v < 0:
            v = 0
        u, v = ceil(u), ceil(v)
        if u == w and v == h:
            continue
        if uint == 1 and vint == 1 and flag == 0:
            res -= 1
        res += (w - u + h - 1 - v) * (w - u + h - v) // 2
        stk.append((matmult(matmult([[1, 0, 0], [-1, 1, 0], [h - v, v, 1]], m), [[1, -1, h - v], [0, 1, v], [0, 0, 1]]), u - (h - v), h - v))
        stk.append((matmult(matmult([[1, -1, 0], [0, 1, 0], [u, w - u, 1]], m), [[1, 0, u], [-1, 1, w - u], [0, 0, 1]]), w - u, v - (w - u)))
    res = (sn - 1) * (sn - 1) - res
    res += sn - 1
    res *= 4
    res += 1
    if sn * sn == n:
        res += 4
    return res

print(gauss(10 ** 20))
print(default_timer() - start)


Recursive version of the code - Python

from math import sqrt, ceil, floor
from timeit import default_timer

start = default_timer()

def matmult (A, B):
    rows_A = len(A)
    cols_A = len(A[0])
    rows_B = len(B)
    cols_B = len(B[0])

    if cols_A != rows_B:
      print("Cannot multiply the two matrices. Incorrect dimensions.")
      return

    # Create the result matrix
    # Dimensions would be rows_A x cols_B
    C = [[0 for row in range(cols_B)] for col in range(rows_A)]

    for i in range(rows_A):
        for j in range(cols_B):
            for k in range(cols_A):
                C[i][j] += A[i][k] * B[k][j]
    return C

def poins(m, w, h):
    res = 0
    if w <= 0 or h <= 0:
        return 0
    a, b, c, d, e, f = m[0][0], 2 * m[0][1], m[1][1], 2 * m[0][2], 2 * m[1][2], m[2][2]
    if d * d - 4 * a * f < 0 or e * e - 4 * c * f < 0:
        return w * h
    x0, y0 = (- d + sqrt(d * d - 4 * a * f)) / 2 / a, (- e + sqrt(e * e - 4 * c * f)) / 2 / c
    xint, yint, uint, vint, flag = 0, 0, 0, 0, 0
    if floor(x0) == ceil(x0):
        xint = 1
    if floor(y0) == ceil(y0):
        yint = 1
    if w <= 1:
        return (h - ceil(y0)) * w - yint
    if h <= 1:
        return (w - ceil(x0)) * h - xint
    if w - x0 > 1:
        return (w - ceil(x0)) * h - xint + poins(m, ceil(x0), h)
    if h - y0 > 1:
        return (h - ceil(y0)) * w - yint + poins(m, w, ceil(y0))
    temp = (a - b + c) * (c * d * d - b * d * e + a * e * e + b * b * f - 4 * a * c * f)
    if temp < 0:
        return 0
    u = ((a - b + c) * (b * e - 2 * c * d) - (b - 2 * c) * sqrt(temp)) / (a - b + c) / (4 * a * c - b * b)
    if floor(u) == ceil(u):
        uint = 1
    if u > w:
        u, flag = w, 1
    if u < 0:
        u, flag = 0, 1
    temp = (e  + b * u) * (e + b * u) - 4 * c * (f + u * (d + a * u))
    if temp < 0:
        return 0
    v = (- e - b * u + sqrt(temp)) / 2 / c
    if floor(v) == ceil(v):
        vint = 1
    if v < 0:
        v = 0
    if uint == 1 and vint == 1 and flag == 0:
        res -= 1
    u, v = ceil(u), ceil(v)
    if u == w and v == h:
        return 0
    res += (w - u + h - 1 - v) * (w - u + h - v) // 2
    res += poins(matmult(matmult([[1, 0, 0], [-1, 1, 0], [h - v, v, 1]], m), [[1, -1, h - v], [0, 1, v], [0, 0, 1]]), u - (h - v), h - v)
    res += poins(matmult(matmult([[1, -1, 0], [0, 1, 0], [u, w - u, 1]], m), [[1, 0, u], [-1, 1, w - u], [0, 0, 1]]), w - u, v - (w - u))
    #print(m, w, h, u, v, res)
    return res

def gauss(n):
    m = ceil(sqrt(n))
    res = poins([[1, 0, 0], [0, 1, 0], [0, 0, -n]], m, m)
    res = (m - 1) * (m - 1) - res
    res += m - 1
    res *= 4
    res += 1
    if m * m == n:
        res += 4
    return res

print(gauss(10 ** 14))

print(default_timer() - start)


Recursive version of the code


Clear["Global`*"];
Poins[m0_, w0_, h0_] :=
  Poins[m0, w0, h0] =
   Block[{$RecursionLimit = Infinity, m = m0, w = w0, h = h0, a, b, c,
      d, e, f, temp, u, v, res = 0, trans, x0, y0, Flag = 0},
    If[Or[w <= 0, h <= 0], Return[0];];
    If[w < h,
     temp = {{0, 1, 0}, {1, 0, 0}, {0, 0, 1}};
     m = Transpose[temp].m.temp;
     temp = w; w = h; h = temp;
     ];
    a = m[[1, 1]]; b = 2 m[[1, 2]]; c = m[[2, 2]]; d = 2 m[[1, 3]];
    e = 2 m[[2, 3]]; f = m[[3, 3]];
    If[Or[d*d - 4*a*f < 0, e*e - 4*c*f < 0], Return[w h];];
    x0 = (-d + Sqrt[d*d - 4*a*f])/2/a;
    y0 = (-e + Sqrt[e*e - 4*c*f])/2/c;
    If[w <= 1, Return[(h - Ceiling[y0]) w - Boole[IntegerQ[y0]]];];
    If[h <= 1, Return[(w - Ceiling[x0]) h - Boole[IntegerQ[x0]]];];
    If[w - x0 > 1,
     Return[(w - Ceiling[x0]) h - Boole[IntegerQ[x0]] +
        Poins[m, Ceiling[x0], h]];
     ];
    If[h - y0 > 1,
     Return[(h - Ceiling[y0]) w - Boole[IntegerQ[y0]] +
        Poins[m, w, Ceiling[y0]]];
     ];
    temp = (a - b + c)*(c*d*d - b*d*e + a*e*e + b*b*f - 4*a*c*f);
    If[temp < 0, Return[0];];
    u = ((a - b + c)*(b*e - 2*c*d) - (b - 2*c)*Sqrt[temp])/(a - b +
         c)/(4*a*c - b*b);
    If[u > w, u = w; Flag = 1;];
    If[u < 0, u = 0; Flag = 1;];
    temp = (e + b*u)*(e + b*u) - 4*c*(f + u*(d + a*u));
    If[temp < 0, Return[0];];
    v = (-e - b*u + Sqrt[temp])/2/c;
    If[v < 0, v = 0;, v = Ceiling[v];];
    If[And[IntegerQ[u], IntegerQ[v], Flag == 0], res -= 1;];
    u = Ceiling[u]; v = Ceiling[v];
    If[And[u == w, v == h], Return[0];];
    res += (w - u + h - 1 - v)*(w - u + h - v)/2;
    trans = {{1, -1, h - v}, {0, 1, v}, {0, 0, 1}};
    res += Poins[Transpose[trans].m.trans, u - (h - v), h - v];
    trans = {{1, 0, u}, {-1, 1, w - u}, {0, 0, 1}};
    res += Poins[Transpose[trans].m.trans, w - u, v - (w - u)];
    res
    ];
GaussCnt[n0_] := Block[{n = n0, m, res},
   m = Ceiling[Sqrt[n]];
   res = Poins[{{1, 0, 0}, {0, 1, 0}, {0, 0, -n}}, m, m];
   res = (m - 1)*(m - 1) - res;
   res += m - 1; res *= 4; res += 1;
   If[m m == n, res += 4;];
   res
   ];
Rad = Power[10, 6];
GaussCnt[Rad*Rad] // AbsoluteTiming



Note that in all the codes, the core function takes as input the square of the radius of the circle under consideration.



Until then
Yours Aye
Me

A Sub-linear algorithm for Gauss's Circle problem


I was proud of my previous two posts, I thought I would not have anything that I can share for a while. But I'm glad I found something very interesting.

I found this PDF titled A Successive approximation Algorithm for Computing the Divisor Summatory function a long time back (in OEIS I think). It has an $O(n^{1/3})$ for calculating the divisor summatory function. I have read this many times both because I find it interesting and I could not fully understand the tiny details though I got the crux of what the algo is trying to do.

I was reading it again some time last week and was on page 6. The PDF uses the properties of a Hyperbola to do the summation and the author quotes "... the hyperbolic segment has the same basic shape as the full hyperbola" and then suddenly it hit me. After all, the Hyperbola is a conic section and if it holds for a hyperbola, why not a different conic? And if that conic is an Ellipse (of which Circle is a special case), then pretty much the same algo should solve Gauss's Circle problem. Indeed it did.

As I said earlier, I could not get all the details of the original algo and I proceeded in a more natural way. In essence, let

$C(x,y)=Ax^2+Bxy+Cy^2+Dx+Ey+F=0$ be a conic section with an increasing slope in the first quadrant bounded by the lines $x=w$ and $y=h$.

Then we will the counting the lattice points such that,

$0 \leq x < w$, $0 \leq y < h$ and $C(x,y) > 0$

Consider the first quadrant of the circle as shown in the left figure below. Now we can draw an 'almost' tangent with a slope of $-1$ as shown in the right figure below. Let $(u,v)$ be the almost tangent point (shown as a black dot).

 

Now it's we can simply count the number of lying inside the triangle that don't violate our conditions in $O(1)$ time. We can then make two parallelograms such that they cover our region of interest. Something like in the figure below.


Now comes the beautiful part. Let's concentrate on the parallelogram above. The reasoning will be almost similar to the other parallelogram except for a tiny detail.

We can transform the co-ordinate system now such that the lower left corner of the upper parallelogram becomes the origin. In addition to this, the parallelogram can be sheared horizontally such that the parallelogram becomes a rectangle. Simply put, we are trying to do the following:





Here the first figure, which is the upper parallelogram, after shifting the origin and shearing becomes the second figure. Note that the number of lattice points inside both the figures remain the same after this operation. This fact can be understood by a bit of linear algebra but let's not get into the details.

Now we are in the same situation as before. We have a conic, a bounding rectangle and we have to count the points inside the rectangle and outside the conic. This essentially forms the crux of the recursion.

The almost tangent point can be found as the point of intersection of the conic and the tangent (with a slope of $-1$) equation of the same conic. Mathematica solves the symbolic equation in the blink of an eye.

The 'exact' point of intersection $(u, v)$ is found to be,

$u_e=\displaystyle\frac{(A-B+C)(BE-2CD)-(B-2C)\sqrt{\Delta_u}}{(A-B+C)(4AC-B^2)}$ and $v_e=\displaystyle\frac{-C-Bu+\sqrt\Delta_v}{2C}$

where $\Delta_u=(A - B + C) (CD^2 - BDE + AE^2 + B^2F - 4ACF)$ and $\Delta_v=(E+Bu)^2 - 4C(F + u (D + Au))$

The almost tangent point can be found by setting $u=\lceil u_e \rceil$ and $v=\lceil v_e \rceil$.

Now the transformation. These can be attained beautifully with matrices. A Conic can be represented as a matrix in the following way.

$C=\mathbf{x}^T A_Q\mathbf{x}$

where the matrix $A_Q$ is given in Matrix representation of Conic Sections. Now the 'shifting and shearing' operations can be done using the following two matrices.

$X=\begin{pmatrix}1 & -1 & h-v\\0 & 1 & v \\0&0&1\end{pmatrix}$ and $Y=\begin{pmatrix}1 & 0 & u\\-1 & 1 & w-u \\0&0&1\end{pmatrix}$

Now the matrix $X^TA_QX$ gives the conic after we have translated the origin to $(u, v)$ and sheared it horizontally and vice versa for the vertical shearing using matrix $Y$.

Finally, though we start with a simple conic, the conics that subsequently results in further recursion are a little tricky and there are a number of cases that have to dealt with carefully. A better programmer could've done it in a day or two, but it took me almost ten days to finally complete the code. You can find the recursive and the iterative version of the Python code in this post.

Assuming no precision issues, the iterative version of the code finds $N(10^{9})=3141592653590319313$ in about 5 mins and $N(10^{10})=314159265359046103545$ in about 20 mins. The timing could be halved by taking the eight-fold symmetry of a circle but I didn't do it yet.

Hope you enjoyed it. See ya in the next post.


Until then
Yours Aye
Me

Sunday, January 21, 2018

Complicating a simple Probability Problem II


Hi all, we complicated a simple problem in Probability in the previous post. Here again, we do the same thing.

The setup is this. Consider the unit interval $(0, 1)$. We start at the origin and choose $n$ random points in the interval that lies to the right. We now choose the $k$th point nearest to the origin and move there. We again choose $n$ points in the interval that lies to the right and keep moving to the right until we get inside an interval $[0, x)$ for a given $x$.

Let $\mathbb{E}_{n,k}(x)$ be the expected number of times we have to generate the numbers (or in other words, the moves we have to make). So as you may have guessed, this post about finding this number.

Okay, the procedure pretty much follows what we saw the previous time. It gets all messy and I'm not even sure how to present it. So I'm directly giving the result.

Let $P_{n,k}(z)$ denote the characteristic equation. We have,

$P_{n,k}(z)=\displaystyle\frac{z^\underline{n}}{n!}-(-1)^{n-k}\sum_{j=0}^{n-k}(-1)^j\binom{n-1-j}{k-1}\frac{z^\underline{j}}{j!}$

where $z^\underline{n}$ is the falling factorial.

Let $\eta_1,\eta_2,\cdots,\eta_n$ be the $n$ roots of the equation $P_{n,k}(z)=0$. Then we have,

$\begin{align}
\mathbb{E}_{n,k}(x)=\displaystyle\frac{H_n-\log{x}}{H_n-H_{n-k}}&+\frac{x^n}{n!}\sum_{r=1}^n\frac{x^{-\eta_r}\eta_r^\underline{n}}{\eta_rP'_{n,k}(z)}\\
&-\displaystyle\frac{x^nn^{n-1}}{n!(H_n-H_{n-k})}\sum_{r=1}^n\sum_{i=0}^{n-1}\sum_{j=0}^i\frac{(n-i)x^{-\eta_r}\eta_r^\underline{n}\eta_r^{j-1}a_{i-j}}{n^{i}P'_{n,k}(\eta_r)}
\end{align}$

where $a_m=[z^{n-m}]P_{n,k}(z)$ and $H_n$ is the $n$th Harmonic number.

Using this, we can see that

$\mathbb{E}_{n,1}(x)=1-n\log{x}$

$\mathbb{E}_{n,2}(x)=1+\displaystyle\frac{n(n-1)}{(2n-1)^2}(x^{2n-1}-1)-\frac{n(n-1)\log{x}}{2n-1}$

Considering the number of identities involving the falling factorials, there seems to enough room for simplifying the above formula but am not able to see things clearly. I think adding an animation would make it more easy to see what the 'procedure' means and am on it. I'll update it as soon as possible.


Until then
Yours Aye
Me