Friday, July 22, 2016

Lemma - Dyadic (XOR) Convolution Theorem


Note1: This lemma is used to prove the Dyadic Convolution Theorem.

Note2: To understand this proof, you should have a good understanding of Bit-wise operators. The first answer of this question has a nice explanation in case you need it.

Lemma Let $n(k)$ count the number of $1$s in the binary expansion of the non-negative integer $k$. We then claim that $n(i)+n(j)$ and $n(i \oplus j)$ have the same parity. Here again $\oplus$ denote the BitXOR operator.

(In the following proof, we use $'\&'$ to denote BitAND and $'|'$ to denote BitOR)

Proof It suffices to prove that $n(i)+n(j)-n(i\oplus j)$ is even. To do this, we find $n(i|j)$ in two ways. First, $i|j$ has a 0 if both bits of $i$ and $j$ in a position are 0, while otherwise it has a 1. Therefore,

$n(i|j)=n(i \oplus j)+n(i\&j)$

Here, $n(i \oplus j)$ counts the number of 1's in positions that has a 1 in exactly one of two numbers $i$ and $j$, while $n(i\&j)$ counts the number of 1's in positions that has a 1 in both the numbers.

On the other hand, we can simply use the inclusion-exclusion principle to compute $n(i|j)$. Here we have,

$n(i|j)=n(i)+n(j)-n(i\&j)$

Equating the two we get, $n(i)+n(j)-n(i\oplus j)=2\text{ }n(i\&j)$. $\blacksquare$

Dyadic (XOR) Convolution theorem

The Dyadic (XOR) convolution of the sequences $a$ and $b$ is the sequence $c=a*b$ defined by

$c_k=\displaystyle\sum_{i\oplus j = k}a_ib_j=\sum_{i}a_ib_{i\oplus k}$

where the symbol $\oplus$ stands for bit-wise XOR operator. All the three sequences must of the same length that is a power of 2 (we could pad them with zeroes if they are not).

The Dyadic (XOR) convolution theorem states that for two sequences $a$ and $b$,

$c=a*b \implies H_mc=H_ma\cdot H_mb$

where $\cdot$ deontes pointwise multiplication and $H_m$ is the Hadamard matrix. We omit the normalization of the matrix throughout this post.

Though all these are available in Wikipedia and other webpages, nowhere was I able to find a proof of the Dyadic (XOR) convolution theorem. So that is the point of this post. Let's attack the theorem head-on.

Before we proceed to the proof of the convolution theorem, we notice a property of the Hadamard matrices. Indexing the rows and colomns of the Hadamard matrices from 0, we can directly have the $(i,j)$ of entry of the matrix using the following formula.

$(H_m)_{(i,j)}=(-1)^{n(i\&j)}$

where $n(k)$ counts the number of 1s in the binary expansion of the non-negative ineger $k$.This definition plays a crucial role in our proof. We now have all that we need to prove the Dyadic (XOR) convolution theorem. For simplicity, we drop the suffix in the Hadamard matrix, $H_m$. 

Proof of the Convolution Theorem Notice that the $k$th element of the vector $Ha$ is given by

$(Ha)_k=(-1)^{n(0\&k)}a_0+(-1)^{n(1\&k)}a_1+(-1)^{n(2\&k)}a_2+\cdots$

We get this by directly multiplying the $k$th row of the Hadamard matrix with the $a$ vector. Similar expression holds for $Hb$. Therefore, the $k$the entry of the pointwise product is

$\begin{align}
(Ha)_k\cdot(Hb)_k&=\displaystyle\left(\sum_i (-1)^{n(i\&k)}a_i\right)\left(\sum_j (-1)^{n(j\&k)}a_j\right)\\
&=\bigl((-1)^{n(0\&k)}a_0+(-1)^{n(1\&k)}a_1+(-1)^{n(2\&k)}a_2+\cdots\bigl)\bigl((-1)^{n(0\&k)}b_0+(-1)^{n(1\&k)}b_1+(-1)^{n(2\&k)}b_2+\cdots\bigl)
\end{align}$

Therefore,

$[a_ib_j](Ha)_k\cdot(Hb)_k=(-1)^{n(i\&k)+n(j\&k)}$

where $[]$ denotes the Coefficient extracting operator.

Now the $k$th element of $Hc$ is

$(Hc)_k=(-1)^{n(0\&k)}c_0+(-1)^{n(1\&k)}c_1+\cdots$

Therefore, $[c_r](Hc)_k=(-1)^{n(r\&k)}$ from which we get

$[a_ib_{i \oplus r}](Hc)_k=(-1)^{n(r\&k)}$ (Using $c_r=a_0b_{0 \oplus r}+a_1b_{1 \oplus r}+a_2b_{2 \oplus r}+\cdots$)

Plugging in $r=i \oplus j$ gives

$[a_ib_j](Hc)_k=(-1)^{n((i \oplus j)\&k)}=(-1)^{n(i\&k\text{ } \oplus \text{ } j\&k)}$

For the Dyadic Convolution theorem to hold, we have to show that the coefficients are equal. That means, the exponents have the same parity. Take, $i'=i\&k$ and $j'=j\&k$. Using the lemma, we see that this is indeed true. $\blacksquare$

Friday, July 15, 2016

An application of Bivariate Ordinary Generating functions

It's interesting to note how each type of Generating function has its own advantages and disadvantages. For example, we saw how, given an OGF or an EGF, partial sums are readily available with certain simple manipulations. But in case of DGFs, we had to rely on Dirichlet's Hyperbola method to evaluate the partial sums.

On the other hand, if you are able to calculate the number of numbers satisfying a 'property' via DGF, then it's not too hard to also calculate the sum of numbers satisfying the same 'property' with a simple change in the DGF. However, it is not the case with OGFs.

Consider the question of finding the number of numbers with atmost $n$ digits whose digit-sum is $k$. We can easily construct its OGF as follows.

$f(x) = (1+x+x^2+\cdots+x^9)^n$

The above expression simply says that each digit can take any value from 0 to 9. Expand $f(x)$ and the coefficient of $[x^k]$ will give you the required answer.

Here we consider the question of finding the sum of numbers with atomst $n$ digits with a digit-sum of $k$. I'm not saying that this cannot be done with an one-variable OGF, but certainly using a bivariate OGF makes things simpler. In addition to noting how much each digit adds to the sum, we must also make a note of each of the digit with its place value. We define some functions as follows:

$f_n(x,y) = 1+xy^{10^{n-1}}+x^2y^{2.10^{n-1}}+\cdots+x^9y^{9.10^{n-1}}$

Each function $f_n(x,y)$ encodes the information about the $n$th digit, making a note of what it adds to the digit sum and digit with its place value. Defining a new function

$f(x,y)=f_1(x,y).f_2(x,y)\cdots.f_n(x,y)$

gives what we need. Differentiating the above generating function w.r.t $y$ and substituting $y=1$ (If you wanna know what this does, refer 'Multivariate Generating Functions' in 'Analytic Combinatorics'), tells us how much each digit contributes to the overall 'sum of numbers with atmost $n$ digits'. . We now let,

$g(x)=\dfrac{d}{dy}f(x,y)\biggl\vert_{y=1}$

This Generating function we are left with is a single variable OGF in which the coefficient of $x^k$ gives the sum of numbers whose digit-sum is $k$. Play around with numbers with certain other 'properties' and you will really see what this post is trying to convey.

I think my next post will be on Multiset Cycle Indices, though am not sure where and/or how to start. C ya then.



Until then
Yours Aye,
Me