Fourier Analysis

Fourier analysis of ANY real valued boolean fn.

The function space for Uniform distribution

Dimensionality

Take the real space $R^{2^{n}}$: Not $\infty$ dimensional Hilbert space. Any real valued boolean function can be represented by a point in this space.

Can also be applied to randomized boolean functions: Alter expectations and probabilites to handle extra randomness.

Inner product

$\dprod{f,g} \dfn 2^{-n} \sum_{x} f(x)g(x) = E_{x \in_{U} \set{\pm 1}^{n}}[f(x)g(x)]$: defined thus scaled to make $\norm{p_{S}} = \norm{f} = 1$. If $S \neq T$: $\dprod{p_{S}, p_{T}} = 0$.

The basis

These form an alternate basis: Parity functions $p_{S}$, where S is the index vector.

The standard basis is just $\set{e_i}$.

Transforms

Discrete Fourier series

Discrete Fourier series for any real valued fn over $\set{\pm 1}^{n}$: $f = \sum_{S\subseteq [n]} \hat{f}(S)p_{S}$.

$\norm{f}^{2} = \sum_{S} \hat{f}(S)^{2} = E_{x}[f(x)^{2}] = 1$: [Parseval].

Correlation / Discrete Fourier coefficient

Aka Fourier weights on basis vectors or Fourier spectrum. $\hat{f}(S) = \dprod{f, p_{S}} = E_{x}[f(x)p_{S}(x)]= Pr_{x}(f(x) = p_S(x)) - Pr_{x}(p(x) \neq p_S(x))$: so, a measure of how good an approx of f $p_S$ is. $\dprod{f,g} = \sum_{S} \hat{f}(S) \hat{g}(S)$.

Fourier transform of f

$\hat{f}(S) = 2^{-n} \sum_{x} f(x)p_{S}(x)$. All $\hat{f}(S)$ can be computed in time $O(n2^{n})$ using FFT alg, given all f(x). \why Inverse FFT can be similarly done. Also see Functional analysis ref.

Estimating coefficients

Estimate all Fourier coefficients of f using FFT

$f_R$: A boolean fn on $\set{0,1}^m$: Take random m*n matrix, define $f_R(y) = f(yR)$. Fourier coefficients of $f_R$ are Fourier coefficients of f restricted to a random subspace.

For random non singular m*n R, with m large enough to give good estimation of $Pr(f(x) p_a(x) = 1)$ whp. Thence find approximations of all the Fourier coefficients of f by finding all coefficients of $f_R$ using FFT.

Relationship with total influence

$I(f) = \sum_{S}|S|\hat{f}(S)^{2}$. \why

Relationship with noise sensitivity

Eg: $NS_{\eps}(\Land_{i} x_{i}) = \frac{2}{2^{n}}(1-(1-\eps)^{n}) \approx 0$; For parity: $NS_{\eps}(p_{1^{n}}(x)) = 2^{-1}-2^{-1}(1-2 \eps)^{n} \approx 2^{-1}$; note correlation with good/ bad Fourier concentration.

$NS_{\eps}(f) = 2^{-1}-2^{-1} \sum_{S}(1-2\eps)^{|S|}\hat{f}(S)^{2}$:\ $Pr_{x,y}(f(x) \neq f(y)) = 2^{-1} - 2^{-1}E_{x,y}[f(x)f(y)]$; \ But $$E_{x,y}[f(x)f(y)] = E_{x,y}[(\sum_{S}\hat{f}(S)p_{S}(x)) (\sum_{T}\hat{f}(T)p_{T}(x))] \ = \sum_{S, T}\hat{f}(S)\hat{f}(T)E_{x,y}[p_{S}(x)p_{T}(y)] = \sum_{S}\hat{f}(S)\hat{f}(T)E_{x,y}[p_{S}(x)p_{S}(y)]$$; but\ $E_{x,y}[p_{S}(x)p_{S}(y)] = \prod_{i} E[p_{\set{i}}(x)p_{\set{i}}(y)] = (1-2\eps)^{|S|}$.

$\sum_{|S| \geq \eps^{-1}} \hat{f}(S)^{2} \leq NS_{\eps}(f)$, k const: $2NS_{\eps}(f) = 1 - \sum_{S}(1-2\eps)^{|S|}\hat{f}(S)^{2} = \sum_{S} \hat{f}(S)^{2}-\sum_{S}(1-2\eps)^{|S|}\hat{f}(S)^{2} = \sum_{S} \hat{f}(S)^{2}(1-(1-2\eps)^{|S|}) \approx \sum_{S} \hat{f}(S)^{2}(1-e^{-2}) \geq k’ \sum_{S} \hat{f}(S)^{2}$.

As, $\sum_{|S| \geq \eps^{-1}} \hat{f}(S)^{2} \leq NS_{\eps}(f)$, any f is $(NS_{\eps}(f), \eps^{-1})$ concentrated.

Concentration in the spectrum

Aka Fourier concentration.

(eps, d) concentrated function f

$\norm{\sum_{|S| >d} \hat{f}(S)p_{S}}_{2}^{2} \leq \eps$. Visualize with histogram.

If C = polysize DNF formulae, every $c\in C$ is $(\eps, \log |c| \log\frac{1}{\eps})$ concentrated. \why

Depth d ckts of size $|c|$: $(\eps,\log |c| \log\frac{1}{\eps})^{d-1}$ concentrated. \why

Best fitting d degree polynomial

$\norm{f-g}^{2} = \sum_{S} (\hat{f}(S)-\hat{g}(S))^{2} = E_{x}[(f(x)-g(x)))^{2}]$; \ so $\min_{g} E_{x}[(f(x)-g(x))^{2}] = \sum_{|S|\leq d} \hat{f}(S)p_{S}$.

t - sparse f

Number of non zero coefficients in f $\leq t$.

Sparse approximator for any f

$\exists \frac{\norm{f}{1}^{2}}{\eps}$ \ sparse g with $\norm{f-g}^{2} \leq \eps$: Remove all $\hat{f}< \frac{\eps}{\norm{f}{1}}$; then g is $\frac{\norm{f}{1}^{2}}{\eps}$ sparse: $\norm{f}{1} = \sum \hat{f}(S)$ and each $\hat{f}(S)$ has min size $\frac{\eps}{\norm{f}{1}}$. $$\norm{f-g}^{2} = \sum{p_{S} \notin basis(g)} \hat{f}(S)^{2} \leq \ (\max_{p_{S} \notin basis(g)}) \hat{f}(S) \sum_{p_{S} \notin basis(g)} \hat{f}(S) \leq \eps$$.

So, sgn(g) approximates f.

Weak parity learning problem

Let f have a t-heavy Fourier component; then find a t/2 heavy Fourier component. Thence find weakly correlated parity.

KM alg solves this; See colt ref.