Fourier Analysis

Fourier analysis of ANY real valued boolean fn.

The function space for Uniform distribution

Dimensionality

Take the real space R2n: Not dimensional Hilbert space. Any real valued boolean function can be represented by a point in this space.

Can also be applied to randomized boolean functions: Alter expectations and probabilites to handle extra randomness.

Inner product

f,g:=2nxf(x)g(x)=ExU{±1}n[f(x)g(x)]: defined thus scaled to make pS=f=1. If ST: pS,pT=0.

The basis

These form an alternate basis: Parity functions pS, where S is the index vector.

The standard basis is just {ei}.

Transforms

Discrete Fourier series

Discrete Fourier series for any real valued fn over {±1}n: f=S[n]f^(S)pS.

f2=Sf^(S)2=Ex[f(x)2]=1: [Parseval].

Correlation / Discrete Fourier coefficient

Aka Fourier weights on basis vectors or Fourier spectrum. f^(S)=f,pS=Ex[f(x)pS(x)]=Prx(f(x)=pS(x))Prx(p(x)pS(x)): so, a measure of how good an approx of f pS is. f,g=Sf^(S)g^(S).

Fourier transform of f

f^(S)=2nxf(x)pS(x). All f^(S) can be computed in time O(n2n) using FFT alg, given all f(x). \why Inverse FFT can be similarly done. Also see Functional analysis ref.

Estimating coefficients

Estimate all Fourier coefficients of f using FFT

fR: A boolean fn on {0,1}m: Take random m*n matrix, define fR(y)=f(yR). Fourier coefficients of fR are Fourier coefficients of f restricted to a random subspace.

For random non singular m*n R, with m large enough to give good estimation of Pr(f(x)pa(x)=1) whp. Thence find approximations of all the Fourier coefficients of f by finding all coefficients of fR using FFT.

Relationship with total influence

I(f)=S|S|f^(S)2. \why

Relationship with noise sensitivity

Eg: NSϵ(ixi)=22n(1(1ϵ)n)0; For parity: NSϵ(p1n(x))=2121(12ϵ)n21; note correlation with good/ bad Fourier concentration.

NSϵ(f)=2121S(12ϵ)|S|f^(S)2:\ Prx,y(f(x)f(y))=2121Ex,y[f(x)f(y)]; \ But Ex,y[f(x)f(y)]=Ex,y[(Sf^(S)pS(x))(Tf^(T)pT(x))] =S,Tf^(S)f^(T)Ex,y[pS(x)pT(y)]=Sf^(S)f^(T)Ex,y[pS(x)pS(y)]; but\ Ex,y[pS(x)pS(y)]=iE[p{i}(x)p{i}(y)]=(12ϵ)|S|.

|S|ϵ1f^(S)2NSϵ(f), k const: 2NSϵ(f)=1S(12ϵ)|S|f^(S)2=Sf^(S)2S(12ϵ)|S|f^(S)2=Sf^(S)2(1(12ϵ)|S|)Sf^(S)2(1e2)kSf^(S)2.

As, |S|ϵ1f^(S)2NSϵ(f), any f is (NSϵ(f),ϵ1) concentrated.

Concentration in the spectrum

Aka Fourier concentration.

(eps, d) concentrated function f

|S|>df^(S)pS22ϵ. Visualize with histogram.

If C = polysize DNF formulae, every cC is (ϵ,log|c|log1ϵ) concentrated. \why

Depth d ckts of size |c|: (ϵ,log|c|log1ϵ)d1 concentrated. \why

Best fitting d degree polynomial

fg2=S(f^(S)g^(S))2=Ex[(f(x)g(x)))2]; \ so mingEx[(f(x)g(x))2]=|S|df^(S)pS.

t - sparse f

Number of non zero coefficients in f t.

Sparse approximator for any f

\(\exists \frac{\norm{f}{1}^{2}}{\eps}\) \ sparse g with fg2ϵ: Remove all \(\hat{f}< \frac{\eps}{\norm{f}{1}}\); then g is \(\frac{\norm{f}{1}^{2}}{\eps}\) sparse: \(\norm{f}{1} = \sum \hat{f}(S)\) and each f^(S) has min size \(\frac{\eps}{\norm{f}{1}}\). $$\norm{f-g}^{2} = \sum{p_{S} \notin basis(g)} \hat{f}(S)^{2} \leq \ (\max_{p_{S} \notin basis(g)}) \hat{f}(S) \sum_{p_{S} \notin basis(g)} \hat{f}(S) \leq \eps$$.

So, sgn(g) approximates f.

Weak parity learning problem

Let f have a t-heavy Fourier component; then find a t/2 heavy Fourier component. Thence find weakly correlated parity.

KM alg solves this; See colt ref.