2 Mean and variance

Mean: estimation

Consistency

Aka Law of large numbers

Let {Xi} iid. \(\hat{X}{n} = n^{-1}\sum^{n} X{i}\). As var[Xn^]=σ2/n0 as n, Weak law: X^n is a consistent estimator of μ.

Normalness of estimator distribution

Aka Central limit theorem (CLT)

Take estimator Un=X¯μσn. ltnPr(Unu)=u12πet2/2dt: so approaches CDF of N(0,1): See convergence of moment generating function (MGF) below. So, as n increases, var[X¯] becomes smaller: visualize pdfs of \(X, \bar{X}{30}, \bar{X}{50}\); see how curve becomes more normal and gets thinner and taller. Generally, can use CLT when n>30.

Proof showing convergence to Normal MGF

Theorem: MGF \( M_{U_{n}(t) \to \) MGF of N(0, 1)}

Proof: iid {Xi}. mUn(t)=E[et(Xinμ)nσ]=E[etn(Xiμ)σ]=mZ(t/n)n: implicitly defining Z with E[Z]=0,var[Z]=E[Z2]=1.

But, by Taylor, mZ(t/n)= mZ(0)+mZ(0)(t/n)+mZ(h)(t/n)2(1/2!) =1+E[Z]t+m(h)(t22n) for some h(0,t/n); so mZ(t/n)=1+m(h)(t22n)1+t22n as n. So, mUn(t)(1+t22n)net2/2, MGF of N(0, 1).

Normal distr: Pivotal quantity to estimate mean

Student’s t distribution is used to estimate μ when distribution is assumed to be Normal, n is small and σ is unknown. Tables only go up to n = 30 or 40. If σ were known, would use normal distribution, or if n>30 would estimate σ and use normal distribution tables.

As (n1)S2σ2χn12 ,

nX¯μStn1.

Goodness of empirical estimate

Can apply Chernoff bounds and Azuma Hoeffding inequality etc.. to judge goodness of empirical estimate.

For Binary valued random variables: A/B testing confidence interval, precision calculator here.

Variance estimation

The biased and unbiased estimators

S2=n1(XiX¯)2 biased: B[S2]=n1E(Xi22X¯Xi+nX¯2)σ2=n1(nE[X2]2E[nX¯2]+nE[X¯2])σ2=n1(nσ2+nμ2nvar[X¯]+nμ2)σ2n1(n1)σ2σ2 from central limit thm. So, defined as S2=(n1)1(XiX¯)2 to get unbiased estimator. Difference small as n.

Normal distr: Pivotal quantity to estimate variance

N(μ,σ2) assumed. If S2=(XiX¯)2n1, then (n1)S2σ2χn12 \why.

So, can use this as pivotal quantity.

Sequential data Sample statistics

k-step Moving averages

Suppose that ran(Xi)R, and that the sample size is n.

Simple moving average

This is simply the mean of the last k Xi.

Exponential Weighed

Here, one uses an exponentially decreasing weight (with decreasing i) while taking a weighted average of k Xi.

Applications

This is useful while predicting stock prices, for example.