Linear operators

Linear Transformation

Definition and Linearity

Linearity

$A (a x + b y) = a A (x) + b A (y)$ . $A$ called operator, from viewing vector as functions.

Mapping between vector spaces

For any field $F$ , can consider linear transformations $A : F^{n} \to F^{m}$ .

Applications, examples

Vector spaces can model many real world things (see vector spaces ref), even functions. In all of these, linear transformations have deep meanings.

Over function spaces: Differentiation, integration, multiplication by fixed polynomial in $P_{n}$ .

Geometric operations: $A x$ . \ Rotation, projection, reflection, stretching, shearing.

As a matrix

Use action on basis vectors

Take the linear operation $A$ , take standard basis vectors ${e_{i}}$ of dom(A). Take an input $x$ , which, as a combination of basis vectors, is $x = \sum_{i} x_{i} e_{i}$ . Now, by linearity, $A (x) = \sum_{i} x_{i} A (e_{i})$ .

(So, $A (e_{i})$ is the basis of the range space range(A)!)

Matrix

Now, write a matrix $A$ , a 2-dim array of numbers such that the ith column, $a_{i} = A_{:, i} = A (e_{i})$ . This will be a $m \times n$ array. Reason for doing this: Ax now be defined to equal A(x).

Matrix * vector multiplication

Define $A x = \sum_{i} a_{i} x_{i}$ . (Voila - linear operation represented by matrix vector multiplication!)

For other views of $A x$ , see a later section.

Vector dot product

(This also defines row vector * column vector multiplication!) This is also the standard inner product.

Row view: inner product with rows

Take the standard inner product. Then, Ax is the vector formed by $(⟨ A_{i, :}, x ⟩)$ .

Changing basis vectors

Representation of the same geometric point can change with the choice of bases.

Take the point $x = \sum_{i} x_{i} e_{i}$ , written according to the old standard basis ${e_{i}}$ . Express ${e_{i}}$ in terms of the new basis ${e_{i}^{'}}$ :, $e_{i} = \sum_{i} a_{i} e_{i}^{'}$ ; write it as the vector $u_{i}$ . Thence, construct the matrix U. Then, $U x = y$ is the representation of the point in terms of the new basis ${e_{i}^{'}}$ .

Representation wrt i/p and o/p bases

Similarly, representation of a linear transformation can also change with choice of bases.

When the right input and output basis is chosen, every matrix is actually diagonal : mere scaling: see SVD.

Forward operation: Ax

Range space

As $A$ is linear, $r a n g e (A) = {A x : \forall x \in d o m (A)}$ is the vector (sub)space $⟨ a_{i} ⟩$ by definition of matrix-vector multiplication. So, aka the column space.

By linearity, $A$ maps $x = x_{r} + x_{n} \in C^{n}, x_{n} \in N (A), x_{r} ⊥ N (A)$ to $A x_{r} \in C^{m}$ .

Left null space

Also, orthogonal complement of $r a n g e (A)^{⊥} = N (A^{T})$ , the Left Null space; both in $C^{m}$ .

Subspaces of the domain

Null space (Kernel)

N(A): ${x : A x = 0}$ . By linearity of $A$ , this is a vector space. N(A) dimension = degrees of freedom = free vars’ number.

Left null space is $N (A^{T})$ .

Find null-space

Ax=0; Reduce $A$ to U; identify pivot and free vars; find values of pivot vars in terms of f free vars; rewrite as combination of f vectors (basis of null space). If $N (A) = C^{0}$ , every b is unique combo of range(A).

Similarly, Find left null-space $N (A^{T})$ basis.

Row space

This is the space spanned by rows of $A$ , so it is $r a n g e (A^{*})$ . $S ⊥ N (A)$ wrt standard inner product, $r a n g e (A) + N (A) = F^{n}$ .

Every x: $A x \neq 0$ has a component in the row space.

SVD view

See elsewhere.

Rank of A

Row and column ranks

The number of linearly independent rows in $A$ is row rank. Similarly column rank is defined. So, column rank = dim(range(A)).

Row rank = column rank

Do triangular row elimination to get PA = LU. Then, row rank = number of non-zero rows/ pivots in A.

But, every column $a_{i}$ of $A$ , corresponding to a 0 pivot, is a linear combination of the non-0 pivot columns: construct the matrix A’ with only such columns, and solve $A^{'} x = a_{i}$ using triangular row elimination. So, column rank = number of non-zero pivots.

SVD view

Take SVD: $A = U \SW V^{}$. Rank r of $A$ corresponds to number of non 0 $σ_{i}$ ; Can take reduced SVD: $A = U_r \SW_r V_r^{}$. $A$ actually acts between r-dim subspaces.

Invertability

Both row space and range(A) must have dimension r. Every $x_{r}$ in $r a n g e (A^{T})$ mapped to unique $A x_{r}$ . Invertability, I. Every $A x_{r}$ in range(A) mapped to unique $x_{r}$ : Else, if $A x_{r^{'}} - A x_{r} = 0$ , $x_{r^{'}} - x_{r}$ is both in row space and N(A). Also, $A \in C^{m \times n}$ ( $m \geq n$ ) full column rank iff no $x_{r}$ mapped to same $A x_{r}$ . So, $A$ invertible (both as a function and group theoretically) iff r=m=n. So unique decomposition into basis vectors.

Else, use pseudoinverse to get left inverse if $m > n$ , right inverse if $n > m$ . To invert Ax=b for $n > m$ , identify n-m linearly dependent columns in $A$ , set corresponding $x_{i} = 0$ , drop those columns to get A’, solve for x’ is $A^{'} x^{'} = b$ .

Identity operation

$I x = x$ .

Matrix multiplication: AB

Composition of transformations

AB is defined to conform to transformation ABx. So, ABCx is associative.

As sum of rank 1 matrices

$u \otimes v = u v^{T}$ makes Rank 1 matrix. $AB = a_{1}b_{1}^{} + .. + a_{p}b_{p}^{}$, where $b_{i}^{*} = B_{i, :}$ . This ensures that $A B x = \sum_{i} (B_{i, :} x) a_{i}$ , as intended! (Remarkable!)

Similarly, for diagonal matrix D: $A D B = \sum_{i} a_{i} d_{i, i} b_{i}^{*}$ . So, for symmetric $S = U Σ U^{T}$ , $A^{T} S B = A^{T} U Σ U^{T} B = \sum σ_{i} (U^{T} a_{i})^{T} (U^{T} b_{i}) = \sum σ_{i} a_{i}^{T} b_{i}$ .

Elementwise description

From sum of rank 1 matrices form: $(AB){i,j} = A{i,:}B_{:,j}$.

Columns of the result

A acts on B’s rows; B acts on A’s columns. Every col of C=AB is a linear combination of A’s cols according to some col of B: $c_{i} = A b_{i}$ .

Form of ABC

Consider expansion of quadratic functional (see vector spaces ref). Similarly, in D = ABC has $D_{i, j} = A_{i, :} (B C_{:, j})$ .

So, $D_{i, j}$ is a linear combination of $B_{i, j}$ .

Computation

Needs $O (n^{3})$ naively. Else needs $O (n^{2.7})$ by Strassen alg. If sparse need $(n^{2} ν)$ .

Rank

rank(AB) $\leq$ min (rank(A), rank(B)): Take A=LU factorization, as AB = LUB, rank(AB) $\leq$ rank(B).

Inverse and transpose

$(A B)^{- 1} = B^{- 1} A^{- 1}$ (Take ABx to x.). $(AB)^{}=B^{}A^{*}$ from defn.

So, as $I = (AA^{-1})^{} = (A^{-1})^{}A^{}$, $(A^{-1})^{} = (A^{})^{-1}$ (aka $A^{-}$). Also, $(AA^{})^{} = AA^{}$ and $A^{}A$ Hermitian.

Block matrices

Block multiplication works.

Other matrix products

Entrywise product

Aka Hadamard product, Schur product. $A . A$ .

Kronecker product

Aka outer product. $A$ $m \times n$ , B $p \times q$ ; $C =$ A $\otimes B$ is $m p \times n q$ block matrix with $C_{i, j} = A_{i, j} B$ .

From defn, $\exists A, B : A \otimes B \neq B \otimes$ A $; (A \otimes B)^{T} = B^{T} \otimes C^{T}; A \otimes (B + C) =$ A $\otimes B + A \otimes C; a A \otimes b B = a b A \otimes B$ . $(A \otimes B) (C \otimes D) = A C \otimes B D$ by block multiplicity. So, $(A \otimes B)^{- 1} = A^{- 1} \otimes B^{- 1}$ . Also, using $Q A = L D U : r a n k (A \otimes B) = r a n k (A) r a n k (B)$ .

$λ$ vector: $λ (A \otimes B) = λ (A) \otimes λ (B)$ : take eigenpairs of $A$ and B: $(λ_{i}, v_{i}), (μ_{j}, u_{j})$ ; $(A \otimes B) (v_{i} \otimes u_{j}) = A v_{i} \otimes B u_{j} = λ_{i} μ_{j} (v_{i} \otimes u_{j})$ .

Inverse operation: Solve Ax = b for x

Solvability Assumption

$b \in r a n g e (A)$ .

Left and right inverses

Right inverse: $I = A A^{- 1}$ . Similarly, left inverse is defined.

Existance conditions

The left inverse exists exactly when you can solve Ax = b for all $b \in r a n g e (A)$ . Right inverse exists when you can solve $x^{T} A = b^{T} \forall b \in r a n g e (A^{T})$ .

Equivalence if both exist

If both left and right inverses exist, they’re the same: $B_{l}^{- 1} B B_{r}^{- 1} = B_{l}^{- 1} = B_{r}^{- 1}$ .

Solutions

Full column rank

If $A$ has full column rank, you have a unique solution, left inverse exists, $x = A_{l}^{- 1} b$ . Proof: Triangularization by row elimination goes through.

Rank Defective matrix

If $A$ is column-rank deficient, eg: short and fat, you have an overdetermined set of linear equations: many ’equally good’ solutions exist, but you may want one with certain properties (like sparsity). See numerical analysis ref for details.

Finding solutions

See numerical analysis reference for various techniques.

Finding inverses

Find right inverse: Gauss-Jordan: Make augmented matrix A:I; use row operations to reduce it to $I : A^{- 1}$ . So, $\exists A^{- 1}$ (right) iff r=m. Similar trick for left inverse.

Or make cofactor matrix C, use $A^{- 1} = \frac{C^{T}}{d e t (A)}$ .

Block matrices

Block Gaussian elimination

Take $X = [\begin{matrix} A & B C & D \end{matrix}]$ , solve $X [\begin{matrix} x y \end{matrix}] = [\begin{matrix} u v \end{matrix}]$ for $[\begin{matrix} x y \end{matrix}]$ , thence derive $X^{- 1}$ . Solving for x in the top equation and substituting it in the bottom, you will have reduced the problem to:

$[\begin{matrix} I & A^{- 1} B 0 & S = D - C A^{- 1} B \end{matrix}] [\begin{matrix} x y \end{matrix}] = [\begin{matrix} A^{- 1} & 0 - C A^{- 1} & I \end{matrix}] [\begin{matrix} u v \end{matrix}]$ .

Block LU

Thence, get block LU, Aka Leibnitz factoriazation: \ $X = [\begin{matrix} A & 0 C & I \end{matrix}] * [\begin{matrix} I & A^{- 1} B 0 & S = D - C A^{- 1} B \end{matrix}]$ .

Schur complement

S is called the Schur complement of $A$ in X.

Inverse of X

Do block back substitution to get $X^{- 1}$ .

Pseudoinverse for long thin A

Projector + inverse. For rectangular, non column rank deficient matrices: $m \geq n$ . Takes $b = A (x_{r} + x_{n})$ to $x_{r}$ in row space, cannot revert $A x_{n} = 0$ ( $x_{n} \in N (A)$ ).

$A^{+} = (A^{}A)^{-1}A^{} = \hat{R}^{-1}\hat{Q}^{}$ (as $A = \hat{R} \hat{Q}$ ) $= V\hat{\SW}^{-1}\hat{U}^{}$ (from SVD).

Restriction to a subspace S

Let Q be the orthogonal matrix formed by an orthonormal basis of S. Then, projection of $a_{i}$ in S is $Q Q^{T} a_{i}$ . So, $A x = \sum_{i} x_{i} a_{i}$ , when restricted to S, becomes $A x = \sum_{i} x_{i} Q Q^{T} a_{i}$ . So, $Q Q^{T} A$ is the operator $A$ restricted to the subspace S.

Submatrices

Principle submatrices: $A_{1 : k, 1 : k} \forall k$ . For each principle submatrix, schur complement is defined: see determinant section.

Random matrices

Random projections

(Johnson Lindenstrauss) See randomized algorithms survey.

Linear Transformation

Definition and Linearity

Linearity

Mapping between vector spaces

Applications, examples

As a matrix

Use action on basis vectors

Matrix

Matrix * vector multiplication

Vector dot product

Row view: inner product with rows

Changing basis vectors

Representation wrt i/p and o/p bases

Forward operation: Ax

Spaces related to range(A)

Range space

Left null space

Subspaces of the domain

Null space (Kernel)

Find null-space

Row space

SVD view

Rank of A

Row and column ranks

Row rank = column rank

SVD view

Invertability

Identity operation

Matrix multiplication: AB

Composition of transformations

As sum of rank 1 matrices

Elementwise description

Columns of the result

Form of ABC

Computation

Rank

Inverse and transpose

Block matrices

Other matrix products

Entrywise product

Kronecker product

Inverse operation: Solve Ax = b for x

Solvability Assumption

Left and right inverses

Existance conditions

Equivalence if both exist

Solutions

Full column rank

Rank Defective matrix

Finding solutions

Finding inverses

Block matrices

Block Gaussian elimination

Block LU

Schur complement

Inverse of X

Pseudoinverse for long thin A

Restriction to a subspace S

Submatrices

Random matrices

Random projections