0212ab274a9d47075130bbe12c6ad4f2.pdf
MATHS 253, Algebra. Lecture 16
4 April 2022
Today’s topic: Conics and Quadrics
Colour code: Red
Preparation: Go through the text:Textbook: §5.5 pp. 432 – 440.Lecture Notes: Sec. 6.3.
1 / 31
Conic SectionsGeometrically conic sections are defined as an intersection of a plane with a rightcircular cone.
These are parabola, ellipse and hyperbola and shown below:
2 / 31
The equation of the conic section
Algebraically, with the help of calculus it is not too difficult to show that in anycoordinate system in the intersecting plane the equation of the curve will be
ax 21 + 2bx1x2 + cx22 + dx1 + ex2 + f = 0
for some real numbers a, b, c, d, e, f .
This can be written asxT Ax + Mx + f = 0,
where A =[
a bb c
], M =
[d e
], x =
[x1x2
].
andax 21 + 2bx1x2 + cx
22 = x
T Ax.
is the associated quadratic form.
3 / 31
Parabolay = ax 2 or x = ay 2. The parabola is along the axis corresponding to the variablenot being squared.
This is x = ay 2.4 / 31
Ellipse
x 21a2
+x 22b2
= 1.
An ellipse looks like an elongated circle. It extends in the x1-axis direction from −ato a and in the x2-axis direction from −b to b. For example,
x 2
4+
y 2
9= 1.
Points (0,±3) arecalled vertices ofthe ellipse.
5 / 31
Hyperbolax 21a2−
x 22b2
= 1 orx 22a2−
x 21b2
= 1.
The hyperbola never crosses the axis corresponding to the variable having thenegative sign.
−x2
4 +y 29 = 1
x 24 −
y 29 = 1 6 / 31
ExampleClassify the curve
3x 2 − 4xy + 3y 2 = 5
(i.e., say if it is parabola, ellipse or hyperbola, etc.), calculate its parameters anddraw the picture.
We need to diagonalise this quadratic form. As∣∣∣∣ 3 −λ −2−2 3 −λ∣∣∣∣ = λ2 − 6λ + 5 = (λ− 1)(λ− 5)
in the coordinates relative to the basis of principal axes the equation of the curvewill be
z2 + 5t 2 = 5 orz2
5+
t 2
1= 1.
7 / 31
So this is an ellipse whose basis B of principal axes consists of the normalisedeigenvectors
v1 =1√
2
[11
], v2 =
1√
2
[−1
1
].
The vertices of this ellipse in the new coordinate system will be (±√
5, 0). Thevertices in the old coordinates will be
PE←B
[ √50
]=
1√
2
[1 −11 1
][ √50
]=
[ √5/2√5/2
]
PE←B
[−√
50
]=
1√
2
[1 −11 1
][−√
50
]=
[−√
5/2−√
5/2
],
hence the vertices are at (√
5√2,√
5√2) and (−
√5√2,−√
5√2).
8 / 31
Geometric property of a parabolaA parabola is the set of points in a plane that are equidistant from a fixed point F(called the focus) and a fixed line (called the directrix).
|PF| =√
x 2 + (y − p)2 = |y + p| (the distance to the line)
x 2 + (y − p)2 = (y + p)2
x 2 = 4py.9 / 31
Geometric property of an ellipseAn ellipse is the set of points in a plane the sum of whose distances from two fixedpoints F1 and F2 (called the foci) is a constant.
|PF1|+ |PF2| = 2a√(x + c)2 + y 2 +
√(x − c)2 + y 2 = 2a
(a2 − c2)x 2 + a2y 2 = a2(a2 − c2).x 2
a2+
y 2
a2 − c2= 1.
10 / 31
Geometric property of a hyperbolaAn hyperbola is the set of points in a plane the difference of whose distances fromtwo fixed points F1 and F2 (called the foci) is a constant, i.e., |PF1|− |PF2| = ±2a:
x 2
a2−
y 2
b2= 1 where b2 = c2 − a2.
As x and y get large x2
a2= 1 + y
2
b2=⇒±xa =
yb . and the graph approaches the
asymptotes y = ±ba x .If hyperbola is rotated 90◦ the minus shifts from y 2 to x 2.
11 / 31
Getting rid of linear terms
We shift conic h units to the right and k units up by taking the standard equation(of parabola, ellipse or hyperbola) and replacing x and y by x − h and y − k .
Sketch the conic
9x 2 − 4y 2 − 72x + 8y + 176
After completing squares and divid-ing by 36 we get
−(x − 4)2
4+
(y − 1)2
9= 1.
12 / 31
Getting rid of linear terms not always possible
Sketch the conic and find the vertex and the focus:
y 2 − 8y = 6x − 16.
Solution: After completing squares we get
(y − 4)2 = 6x.
Hence the vertex is at (0, 4) and (since 4p = 6) the focus is at (1.5, 4).
When there is no square you cannot get rid of the corresponding linear term.
13 / 31
Getting rid of cross-product termsThe presence of cross-product and linear terms means that it has been rotatedand translated out of standard position, e.g.,
5x 21 − 4x1x2 + 8×22 + 4
√5×1 − 16
√5×2 + 4 = 0
is a shifted and rotated ellipse
14 / 31
Example continuedWe must choose new axes which coincide with the principal axes of theassociated quadratic form
5x 21 − 4x1x2 + 8×22 = [x1 x2]
[5 −2−2 8
][x1x2
]= xT Ax.
and then adjust the linear form
4√
5×1 − 16√
5×2 = [ 4√
5,−16√
5 ][
x1x2
]= Mx.
The characteristic equation of A is
det(A −λI) = (5 −λ)(8 −λ)− 4 = λ2 − 13λ + 36 = (λ− 4)(λ− 9),
that is, the eigenvalues are λ1 = 4 and λ2 = 9.
The orthonormal bases of one-dimensional eigenspaces are v1 =1√5(2, 1)T and
v2 =1√5(−1, 2)T , respectively.
15 / 31
Example continuedLet F = {v1, v2} be the corresponding basis of R2. Then we can write the changeof basis matrix
P = PE←F =1√
5
[2 −11 2
].
This is the matrix of anticlockwise rotation through an angle arctan 1/2 which isapproximately 26.6◦. Then we have
PT AP = D =[
4 00 9
],
and we express the vector of new coordinates y = [x]F of x by substitutingx = [x]E = PE←F [x]F = Py. Then we obtain
yT Dy + MPy + f = 0, where MP = [−8,−36]or
4y 21 + 9y22 − 8y1 − 36y2 + 4 = 0.
We have gotten rid of cross-products.16 / 31
Example continuedHowever, the origin is still shifted. Completing the squares:
4(y1 − 1)2 + 9(y2 − 2)2 = 36 or(y1 − 1)2
9+
(y2 − 2)2
4= 1.
This shows that in the new coordinates the curve is a shifted ellipse with one axishaving length 3 and another length 2.
17 / 31
Quadric surfaces. Elliptic CylinderThis degenerate quadric surface is called elliptic cylinder.
x 2
a2+
y 2
b2= 1
It appears when one eigenvalue (the third one) is zero and the coefficient of z isalso zero.
18 / 31
Quadric surfaces. Hyperbolic CylinderThis degenerate quadric surface is called hyperbolic cylinder.
x 2
a2−
y 2
b2= 1
It appears when one eigenvalue (the third one) is zero and the coefficient of z isalso zero.
19 / 31
Quadric surfaces. Parabolic CylinderThis degenerate quadric surface is called parabolic cylinder.
x 2 + 2ay = 0
It appears when two eigenvalues (the second and the third one) are zero and thecoefficient of z is also zero.
20 / 31
Quadric surfaces. Ellipsoid
This quadric surface is called ellipsoid.
x 2
a2+
y 2
b2+
z2
c2= 1
It has semiaxes a, b, c, respectively.21 / 31
Quadric surfaces. Hyperboloids of one sheetThis quadric surface is called hyperboloid of one sheet.
x 2
a2+
y 2
b2−
z2
c2= 1
It has semiaxes a and b, respectively.22 / 31
Quadric surfaces. Hyperboloids of two sheetsThis quadric surface is called hyperboloid of two sheets.
−x 2
a2−
y 2
b2+
z2
c2= 1
A good rule to remember: one minus – one sheet, two minuses – two sheets.23 / 31
Quadric surfaces. Circular ConeThis degenerate quadric surface is called circular cone.
x 2
a2+
y 2
b2−
z2
c2= 0
It appears when one eigenvalue is negative and the constant term is zero.24 / 31
Example
Let Q : R3 → R be the quadratic form, given by
Q(x) = 2x 21 + 2×22 + 2x
23 + 2x1x2 + 2x1x3 + 2x2x3.
• Write down the matrix of this form in the standard basis E .• Diagonalise the form Q, that is find an orthonormal basis F in which the form
will be given by the formula
Q(x) = λ1y21 + λ2y
22 + λ3y
23 ,
[x]F = (y1, y2, y3)T ? Write down this basis and this formula, with allcoefficients given numerically.
• Determine the type of the surface Q(x) = 1.
25 / 31
Example continuesThe matrix of this form is
A =
2 1 11 2 1
1 1 2
.
The characteristic polynomial of this matrix is equal to
det(A −λI) =
∣∣∣∣∣∣2 −λ 1 1
1 2 −λ 11 1 2 −λ
∣∣∣∣∣∣ =∣∣∣∣∣∣
1 −λ −1 + λ 00 1 −λ −1 + λ1 1 2 −λ
∣∣∣∣∣∣ =(we subtract the third row from the first and the second)
(λ− 1)2∣∣∣∣∣∣−1 1 0
0 −1 11 1 2 −λ
∣∣∣∣∣∣ = − (λ− 1)2(λ− 4).Therefore there are two distinct eigenvalues: λ1,2 = 1, λ3 = 4; one of them 1 is ofalgebraic multiplicity 2.
26 / 31
Example continuesLet us find the eigenvectors. Take λ1,2 = 1 first.
A −λ1I = A − I =
1 1 11 1 1
1 1 1
→
1 1 10 0 0
0 0 0
.
This row reduced echelon form corresponds to the system
x1 + x2 + x3 = 0,
whose solutions are spanned by the vectors f1 = (−1, 1, 0)T and f2 = (−1, 0, 1)T .Orthogonalizing we get:
g1 = f1 = (−1, 1, 0)T ,
g2 = f2 −f2 · g1|g1|2
g1 = (−1, 0, 1)T −12(−1, 1, 0)T =
(−
12,−
12, 1)T
.
27 / 31
Example continues
After normalization we have
v1 =1√
2(−1, 1, 0)T ,
v2 =1√
6(1, 1,−2)T .
The third eigenvector belonging to λ1 = 4 can be found as a unit vector orthogonalto v1 and v2. It can be taken as
v3 =1√
3(1, 1, 1)T .
28 / 31
Example continues
In this basis F = {v1, v2, v3} the form is presented by
Q(v) = z21 + z22 + 4z
23,
[v]F = (z1, z2, z3)T .
The surface Q(v) = 1 is an ellipsoid whose two semiaxes are equal to 1 (so that it
is an ellipsoid of revolution) and the third semiaxis is equal to12
.
29 / 31
To Summarise
As a result of today’s lecture (together with your further study) you should• be able to determine whether a conic section is a parabola, hyperbola or
ellipse and determine its position;• be able to determine the type of a quadric.
30 / 31
What to do now
Consolidation: Try exercises:Textbook: §5.5, pp. 441–442: # 25,27,31,33,
35,37,39,61,63,65,67,69,73,75,79.Lecture Notes: Exercises of Sec. 6.3.
Next topic: Positive definite quadratic forms.
Colour code: Red
Preparation: Go through the text:Textbook: §5.5; pp. 429–432.Lecture Notes: Sec. 6.4–6.5.
31 / 31
1241aea110249b5359051ed84d5a6606.pdf
MATHS 253, Lecture 8
16 March 2022
Today’s topic: The Cayley-Hamilton Theorem, Minimal PolynomialsColour code: Red.Lecture Notes: Sec. 3.1–3.3
Textbook: §6.1
1 / 18
Algebraically Closed Fields
A field F is called algebraically closed if every polynomial f (x) ∈ F [x] of degree atleast 1 can be written as a product of linear factors:
f (x) = c(x − r1)(x − r2) · · ·(x − rn)
for some c, r1, r2, . . . , rn ∈ F .
Example 1
1. C is algebraically closed.2. R is not algebraically closed (e.g. f (x) = x 2 + 1 cannot be written as a
product of linear factors over R).
For any field F , there exists another field E ⊇ F such that E is algebraically closed.
2 / 18
Uniqueness of Linear Factors DecompositionTheorem 2Let F be an algebraically closed field, and let f ∈ F [x]. Then the decomposition off into linear factors is unique (up to the order of the factors).
Proof.Suppose we have two factorizations
f (x) = c(x − r1) · · ·(x − rn) = c′(x − r ′1) · · ·(x − r′n′).
We have1. n = n′, since otherwise the two expressions have different degrees.2. c = c′, since otherwise the two expressions have different x n coefficients.3. If some rj 6∈ {r ′1, r
′2, . . . , r
′n′}, then the first expression says f (rj ) = 0, while the
second says f (rj ) 6= 0. So {r1, r2, . . . , rn}⊆{r ′1, r′2, . . . , r
′n′} and vice versa.
3 / 18
Polynomials of Operators
If T : V → V is a linear operator on an F -vector space V and f ∈ F [x], we definethe operator f (T ) as follows: write f (x) = a`x` + a`−1x`−1 + · · ·+ a1x + a0; then
f (T )(v) = a`T`v + a`−1T
`−1v + · · ·+ a1T v + a0v.
Notice that for any basis B, we have
[f (T )(v)]B = [a`T`v + a`−1T
`−1v + · · ·+ a1T v + a0v]B= [a`T
` + a`−1T`−1 + · · ·+ a1T + a0I]B[v]B
=(
a`[T ]`B + a`−1[T ]
`−1B + · · ·+ a1[T ]B + a0[I]B
)[v]B
= f ([T ]B)[v]B.
so that [f (T )]B = f ([T ]B).
4 / 18
Annihilating Polynomials
Definition 3Let V be an n-dimensional vector space over F , and let T : V → V be a linearoperator. A polynomial f ∈ F [x] is called an annihilating polynomial for T if
f (T ) = 0 that is, if f (T )(v) = 0 for all v ∈ V .
Definition 4The minimal polynomial µT of an operator T is the polynomial of lowest degree,with leading coefficient 1, such that µT (T ) = 0.
5 / 18
Example: Annihilating PolynomialsExample 5Let V = C3, and let T : V → V have standard matrix
A =
1 0 10 1 1
0 0 2
Then f (x) = x 3 − 4x 2 + 5x − 2 and g(x) = x 2 − 3x + 2 are annihilatingpolynomials for T .How do we know this? We compute
f (A) = g(A) =
0 0 00 0 0
0 0 0
so f (T )(v) = f (A)v = 0 and g(T )(v) = g(A)v = 0 for any v ∈ V .6 / 18
On Annihilating PolynomialsLemma 6Let V be an n-dimensional F vector space with basis B = {v1, v2, . . . , vn}, and letT : V → V be a linear operator. A polynomial f ∈ F [x] annihilates T if and only if
f ([T ]B) =
0 0 . . . 00 0 . . . 0…
…. . .
…0 0 . . . 0
i.e., f annihilates [T ]B .
Proof.If f ([T ]B) = 0, then [f (T )(v)]B = f ([T ]B)[v]B = [0]B for all v, so T is annhilated by f .If f ([T ]B) 6= 0, suppose its i th column is ai 6= 0. Then
[f (T )(vi)]B = f ([T ]B)[vi ]B = f ([T ]B)ei = ai 6= 0
so T is not annhilated by f .7 / 18
Existence of Annihilating Polynomials
Lemma 7Let V be an n-dimensional F -vector space and let T : V → V be a linear operator.Then there is a non-zero polynomial f ∈ F [x] of degree at most n2 whichannihilates T .
Proof.Let A be the standard matrix of T . It suffices to show that there is f ∈ F [x] ofdegree at most n2 such that f (A) = 0.Consider the collection {I, A, A2, . . . , An
2}∈ Fn×n. Since dim Fn×n = n2, this set is
linearly dependent. So there are b0, b1, . . . , bn2 such that
b0 + b1A + b2A2 + . . . + bn2 A
n2 = 0.
We can take f (x) = b0 + b1x + . . . + bn2 xn2 .
8 / 18
Cayley-Hamilton Theorem
It turns out we can do much better:
Theorem 8Let V be an n-dimensional F -vector space and let T : V → V be a linear operator.Then the characteristic polynomial pT annihilates T
Proof.The proof is quite technical. See course notes Sections 3.2–3.3 for details.How is this better? Two ways:
1. deg pT (x) = n � n2.2. This way is constructive (gives a method to find an annihilating polynomial).
9 / 18
Example 9Let T : R3 → R3 have standard matrix
A =
1 2 22 1 −2−2 2 5
Find a polynomial f ∈ R3[x] which annihilates A.
10 / 18
By the Cayley-Hamilton theorem we can take f (x) = pA(x). We compute
pA(λ) = det(A −λI3) =
∣∣∣∣∣∣1 −λ 2 2
2 1 −λ −2−2 2 5 −λ
∣∣∣∣∣∣ =∣∣∣∣∣∣
1 −λ 2 22 1 −λ −20 3 −λ 3 −λ
∣∣∣∣∣∣= (3 −λ)
∣∣∣∣∣∣1 −λ 2 2
2 1 −λ −20 1 1
∣∣∣∣∣∣ =∣∣∣∣∣∣
1 −λ 0 02 1 −λ −20 1 1
∣∣∣∣∣∣= (1 −λ)(3 −λ)
∣∣∣∣ 1 −λ −21 1∣∣∣∣ = (1 −λ)(3 −λ)2.
So we can take f (x) = (1 − x)(3 − x)2.
11 / 18
Finding the Minimal Polynomial
Perhaps surprisingly, the Cayley-Hamilton theorem gives us a reasonably-efficientmethod to find the minimal polynomial of an operator.
Lemma 10Let T : V → V be a linear operator, and suppose f ∈ F [x] annihilates T . ThenµT (x)|f (x).
Corollary 11Let T : V → V be a linear operator. Then µT (x)|pT (x).
12 / 18
Proof (of Lemma).Let A be the standard matrix of T .
Write f (x) = µT (x)q(x) + r(x) with deg r < deg µT .
We have 0 = f (A) = µT (A)q(A) + r(A) = r(A).
If r(x) 6= 0, then r annihilates A and has degree smaller than µT . That can’thappen.
So r(x) = 0, so that f (x) = µT (x)q(x); that is, µT (x)|f (x).
Proof (of Corollary).Take f (x) = pT (x) and apply the Cayley-Hamilton theorem and the Lemma.
13 / 18
Finding the Minimal Polynomial
We can use this to find the minimal polynomial of T as follows:1. Write down the standard matrix A of T .2. Compute pT (x) (= pA(x))3. Write down the divisors {f1(x), f2(x), . . . , f`(x)} of pT (x).4. Find the smallest degree fj (x) which satisfies fj (A) = 0. That’s µT (x).
14 / 18
Example 12Find the minimal polynomial of T : R3 → R3, which has standard matrix
A =
1 2 22 1 −2−2 2 5
.
We already know pT (λ) = (1 −λ)(3 −λ)2. The polynomials that divide pT are:f1(λ) = 1, f2(λ) = 1 −λ, f3(λ) = 3 −λ,f4(λ) = (1 −λ)(3 −λ), f5(λ) = (3 −λ)2, f6(λ) = (1 −λ)(3 −λ)2.
We test them:
f1(A) =
1 0 00 1 0
0 0 1
f2(A) =
0 −2 −2−2 0 2
2 −2 −4
f3(A) =
2 −2 −2−2 2 2
2 −2 −2
15 / 18
More testing:
f4(A) =
0 −2 −2−2 0 2
2 −2 −4
2 −2 −2−2 2 2
2 −2 −2
=
0 0 00 0 0
0 0 0
f5(A) =
2 −2 −2−2 2 2
2 −2 −2
2 =
4 −4 −4−4 4 4
4 −4 −4
f6(A) =
0 −2 −2−2 0 2
2 −2 −4
2 −2 −2−2 2 2
2 −2 −2
2 =
0 0 00 0 0
0 0 0
So f4, f6 annihilate A. f4 has the smaller degree, so µT (x) = f4(x) = x 2 − 4x + 3.
16 / 18
To Summarise
As a result of today’s lecture (together with your further study) you should• Know what it means for a field to be algebraically closed.• Know how to compute the evaluation of a polynomial at a matrix/operator.• Know the definitions of annihilating polynomial and minimal polynomial.• Be able to state the Cayley-Hamilton theorem.• Be able to find the minimal polynomial of a matrix/operator.
17 / 18
What to do now
Consolidation: Lecture Notes: Exercises of Sec. 3.3;
Textbook: §4.3, Exercises 33–38Next topic: Jordan Normal Form
Color code: Red.
Preparation: Go through the text:Lecture Notes: Sec. 3.4–3.5.
18 / 18
180bc17b86b108d6351327353ca789d0.pdf
MATHS 253, Lecture 2
2 March, 2022
Today’s topic: Basis, Dimension, The Coordinate Mapping
Colour code: Orange/Red
Lecture Notes: Sec. 1.4–1.5; pp. 11–16;
Textbook: §6.2, pp. 461–474.
1 / 19
Linear Independence
Definition 1Let V be a vector space over a field F . A subset {v1, v2, . . . , vk} of V is said to belinearly dependent if there exist scalars a1, a2, . . . , ak in F , not all of which are 0, suchthat
a1v1 + a2v2 + · · · + ak vk = 0.
Otherwise it is linearly independent.
Example 2
• Let Eij be the matrix in Fm×n whose (ij)th entry is 1 and all other entries are 0.Such a matrix is called a matrix unit. The set of all mn matrix units is linearlyindependent.
• The set of monomials {1, x, x2, . . . , xn} is linearly independent in Fn[x].
2 / 19
Basis
Definition 3Let V be a vector space. A basis B for V is a subset of V such that
• B is linearly independent set of vectors,• B spans V .
Example 4
• The set of all mn matrix units Eij is linearly independent, spans Fm×n and hence abasis of this vector space.
• The set of monomials {1, x, x2, . . . , xn} is linearly independent in Fn[x] and it alsospans Fn[x] so it is a basis of this vector space.
3 / 19
Exercise
Exercise 1Do matrices
M1 =
[1 00 1
], M2 =
[1 00 −1
]form a basis of the subspace D of all diagonal matrices from R2×2?Yes, they do. They are linearly independent (one is not a multiple of another) and
a + b
2
[1 00 1
]+
a −b2
[1 00 −1
]=
[a 00 b
].
hence {M1, M2} spans D.
4 / 19
The Main Technical Result
Lemma 5Suppose V = span{u1, . . . , um} and k > m. Then any k vectors v1, . . . , vk from V arelinearly dependent.
Proof. Let v1 = a11u1 + . . . + a1mum,
v2 = a21u1 + . . . + a2mum,…
vk = ak1u1 + . . . + akmum.
Let us prove that there exist coefficients x1, . . . , xk , not all zero, such thatx1v1 + . . . + xk vk = 0. We write
x1v1 + . . . + xk vk =
x1(a11u1 + . . . + a1mum) + . . . + xk (ak1u1 + . . . + akmum) =
(a11x1 + . . . + ak1xk )u1 + . . . + (a1mx1 + . . . + akmxk )um = 0.
5 / 19
We notice that this equation will be satisfied if (x1, . . . , xk ) is a solution to the systemof linear equations
a11x1 + . . . + ak1xk = 0
. . .
a1mx1 + . . . + akmxk = 0
Since k > m, this system has more unknowns than equations, hence it has a nonzerosolution. This proves that {v1, . . . , vk} is linearly dependent and the lemma.
6 / 19
Exercise 2Let V be a vector space. What is the smallest number k such that every k vectors ofV are linearly dependent in cases:
• V = R5 k = 6;• V = R7[x] k = 9;• V = R3×3 k = 10.
Theorem 6All bases for a finite-dimensional vector space V are equinumerous (have the samenumber of vectors).
Proof.Suppose that B = {u1, . . . , um} and C = {v1, . . . , vk} be two bases for V and k > m.Then By Lemma 5 C is linearly dependent, a contradiction.
7 / 19
Properties of Bases
Theorem 7Let V be a finite-dimensional vector space. Then:
(a) Every spanning subset of V can be reduced to a basis;
(b) Every linearly independent subset of V can be extended to a basis.
Corollary 1
Every finite-dimensional vector space has a finite basis.
8 / 19
Dimension
Definition 8Let V be a finite-dimensional vector space over F . The number of vectors in somebasis (which is the number of elements in all other bases too) is called the dimensionof V over F , denoted as dimF V or simply dim V .
Example 9
• dim Fn[x] = n + 1.• dim Fm×n = mn.
9 / 19
ExampleLet us consider the set U of all 2 × 2 matrices of the form[
a bb a
], a, b ∈ R.
Then this is 2-dimensional subspace of R2×2. Indeed,[a bb a
]= a
[1 00 1
]+ b
[0 11 0
]= aM1 + bM2.
Hence U is spanned by B = {M1, M2}. On the other hand B is linearly independent.Suppose that
x1M1 + x2M2 = x1
[1 00 1
]+ x2
[0 11 0
]=
[x1 x2x2 x1
]=
[0 00 0
].
Then comparing entries of these matrices, we get x1 = x2 = 0. Hence B is indeedlinearly independent.
10 / 19
Coordinates relative to a basis
Theorem 10Let B = {v1, v2, . . . , vn} be a basis of a vector space V over F . Then every vector u inV can be uniquely represented as a linear combination of
u = a1v1 + a2v2 + · · · + anvn
with coefficients a1, a2, . . . , an from F . These coefficients are called the coordinates ofu relative to the basis B.
11 / 19
The coordinate mapping
Let B = {v1, v2, . . . , vn} be a basis of a vector space V over F , u be a vector in V and
u = a1v1 + a2v2 + · · · + anvn
Then we define the coordinate column of u relative to B as
[u]B =
a1a2…an
∈ F n.
The mapping from V to F n
u 7→ [u]Bis called the coordinate mapping.
12 / 19
The Properties of the Coordinate Mapping
Theorem 11
1. The coordinate mapping is one-to-one and onto mapping from V to F n;
2. for any u, v from V and a scalar a in F
[u + v]B = [u]B + [v]B, [au]B = a[u]B.
Corollary 12
For any vectors v1, . . . , vk and for any scalars a1, . . . , ak
[a1v1 + . . . + ak vk ]B = a1[v1]B + a2[v2]B + . . . + ak [vk ]B.
13 / 19
The Main Property of the Coordinate Mapping
Theorem 13Let V be a vector space over F . Then vectors u1, . . . , uk ∈ V are linearly dependent inV if and only if their coordinate columns [u1]B, . . . , [uk ]B are linearly dependent in F
n.
First we observe that [0]B = 0. Assume that the system {u1, . . . , uk} is linearlydependent:
0 = a1u1 + . . . + ak uk
for certain coefficients a1, . . . , ak among which are not all zero. Applying thecoordinate mapping to both sides by Corollary 12 we obtain
0 = [0]B = [a1u1 + . . . + ak uk ]B = a1[u1]B + a2[u2]B + . . . + ak [uk ]B,
i.e., the coordinate columns are also linearly dependent.
14 / 19
On the other hand, if we have a linear dependency among the coordinate columns
a1[u1]B + a2[u2]B + . . . + ak [uk ]B = 0,
again by Corollary 12 we may represent this equation as
[a1u1 + . . . + ak uk ]B = [0]B
and deduce that a1u1 + . . . + ak uk = 0 because the coordinate mapping is one-to-one.
15 / 19
ExerciseDecide if the polynomials
f1(x) = 1 + x, f2(x) = x + x2, f3(x) = 1 + x
2
form a linearly dependent or independent system.
Answer: Let us choose the standard basis B = {1, x, x2} of R2[x]. Then we form
([f1]B [f2]B [f3]B ) =
1 0 11 1 0
0 1 1
.
We have ∣∣∣∣∣∣1 0 11 1 00 1 1
∣∣∣∣∣∣ = 2 6= 0.This means {[f1]B [f2]B [f3]B} is linearly independent, hence {f1, f2, f3} is linearlyindependent too.
16 / 19
Challenging Questions
1. Let V be a finite-dimensional subspace over R and W be a proper subspace of V .Prove that W is also finite-dimensional.
2. Show that the vector space CR[a, b] of all continuous real-valued functions is notfinite dimensional.
17 / 19
To Summarise
As a result of today’s lecture (together with your further study) you should
• know the definition of linear dependence and linear independence;• know the definition of a basis;• understand the dimension;• be able to find coordinates of a vector relative to a basis;• know how to decide whether or not a vector is in the span of other vectors or be
able to decide if a system of vectors is linearly dependent or independent by usingthe coordinate mapping.
• given a spanning subset of a subspace know how to find a basis – this will bepracticed in Tutorial 1.
18 / 19
What to do now
Consolidation: Try exercises: Lecture Notes: Exercises of Sec. 1.4–1.5; Textbook:Exercises pp. 474-75: # 7,13 19, 29, 35, 45, 51.
Next topic: Linear Transformations
Colour code: Yellow
Preparation: Go through the text: Lecture Notes: Sec. 2.1–2.2. Textbook: §6.4, pp.490–497; §6.6, pp. 515–521.
19 / 19
22f40aa274283a0744a6b2dad48a09ee.pdf
MATHS 253, Lecture 1
28 February, 2022
Today’s topic: Introduction. Fields, Vector Spaces and SubspacesColour code: Orange.Lecture Notes: Sec. 1.1–1.3; pp. 7–11
Textbook: §6.1, pp. 447–459.
1 / 19
Goals of Algebra in 253• Extend (slightly) the concept of scalar.• Extend (significantly) the concept of vector space, most importantly to
functional vector spaces with applications to polynomial and Fourierapproximations.
• Learn to “understand” linear operators geometrically.• Extend our knowledge of discrete time system evolution (to systems whose
matrices have complex eigenvalues).
• Prove the fundamental theorem on orthogonal diagonalisability of Hermitianoperators (symmetric matrices).
• Apply the latter to classification of quadratic forms.• Use quadratic forms for classification of critical points of functions of several
variables.
2 / 19
Scalars
Galileo’s slogan was‘‘Measure everything that is measurable, and make measurable everythingwhich is not yet so.”
This led to two big ideas in Algebra: coordinatisation and representation. In both,objects (e.g., groups or geometries) are “understood” in terms of an algebraicconstruction built from some sort of numbers (scalars).
Gradually it has become clear that an all-embracing concept of number is needed.
Nowadays scalars are taken from objects called fields that are definedaxiomatically.
Most important fields of scalars are real numbers R and complex numbers C.
3 / 19
Complex NumbersComplex numbers is a two-dimensional vector space over R with the basis {1, i},where a multiplication is defined:
1 · 1 = 1, 1 · i = i · 1 = i, i 2 = −1.
A complex number can be written as a + bi , where a, b ∈ R. We havea + bi = a1 + b1i if and only if a = a1 and b = b1.
We multiply complex numbers as follows:
(a + bi )(c + di ) = (ac − bd ) + (ad + bc)i.
If z = a + bi , then z̄ = a − bi is called the complex conjugate of z. The followingproperty of conjugation are either obvious or easy to check:
¯̄z = z, z1 + z2 = z̄1 + z̄2, z1z2 = z̄1z̄2.
4 / 19
Complex NumbersIf z = a + bi 6= 0, then
zz̄ = (a + bi )(a − bi ) = (a2 + b2) > 0
is a real number which is positive. Hence
(a + bi )(
aa2 + b2
−b
a2 + b2i)
= 1,
and we obtain the inverse of z:
z−1 =a
a2 + b2−
ba2 + b2
i =z̄|z|2
,
where |z| =√
zz̄ =√
a2 + b2 is called the modulus (absolute value) of z.
We can add, multiply and divide complex numbers exactly as the real ones.5 / 19
Fields
So we can add, subtract, multiply and divide (by a non-zero number) real andcomplex numbers. Any set where we can perform such operations in mathematicsis called a field.
But for the purposes of this course a field will mean one of the two sets of scalarsR or C.
There are other fields, for example, the set Q of rational numbers but we will dealonly with R and C.
What about the set of all integers Z? It is not a field since no division exists.
6 / 19
Fundamental Theorem of Algebra
Complex numbers have one very important advantage which will be a significantfactor in this course: the so-called Fundamental Theorem of Algebra1 which tellsus that every polynomial
f (z) = zn + a1zn−1 + . . . + an−1z + an
with complex coefficients ai ∈ C has at least one root in C, and therefore thepolynomial f (z) can always be factorized into a product of linear factors:
f (z) = (z −λ1)(z −λ2) . . . (z −λn),
where λi ∈ C are the roots of f .
1This theorem is not a part of this course.7 / 19
Importance of linear combinationsThe ability to form linear combinations of vectors was important in the study of Rn.It led us to the notions:• linear dependence and independence, span,• basis,• dimension.
To define linear combinations we need a field F of “scalars”, a set V of “vectors”and “an operation of scalar multiplication.”
The properties we require are:
1 · v = v, 0 · v = 0,
(a1v1 + . . . + ak vk ) + (b1v1 + . . . + bk vk ) = (a1 + b1)v1 + . . . + (ak + bk )vk,
a · (a1v1 + . . . + anvn) = (aa1)v1 + . . . + (aan)vn.If these are satisfied we say that we have a vector space V over F . (See the
Appendix of the Lecture Manual for the complete set of axioms of a vector space.)8 / 19
Vector spaces
1. Rn and Cn;2. Sets of matrices Rm×n and Cm×n;3. Rn[x ] and Cn[x ] – polynomials of degree at most n;4. R[x ] and C[x ] – polynomials without restriction on their degree;5. Real- and complex-valued functions FunR[a, b] and FunC[a, b] on [a, b] or
FunR(a, b) and FunC(a, b) on (a, b). For functions linear combinations aredefined as:
(a1f1 + . . . + anfn)(x )df= a1f1(x ) + . . . + anfn(x ).
For the purposes of this course a vector space is one of the objects listed above ortheir subspaces (to be defined on the next slide).
9 / 19
Subspaces
Definition 1Let V be a vector space over a field F . A subspace of V is a subset W of V whichsatisfies the following conditions:
S1. 0 belongs to W ;S2. for any vectors w1, w2 in W the sum w1 + w2 is again in W ;S3. for each scalar a in F and vector w in W the product aw is again in W
If the three conditions hold, we can form linear combinations: if a1, . . . , an arescalars and w1, . . . , wn from W , then
a1w1 + . . . + anwn ∈ W.
So W is itself may be considered as a vector space over F .
10 / 19
Examples of Subspaces
1. Let p(x ) be a polynomial in F [x ]. The set W of all polynomials of F [x ], whichare divisible by p(x ), is a subspace in F [x ].
2. Let V = FunR[a, b] be the vector space of all functions on [a, b]. The setCR[a, b] of all continuous functions on [a, b] is a subspace of V .
3. The set DR[a, b] of all differentiable real-valued functions on [a, b] is asubspace of CR[a, b].
4. Let V = CR[a, b] be the vector space of all continuous functions on [a, b] andα ∈ [a, b]. The set W of all functions f (x ) such that f (α) = 0, is a subspacein V .
11 / 19
Exercise
Exercise 1Is the set W of functions f such that f (4) = 2f (0) a subspace of FunR[0, 4]?
Yes, because 0 ∈ W and, if f, g ∈ W , then
(λf )(4) = λf (4) = 2λf (0) = 2(λf )(0).
and(f + g)(4) = f (4) + g(4) = 2f (0) + 2g(0) = 2(f + g)(0).
12 / 19
SpanDefinition 2Let V be a vector space over the field F and v1, . . . , vk be arbitrary vectors in V .Then the span of v1, . . . , vk is the set
span{v1, . . . , vk} = {a1v1 + a1v2 + · · · + ak vk | ∀i ai ∈ F}.
Theorem 3W = span{v1, . . . , vk} is a subspace of V .
Proof.Since 0 = 0v1 + . . . + 0vk ∈ W , so S1 is satisfied. Also
(a1v1 + . . . + ak vk ) + (b1v1 + . . . + bk vk ) = (a1 + b1)v1 + . . . + (ak + bk )vk,
a · (a1v1 + . . . + anvn) = (aa1)v1 + . . . + (aan)vn,
are also linear combinations, hence S2 and S3 are true.13 / 19
Span Examples
1. The set of monomials {1, x, x 2, . . . , x n} spans Fn[x ]: if f ∈ Fn[x ], then
f = a0 · 1 + a1 · x + . . . + an · x n.
2. In CR(−∞,∞) we have
sin(
x +π
4
)∈ span{sin x, cos x}
sincesin
(x +
π
4
)=
1√
2sin x +
1√
2cos x.
14 / 19
Exercise
Exercise 2f0(x ) = 1,f1(x ) = 1 + xf2(x ) = 1 + x + x
2
Is it true that {f0, f1, f2} spans the vector space R2[x ] of polynomials of degree atmost 2?Answer: Yes, this is because
1 = f0x = f1 − f0,
x 2 = f2 − f1
is in the span of {f0, f1, f2} and hence any polynomial from R2[x ] is in the span too.15 / 19
Finite-dimensional space
Definition 4Let V be a vector space over the field F . The space V is said to befinite-dimensional, if there exists a finite set of vectors {v1, . . . , vk}, which spansV , that is V = span{v1, . . . , vk}.
Example 5• The space of polynomials Fn[x ] is finite-dimensional:
Fn[x ] = span{1, x, x 2, . . . , x n}.
• The space of polynomials F [x ] is not finite-dimensional.
16 / 19
Proof that F [x] is not finite-dimensional
Proof.Let us prove the second statement. Suppose
F [x ] = span{f1(x ), . . . , fn(x )}.
Let us choose a positive integer N such that N > deg fi for all i = 1, . . . , n andconsider x N .
As {f1(x ), . . . , fn(x )} is assumed to span F [x ] we can find scalars a1, . . . , an suchthat x N = a1f1(x ) + · · · + anfn(x ). Then the polynomial
G(x ) = x N − a1f1(x ) −···− anfn(x ) ≡ 0
is a polynomial of degree N which is identically zero. This is a contradiction sinceG(x ) cannot have more than N roots.
17 / 19
To Summarise
As a result of today’s lecture (together with your further study) you should• know examples of fields and vector spaces (but not their axiomatic definition);• know the definition of subspace;• know examples of subspaces;• know the definition of span;• know what it means for a vector space to be finite-dimensional.• know an example of a vector space of infinite dimension.
18 / 19
What to do now
Consolidation: Lecture Notes: pp. 7–11. Exercises of Sec. 1.1–1.3;
Textbook: §6.1, pp. 447–459. Exercises pp. 459-60: # 1–11; 24–47;59–62;
Next topic: Basis, Dimension and The Coordinate Mapping
Color code: Orange/Red.
Preparation: Go through the text:Lecture Notes: Sec. 1.4–1.5;Textbook: §6.2, pp. 461–474.
19 / 19
2e8385b353da847bc5c1dc7268bd155d.pdf
MATHS 253, Algebra. Lecture 7
14 March 2022
Today’s topic: Applications of Diagonalisation.
Colour code: Yellow/Orange
Preparation: Go through the text:Lecture Notes: Sec. 2.3Textbook: Sec. §4.6; pp. 336–343.
1 / 23
Discrete time system evolution
Definition 1An n-dimensional discrete time dynamical system is called a linear(homogeneous) discrete time dynamical system if there is an n × n matrix A suchthat
xk+1 = Axk, k = 0, 1, 2, . . .
where xk ∈ Rn — these vectors are called states. The vector x0 is called the initialstate of the system.
Eigenvalues and eigenvectors give us a powerful tool for analysing such systems.
2 / 23
ExampleSuppose we are interested in studying a population, say of fish in the lake, that inthe nth year contains• xn immature members and• yn adult members,
We assume that:1. Immature fish that survive until the end of the year mature into adult fish;2. A fraction β of immature fish survive until the end of the year;3. No adult fish survive at the end of the year;4. Adult fish give birth to immature fish at rate α.
Then we obtain [xn+1yn+1
]=
[0 αβ 0
][xnyn
].
3 / 23
The main theorem
Theorem 2Let xk+1 = Axk , k = 0, 1, 2, . . . be the discrete time linear dynamical system.Suppose that the initial state x0 can be written as a linear combination of theeigenvectors of A
x0 = a1v1 + · · ·+ amvm,
where vi is an eigenvector of A belonging to the eigenvalue λi .1 Then
xk = a1λk1 v1 + · · ·+ amλ
kmvm
for all k = 0, 1, 2, . . ..
1We do not assume diagnalisability so m may be less than n.4 / 23
Proof of the theoremAs vi is an eigenvector belonging to λi , we have Avi = λi vi . Let us multiply bothparts of this equation by A on the left. We will obtain
A2vi = A(Avi ) = λi Avi = λ2i vi.
After k − 1 such multiplications we will obtain
Ak vi = λki vi.
We use this to obtain
xk = Ak x0 = A
k (a1v1 + · · ·+ amvm) = a1Ak v1 + · · ·+ amAk vm =a1λ
k1 v1 + · · ·+ amλ
kmvm.
as required.
5 / 23
Steps of the analysis
1. Find the matrix A (if the system is given verbally) and the initial state x0;2. Find the eigenvalues of A;3. Find the eigenvectors of A;4. Make sure that x0 is a linear combination of the eigenvectors and find the
coefficients of this linear combination;5. Substitute all the above ingredients into the formula
xk = a1λk1 v1 + · · ·+ amλ
kmvm.
6. Analyse the result. Make predictions.
6 / 23
Example
Griffins and Dragons live in the Enchanted Forest. The number of dragons D andthe number of griffins G in the forest each year is determined by the formulas:
Dn = 1.5Dn−1 + Gn−1,Gn = Dn−1,
where Dn is the number of dragons at year n and Gn is the number of griffins atyear n.
In year 0 there were 25 dragons and no griffins.
What would be the dynamics of these populations?
7 / 23
1. Finding the matrix of the system and the initial state
Let xi =[
DiGi
]. Then x0 =
[D0G0
]=
[25
0
]. Furthermore,
xn+1 =[
Dn+1Gn+1
]=
[1.5Dn + Gn
Dn
]= Dn
[1.5
1
]+ Gn
[10
]
=
[1.5 11 0
][DnGn
]=
[1.5 11 0
]xn.
Thus we have xn+1 = Axn, where
A =[
1.5 11 0
].
8 / 23
2 & 3. Finding eigenvalues & eigenvectorsSolving the characteristic equation
0 =∣∣∣∣ 1.5 −λ 11 −λ
∣∣∣∣ = λ2 − 1.5λ− 1,we find that the eigenvalues of A are λ1 = 2 and λ2 = −1/2.
For λ1 = 2:A −λ1I =
[−0.5 1
1 −2
]rref−→
[1 −20 0
]and the respective eigenvector will be v1 =
[21
];
For λ2 = −0.5:A −λ2I =
[2 11 0.5
]rref−→
[1 0.50 0
]and the respective eigenvector will be v2 =
[1−2
].
9 / 23
4. Finding the decomposition of the initial state
We write x0 = a1v1 + a2v2 and this is a system of linear equations[25
0
]= a1
[21
]+ a2
[1−2
],
from which a1 = 10 and a2 = 5, i.e.
x0 =[
250
]= 10v1 + 5v2.
10 / 23
5. Writing down the solution and analysing itSubstituting all found parameters into the formula we obtain:
xk =[
DkGk
]= 10 · 2k
[21
]+ 5 ·
(−
12
)k [ 1−2
].
In the long run (1/2)k → 0 and
xk =[
DkGk
]→[
20 · 2k10 · 2k
],
so the populations will grow and approximately double each year. The ratio ofDragons to Griffins in the limit:
20 · 2k
10 · 2k= 2 : 1.
11 / 23
Example with complex eigenvaluesExample 3Let us investigate a discrete time system xn+1 = Axn with
A =[
0 −11 0
], x0 =
[02
].
As A is the standard matrix of the anticlockwise rotation through the angle 90◦, thesequence (xk ) is periodic with period 4:
x4s =[
02
], x4s+1 =
[−2
0
], x4s+2 =
[0−2
], x4s+3 =
[20
].
How can we obtain this using the formula
xk = a1λk1 v1 + a2λ
k2 v2?
12 / 23
Example continued
Matrix A has complex eigenvalues λ1 = i and λ2 = −i
with respective eigenvectors v1 =[
i1
]and v2 =
[−i1
].
We have x0 =[
02
]= v1 + v2 so a1 = a2 = 1.
Substituting intoxk = a1λ
k1 v1 + a2λ
k2 v2
we get
xk = ik[
i1
]+ (−i)k
[−i1
].
which is the same result as above (check it!).
13 / 23
Population growth model from 250
The herd of animals is divided into three cohorts:
• C1: Newborn (in direct parental care)
• C2: Juvenile (the free-ranging but not yet reproducing)
• C3: Adults (those capable of reproduction).
Adults reproduce, at the rate of 0.4 newborn per adult per year,
65% of the newborn survive to become juveniles,
75% of juveniles survive to adults, and
95% of adults survive to live another year.
14 / 23
If we denote the number of newborn at year n as un, the number of juveniles atyear n as vn, and the number of adults at year n as wn, then:
uk+1 = 0.4wkvk+1 = 0.65ukwk+1 = 0.75vk + 0.95wk
or in the matrix form
xk+1 = Axk, xk =
ukvk
wk
∈ R3,
where
A :=
0 0 .4.65 0 0
0 .75 .95
.
15 / 23
Matlab (or Wolfram Alpha) tells us:• the matrix A has one real eigenvalue λ1 ≈ 1.11 with a normalised eigenvector
v1 ≈ (0.23, 0.13, 0.64)T , and
• two complex ones λ2,3 ≈ 0.1 ± 0.4i with two complex eigenvectors v2 and v3belonging to them.
Therefore the matrix is diagonalisable over the field of complex numbers. Anotherimportant thing to notice is that λ1 > 1 while |λ2| < 1 and |λ3| < 1. In particular
λk2 → 0, λk3 → 0.
Since A is diagonalisable we may express any x0 as a linear combination of theeigenvectors:
x0 = a1v1 + a2v2 + a3v3.
16 / 23
Now according to the theorem
xk = a1λk1 v1 + a2λ
k2 v2 + a3λ
k3 v3 ∼ a1λ
k1 v1 = a1(1.11)
k
0.230.13
0.64
,
when k is large because λk2 → 0, and λk3 → 0.
Therefore the herd will grow with a speed 11% a year and we would expect 23% ofthe population to be newborns, 13% to be juveniles, and 64% to be adults.
In such circumstances we say that λ1 is the dominant eigenvalue. If the dominanteigenvalue exists, it determines the long-term dynamics of the system.
17 / 23
Solving recurrence relationsLet us define a sequence of numbers as follows:
p1 = 1, p2 = 3, pn+1 = 3pn − 2pn−1.Find a closed formula for pn.
Solution: In order to use the theorem we convert this sequence into the followingdiscrete time linear system:
x0 =[
p2p1
]=
[31
], x1 =
[p3p2
], . . . , xn =
[pn+2pn+1
], . . . .
Since pn+1 = 3pn − 2pn−1, we have the following equations:
xn =[
pn+2pn+1
]=
[3pn+1 − 2pn
pn+1
]= pn+1
[31
]+ pn
[−2
0
]=
[3 −21 0
][pn+1pn
]=
[3 −21 0
]xn−1.
18 / 23
Continuation of the exampleLet us find the eigenvalues and eigenvectors of A. The characteristic polynomial ofA will be
det(A −λI) =∣∣∣∣ 3 −λ −21 −λ
∣∣∣∣ = λ2 − 3λ + 2 = (λ− 1)(λ− 2)and the eigenvalues are: λ1 = 1, λ2 = 2.
λ1 = 1. Row reducing the characteristic matrix we get
A −λ1I =(
2 −21 −1
)−→
(1 −10 0
).
from which we see that the eigenvector of A will be v1 =[
11
].
19 / 23
End of the exampleλ2 = 2. Row reducing the characteristic matrix we get
A −λ2I =(
1 −21 −2
)−→
(1 −20 0
).
from which we see that the eigenvector of A will be v2 =[
21
].
Since x0 = −v1 + 2v2, we have that
xk−1 = −1k−1v1 + 2 · 2k−1v2 = − 1k−1[
11
]+ 2 · 2k−1
[21
]=
[−1 + 2k+1−1 + 2k
]and therefore pk = 2k − 1.
20 / 23
Fibonacci sequence
Example 4In many biological systems we observe the Fibonacci numbers
1, 1, 2, 3, 5, 8, 13, 21, 34, 55, 89, . . .
which are defined as follows: F1 = 1, F2 = 1, Fn+1 = Fn + Fn−1.
Using the developed technique we may obtain the formula
Fn =1√
5
(1 +√
52
)n−
1√
5
(1 −√
52
)n.
It is called Binet’s formula but was actually discovered first by A. de Moivre(1667-1754) (see Textbook p.349).
21 / 23
To Summarise
As a result of today’s lecture (together with your further study) you should• be able to apply diagonalisation technique to analyse discrete time system
evolution.• be able to find a closed formula for a recurrence relation.
22 / 23
What to do now
Consolidation: Textbook: Try exercises: §4.4: # 3,7, 13,15,27;Lecture Notes: Exercises of Sec. 3.2–3.4.1.
Next topic: The Cayley-Hamilton Theorem, Minimal Polynomials
Colour code: Red
Preparation: Go through the text:Textbook: §6.1Lecture Notes: Sec. 3.1 – 3.3
23 / 23
3f7441c2a3c7992c76871903d692a6cd.pdf
MATHS 253, Algebra. Lecture 6
11 March 2022
Today’s topic: Diagonalisation.
Colour code: Yellow/Orange
Preparation: Go through the text:Lecture Notes: Sec. 2.2.Textbook: Sec. §4.3, 4.4; pp. 307–320.
1 / 26
Diagonalisable operators
Definition 1Let V be a finite-dimensional vector space over an arbitrary field F withdim(V ) = n. A linear operator T : V → V is called diagonalisable if it possesses aset of eigenvectors {v1, . . . , vn}, which form a basis of V .
Example 2Let T be a linear operator on the vector space R1[x] of all polynomials of degree≤ 1 given by
T (a + bx) = (a + 2b)(3x − 5).
Then the basis {f1(x), f2(x)}, where f1(x) = −2 + x and f2(x) = 3x − 5,diagonalises T since T (f1) = 0, and T (f2) = f2 and moreover the system{f1(x), f2(x)} is linearly independent.
2 / 26
What are we diagonalising?
The rotation Rθ : R2 → R2, for 0 < θ < π, is not diagonalisable over R since it hasno eigenvectors at all. Indeed, let us take for example θ = 90◦. Then
Rπ/2
[x1x2
]=
[0 −11 0
][x1x2
]=
[−x2
x1
],
and the characteristic polynomial of the latter matrix will be the polynomial∣∣∣∣ −λ −11 −λ∣∣∣∣ = λ2 + 1,
which has no roots in R.
3 / 26
Matrix or the operator?However, it has two roots in C, namely, i and −i . The linear operator T : C2 → C2given by the same matrix
T[
z1z2
]=
[0 −11 0
][z1z2
],
[z1z2
]∈ C2,
is diagonalisable.
The characteristic polynomial ∣∣∣∣ −λ −11 −λ∣∣∣∣ = λ2 + 1,
will now have twoo roots in C, which are λ1,2 = ±i . The basis of eigenvectors willconsist of v1 = (i, 1)T and v2 = (−i, 1)T .
Hence diagonalisability is a property of the operator but not the matrix!4 / 26
Why is it called diagonalisation?Suppose that T : F n → F n has the standard matrix A, is diagonalizable, and letB = {v1, . . . , vn} be a basis of eigenvectors of T which belong to eigenvaluesλ1, . . . ,λn, respectively. Then
T (v1) = λ1v1,T (v2) = λ2v2,
…T (vn) = λnvn,
=⇒ [T ]B =
λ1 0 . . . 00 λ2 . . . 0. . . . . . . . . . . .0 0 . . . λn
.
We have
P−1AP = PB←E [T ]E PE←B = [T ]B =
λ1 0 . . . 00 λ2 . . . 0. . . . . . . . . . . .0 0 . . . λn
for P = PE←B = [[v1]E . . . [vn]E ] = [v1 . . . vn].5 / 26
Criterion of diagonalisability 1Let dim V = n and T : V → V be a linear operator which has n distinct eigenvaluesλ1, . . . ,λn. Then T is diagonalisable.
Let {v1, . . . , vn} be the corresponding eigenvectors. Supposea1v1 + · · ·+ ak vk = 0. (1)
is the “shortest” nontrivial linear combination of these vectors. Note that allcoefficients a1, . . . , ak are nonzero.
Let us apply T to both sides of the equation (1):
T (a1v1 + · · ·+ ak vk ) = λ1a1v1 + · · ·+ λk ak vk = T (0) = 0. (2)Multiplying (1) by λ1 and subtracting the result from (2) we get
(λ2 −λ1)a2v2 + · · ·+ (λk −λ1)ak vk = 0.This new linear combination is nontrivial and “shorter” than (1). Contradictionproves the theorem.
6 / 26
Example
Calculating the eigenvalues show that the linear operator T : R3 → R3 with thestandard matrix
A =
1 1 −12 1 2
2 −1 4
.
is diagonalisable. Find an invertible matrix P and a diagonal matrix D such thatP−1AP = D.
7 / 26
Example continued
The characteristic polynomial of the matrix A is equal to
det(A −λI) =
∣∣∣∣∣∣1 −λ 1 −1
2 1 −λ 22 −1 4 −λ
∣∣∣∣∣∣ =∣∣∣∣∣∣
1 −λ 1 −12 1 −λ 20 −2 + λ 2 −λ
∣∣∣∣∣∣ =
(λ− 2)
∣∣∣∣∣∣1 −λ 1 −1
2 1 −λ 20 1 −1
∣∣∣∣∣∣ = (λ− 2)∣∣∣∣∣∣
1 −λ 0 02 1 −λ 20 1 −1
∣∣∣∣∣∣ =−(λ− 2)(λ− 1)
∣∣∣∣ 1 −λ 21 −1∣∣∣∣ = − (λ− 1)(λ− 2)(λ− 3).
Therefore there are three eigenvalues λ1 = 1, λ2 = 2, λ3 = 3, each of algebraicmultiplicity 1, and the operator is diagonalisable.
8 / 26
Example continued
λ1 = 1. The eigenvectors are the non-zero solutions to the system (A −λ1I)x = 0,where x = (x1, x2, x3)T . We find
A −λ1I = A − I =
0 1 −12 0 2
2 −1 3
rref−→
1 0 10 1 −1
0 0 0
.
This row reduced echelon form corresponds to the system
x1 + x3 = 0x2 − x3 = 0
with solutions spanned by the vector f1 = (−1, 1, 1)T . This is the first eigenvectorsought.
9 / 26
Example continued
λ2 = 2. The eigenvectors are the non-zero solutions to the system (A −λ2I)x = 0,where x = (x1, x2, x3)T . We find
A −λ2I = A − 2I =
−1 1 −12 −1 2
2 −1 2
rref−→
1 0 10 1 0
0 0 0
.
This row reduced echelon form corresponds to the system
x1 + x3 = 0x2 = 0
with solutions spanned by the vector f2 = (−1, 0, 1)T . This is the secondeigenvector sought.
10 / 26
Example continued
λ3 = 3. The eigenvectors are the non-zero solutions to the system (A −λ3I)x = 0,where x = (x1, x2, x3)T . Let us row reduce the characteristic matrix:
A −λ3I = A − 3I =
−2 1 −12 −2 2
2 −1 1
rref−→
1 0 00 1 −1
0 0 0
.
This row reduced echelon form corresponds to the system
x1 = 0x2 − x3 = 0
with solutions spanned by the vector f3 = (0, 1, 1)T . This is the third eigenvectorsought.
11 / 26
Example continuedThe invertible matrix P can be now obtained as
P = [f1 f2 f3] =
−1 −1 01 0 1
1 1 1
and the diagonal matrix D such that P−1AP = D will be
D =
λ1 0 00 λ2 0
0 0 λ3
=
1 0 00 2 0
0 0 3
.
We do not have to compute D as we know the eigenvalues already.
It is essential that the eigenvalues appear in the same order in D as do theircorresponding eigenvectors in P.
12 / 26
Example 3Let λ1, . . . ,λn be distinct real numbers. Prove that the set of functions{exp(λ1x), . . . ,exp(λnx)} is linearly independent.
The operator of differentiation D : f (x) 7→ f ′(x) is an operator on the subspaceV = span{exp(λ1x), . . . ,exp(λnx)} and
D(exp(λi x)) = λi exp(λi x)
so they are all eigenvectors belonging to distinct eigenvalues λ1, . . . ,λn.
In particular, the restriction of D onto V is diagonalisable.
13 / 26
Algebraic and geometric multiplicity (recap)
Definition 4Let λ0 be an eigenvalue of A and (λ−λ0)k is the highest power of λ−λ0 thatdivides pA(λ). Then k is said to be the algebraic multiplicity of λ0.
Definition 5The eigenvectors belonging to the eigenvalue λ0 and the zero vector form asubspace, which is the null space of the matrix (A −λ0In). This subspace is calledthe eigenspace belonging to λ0. The dimension of this space is called thegeometric multiplicity of λ0.
14 / 26
Exercise 1
What is the geometric multiplicity of eigenvalue 3 of the matrix
A =
3 0 10 3 1
0 0 3
?
The matrix
A − 3I =
0 0 10 0 1
0 0 0
has rank 1, hence the geometric multiplicity of this eigenvalue is 3 − 1 = 2.
15 / 26
Relation between the two multiplicities
Theorem 6The geometric multiplicity of an eigenvalue is never greater than its algebraicmultiplicity.
16 / 26
Proof of the theoremLet g = dim E(λ0) be the geometric multiplicity of λ0 and {v1, . . . , vg} be a basis ofE(λ0). Then T (vi ) = λ0vi for i = 1, 2, . . . , g.
Extend this set to a basis B = {v1, . . . , vg, vg+1, . . . , vn}, where n = dim V . Thefirst g columns of A = [T ]B then will be
[T (vi )]B =
0…λ0…0
(i th position) =⇒ A =
[λ0Ig ?
0 C
],
where Ig is the identity matrix of order g and ? can be anything. Then
pA(λ) = (λ0 −λ)g pC(λ)
and the algebraic multiplicity is at least g.17 / 26
Criterion of diagonalisability 2
Theorem 7A linear operator T : V → V acting on an n-dimensional vector space V isdiagonalisable if and only if• the sum of the algebraic multiplicities is equal to n, and• for each eigenvalue its geometric multiplicities is equal to its algebraic
multiplicity.
Sketch of the proof:Let λ1, . . . ,λk be the eigenvalues of T and n1, . . . , nk be their algebraicmultiplicities. If n1 + . . . + nk < n, then
dim E(λ1) + . . . + dim E(λk ) ≤ n1 + . . . + nk < n
and T is not diagonalisable as we cannot find n linearly independent eigenvectors.
18 / 26
End of the proof
If n1 + . . . + nk = n, then
dim E(λ1) + . . . + dim E(λk ) ≤ n1 + . . . + nk = n.
T is diagonalisable in this case if and only if dim E(λi ) = ni , i.e., all geometricmultiplicities are equal to their respective algebraic ones.
19 / 26
Exercise 2
Is the matrix
A =
2 1 4 80 3 2 30 0 1 70 0 0 6
diagonalisable?
Yes, its eigenvalues are its diagonal entries: 1,2,3,6. There are four of them andthey are all distinct.
20 / 26
Exercise 3Is the matrix
B =
2 1 0 00 2 0 00 0 2 10 0 0 2
diagonalisable?
2 is its only eigenvalue. The matrix
B − 2I =
0 1 0 00 0 0 00 0 0 10 0 0 0
has rank 2, hence the geometric multiplicity of 2 is 4 − 2 = 2 6= 4. Notdiagonalizable.
21 / 26
Calculation of large powers of matrices
If matrix A is diagonalisable, it is easy to calculate, say A2019. Indeed, we have forsome invertible matrix P that P−1AP = D, where D is diagonal. Then A = PDP−1
andAm = PDP−1 · PDP−1 · . . . · PDP−1 = PDmP−1.
Also
D =
λ1 0 · · · 00 λ2 · · · 0…
……
…0 0 · · · λn
=⇒ Dm =
λm1 0 · · · 00 λm2 · · · 0…
……
…0 0 · · · λmn
Hence Am can be calculated.
22 / 26
Calculation of large powers of matrices. Example
Calculate A2019, where A = 15
[4 −3−3 −4
].
Let us diagonalise this matrix first. The characteristic polynomial of
A = 15
[4 −3−3 −4
]would be
∣∣∣∣ 4/5 −λ −3/5−3/5 −4/5 −λ∣∣∣∣ = λ2 − 1.
So we have two eigenvalues λ1 = 1, λ2 = −1.
23 / 26
Calculation of large powers of matrices. ExampleThe corresponding eigenvectors will be:
1. λ1 = 1. We have
A −λ1I =[−1/5 −3/5−3/5 −9/5
]rref−→
[1 30 0
], v1 =
[3−1
].
2. λ2 = −1. We have
A −λ2I =[
9/5 −3/5−3/5 1/5
]rref−→
[3 −10 0
], v2 =
[13
].
So for P =[
3 1−1 3
]we have P−1AP =
[1 00 −1
]= D. Now
A2019 = PD2019P−1 = PDP−1 = A.
24 / 26
To Summarise
As a result of today’s lecture (together with your further study) you should• know the concept of diagonalisation and criteria of diagonalisability;• know the definition of the algebraic and geometric multiplicities of an
eigenvalue and know the relation between them;• know how to calculate both multiplicities.• apply diagonalisability to calculate large powers of matrices.
25 / 26
What to do now
Consolidation: Textbook: Try exercises: §4.4: # 3,7, 13,15,27;Lecture Notes: Exercises of Sec. 3.2–3.3.
Next topic: Discrete time system evolution.
Colour code: Yellow/Orange
Preparation: Go through the text:Textbook: §4.6 pp 336–343.Lecture Notes: Sec. 3.4.
26 / 26
427c3840d38ac6f90cd0abd987b2e114.pdf
MATHS 253, Algebra. Lecture 14
30 March 2022
Today’s topic: Orthogonal diagonalisation of symmetric matrices. Spectraldecomposition.
Colour code: Red
Preparation: Go through the text:Textbook: §5.4 pp. 411–418.Lecture Notes: Sec. 5.2
1 / 23
Orthogonally diagonalisable matrices
Definition 1A real n × n matrix A is called orthogonally diagonalisable if for some diagonalmatrix D (with real entries) and some orthogonal matrix Q we have
Q−1AQ = D.
We note that orthogonal diagonalisability of real n × n matrix A is equivalent to theorthogonal diagonalisability of the operator T (x) = Ax.
Theorem 2A real matrix A is orthogonally diagonalisable if and only if it is symmetric, i.e.,AT = A.
2 / 23
Proof. Part 1
Suppose that A is orthogonally diagonalisable and Q−1AQ = D, where QT = Q−1,i.e., Q is orthogonal. Then
A = QDQ−1 = QDQT
andAT = (QDQT )T = (QT )T DT QT = QDQT = A,
that is, A is symmetric.
3 / 23
Proof. Part 2
Let now A be a real symmetric matrix. We know that the operator T : Cn → Cngiven by T (z) = Az, z ∈ Cn, is Hermitian and hence orthogonally diagonalisable.Can we deduce our theorem from this?
Not immediately! We have an orthonormal basis of eigenvectors {v1, . . . , vn} for Ain Cn but they may not be real.
Let us prove we can choose them real. Suppose λ is an eigenvalue of geometricmultiplicity k . Then the corresponding eigenspace Eλ in Cn is the k -dimensionalsolution-space of
(A −λIn)x = 0.
In particular, rank(A −λIn) = n − k .
We know that diagonalisability of A means that the sum of geometric multiplicitieswill be equal to n.
4 / 23
Proof. Part 2 continuesAs A and λ are real, our algorithm of finding a basis for the solution-space will finda basis of the eigenspace Eλ consisting of k real vectors.
Within each eigenspace we can orthonormalise the basis obtained. For distincteigenvalues the eigenspaces are orthogonal.
Gathering them all together, we can find an orthonormal basis B = {v1, . . . , vn} ofRn, with vi being an eigenvector belonging to λi , which diagonalises the operatorT : Rn → Rn, given by T (x) = Ax, where x ∈ Rn.
Since B is orthonormal, the matrix
Q = PE←B = [ v1 | v2 | . . . | vn ]
is orthogonal andD = [T ]B = PB←E APE←B = Q
−1AQ.
5 / 23
Example
Find an orthonormal basis B consisting of eigenvectors of the linear operatorT : R3 → R3 given by T (x) = Ax, where
A =
−1 2 22 −1 2
2 2 −1
.
and the diagonal matrix D of T relative to B. Find an orthogonal matrix P such thatPT AP = P−1AP = D.
6 / 23
Characteristic polynomialSolution: The characteristic polynomial of the matrix A is
det(A −λI3) =∣∣∣∣∣∣−1 −λ 2 2
2 −1 −λ 22 2 −1 −λ
∣∣∣∣∣∣ =∣∣∣∣∣∣−1 −λ 2 2
2 −1 −λ 20 3 + λ −3 −λ
∣∣∣∣∣∣ =(λ + 3)
∣∣∣∣∣∣−1 −λ 2 2
2 −1 −λ 20 1 −1
∣∣∣∣∣∣ = (λ + 3)∣∣∣∣∣∣−1 −λ 4 2
2 1 −λ 20 0 −1
∣∣∣∣∣∣ =−(λ + 3)(λ2 − 9) = −(λ + 3)2(λ− 3).
Therefore there are two distinct eigenvalues λ1,2 = −3, λ3 = 3. The first one has
multiplicity 2 (both algebraic and geometric).7 / 23
Eigenvectors for λ1 = −3
A −λ1I = A + 3I =
2 2 22 2 2
2 2 2
→
1 1 10 0 0
0 0 0
.
This row reduced echelon form corresponds to the system
x1 + x2 + x3 = 0,
which solutions are spanned by the vectors f1 = (−1, 1, 0)T and f2 = (−1, 0, 1)T .We have to orthonormalise this system:
g1 = f1 = (−1, 1, 0)T ,
g2 = f2 −f2 · g1‖g1‖2
g1 = (−1, 0, 1)T −12(−1, 1, 0)T = (−1/2,−1/2, 1)T .
Normalising these vectors we have v1 =1√
2(−1, 1, 0)T and v2 =
1√
6(−1,−1, 2)T .
8 / 23
Eigenvectors for λ3 = 3
A −λ3I = A − 3I =
−4 2 22 −4 2
2 2 −4
→
1 1 −21 −2 1
0 0 0
→
1 1 −20 −3 3
0 0 0
→
1 0 −10 1 −1
0 0 0
.
This row reduced echelon form corresponds to the system
x1 − x3 = 0x2 − x3 = 0
which solutions are spanned by the vector f3 = (1, 1, 1)T . Normalising it we have
the third vector v3 =1√
3(1, 1, 1)T for the orthonormal basis {v1, v2, v3} sought for.
9 / 23
Discussion
It was not necessary to solve the last system if we knew the theory.
After we discovered that the eigenspace belonging to λ1,2 = −3 is given as thesolution-space of the system x1 + x2 + x3 = 0, it was clear that the orthogonalcomplement to this solution-space is spanned by the vector (1, 1, 1)T .
Since an orthonormal basis of eigenvalues exists this vector must be aneigenvector for the eigenvalue λ3 = 3.
The matrix of T relative to B = {v1, v2, v3} will be
[T ]B =
−3 0 00 −3 0
0 0 3
.
10 / 23
The answer
For the orthogonal matrix
P = [v1 | v2 | v3] =
−1√
2−1√
61√
31√
2−1√
61√
3
02√
61√
3
we have
P−1AP = PT AP =
−3 0 00 −3 0
0 0 3
.
11 / 23
Spectral Decomposition
The following theorem says that every Hermitian operator is a linear combinationof projections with real coefficients.
Theorem 3 (Spectral Decomposition)Let T be a Hermitian operator acting on a vector space V which is Rn or Cn. Let{v1, . . . , vn} be an orthonormal basis of V consisting of eigenvectors of Tbelonging to real eigenvalues λ1, . . . ,λn, respectively. Then
T = λ1Pv1 + · · ·+ λnPvn,
where Pvi is the projection operator onto vi .
12 / 23
ProofAn arbitrary vector v can be represented as a linear combinationv = a1v1 + · · ·+ anvn of the eigenvectors. Firstly, we find
(λi Pvi )(aj vj ) = λi (Pvi (aj vj )) = λi aj Pvi (vj ) =
{λi ai vi if i = j0 if i 6= j.
NowT (a1v1 + · · ·+ anvn) = a1λ1v1 + · · ·+ anλnvn
and
[λ1Pv1 + · · ·+ λnPvn ] (a1v1 + · · ·+ anvn) = a1λ1v1 + · · ·+ anλnvn,
i.e., T (v) = [λ1Pv1 + · · ·+ λnPvn ] (v). The two operators T and λ1Pv1 + · · ·+ λnPvnagree on an arbitrary vector v of V and hence equal.
13 / 23
ExampleLet
A =
−1 1 11 −1 1
1 1 −1
Find the spectral decomposition for A.
We note that λ1,2 = −2 is an eigenvalue of geometric multiplicity 2 since
A − (−2)I =
1 1 11 1 1
1 1 1
rref−→
1 1 10 0 0
0 0 0
.
The eigenspace is spanned by v1 =1√2(1,−1, 0)T and v2 = 1√6 (1, 1,−2)
T .
14 / 23
Example continued
As in a similar situation considered before the vector v3 =1√3(1, 1, 1)T will be the
third eigenvector.
Av3 =1√
3
−1 1 11 −1 1
1 1 −1
11
1
= 1√
3
11
1
= v3
so the third eigenvalue is λ3 = 1.
So we getA = λ1,2(Pv1 + Pv2) + λ3Pv3 =
−2
1
2
1 −1 0−1 1 0
0 0 0
+ 1
6
1 1 −21 1 −2−2 −2 4
+ 1 · 1
3
1 1 11 1 1
1 1 1
.
The spectral decomposition is useful when functions of matrices are considered.15 / 23
Functions of matricesDefinition 4Let f (x) = a0 + a1x + a2x 2 + · · ·+ anx n be a polynomial and T : V → V be a linearoperator. We define
f (T ):= a0I + a1T + a2T2 + · · ·+ anT n,
where I is the identity operator.
Theorem 5Let f (x) be a polynomial and
T = λ1Pv1 + · · ·+ λnPvn,
be the spectral decomposition of a Hermitian operator T . Then
f (T ) = f (λ1)Pv1 + · · ·+ f (λn)Pvn.
16 / 23
Proof
It is enough to notice that
T k = λk1 Pv1 + · · ·+ λkn Pvn.
This follows from the fact that Pvi Pvj = 0, when i 6= j and Pvi Pvi = Pvi for all i, j . �
Formulaf (T ) = f (λ1)Pv1 + · · ·+ f (λn)Pvn.
can be used to define such functions of a matrix as logarithm, exponent etc.
17 / 23
Example
Let A =[
5 44 5
]. Find a matrix B such that A = B2.
Solution: The two eigenvalues of A are λ1 = 1 and λ2 = 9.
The normalized eigenvectors will be v1 =1√2(1,−1)T and v2 = 1√2 (1, 1)
T ,respectively.
The projection matrices will be
Pv1 =12
[1 −1−1 1
], Pv2 =
12
[1 11 1
].
18 / 23
Example continued
Then
A = 1 ·12
[1 −1−1 1
]+ 9 ·
12
[1 11 1
]is the spectral decomposition of A. As (±3)2 = 9, we look for B in the form:
B1 = 1 ·12
[1 −1−1 1
]+ 3 ·
12
[1 11 1
]=
[2 11 2
],
B2 = 1 ·12
[1 −1−1 1
]− 3 ·
12
[1 11 1
]=
[−1 −2−2 −1
].
We check that indeed B21 = B22 = A. So we have four solutions: ±B1, ±B2.
19 / 23
Cayley-Hamilton TheoremTheorem 6Every linear operator T : V → V acting on a finite-dimensional vector space V is aroot of its characteristic polynomial. That is, if
pT (λ) = det(A −λI),
where A is a matrix of T in one of the bases, then pT (T ) = 0.
Proof: We will prove this theorem for all Hermitian operators (although it is true forall operators). For any Hermitian operator T the spectral decomposition exists
T = λ1Pv1 + · · ·+ λnPvn,
and, thereforepT (T ) = pT (λ1)Pv1 + · · ·+ pT (λn)Pvn = 0
since pT (λi ) = 0 for all i = 1, . . . , n.20 / 23
Cayley-Hamilton Theorem. Example
Let
A =[
1 22 1
].
Then
pA(λ) =∣∣∣∣ 1 −λ 22 1 −λ
∣∣∣∣ = λ2 − 2λ− 3.We have
A2 =[
5 44 5
]and
A2 − 2A − 3I =[
5 44 5
]− 2
[1 22 1
]− 3
[1 00 1
]=
[0 00 0
].
21 / 23
To Summarise
As a result of today’s lecture (together with your further study) you should• be able to find an orthonormal basis of eigenvectors for a symmetric matrix;• be able to find the spectral decomposition of a Hermitian operator;• know how functions of matrices are defined and be able to calculate, say, the
square root of a matrix;• know the Cayley-Hamilton theorem.
22 / 23
What to do now
Consolidation: Try exercises:Textbook: §5.4; pp. 418–419: # 1,3,5,17,19, 21,23.Lecture Notes: Exercises of Chap. 5.
Next topic: Quadratic Forms.
Colour code: Red
Preparation: Go through the text:Textbook: §. 5.5; pp. 425–432.Lecture Notes: Sec. 6.1, 6.2.
23 / 23
4b35e24fe3adac2cc1778c158c005708.pdf
MATHS 253, Algebra. Lecture 18
8 April 2022
Today’s topic: Congruent diagonalisation. Sufficiency of Sylvester’s criterion.
Colour code: Orange
Preparation: Go through the text: Lecture Notes: Sec. 6.5.2-6.5.4.
1 / 22
Introduction to Congruent Diagonalisation
To prove the sufficiency of Sylvester’s criterion we need to learn about congruentdiagonalasation of matrices.
We also need a less computationally involved method of determining whether theform is positive definite, negative definite or indefinite.• In general eigenvalues are very difficult to find,• to calculate a large number of determinants for Silvester’s criterion may not be
much fun either.
The idea of a new method:we do not need an orthonormal basis to check whether the form is positivedefinite or not. Any basis, in which the form does not have cross-terms,will do.
2 / 22
The idea explained
The problem which we now put forward is to find an arbitrary basisB = {v1, . . . , vn} in which the form is given by
Q(x) = c1z21 + · · · + cnz
2n ,
where [x]B = (z1, . . . , zn)T . This is equivalent to the fact that there is anondegenerate matrix P such that PT AP = D, where D is a diagonal matrix
D =
c1 0 . . . 00 c2 . . . 0. . . . . . . . . . . .0 0 . . . cn
.
Now we cannot use diagonalisability of A since for non-orthogonal matrix P we getnormally P−1 6= PT .
3 / 22
Congruent diagonalisation
This process is called congruent diagonalisation because of the followingdefinition.
Definition 1Matrices A and B are called congruent if there exists an invertible matrix P suchthat B = PT AP. The process of finding an invertible matrix P such that PT AP isdiagonal is called congruent diagonalisation of A.
We will use elementary matrices for this so we will briefly recap what they do.
4 / 22
Elementary matrices
We have three types of elementary row operations that were used in reducing thematrix to its reduced row echelon form (rref):Type 1. R1 := Row i ↔ Row j;Type 2. R2 := Row i → aRow i, where a 6= 0;Type 3. R3 := Row i → Row i + bRow j.
Definition 2An elementary matrix is the result of application of one of the elementary rowoperations to the identity matrix In.
Respectively, we also have three types of elementary matrices.
5 / 22
Elementary matrices examples
The three matrices below are elementary:
E1 =
1 0 00 0 1
0 1 0
, E2 =
1 0 00 −3 0
0 0 1
, E3 =
1 0 −20 1 0
0 0 1
.
They are obtained from I3 by applying• Row 2 ↔ Row 3,• Row 2 → (−3)· Row 2, and• Row 1 → Row 1 −2· Row 3,
respectively.
6 / 22
Use of elementary matrices
Now we know that to perform an elementary row operation R over the rows ofmatrix A it is enough to:
1. perform this elementary row operation over the rows of the identity matrix I,and
2. premultiply A by the elementary matrix E resulting from this elementary rowoperation.
That isIf I
R−→ E, then A R−→ EA.
7 / 22
Column operations
Since AE T = (EAT )T , it is clear that postmultiplication of a matrix A by E T willperform the corresponding column operation with columns of A. For example,
1 0 00 0 10 1 0
1 2 34 5 6
7 8 9
=
1 2 37 8 9
4 5 6
,
1 2 34 5 6
7 8 9
1 0 00 0 1
0 1 0
=
1 3 24 6 5
7 9 8
.
8 / 22
Elementary column operations
Our goal is to reduce a symmetric matrix to a diagonal matrix keeping thesymmetry all the way. The idea is to use matching row-column reductions. Everyrow operation has the corresponding counterpart among column operations:
Type 1. C1 := Col i ↔ Col j;Type 2. C2 := Col i → aCol i, where a 6= 0;Type 3. C3 := Col i → Col i + bCol j.
9 / 22
Reducing a symmetric matrix
We have already observed that:
If IR−→ E, then A R−→ EA and A C−→ AE T
where R and C are matching row and column operations, e.g.,R := Row i + a Row j and C := Col i + a Col j .
If A was symmetric, after consecutively performing operations R and C
AR−→ EA C−→ EAE T
we obtain a symmetric matrix again.
10 / 22
ExampleLet
A =
1 0 20 2 0
2 0 3
.
and let
E =
1 0 00 1 0−2 0 1
which corresponds to Row 3 → Row 3 − 2Row 1. Then
EA =
1 0 20 2 0
0 0 −1
, EAE T =
1 0 00 2 0
0 0 −1
.
11 / 22
Congruent diagonalisability theorem
Theorem 3Every symmetric matrix A is congruent to a diagonal matrix. It is possible tocongruently diagonalise A using only elementary row and column operations ofType 3.
The algorithm of congruent diagonalisation of a square symmetric matrix Aproceeds in two ways depending on whether or not we have
Case 1. a11 6= 0 or
Case 2. a11 = 0.
12 / 22
Case 1: a11 6= 0By simultaneous row and column reduction, using operations
Ri := Row i −ai 1a11
Row 1, Ci := Col i −a1ia11
Col 1,
for i = 2, . . . , n, due to ai 1 = a1i , we can reduce A to a matrix with zeros in the firstrow and in the first column, except a11:
En . . . E2AET2 . . . E
Tn =
[a11 00 B
],
with a (n − 1) × (n − 1) matrix B in the right bottom corner.If we denote P1 = E T2 . . . E
Tn , then
PT1 AP1 =[
a11 00 B
].
The matrix PT1 AP1 is symmetric, ((PT1 AP1)
T = PT1 AP1), hence B is alsosymmetric. We apply the same steps to the (n − 1)×(n − 1) matrix B and so on.
13 / 22
Case 2: a11 = 0Suppose that for some i we have ai 1 = a1i 6= 0. Then we reduce the situation tothe previous case by performing two operations
R := Row 1 + Row i, C := Col 1 + Col i,
and apply the first operations as above to get a matrix P1 such that
PT1 AP1 =[
2a1i 00 B
].
and the element in the right upper corner is non-zero. If all elements of the firstrow and the first column are zeros, then we have
A =[
0 00 B
]and we can apply the procedure to matrix B.
14 / 22
Congruent diagonalisability theorem
Theorem 4Every symmetric matrix A is congruent to a diagonal matrix. It is possible tocongruently diagonalise A using only elementary row operations of Type 3, that isto find a nondegenerate matrix P such that
PT AP =
c1 0 · · · 00 c2 · · · 0…
……
…0 0 · · · cn
.
For every quadratic for Q(x) we can choose a basis F in which
Q(x) = c1y21 + c2y
22 + . . . + cny
2n ,
where [x]F = (y1, . . . , yn)T . The number of positive coefficients (the signature ofthe form) does not depend on the choice of F . 15 / 22
Example
Let Q : R3 → R be the quadratic form, given by
Q(x) = x 21 + 3×22 + x
23 + 4x1x2 + 2x2x3.
1. Write down the matrix of this form in the standard basis E .2. Congruently diagonalise the form Q, that is to find a basis F in which the form
will be given by the formula
Q(x) = c1y21 + c2y
22 + c3y
23 ,
[x]F = (y1, y2, y3)T ? Write down this basis and this formula, with allcoefficients given numerically.
3. Is Q positive definite, negative definite or indefinite?
16 / 22
Solution
The matrix of this form is
A =
1 2 02 3 1
0 1 1
.
Let us do simultaneous row and column reduction of A by performing
R1 := Row 2 − 2Row 1, C1 := Col 2 − 2Col 1
A =
1 2 02 3 1
0 1 1
−→
1 2 00 −1 1
0 1 1
−→
1 0 00 −1 1
0 1 1
= E1AE T1 ,
where
E1 =
1 0 0−2 1 0
0 0 1
.
17 / 22
Solution continuedReducing further by performing
R2 := Row 3 + Row 2, C2 := Col 3 + Col 2
we have
E1AET1 =
1 0 00 −1 1
0 1 1
−→
1 0 00 −1 1
0 0 2
−→
1 0 00 −1 0
0 0 2
= E2E1AET1 E
T2 ,
where E2E1 can be obtained as
E1 =
1 0 0−2 1 0
0 0 1
R2−→
1 0 0−2 1 0−2 1 1
= E2E1.
We obtain this product by applying row operation R2 to E1.18 / 22
Solution continuedDenote
P = (E2E1)T =
1 −2 −20 1 1
0 0 1
.
Then
PT AP =
1 0 00 −1 0
0 0 2
and therefore in the basis F = {v1, v2, v3} of columns of P
v1 = (1, 0, 0)T v2 = (−2, 1, 0)T v3 = (−2, 1, 1)T .
The form in the basis F is presented by
Q(x) = z21 − z22 + 2z
23 ,
[x]F = (z1, z2, z3)T . This form is therefore indefinite.19 / 22
Sufficiency of the Sylvester’s criterionSuppose that
∆1 > 0, ∆2 > 0, . . . , ∆n > 0.
All elementary operations of congruent diagonalisation are of type 3, moreover, inCase 1 we used only those that do not change the leading minors, i.e.,
Row i → Row i − a · Row j (with j < i ).
We don’t encounter case 2, because we’d have a minor of the form
∆k =
∣∣∣∣∣∣∣∣∣∣∣
c1 0 · · · 0 b10 c2 · · · 0 b2…
…. . .
……
0 0 · · · ck−1 bk−1b1 b2 · · · bk−1 0
∣∣∣∣∣∣∣∣∣∣∣= −
k−1∑i=1
b2i∏j 6=i
cj ≤ 0
20 / 22
Sufficiency of the Sylvester’s criterion
Once an invertible matrix P is found such that
PT AP =
c1 0 . . . 00 c2 . . . 0. . . . . . . . . . . .0 0 . . . cn
,
then ∆1 = c1, ∆2 = c1c2, . . ., ∆n = c1c2 . . . cn. If they are positive, then all ci ’s arepositive and the quadratic form is positive definite. If
∆1 < 0, ∆2 > 0, ∆3 < 0, ∆4 > 0, . . . ,
then all ci ’s are negative and the form is negative definite.
This proves the theorem.
21 / 22
What to do now
Consolidation: Lecture Notes: Sec. 6.5.
22 / 22
608b405e00fa9af620840e5a5424967e.pdf
MATHS 253, Algebra. Lecture 12
25 March 2022Today’s topic: Complex inner product. Adjoint of an operator. Normal equation.
Colour code: Red
Preparation: Go through the text:Lecture Notes: Sec. 4.4, 5.1.Textbook: pp. 566-570; §7.3; pp. 591–607 (repetition of 250);
1 / 33
The Dot Product in Cn
We have to extend the dot product to Cn. We do this by setting
x · y df= x̄T y = x̄1y1 + x̄2y2 + . . . + x̄nyn.
for any vectors x = (x1, . . . , xn)T and y = (y1, . . . , yn)T from Cn, wherex̄ = (x̄1, x̄2, . . . , x̄n)T .
The four main properties of the dot product survive virtually intact:DP1. u · (v + w) = u · v + u · w, (u + v) · w = u · w + v · w;DP2. u · v = v · u;DP3. au · v = ā(u · v), u · av = a(u · v) for all u in Cn and for all scalars a from C;DP4. u · u ≥ 0 for each vector u in Cn, and u · u = 0 only for u = 0.
2 / 33
Let us prove DP2, for example. For this we need to know the following propertiesof conjugation:
x + y = x + y, xy = x y, x = x.
Using them for u = (x1, . . . , xn)T and v = (y1, . . . , yn)T we get
v · u = ȳ1×1 + ȳ2×2 + . . . + ȳnxn = ȳ1×1 + ȳ2×2 + . . . + ȳnxn =
y1x̄1 + y2x̄2 + . . . + ynx̄n = x̄1y1 + x̄2y2 + . . . + x̄nyn = u · v.
3 / 33
Let us prove DP4 now. If u = (x1, . . . , xn)T , then
u · u = x̄1×1 + . . . + x̄nxn = |x1|2 + . . . + |xn|2 ≥ 0.
Also |x1|2 + . . . + |xn|2 = 0 if and only if |x1| = . . . = |xn| = 0 and this is equivalentto u = 0.
Definition 1Two vectors u, v ∈ Cn are orthogonal if u · v = 0.
For example, [1i
]·[
i1
]= 1̄i + ī 1 = i − i = 0,
so[
1i
]is orthogonal to
[i1
].
All properties of the dot product in Rn can be proved also for Cn includingGram-Schmidt orthogonalisation, projections, etc.
4 / 33
Dot Product as Matrix Multiplication
For matrices in Cm×n the combination of conjugation and transposition will appearquite often so we will, for any vector x ∈ Cn, define x∗ df= x̄T . Then in terms ofmatrix multiplication the dot product can be written as
x · y = x∗y. (1)
This is consistent withx · y = xT y
for two vectors x, y ∈ Rn.
5 / 33
Projections
Definition 2 (4.3.1)Let u be a nonzero vector of Cn and x be an arbitrary vector of Cn. The vectorproju(x) ∈ span{u} such that x − proju(x) ⊥ u is called the orthogonal projectionof x onto u.
The projection onto a subspace also similarly defined.
Theorem 3 (4.3.1)If W is a subspace of Cn and {w1, . . . , wm} is an orthogonal basis for W , then theorthogonal projection projW (x) of an arbitrary vector x ∈ C
n onto W exists and
projW (x) =w1 · x‖w1‖2
w1 + · · · +wm · x‖wm‖2
wm. (2)
Note: the order of vectors in the nominators is important.
6 / 33
Adjoint of a Matrix
Definition 4 (5.2.1)Let A ∈ Cm×n be a complex matrix. Then the matrix A∗ = AT , which is Atransposed and each element of which is replaced by its complex conjugate, iscalled the adjoint of A.
Example 5 (5.2.1)
Let A =[
1 2 1 + i1 − i i 4
], then A∗ =
1 1 + i2 −i
1 − i 4
.
For any A ∈ Rm×n we have A∗ = AT , in particular, for the identity matrix In we haveI∗n = In.
No wonder, the properties of the adjoint are similar to properties of the transpose.
7 / 33
Properties of the Adjoint of a Matrix
Proposition 1 (5.2.1)Let A and B be two matrices over R or C, thenA1 (A∗)∗ = A;A2 If the sum is defined, (A + B)∗ = A∗ + B∗;A3 (cA)∗ = c̄A∗;A4 If the product is defined, (AB)∗ = B∗A∗;A5 If the inverse of A exists, then A∗ is also invertible, and (A−1)∗ = (A∗)−1.
8 / 33
Proof
Proof.Let us prove A4. We need the property (AB)T = BT AT of the transpose andproperties of the conjugate. We have
(AB)∗ = (AB)T
= (ĀB̄)T = B̄T ĀT = B∗A∗.
For A5 we apply the adjoint operation to both sides of AA−1 = In and use A4. Weget
(A−1)∗A∗ = I∗n = In.
Thus, (A∗)−1 = (A−1)∗ as required. We leave the rest as exercises.
9 / 33
Adjoint of a Linear Operator
In the operator world the adjoint of a matrix corresponds to the adjoint operator.
Definition 6 (5.2.1)let T : V → W be a linear transformation from an inner product space V to aninner product space W over a field F = R or C. The adjoint T∗ of T is a lineartransformation T∗ : W → V such that
(T (v), w) = (v, T∗(w)) for all v ∈ V and w ∈ W.
For arbitrary inner product spaces the existence of the adjoint is not guaranteed.But for finite-dimensional vector spaces its existence can be proved.
10 / 33
Existence and the Matrix of the Adjoint
Proposition 2 (5.2.2)Let T : F n → F m be a linear transformation with the standard matrix A ∈ F m×n,where F is either R or C. Then the adjoint T∗ of T exists and the standard matrixof the adjoint T∗ of T is the adjoint A∗ of A.
Proof.The inner product in Cn and Rn is the dot product. Thus, using the expressionx · y = x∗y for the dot product, by property A4 we have
T (v) · w = Av · w = (Av)∗w = (v∗A∗)w = v∗(A∗w) = v · A∗w.
Hence T∗(w) = A∗w.
11 / 33
ExampleLet V = CR[0, 1] be the inner product space of all continuous functions on [0, 1]with the integral inner product. Consider the linear operator
(Tf )(x ) =∫ x
0f (t )dt
on V . Then the adjoint of this operator will be
(T∗f )(x ) =∫ 1
xf (t )dt.
Indeed, we have
(Tf, g) =∫ 1
0(Tf )(t )g(t )dt =
∫ 10
(∫ t0
f (τ)dτ)
g(t )dt =
=
∫ 10
f (τ)
(∫ 1τ
g(t )dt
)dτ =
∫ 10
f (τ)(T∗g)(τ)dτ = (f, T∗g).
The change of order of integration is due to Fubini’s theorem.12 / 33
Properties of the AdjointLet T and S be two linear transformations from an inner product space V to aninner product space W . ThenA1 (T∗)∗ = T ;A2 (T + S)∗ = T∗ + S∗;A3 (aT )∗ = āT∗;A4 If the product is defined, (TS)∗ = S∗T∗;A5 If the inverse of T exists, then T∗ is also invertible, and (T−1)∗ = (T∗)−1.
Proof.Let us prove A4. For any v ∈ V and w ∈ W we have
(TS(v), w) = (S(v), T∗(w)) = (v, S∗T∗(w)),
whence (TS)∗ = S∗T∗.
13 / 33
Erik Ivar Fredholm
Erik Ivar Fredholm (1866 – 1927) Uppsala University
14 / 33
Fredholm’s Theorem
Theorem 7 (5.2.4)Let T : V → W be a linear transformation of an inner product space V into aninner product space W and T∗ be the adjoint transformation of T . Then
(i) range(T )⊥ = ker(T∗);(ii) ker(T )⊥ = range(T∗).
Proof.The fact that w ∈ ker(T∗) is equivalent to (v, T∗(w)) = 0 for all v ∈ V . The latter isequivalent to (T (v), w) = 0 for all v ∈ V , which is equivalent to w ⊥ range(T ).
The second equality can be obtained by substituting T∗ instead of T in (i) andusing the fact that range(T )⊥⊥ = range(T ).
15 / 33
Important Corollary
Corollary 8 (5.2.5)The equation T (x) = f has a solution (consistent) if and only if f is orthogonal toall solutions of the homogeneous equation T∗(y) = 0.
Proof.The equation T (x) = f has a solution means that f ∈ range(T ). By Fredholm’sTheorem this is equivalent to f ∈ ker(T∗)⊥ which means that f is orthogonal to allsolutions of the homogeneous equation T∗(y) = 0.
16 / 33
ExampleLet W be a space of all two times differentiable functions f defined on the interval[0,π], such that f (0) = f (π) = 0. Consider the linear operator on W given byT (f ) = f ′′ + λf for some fixed real number λ.
It is easy to calculate T∗ since it coincides with T . Let f, g ∈ W . Integrating byparts twice we have ∫ π
0f ′′(x )g(x ) dx =
∫ π0
f (x )g′′(x ) dx,
and therefore(T (f ), g) =
∫ π0
(f ′′(x ) + λf (x ))g(x ) dx =∫ π0
f (x )(g′′(x ) + λg(x )) dx = (f, T (g)),
and T∗ = T .17 / 33
Example ContinuedThe general solution of the equation
f ′′ + λf = 0
without boundary conditions f (0) = f (π) = 0 is a two-dimensional subspacespanned by the functions sin(
√λx ) and cos(
√λx ). The restriction f (0) = 0 kills the
cosine and the restriction f (π) = 0 kills the sin also unless λ = n2. For this lattercase there will be a one-dimensional subspace of solutions with the functionsin(nx ) as a basis of it. Therefore for λ = n2 the equation
f ′′ + λf = h
has a solution if and only if h ⊥ sin nx , or∫ π0
h(x ) sin(nx ) dx = 0.
18 / 33
Recapping 250
Theorem 9 (Fundamental Theorem)Let A be any m × n matrix. Then col(A)⊥ = null(AT ).
Proof.Let A = (a1 a2 . . . an), where ai is the i th column of A and let x ∈ Rm. Then
AT x =
aT1aT2…
aTn
x =
aT1 xaT2 x
…aTn x
=
a1 · xa2 · x
…an · x
.
We see that x ∈ null(AT ), i.e. AT x = 0 if and only if x ⊥ ai for all i = 1, 2, . . . , n,and the latter means x ∈ col(A)⊥.
19 / 33
Solving Inconsistent Equations
Definition 10 (5.2.3)Let V, W be inner product spaces. A linear operator T : V → W is said to be afinite-rank operator if its range range(T ) is finite-dimensional subspace of W .
Very often in practical applications we have to solve finite-rank operator equationsof the type
T (x) = y, y ∈ W,
which does not have an exact solution.
By the Best-Approximation Theorem the projection on a finite-dimensionalsubspace range(T ) always exist. Then we consider finding a vector x0 ∈ V whichminimises the norm ‖y − T (x)‖. It is often called pseudo-solution.
20 / 33
The Normal Equation
The norm ‖y − T (x)‖ will be minimised by vector x0 if and only if
T (x0) = projrange(T )(y).
In such a casey − T (x0) ⊥ range(T ),
which by Fredholm’s theorem is equivalent to
y − T (x0) ∈ ker(T∗),
thus x0 satisfies the so-called normal equation
T∗T (x) = T∗y.
21 / 33
A technique to solve the normal equationLet T : Rn → W be a linear mapping and let us denote ui = T (ei ), i = 1, . . . , n.Then (ei, T∗y) = (T (ei ), y) = (ui, y). So
T∗y =
(u1, y)…
(un, y)
.
If x = (x1, . . . , xn)T , then T (x) =∑n
i=1 xi ui due to linearity of T ,
T∗T (x) =
(u1, T (x))…
(un, T (x))
=
x1(u1, u1) + . . . + xn(u1, un)…
x1(un, u1) + . . . + xn(un, un)
=
(u1, u1) (u1, u2) · · · (u1, un)… … · · · …
(un, u1) (un, u2) · · · (un, un)
x1…
xn
.
22 / 33
so the normal equation T∗T (x) = T∗y reduces to the matrix equation (u1, u1) (u1, u2) · · · (u1, un)… … · · · …
(un, u1) (un, u2) · · · (un, un)
x1…
xn
=
(u1, y)…
(un, y)
from which x0 can be found.
The matrix of this system is called the Gram matrix denoted G(u1, . . . , un).
23 / 33
ExampleLet us find the best polynomial approximation of degree ≤ 2 for the functionf (x ) = x 2/3 on [−1, 1].
Let V be the inner product vector space of continuous functions on [−1, 1] with theintegral inner product. The mapping T : R3 → V defined asT (a1, a2, a3)T = a1 + a2x + a3x 2. Then u1 = 1, u2 = x , u3 = x 2. Then
(u1, u1) =∫ 1−1
dx = 2, (u2, u2) =∫ 1−1
x 2dx =23,
(u3, u3) =∫ 1−1
x 4dx =25,
(u1, u2) = (u2, u3) = 0, (u1, u3) =∫ 1−1
x 2dx =23.
24 / 33
Example Continued
And also(u1, f ) =
65, (u2, f ) = 0, (u3, f ) =
611.
Hence the normal equation for the unknown coefficients a0, a1, a2 will be 2 0 2/30 2/3 0
2/3 0 2/5
a0a1
a2
=
6/50
6/11
.
We find (a1, a2, a3)T = (1855, 0,
911 ) or
f (x ) =1855
+9
11x 2,
i.e., the same result as in Lecture 10.
25 / 33
Example
Suppose that we want to evaluate the vector x = (x1, x2)T , with true value (1, 2)T .We measure the following values
x1 + x2, x1 − x2, x1, x2
in independent experiments with a random mistake ±0.5. If our tools were exactwe would obtain measurements 3,−1, 1, 2, respectively, and x could bedetermined from the system
1 11 −11 00 1
[
x1x2
]=
3−1
12
.
26 / 33
But since the results may be distorted we may obtain an inconsistent system likethe following one
1 11 −11 00 1
[
x1x2
]=
3.5−1.5
1.51.5
.
We will denote it Ax = b. Let us find its least squares solution. We have
AT A =[
3 00 3
].
The normal equation will be
[3 00 3
][x1x2
]=
[1 1 1 01 −1 0 1
]3.5−1.5
1.51.5
=
[3.56.5
].
Therefore x0 = (76,
136 )
T . This is rather close to the true value.27 / 33
Curve fittingWe wish to find a polynomial of degree not greater than n
y = a0 + a1x + a2x2 + . . . + anx
n
(usually n = 1 or n = 2), which approximates as closely as possible someexperimental data, given as points
(x1, y1), (x2, y2), . . . , (xm, ym).
Our first inclination would be to try to solve the system
a0 + a1x1 + a2x21 + . . . + anx
n1 = y1,
a0 + a1x2 + a2x22 + . . . + anx
n2 = y2,· · ·
a0 + a1xm + a2x2m + . . . + anx
nm = ym,
where the unknowns are the coefficients a1, . . . , an. In other words, we have tosolve the system of linear equations Ax = b.
28 / 33
Thus we haveAx = b,
where x = (a0, a1, . . . , an)T , b = (y1, . . . , ym)T and
A =
1 x1 x 21 . . . xn1
1 x2 x 22 . . . xn2
. . . . . . . . . . . . . . .
1 xm x 2m . . . xnm
.
This system is often inconsistent as we normally have many more measurementsthan unknowns. In this case we find the least squares solutionx0 = (a00, a
01, . . . , a
0n) and consider the polynomial
y = a00 + a01x + a
02x
2 + . . . + a0nxn
to be the “best fit” for our experimental data.
29 / 33
ExampleSuppose that we have measurements (−1, 2), (0,−1), (1, 2), (2, 4) and yoususpect that this data can be best approximated by a quadratic y = a + bx + cx 2.
If the given points satisfied this equation, then we would obtain
a − b + c = 2 (substituting (−1, 2)),a = −1 (substituting (0,−1)),a + b + c = 2 (substituting (1, 2)),a + 2b + 4c = 4 (substituting (2, 4)).
We then have to solve the system
1 −1 11 0 01 1 11 2 4
ab
c
=
2−1
24
.
30 / 33
This system is inconsistent. We form the normal equation to find the least squaressolution, it will be
4 2 62 6 86 8 18
ab
c
=
78
20
,
whose solution is (a, b, c)T = (1/20,−7/20, 5/4)T . Hence the best fit quadraticwill be
y =120−
720
x +54
x 2.
31 / 33
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definition of a complex inner product;• know what an adjoint operator is and what the properties are• know what the adjoint of a matrix is and the properties.• know Fredholm’s theorem and know how to use it for approximate solutions,
in particular the normal equation• be able to compute approximations to a function.
32 / 33
What to do now
Consolidation: Try exercises: Textbook: §7.5, p. 610 # 19-22.Lecture Notes: Exercises of Sec 4.3. and 5.2.
Next topic: Orthogonal matrices. Orthogonal diagonalisation of Hermitianoperators.
Colour code: Red
Preparation: Go through the text:Textbook: §5.1; pp. 384–387; §5.4 pp. 411–418.Lecture Notes: Sec. 5.1, 5.3.
33 / 33
63f057b2e78608b161dacdd0951dfa02.pdf
MATHS 253, Algebra. Lecture 10
21 March 2022Today’s topic: Inner products and inner product spaces.
Colour code: Red
Preparation: Go through the text:Textbook: §7.1, pp. 554–563; §7.2 pp. 575–578.Lecture Notes: Sec.4.1.1–4.1.2.
1 / 24
Inner Product Spaces
Definition 1A (real) inner product space consists of:
1. a vector space V over R;2. a rule (or operation) called inner product, which associate with each pair of
vectors u, v in V a scalar (u, v) in R, called the inner product of u and v, insuch a way thatIP1. (u, v + w) = (u, v) + (u, w), (u + v, w) = (u, w) + (v, w);IP2. (u, v) = (v, u);IP3. (au, v) = a(u, v), (u, av) = a(u, v) for all u in V and for all scalars a from R;IP4. (u, u) ≥ 0 for each vector u in V , and (u, u) = 0 only for u = 0;
These four conditions are modelled straight from the dot product DP1 – DP4.Note: the inner product is sometimes written 〈u, v〉.
2 / 24
Integral inner product
Example 2The vector space V = CR[a, b] of continuous real valued functions with innerproduct
(f, g) =∫ b
af (x )g(x )dx
is an inner product space.Properties IP1–IP3 follow from the properties of the integral.
For IP4, first note that for any f ∈ V , f 2(x ) = f (x )2 ≥ 0.
Now suppose f 6= 0; then there exists some c ∈ (a, b) such that f (c) 6= 0. By thedefinition of continuity, for any � > 0, there exists δ > 0 such that
|f (x ) − f (c)| < � whenever |x − c| < δ
3 / 24
Taking � = 12|f (c)|, there is some δ such that
|f (x )| >12|f (c)| for all x ∈ (c −δ, c + δ).
Now, for any function f ∈ V , write
(f, f ) =∫ b
af 2(x ) dx
=
∫ c−δa
f 2(x ) dx +∫ c+δ
c−δf 2(x ) dx +
∫ bc+δ
f 2(x ) dx
≥∫ c−δ
a0 dx +
∫ c+δc−δ
(12
f (c))2
dx +∫ b
c+δ0 dx
=12δf 2(c) > 0
as required.4 / 24
Cauchy-Schwarz inequality
As for Rn, for an arbitrary vector v of an inner product space V we can now definethe norm
‖v‖ =√
(v, v).
One of the most important facts about inner products is contained in the following:
Theorem 3 (4.2.1)For any two vectors u, v of an inner product space V
|(u, v)| ≤ ‖u‖‖v‖.
and equality is achieved only when one vector is a scalar multiple of another.
5 / 24
Proof
Suppose λ ∈ R. Then by the property IP4 of inner products we have(λu + v,λu + v) ≥ 0. This can be equal to zero if and only if v = −λu. Since
(λu + v,λu + v) = λ2(u, u) + 2λ(u, v) + (v, v) = λ2‖u‖2 + 2λ(u, v) + ‖v‖2,
this can be considered as a quadratic in λ. Since this quadratic can never benegative, its discriminant
∆ = (2(u, v))2 − 4‖u‖2‖v‖2 ≤ 0.
This implies (u, v)2 ≤‖u‖2‖v‖2 and, taking the square roots, we obtain theCauchy-Schwarz inequality.
6 / 24
Properties of the normTheorem 4 (4.2.2)For any three vectors u, v, w of an inner product space VN1. ‖u‖≥ 0 and ‖u‖ = 0 if and only if u = 0;N2. ‖au‖ = |a|‖u‖ for any scalar a ∈ R;N3. ‖u + v‖≤‖u‖ + ‖v‖.
Proof.We will prove only N3. Indeed, by the Cauchy-Schwarz inequality we get(u, v) ≤‖u‖‖v‖ and hence
‖u + v‖2 = (u + v, u + v) = (u, u) + (u, v) + (v, u) + (v, v)
= ‖u‖2 + 2(u, v) + ‖v‖2 ≤‖u‖2 + 2‖u‖‖v‖ + ‖v‖2 = (‖u‖ + ‖v‖)2.
Taking the square roots, we obtain N3.7 / 24
The distanceWe can also define the distance: dist (u, v) = ‖u − v‖. The main properties of thedistance:
Theorem 5 (4.2.3)For any three vectors u, v, w of an inner product space VD1. dist (u, v) ≥ 0 and dist (u, v) = 0 if and only if u = v;D2. dist (u, v) = dist (v, u);D3. dist (u, v) ≤ dist (u, w) + dist (w, v).
Proof.We will prove only D3 which is known as Triangle Inequality.
dist (u, v) = ‖u − v‖≤‖u − w‖ + ‖w − v‖ = dist (u, w) + dist (w, v).
This follows from N3 since u − v = (u − w) + (w − v).8 / 24
Angles
We may rewrite the Cauchy-Schwarz inequality as
−1 ≤(u, v)‖u‖‖v‖
≤ 1.
This allows us to define the angle between u and v by the formula
φ = arccos
((u, v)‖u‖‖v‖
).
This angle is always defined.
9 / 24
ExampleIn the inner product space CR[0, 1] find the distance and the angle between thefunctions x 3 and x 2.
We calculate
dist (x 3, x 2) = ‖x 3 − x 2‖ =
(∫ 10
(x 3 − x 2)2dx
)12
=
(∫ 10
(x 6 − 2x 5 + x 4)dx
)12
=1√
105.
The angle can be found as follows:
φ = arccos
((x 3, x 2)‖x 3‖‖x 2‖
)= arccos
(1/6
1/√
7 · 1/√
5
)= arccos
(√356
)≈ 10◦.
10 / 24
Orthogonality in Inner Product Spaces
We say that u is orthogonal to v and denote this u ⊥ v, in the case (u, v) = 0.
Exercise 1Prove that if a vector u of an inner product space is orthogonal to both v and w,then u is orthogonal to v + w.
Suppose (u, v) = 0 and (u, w) = 0. Then
(u, v + w) = (due to IP1) = (u, v) + (u, w) = 0 + 0 = 0.
Hence u ⊥ (v + w).
11 / 24
Linear independence of orthogonal systems
Definition 6A basis B = {v1, . . . , vn} of an inner product space V is said to be orthogonal if itis a pairwise orthogonal set. B is said to be orthonormal if it is orthogonal andeach vector in it has norm 1, i.e. ‖vi‖ = 1 for all i = 1, 2, . . . , n.
The set of pairwise orthogonal vectors, as expected, has a lot of nice properties.
Theorem 7 (4.2.4)Any set S = {v1, . . . , vn} of nonzero pairwise orthogonal vectors is linearlyindependent.
We do not have to prove it as the proof will be the same as for Rn.
12 / 24
Even and odd functionsLet us consider the inner product space V = CR[−a, a]. A function f (x ) on [−a, a]is said to be• even if f (−x ) = f (x ) for all x ∈ [−a, a],• odd if f (−x ) = −f (x ) for all x ∈ [−a, a].
Two observations:1. If f is even and g is odd, then their product fg is odd;2. If h is odd, then
∫ a−a h(x )dx = 0.
Theorem 8In the inner product space V = CR[−a, a], if f is even and g is odd, then f ⊥ g.
Proof.Indeed, fg is odd and (f, g) =
∫ a−a f (x )g(x ) dx = 0 by Observation 2.
If the interval is not symmetric, this property disappears!13 / 24
Gram-Schmidt orthogonalisation
We define projections as usual: projw(v) =(w,v)‖w‖2 w.
Theorem 9Let W = span{v1, . . . , vk} be a subspace of Rn. We construct a new system ofvectors {w1, . . . , wk} by defining
w1 := v1;w2 := v2 − projw1 (v2);
…wk := vk − projw1 (vk ) − . . .− projwk−1 (vk )
Then W = span{w1, . . . , wk} and the vectors w1, . . . , wk are pairwise orthogonal.If {v1, . . . , vk} is a basis, then {w1, . . . , wk} is also a basis.
14 / 24
Orthogonal basis for polynomials
The ordinary monomial basis B = {1, x, x 2, . . . , x n} of Rn[x ] on [−1, 1] relative tothe integral inner product is NOT even orthogonal. Indeed,
(1, x 2) =∫ 1−1
x 2 dx =236= 0.
Let us apply the Gram-Schmidt algorithm
{1, x, x 2, . . . , x n} GSO−→ {p0, p1, . . . , pn}.
The first three are:
p0(x ) = 1, p1(x ) = x, p2(x ) = x2 −
13.
Indeed, the first two are orthogonal since the constant function p0(x ) = 1 is evenand the function p1(x ) = x is odd.
15 / 24
Orthogonal basis for polynomials (continued)
Since x 2 is again even, we know that (x 2, x ) = 0. Thus
p2(x ) = x2 −
(x 2, x )(x, x )
x −(x 2, 1)(1, 1)
1 = x 2 −13.
As projections of x 3 onto p0 and p2 are zero,
p3(x ) = x3 −
(x 3, p1)(p1, p1)
x = x 3 −2/52/3
x = x 3 −35
x.
These polynomials are not normalised, we have
(p0, p0) = 2, (p1, p1) = 2/3, (p2, p2) = 8/45, (p3, p3) =8
175.
(check this!).
16 / 24
Legendre polynomialsBut normalisation which makes all norms equal to 1 is not the only one used.Legendre used a different one.
He normalised p1, . . . , pn, . . . as follows: Pn(x ) =1
pn(1)pn(x ).
Then after normalisation we have Pn(1) = 1. For example,
P0(x ) = 1P1(x ) = x,
P2(x ) =12
(3x 2 − 1),
P3(x ) =12
(5x 3 − 3x ),
P4(x ) =18
(35x 4 − 30x 2 + 3),
P5(x ) =18
(63x 5 − 70x 3 + 15x ).17 / 24
Legendre polynomialsWe plot the first five Legendre polynomials below:
18 / 24
Adrien-Marie Legendre (1752 – 1833)
19 / 24
An orthogonal system of trigonometric functionsLet B = {1, cos πx, sin πx, . . . , cos nπx, sin nπx} is an orthogonal subset in the innerproduct space V = CR[−1, 1].
We know that cos a cos b = 12 [cos(a + b) + cos(a − b)] so
cos mπx cos nπx =12
[cos(m + n)πx + cos(m − n)πx ]
hence, if m 6= n,(cos mπx, cos nπx ) =
∫ 1−1
cos mπx cos nπx dx
=sin(m + n)πx
2π(m + n)
∣∣∣∣1−1
+sin(m − n)πx
2π(m − n)
∣∣∣∣1−1
= 0.
The orthogonality conditions for sines can be proved in the same way. Since sinesare odd and cosines are even, we get their orthogonality for free.
20 / 24
Trigonometric polynomialsLet
B = {1, cos πx, sin πx, . . . , cos nπx, sin nπx}
from the previous example. We know this set is orthogonal, hence it is linearlyindependent.
Let us consider the subspace Tn[x ] = Span{B} in the inner product spaceV = CR[−1, 1]. The elements of Tn[x ] has a form
t (x ) = a0 +n∑
k=1
(ak cos kπx + bk sin kπx ) , ak, bk ∈ R,
they are called trigonometric polynomials.The set B is an orthogonal basis of thesubspace Tn[x ] of trigonometric polynomials.
Note that dim Tn[x ] = 2n + 1.
21 / 24
Orthonormal basis for trigonometric polynomialsLet us make this basis orthonormal. If we have m = n, then cos(m −n)πx = 1, and
(cos mπx, cos mπx ) =∫ 1−1
(cos mπx )2dx =∫ 1−1
1 + cos 2mπx2
dx
=sin 2mπx
4πm
∣∣∣∣1−1
+x2
∣∣∣1−1
= 0 + 1 = 1.
This shows that ‖cos mπx‖ = 1 and the same is true for the sines, i.e.‖sin mπx‖ = 1. We also have ‖1‖ =
√(1, 1) =
√2.
Thus to normalize B we need only to normalize the first vector. The orthonormalbasis of Tn[x ] will be{
1√
2, cos πx, sin πx, . . . , cos nπx, sin nπx
}.
22 / 24
To Summarise
As a result of today’s lecture (together with your further study) you should• know properties of an inner product;• know the definitions of geometric notions that results from introducing an
inner product;• know properties of orthogonal systems of vectors in an inner product space;• know an orthonormal basis for trigonometric polynomials;• know an orthogonal basis for polynomials.
23 / 24
What to do now
Consolidation: Textbook: Try exercises: §7.1 pp. 563–564: # 5,7,9,13,1531,39,41;Lecture Notes: Exercises of Sec. 4.1
Next topic: Projections as best approximations. Polynomial and Fourierapproximations.
Colour code: Orange/Red
Preparation: Go through the text:Textbook: §7.5; pp. 633–640.Lecture Notes: Sec. 4.2.3–4.2.5.
24 / 24
81eb5b662e4f7b409d08cac7e0fe871f.pdf
MATHS 253, Algebra. Lecture 17
6 April 2022
Today’s topic: Positive definite quadratic forms
Colour code: Red
Preparation: Go through the text:Textbook: §5.5; pp. 429–432.Lecture Notes: Sec. 6.4–6.5.1.
1 / 24
Positive definite quadratic forms
Definition 1A quadratic form Q : Rn → R is called• positive definite if Q(x) > 0 for all nonzero x ∈ Rn,• negative definite if Q(x) < 0 for all nonzero x ∈ Rn.• If the form takes both positive and negative values it is called indefinite.
There may be quadratic forms that do not satisfy any of these. For example, it maybe that Q(x) ≥ 0 but Q(x) = 0 for all x in a subspace W ⊆ Rn. These aredegenerate cases which we will touch only briefly.
2 / 24
Positive definite quadratic form visualised
3 / 24
Indefinite quadratic form visualised
Q(x) = 2x 2 + xy − 3y 2.
4 / 24
Our goals
Our goal is to find criteria for deciding whether a given quadratic form is positivedefinite or negative definite or indefinite.
One such criterion follows from finding the eigenvalues of the matrix of the form.
5 / 24
The first criterion
A quadratic form Q : Rn → R given by
Q(x) =n∑
i=1
n∑j=1
aij xi xj = xT Ax,
where [x]E = (x1, . . . , xn)T , is:
• positive definite iff all eigenvalues of A are positive,• negative definite iff all eigenvalues of A are negative,• indefinite iff A has both positive and negative eigenvalues.
6 / 24
ProofIn the basis F = {v1, . . . , vn} of principal axes
Q(x) = λ1y21 + · · · + λny
2n ,
where [x]F = (y1, . . . , yn)T , all the eigenvalues λ1, . . . ,λn of A are positive. ThenQ(x) > 0 for all nonzero x ∈ Rn since for every nonzero x at least one of thecoordinates y1, . . . , yn is nonzero.
On the other hand, if Q(x) > 0 for all 0 6= x ∈ Rn, let us calculate the value of Q onthe i th vector vi of the basis F . Since
[vi ]F = ei = (0, . . . , 0, 1, 0, . . . , 0)T ,
we haveQ(vi ) = λ1 · 0 + . . . + λi · 1 + . . . + λn · 0 = λi > 0.
Thus λi > 0. And this is true for every i = 1, 2, . . . , n. For negative definite formsthe proof is similar.
7 / 24
The leading minors
Definition 2Let
A =
a11 a12 . . . a1na21 a22 . . . a2n. . . . . . . . . . . .an1 an2 . . . ann
.
be a square n × n matrix. The determinants
∆1 = a11, ∆2 =∣∣∣∣ a11 a12a21 a22
∣∣∣∣ , . . . , ∆n =∣∣∣∣∣∣∣∣
a11 a12 . . . a1na21 a22 . . . a2n. . . . . . . . . . . .an1 an2 . . . ann
∣∣∣∣∣∣∣∣are called the leading minors of A.
8 / 24
Sylvester’s Criterion
Theorem 3Suppose that a quadratic form Q : Rn → R given by Q(x) = xT Ax.Let ∆1, . . . , ∆n be the leading minors of A. Then
1. The form is positive definite iff
∆1 > 0, ∆2 > 0, . . . , ∆n > 0. (1)
2. The form is negative definite iff
∆1 < 0, ∆2 > 0, ∆3 < 0, ∆4 > 0, . . . . (2)
3. The form is indefinite if all the leading minors are nonzero but the pattern isdifferent from those in (1) and (2).
9 / 24
Examples
Are the following quadratic forms positive definite or negative definite or indefinite?
• Q1(x) = x 2 − 2xy + 2y 2 = [ x y ][
1 −1−1 2
][xy
];
It is positive definite: ∆1 = 1 > 0 and ∆2 = 1 > 0.
• Q2(x) = −2x 21 − 2x1x2 + 2x1x3 − 3×22 − x
23 =
[x1 x2 x3 ]
−2 −1 1−1 −3 0
1 0 −1
x1x2
x3
It is negative definite: ∆1 = −2 < 0, ∆2 = 5 > 0 and ∆3 = −2.
10 / 24
Proof of necessity
We will prove that the condition (1) is necessary for positive definiteness and leavethe proof of sufficiency for later.
Suppose Q is positive definite. First we will prove that ∆n = det(A) > 0. We knowthat PT AP = D with P being orthogonal and all diagonal elements λ1, . . . ,λn of D(eigenvalues) positive. Then we obtain
det(PT AP) = det(D) = λ1λ2 . . .λn > 0.
On the other hand, since det(PT ) = det(P), we get
det(PT AP) = det(PT ) det(A) det(P) = det(P)2 det(A).
Since det(P)2 is positive, we obtain that det(A) is positive.
11 / 24
Proof of necessity continues
To prove ∆k > 0 we consider a subspace Rk ⊂ Rn, which we identify with the setof vectors x ∈ Rn such that x = (x′, 0)T for x′ ∈ Rk . Then the quadratic formQ′ : Rk → R
Q(x) =n∑
i=1
n∑j=1
aij xi xj =k∑
i=1
k∑j=1
aij xi xj = x′T A′x′ =: Q′(x′)
where
A′ =
a11 a12 . . . a1ka21 a22 . . . a2k. . . . . . . . . . . .ak 1 ak 2 . . . akk
.
Since Q′(x′) = x′T A′x′ is also positive definite we must have det(A′) > 0, but this isexactly ∆k > 0.
12 / 24
Negative definite case
We give only the idea.
If Q is negative definite, all λi ’s are negative too.
Hence det(D) = λ1 . . .λn will be negative for odd n and positive for even n.
From here the result for negative definite forms follows.
This is an exercise to fill all the details of this proof.
13 / 24
Example 1Decide whether or not the quadratic form
Q(x) = 4x 21 + 3×22 + 6x
23 − 4x1x2 − 6x2x3
is positive definite, negative definite, indefinite or none of the above.
Solution: The matrix of Q will be:
A :=
4 −2 0−2 3 −3
0 −3 6
.
Since ∆1 = 4, ∆2 = 8, ∆3 = 12, the form is positive definite.
Of course, we could use also a big cannon, i.e., Maple or Matlab, to find theeigenvalues:
λ1 = 0.3254 , λ2 = 4.5246 , λ3 = 8.1499 .
As all of them are positive, we can draw the same conclusion. But it is notnecessary.
14 / 24
Exercise
Does there exist a real number a for which the conic section
2x 2 + 2axy − y 2 = 1
is an ellipse?
We will get an ellipse only when the form 2x 2 + 2axy − y 2 is positive definite. Thematrix of the form is
A =[
2 aa −1
].
Since ∆1 = 2 > 0 it is positive definite iff ∆2 =∣∣∣∣ 2 aa −1
∣∣∣∣ = −2 − a2 > 0. Nonumber a will make this value positive, hence the conic section is never an ellipse.
15 / 24
Exercise
For which values of parameter λ the form
2x 21 + x22 + 3x
23 + 2λx1x2 + 4x1x3 = c > 0
is an ellipsoid?
It is an ellipsoid iff the form 2x 21 + x22 + 3x
23 + 2λx1x2 + 4x1x3 is positive definite. Its
matrix will be
A =
2 λ 2λ 1 0
2 0 3
.
We have ∆1 = 2 > 0, ∆2 = 2 −λ2, ∆3 = 2 − 3λ2.
All of them are positive iff λ2 < 2/3 or |λ| <√
2/3.
16 / 24
Example 2
Decide whether or not the quadratic form
Q(x) = 4x 21 + 9×22 + 12x1x2
is positive definite, negative definite, indefinite or none of the above.
Solution: The matrix of Q will be:
A :=[
4 66 9
].
Since ∆1 = 4 > 0, ∆2 = 0, this case is in the category “none of the above”. Wehave
Q(x) = 4x 21 + 9×22 + 12x1x2 = (2×1 + 3×2)
2 ≥ 0.
So is it positive definite? No.
17 / 24
Example 2. Continued
−4−2
02
4
−4
−2
0
2
4−50
0
50
100
150
200
250
300
There is a line 2×1 + 3×3 = 0 where Q takes value zero. Such forms are calledpositive semi-definite.
18 / 24
Positive and negative semi-definite forms
Definition 4A quadratic form Q : Rn → R is called• positive semi-definite if Q(x) ≥ 0 for all nonzero x ∈ Rn,• negative semi-definite if Q(x) ≤ 0 for all nonzero x ∈ Rn.
The criterion: A quadratic form Q(x) = xT Ax, is:
• positive semi-definite iff all eigenvalues of A are nonnegative,• negative semi-definite iff all eigenvalues of A are nonpositive.
19 / 24
Example 2. Continued
The characteristic polynomial of the matrix
A :=[
4 66 9
]is
pA(λ) = (4 −λ)(9 −λ) − 36 = λ2 − 13λ
and λ1 = 13 and λ2 = 0. Here is the culprit!
20 / 24
Positive definite Quadratic Forms. Further Properties
Theorem 5Let Q is positive definite or negative definite form. Then there exists a positive realnumber λ > 0 such that
|Q(x)| ≥ λ‖x‖2.
Proof: Suppose Q is positive definite. The negative definite case can be obtainedby replacing Q with −Q. Let B be the basis of principal axes of Q. Then
Q(x) = λ1y21 + . . . + λny
2n ,
where [x]B = (y1, . . . , yn)T = y. We know that x = Py, where P = PE←B isorthogonal. Then
‖x‖2 = xT x = yT PT Py = yT y = ‖y‖2.
21 / 24
Proof continues
We know the coefficients λi are all positive. Let λ be the smallest one. Then
Q(x) = λ1y21 + . . . + λny
2n ≥ λy
21 + . . . + λy
2n
= λ‖y‖2 = λ‖x‖2.
This proves the theorem.
22 / 24
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definitions of positive definite, negative definite and indefinite forms,• know the Sylvester’s criterion and be able to use it.
23 / 24
What to do now
Consolidation: Try exercises: Textbook: §. 5.5; pp. 441–442: #41,43,45,47,81,83,85.
Lecture Notes: Exercises of Sec. 6.5.1
Next topic: Congruent diagonalisation. Sufficiency of Sylvester’s criterion.
Colour code: Red
Preparation: Go through the text:Lecture Notes: Sec. 6.5.2-6.5.4.
24 / 24
83b0a62e1a4ddfc3d412e72be0c30562.pdf
MATHS 253, Lecture 9
18 March 2022
Today’s topic: Jordan Normal FormColour code: Red.Lecture Notes: Sec. 3.4–3.5
1 / 24
Introduction
We saw in an earlier lecture that not all operators are diagonalizable.
e.g. If we take T : R3 → R3 with standard matrix
A =
1 1 10 2 1
0 0 2
then T is not diagonalizable because it has an eigenvalue 2 with algebraicmultiplicity 2 and geometric multiplicity 1.
Jordan Normal Form is a generalization of diagonalization that works for anyoperator, when F is algebraically closed.
2 / 24
Root Vectors and Root Spaces
Definition 1Let λ be an eigenvalue of a linear operator T : V → V . Any v ∈ V which satisfies
(T −λI)k v = 0
for some positive integer k is called a root vector of T belonging to λ. The smallestsuch k is called the height of v.
Lemma 2The set V (λ) of all root vectors of T belonging to λ form an invariant subspace ofV , called the root space of λ.
Proof.If v ∈ V (λ) then (T −λI)k v = 0 for some k .Then (T −λI)k (T v) = T (T −λI)k v = T 0 = 0 so that T v ∈ V (λ).
3 / 24
Root Space Decomposition
Theorem 3Let F be an algebraically closed field, V a finite-dimensional F -vector space, andT : V → V a linear operator. Let λ1,λ2, . . . ,λk be the eigenvalues of T . Then forany vector v ∈ V , there exist v1 ∈ V (λ1), . . . , vk ∈ V (λk ) such that
v = v1 + v2 + . . . + vk.
Moreover, these vectors are uniquely determined by v.
Corollary 4The dimension of V (λ) is equal to the algebraic multipicitity of λ.
4 / 24
Write pT (λ) =∏k
i=1(λ−λi )mi . Define the polynomials
fj (x) =pT (λ)
(λ−λj )mj=∏i 6=j
(λ−λi )mi
The polynomials f1, f2, . . . , fk are coprime, so there exist polynomials g1, g2, . . . , gksuch that
f1(x)g1(x) + f2(x)g2(x) + . . . + fk (x)gk (x) = 1.
Thus∑k
i=1 fi (T )gi (T ) = I. Then
v = Iv =k∑
i=1
fi (T )gi (T )v.
Define vi = fi (T )gi (T )v, so that v = v1 + v2 + . . . + vk . As well,
(T −λi I)mi vi = (T −λi I)mi fi (T )gi (T )v = pT (T )gi (T )v = 0so that vi ∈ V (λi ).
5 / 24
For uniqueness, it suffices to show that 0 has a unique representation.
We know that 0 = 0 + 0 + · · ·+ 0 works, so suppose there is anotherdecomposition
0 = v1 + v2 + · · ·+ vkwith some vi 6= 0. Rewrite this as
vi = −v1 − v2 −···− vi−1 − vi+1 −···− vk
We saw on the last slide that (T −λj I)mj vj = 0 for all j . Thus the equation abovetells us that
fi (T )vi = −fi (T )(v1 + v2 + · · ·+ vi−1 + vi+1 + · · ·+ vk )= −(fi (T )v1 + fi (T )v2 + · · ·+ fi (T )vi−1 + fi (T )vi+1 + · · ·+ fi (T )vk )= 0
6 / 24
Finally, (x −λi )mi and fi are coprime, so we can write
p(x)(x −λi )mi + q(x)fi (x) = 1.
But then p(T )(T −λi I)mi + q(T )fi (T ) = I.
This causes a problem:
vi = Ivi= (p(T )(T −λi I)mi + q(T )fi (T ))vi= p(T )(T −λi I)mi vi + q(T )fi (T )vi= 0.
So the representation is unique after all.
7 / 24
Jordan Blocks and Jordan Matrices
Definition 5For λ ∈ F , The n × n Jordan block of λis the matrix
Jn(λ) =
λ 1 0 · · · 00 λ 1 · · · 0…
…. . . . . .
…0 0 · · · λ 10 0 · · · 0 λ
n×n
Definition 6A Jordan matrix is a matrix of the form
A =
Jn1(λ1) 0 · · · 00 Jn2(λ2) · · · 0…
…. . .
…0 0 · · · Jnk (λk )
8 / 24
Jordan Basis and Jordan Normal Form
Definition 7Let T : V → V be a linear operator. A basis B such that [T ]B is a Jordan matrix iscalled a Jordan basis for T .
Theorem 8Let F be an algebraically closed field, V a finite-dimensional F -vector space, andT : V → V a linear operator. Then
1. There exists a Jordan basis for T .2. If B1 and B2 are two Jordan bases for T , then [T ]B1 and [T ]B2 have the same
Jordan blocks (though possibly in a different order).
Proof.The proof is quite technical. See lecture notes Section 3.5.
9 / 24
Computing the Jordan Canonical FormSuppose [T ]B is a Jordan matrix with blocks Jn1(λ1), Jn2(λ2), . . . , Jnk (λk ). Write
B = {v1,1, . . . , v1,n1, v2,1, . . . , v2,n2, . . . , vk,1, . . . , vk,nk}
From the form of [T ]B we have
T (vj,1) = λj vj,1T (vj,2) = λj vj,2 + vj,1
…T (vj,nj ) = λj vj,nj + vj,nj−1
(T −λj I)(vj,1) = 0(T −λj I)(vj,2) = vj,1
…(T −λj I)(vj,nj ) = vj,nj−1
This suggests a method to find a Jordan basis for T |V (λ):1. Find a basis {u1, u2, . . . , u`} of null(A −λI)m2. For each uj , construct (T −λI)uj,(T −λI)2uj, . . . until you either have enough
vectors (you’re done!), or you hit 0 too early (move onto the next basis vector).10 / 24
An ExampleExample 9Find a Jordan basis for the operator T with standard matrix
A =
−1 1 1−3 2 2−1 1 1
We start by finding the eigenvalues:
pA(λ) =
∣∣∣∣∣∣−1 −λ 1 1−3 2 −λ 2−1 1 1 −λ
∣∣∣∣∣∣ = −λ(1 −λ)2so λ1 = 0,λ2 = 1. We see that V (0) will have dimension 1, while V (1) will havedimension 2.
11 / 24
For λ1 = 0, we only need one root vector, which will be an eigenvector. Wecompute
We have
A =
−1 1 1−3 2 2−1 1 1
−→
1 0 00 1 1
0 0 0
The eigenspace E(0) is spanned by v1,1 = (0, 1,−1)T .
Since λ1 = 0 has algebraic multiplicity 1, we’ve found the whole root space for λ1.
12 / 24
For λ2 = 1, we’ll need two root vectors. We start by looking at (T − I)2:
(A − I)2 =
0 0 01 0 −1−1 0 1
−→
1 0 −10 0 0
0 0 0
The nullspace is spanned by u1 = (1, 0, 1)T and u2 = (0, 1, 0)T . To find the nextroot vector, we’ll try using u1:
(A − I)u1 =
0 0 01 0 −1−1 0 1
10
1
=
−1−1−1
=: v2,1.
Setting v2,2 = u1, we now have enough vectors for a Jordan basis.
13 / 24
The Jordan Basis
So we have a set of vectors B = {v1,1, v2,1, v2,2} which satisfies
T (v1,1) = 0v1,1 T (v2,1) = 1v2,1 T (v2,2) = 1v2,2 + v2,1
Thus
[T ]B =
0 0 00 1 1
0 0 1
= [ J1(0) 0
0 J2(1)
]←− Jordan matrix!
14 / 24
Another ExampleExample 10Find a Jordan basis for the operator T with standard matrix
A =
3 −1 10 2 1
0 −1 4
Again we’ll start by finding the eigenvalues:
pA(λ) =
∣∣∣∣∣∣3 −λ −1 1
0 2 −λ 10 −1 4 −λ
∣∣∣∣∣∣ = (3 −λ)∣∣∣∣ 2 −λ 1−1 4 −λ
∣∣∣∣ = (3 −λ)3So there’s only one eigenvalue: λ = 3 with algebraic multiplicity 3. Thus our rootvectors will have height at most 3.
15 / 24
We compute
A − 3I =
0 −1 10 −1 1
0 −1 1
(A − 3I)2 =
0 0 00 0 0
0 0 0
(A − 3I)3 =
0 0 00 0 0
0 0 0
Now, we see that null((A − 3I)3
)is spanned by
{u1 = (1, 0, 0)T , u2 = (0, 1, 0)T , u3 = (0, 0, 1)T}.
Let’s try to make a Jordan basis from these.16 / 24
Starting from u1 = (1, 0, 0)T , we have
(A − 3I)u1 = (0, 0, 0)T
so we’ve run into 0. Thus we set v1,1 = u1; we still need two more root vectors.
Starting from u2 = (0, 1, 0)T , we have
(A − 3I)u2 = (−1,−1,−1)T
Setting v2,2 = u2 and v2,1 = (A − 3I)v2,2, we now have enough root vectors tomake a basis B = {v1,1, v2,1, v2,2}.
17 / 24
Now we have a basis B such that
T (v1,1) = 3v1,1 T (v2,1) = 3v2,1 T (v2,2) = 3v2,2 + v2,1
Thus
[T ]B =
3 0 00 3 1
0 0 3
= [ J1(3) 0
0 J2(3)
]←− Jordan matrix!
18 / 24
One More Example
Example 11Find a Jordan basis for the operator T with standard matrix
A =
2 1 01 2 1
0 −1 2
We start by computing the eigenvalues:
pT (λ) =
∣∣∣∣∣∣2 −λ 1 0
1 2 −λ 10 −1 2 −λ
∣∣∣∣∣∣ = (2 −λ)3So the only eigenvalue is λ = 2, with multiplicity 3.
19 / 24
We compute
A − 2I =
0 1 01 0 1
0 −1 0
(A − 2I)2 =
1 0 10 0 0−1 0 −1
(A − 2I)3 =
0 0 00 0 0
0 0 0
Now, we see that null((A − 2I)3
)is spanned by
{u1 = (1, 0, 0)T , u2 = (0, 1, 0)T , u3 = (0, 0, 1)T}
Let’s try to make a Jordan basis from these.
20 / 24
We start with v1,3 = u1 = (1, 0, 0)T . We compute
(A − 2I)v1,3 =
0 1 01 0 1
0 −1 0
10
0
=
01
0
=: v1,2.
Continuing on:
(A − 2I)v1,2 =
0 1 01 0 1
0 −1 0
01
0
=
10−1
=: v1,1.
We’ve found the whole Jordan basis.
21 / 24
Now we have a basis B such that
T (v1,1) = 2v1,1 T (v1,2) = 2v1,2 + v1,1 T (v1,3) = 2v1,3 + v1,2
Thus
[T ]B =
2 1 00 2 1
0 0 2
= J3(2) ←− Jordan matrix!
22 / 24
To Summarise
As a result of today’s lecture (together with your further study) you should• Be able to define root vector, root space, and height.• Be able to define Jordan block and Jordan matrix.• State the fundamental theorem of Jordan Normal Form (existence and
uniqueness of JNF).• Be able to find the Jordan basis and Jordan Normal Form.
23 / 24
What to do now
Consolidation: Lecture Notes: Exercises of Sec. 3.3;
Textbook: §4.3, Exercises 33–38Next topic: Inner products and inner product spaces
Color code: Red.
Preparation: Go through the text:Lecture Notes: Sec. 4.1Textbook: Sec. 7.1
24 / 24
8f028eb171b546e9f47c971e1b77f864.pdf
MATHS 253, Algebra. Lecture 15
1 April 2022
Today’s topic: Quadratic Forms
Colour code: Red
Preparation: Go through the text:Textbook: §. 5.5; pp. 425–432.Lecture Notes: Sec. 6.1, 6.2.
1 / 23
Linear formsAny function Rn → R given by
f (x) = a1x1 + . . . + anxn
is known as linear form of n variables.
The matrix of this linear transformation is
[T (e1) T (e2) . . . T (en)] = [a1 a2 . . . an].
Hence
f (x) = [a1 a2 . . . an]
x1x2…
xn
.
2 / 23
Quadratic forms
We will now have to go beyond linearity and to consider quadratic forms.
Definition 1 (6.1.1)A function Q : Rn → R given by the formula
Q(x) =n∑
i=1
n∑j=1
aij xi xj,
where x = (x1, . . . , xn)T is said to be a quadratic form on Rn. Sometimes theexpression on the right-hand-side of this equation is also called a quadratic form.Here is an example:
Q(x) = 2x 21 − 3x1x2 + 5x2x1 − x22 .
3 / 23
Symmetric presentation. Examples
In the previous example the quadratic form can be written as
Q(x) = 2x 21 + x1x2 + x2x1 − x22 .
As xi xj = xj xi it is always possible to write a quadratic form in such a way thataij = aji . We will call it the symmetric presentation of this quadratic form.
Example 2 (6.1.2)The following three functions are quadratic forms:
Q1(x) = x21 + · · ·+ x
2n ,
Q2(x) = x1x2 =12
x1x2 +12
x2x1,
Q3(x) = x1x2 + x1x3 + x23 =
12(x1x2 + x2x1 + x1x3 + x3x1) + x
23
in compact representation and symmetric representation.4 / 23
The matrix of a quadratic form
Theorem 3 (Proposition 6.1.3)Let Q(x) be a quadratic form. Then
Q(x) = x · Ax = xT Ax,
where A = (aij ) is a symmetric matrix which is called the standard matrix of Q.This matrix is uniquely defined.
Proof: We get
Q(x) =n∑
i=1
n∑j=1
aij xi xj =n∑
i=1
xi
n∑
j=1
aij xj
=
x · Ax = xT Ax.
5 / 23
Proof continuesWe note that, if A is symmetric, then, for all i, j
aij =14(Q(ei + ej )− Q(ei − ej )
]=
14(aii + aij + aji + ajj − (aii − aij − aji + ajj )
],
where ei , ej are vectors of the standard basis. Hence A is uniquely defined.
Example 4 (6.1.5)Matrices of quadratic forms Q1(x) = x 21 + · · ·+ x
2n , Q2(x) = x1x2,
Q3(x) = x1x2 + x1x3 + x 23 in Example 2 are:
In,[
0 1/21/2 0
],
0 1/2 1/21/2 0 0
1/2 0 1
,
respectively.6 / 23
Change of basis and change of the coordinatesThe parabola y = x 2 is rotated anticlockwise about the origin through the angle of30◦ degrees. What is the equation of the new curve (relative to the standardbasis)?
We have a new basis B = {w1, w2}:
w1 =√
32
e1 +12
e2 (rotated e1)
w2 = −12
e1 +√
32
e2 (rotated e2)
In the new coordinate system the equation of the parabola will still be the same,i.e. y
′= (x
′)2. We have
[x]B =[
x′
y′
]= PB←E
[xy
]= PB←E x.
7 / 23
Example continued
Hence
w1 =[ √
3/21/2
], w2 =
[−1/2√
3/2
]and
PE←B = [[w1]E | [w2]E ] =[ √
3/2 −1/21/2
√3/2
].
Then
PB←E = (PE←B)−1 =
[ √3/2 1/2−1/2
√3/2
]and this is the matrix we need. Hence[
x′
y′
]=
[ √3/2 1/2−1/2
√3/2
][xy
].
8 / 23
Example continued
This is the same as
x′
=√
3/2x + 1/2y,y
′= −1/2x +
√3/2y
and substituting into y′= (x
′)2, we get
−1/2x +√
3/2y =(√
3/2x + 1/2y)2
= 3/4x 2 +√
3/2xy + 1/4y 2
ory 2 + 2
√3xy + 3x 2 − 2
√3y + 2x = 0.
Is it really our nice parabola? Yes.
9 / 23
Example continuedLet us now have a look which equation will the circle have in the new coordinatesystem. It will still be given by
(x′)2 + (y′)2 = 1.
Substituting
x′
=√
3/2x + 1/2y,y
′= −1/2x +
√3/2y
we get (√3/2x + 1/2y
)2+(−1/2x +
√3/2y
)2= 1
orx 2 + y 2 = 1.
The circle is so symmetric that its shape does not change from the rotation.10 / 23
Quadratic form. Change of basis
Sometimes we have to use bases different from the standard one. A change ofbasis implies the change of coordinates.
It is useful to know how the defining formula of Q(x) will change when we switch toanother basis and change coordinates.
Theorem 5 (Proposition 6.1.7)Let A be the standard matrix of a quadratic form Q : Rn → R and let F be a basisof Rn, then the matrix B of Q relative to F is
B = PT AP,
where P = PE←F . This matrix is sometimes denoted as [Q]F .
11 / 23
Proof
Sincex = [x]E = PE←F [x]F = P[x]F ,
we haveQ(x) = xT Ax = (P[x]F )
T AP[x]F =
[x]TF PT AP[x]F = [x]
TF B[x]F .
This proves the theorem.
We note that a quadratic form Q does not have cross-terms in a basis F if andonly if the matrix [Q]F is diagonal.
12 / 23
The Principal Axes Theorem
Theorem 6 (6.2.1)Let A be the standard matrix of a quadratic form Q : Rn → R. Then, relative to thebasis F = {v1, . . . , vn} of eigenvectors of A the form will be written as
Q(x) = λ1y21 + · · ·+ λny
2n ,
where [x]F = (y1, . . . , yn)T , where λ1 . . . ,λn are eigenvalues of A. These vectorsv1, . . . , vn are called the principal axes of Q.
13 / 23
ProofSuppose now that a quadratic form Q : Rn → R is given by
Q(x) =n∑
i=1
n∑j=1
aij xi xj = xT Ax.
Let us recall that the symmetric matrix A can be orthogonally diagonalised andthere exist an orthonormal basis F = {v1, . . . , vn} consisting of eigenvectors of Abelonging to λ1,λ2, . . . ,λn.
Let P = PE←F = [v1 v2 . . . vn] be the orthogonal change-of-basis matrix from F tothe standard basis E . Then P−1 = PT and
D = P−1AP = PT AP,
where D = diag(λ1, . . . ,λn) is the diagonal matrix.
14 / 23
Proof continues
We switch to the orthonormal basis F = {v1, . . . , vn} consisting of eigenvectors ofA. We then have
Q(x) = [x]TE A[x]E = [x]TF (P
T AP)[x]F = [x]TF D[x]F =
λ1y21 + · · ·+ λny
2n ,
where [x]F = (y1, . . . , yn)T , i.e., the matrix of Q relative to F will be the diagonalmatrix
D =
λ1 0 . . . 00 λ2 . . . 0. . . . . . . . . . . .0 0 . . . λn
and written in this basis the quadratic form will not have cross-terms.
15 / 23
Example (6.2.2)Let Q : R3 → R is given by
Q(x) = −x 21 − x22 − x
23 + 6x1x2 + 6x1x3 + 6x2x3.
Find the principal axes and the formula for Q in the basis of principal axes.
Solution: The matrix of Q relative to the standard basis is
A =
−1 3 33 −1 3
3 3 −1
(we have seen similar matrices).
The eigenvalues of A as we know are: λ1,2 = −4, λ3 = 5.The corresponding eigenvectors are:
v1 =1√
2(−1, 1, 0)T , v2 =
1√
6(−1,−1, 2)T , v3 =
1√
3(1, 1, 1)T .
16 / 23
Example continues
The matrix of Q relative to F = {v1, v2, v3} will be −4 0 00 −4 0
0 0 5
.
Therefore in coordinates relative to this new basis F the quadratic form can bewritten as
Q(x) = −4y 21 − 4y22 + 5y
23 ,
where [x]F = (y1, y2, y3)T .
17 / 23
Quadratic optimization
The problem which can be now solved is how to find minima and maxima of agiven quadratic form on a unit sphere Sn−1 given by |x| = 1. This application ispossible due to the following lemma.
Lemma 7 (6.3.2)In any orthonormal basis of Rn, the sphere Sn−1 is given by the same equation
x 21 + · · ·+ x2n = 1.
18 / 23
Proof
Let F be an arbitrary orthonormal basis of Rn and [x]F = (y1, . . . , yn)T . Then
[x]F = PF←E [x]E = Px,
where P is an orthogonal matrix. We have
y 21 + · · ·+ y2n = ([x]F )
T [x]F = (Px)T Px = xT PT Px =
xT x = x 21 + · · ·+ x2n = 1,
and the equation for the unit sphere in the new coordinate system will be the same.
19 / 23
Maxima and minima
Theorem 8 (6.3.3)Let
Q(x) =n∑
i=1
n∑j=1
aij xi xj = xT Ax
be a quadratic form on Rn with the standard matrix A. Let
λ1 ≥ ···≥ λn
be the eigenvalues of A with the corresponding eigenvectors v1, . . . , vn. ThenQ(v1) = λ1 is the maximal value of Q on Sn−1 and Q(vn) = λn is the minimalvalue of Q on Sn−1.
20 / 23
Proof
Let us introduce a new coordinate system by changing the standard basis to thebasis F = {v1, . . . , vn} consisting of the eigenvectors. Then the sphere Sn−1 ischaracterized by the equation y 21 + · · ·+ y
2n = 1and for an arbitrary v ∈ Sn−1
Q(v) = λ1y21 + · · ·+ λny
2n ≤ λ1(y
21 + · · ·+ y
2n ) = λ1,
with the equality at Q(v1) = λ1 as [v1]F = (1, 0, . . . , 0)T . Also
Q(v) = λ1y21 + · · ·+ λny
2n ≥ λn(y
21 + · · ·+ y
2n ) = λn,
with the equality at Q(vn) = λn as [vn]F = (0, . . . , 0, 1)T .
21 / 23
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definition of a quadratic form;• be able to find a symmetric matrix of a quadratic form in any basis;• be able to find the basis of principal axis in which the form does not have
cross-products;• be able to find the minimum and the maximum of a quadratic form on a
sphere.
22 / 23
What to do now
Consolidation: Textbook: Try exercises: §5.5, p. 441: # 23,25,31,33,35,37,39.Next topic: Conics and Quadrics.
Colour code: Red
Preparation: Go through the text:Textbook: §5.5 pp. 432 –440.Lecture Notes: Sec. 6.3.
23 / 23
9630257bfdd169b72d1dd82fcf982eb0.pdf
MATHS 253, Lecture 3
4 March, 2022
Today’s topic: Basis, Dimension, The Coordinate MappingColour code: Orange/RedLecture Notes: Sec. 1.4–1.5; pp. 11–16;
Textbook: §6.2, pp. 461–474.
1 / 22
Linear transformations defined
Definition 1Let V and W be two vector spaces over a field F . A mapping T : V → W is alinear transformation if the following two conditions hold:
LT1. (additivity) T (u + v) = T (u) + T (v), for all u, v ∈ V ,LT2. (homogeneity) T (λu) = λT (u), for all u ∈ V , λ ∈ F .
The vector space V is called the domain of T and W is called the codomain of T .
When V = W , i.e., the domain and the codomain coincide, a linear transformationT : V → V is called also a linear operator on V .
2 / 22
A few consequences from the definition
Proposition 1Let T : V → W be a linear transformation. Then
(i) T (0) = 0,(ii) T (−u) = −T (u),(iii) T (u − v) = T (u)− T (v).
Proof.By homogeneity, T (0 · 0) = 0 · T (0) = 0, hence (i) holds. By homogeneity again,
T (−u) = T ((−1) · u) = (−1) · T (u) = −T (u).
By additivity and (ii),
T (u − v) = T (u) + T (−v) = T (u)− T (v).3 / 22
One more consequence
Proposition 2Let T : V → W be a linear transformation. Then the following condition, calledlinearity
T (a1u1 + . . . + anun) = a1T (u1) + a2T (u2) + . . . + anT (un).
is satisfied for all ui ∈ V , ai ∈ R.
4 / 22
Examples I
Example 2We know that matrix-vector multiplication satisfies the following properties: for anymatrix A ∈ Fm×n, vectors x, y ∈ F n and λ ∈ F ,
A(x + y) = Ax + Ay, A(λx) = λAx.
This shows that conditions LT1 and LT2 are satisfied, which makes the mappingT (x) = Ax linear transformation from F n to F m. For instance,
T
x1x2
x3
= [ 1 2 3
4 5 6
] x1x2x3
= [ x1 + 2×2 + 3×3
4×1 + 5×2 + 6×3
]
is a linear transformation from R3 into R2.
5 / 22
Examples II
Example 3 (The projection onto a subspace)Let W be a subspace of Rn with a basis {w1, . . . , wk} and let A = ( w1 w2 . . . wk )be the matrix whose i th column is wi . Then for any b ∈ Rn the projection of b ontothe subspace W is
projW (b) = Pb, where P = A(AT A)−1AT .
This, in particular, shows that the projection onto a subspace is a linear operator.
Example 4One important linear operator on any vector space V is the identity operatorI : V → V , defined by I(v) = v for all v ∈ V .
6 / 22
Examples III
Example 5 (The coordinate mapping)Let V be an n-dimensional vector space and B be any basis for V . Then thecoordinate mapping [ · ]B : V → Rn, which assigns to every vector v ∈ V itscoordinate column [v]B ∈ Rn is a linear transformation.
Proof.Indeed we proved
[u + v]B = [u]B + [v]B, [au]B = a[u]B,
which is exactly LT1 and LT2.
7 / 22
ExerciseExercise 1Find the matrix A such that T (x) = Ax for
T
x1x2
x3
=
x1 + 2×2 + 3x34x1 + 5×2 + 6×3
x2
.
We have
T
x1x2
x3
= x1
14
0
+ x2
25
1
+ x3
36
0
=
1 2 34 5 6
0 1 0
x1x2
x3
.
The trick that was used:If A = [ a1 a2 . . . an ] is an m × n matrix and x = (x1, . . . , xn)T ∈ F n. Then
Ax = x1a1 + . . . + xnan.
8 / 22
The standard matrixTheorem 6Every linear transformation T : F n → F m is of the form T (x) = Ax, where A is anm × n matrix, which can be found as
A = [T (e1) | T (e2) | · · · | T (en)] ,
where {e1, e2, . . . , en} is the standard basis of F n. The matrix A is called thestandard matrix of T .
Proof.Let x = (x1, x2, . . . , xn)T ∈ F n be arbitrary. We write it down asx = x1e1 + . . . + xnen and apply T . We get
T (x) = T (x1e1 + . . . + xnen) = x1T (e1) + x2T (e2) + . . . + xnT (en) = Ax
as required (note that the same trick was used).
9 / 22
RotationAnticlockwise rotation Rθ of the plane R2 about the origin through the angle θ is alinear transformation. Finding
Rθ(e1) = cosθ · e1 + sinθ · e2 =[cosθsinθ
],
Rθ(e2) = −sinθ · e1 + cosθ · e2 =[−sinθcosθ
],
we find the matrix of Rθwill be[
cosθ −sinθsinθ cosθ
].
E.g.,[
0 −11 0
]is the matrix of counterclockwise 90◦ rotation.
10 / 22
ProjectionThe projection Pθ of the plane R2 onto the line ` through the origin at angle θ withthe x -axis is a linear transformation.
Pθ
[xy
]=[
Pθ(e1) Pθ(e2)]=
[cos2 θ sinθ cosθ
sinθ cosθ sin2 θ
][xy
].
The matrix of Pθ will be[cos2 θ sinθ cosθ
sinθ cosθ sin2 θ
]
In particular,[
1 00 0
]and
[0 00 1
]are the projection matrices onto x -axis and
y -axis, respectively.11 / 22
Reflection
Example 7The reflection Qθ of the plane R2 in the line drawn at angle θ with the x -axis is alinear transformation. In the coordinate form it is given by the formula:
Qθ
[xy
]=
[cos 2θ sin 2θsin 2θ −cos 2θ
][xy
].
Prove this as an exercise.
In particular,[
1 00 −1
]and
[−1 0
0 1
]are the matrices of refections in x -axis
and y -axis, respectively.
12 / 22
Recognition of Patterns
• Rotations. The matrix of it is[cosθ −sinθsinθ cosθ
]. Examples:
12
[ √2 −
√2√
2√
2
],
12
[1√
3−√
3 1
],
1√
a2 + b2
[a −bb a
].
• Projections. The matrix of it is[
cos2 θ sinθ cosθ
sinθ cosθ sin2 θ
]. Examples:
12
[1 11 1
],
14
[1√
3√3 3
],
1a2 + b2
[a2 abab b2
].
13 / 22
The matrix of a rotation in spaceBelow the red vectors are images of e1 and e2. The image of e3 is itself.
cγ = cosγ,sγ = sinγ.
The matrix of rotation about z-axis shown above is cosγ −sinγ 0sinγ cosγ 0
0 0 1
.
14 / 22
Examples IV
Example 8Let DR[a, b] be the space of all differentiable real-valued functions on [a, b]. Definethe operator of differentiation D by
(Df )(x) = f ′(x).
This is a linear operator.
The differentiation operator can be considered on different domains, say R[x] orRn[x].
15 / 22
Examples V
Example 9Let Rn[x] be the space of all polynomials with coefficients in R of degree ≤ n.Define T by
(Tf )(x) =∫ x
0f (t)dt.
This will not be a linear operator on Rn[x] because its values are in Rn+1[x]. Thisis however a linear transformation from Rn[x] to Rn+1[x].
16 / 22
Linear Transformations in Arbitrary Bases
Theorem 10Let T : V → W be a linear transformation. Let B = {v1, . . . , vn} be a basis for Vand C = {w1, . . . , wm} be a basis for W . Then
[T (x)]C = A[x]B,
where A is an m × n matrix, which can be found as
A = ([T (v1)]C | [T (v2)]C | · · · | [T (vn)]C) .
The matrix A is called the matrix of T relative to bases B and C. We will alsodenote A as [T ]BC so that the formula will also be written as
[T (x)]C = [T ]BC[x]B.
17 / 22
Proof.Let us express an arbitrary vector x as a linear combination of vectors ofB = {v1, . . . , vn}:
x = x1v1 + . . . + xnvn, (1)
which means that [x]B = [x1, . . . , xn]T . Let us apply T to both sides of (1). Then
T (x) = x1T (v1) + x2T (v2) + . . . + xnT (vn). (2)
Now we apply the coordination mapping w 7→ [w]C to both sides of (2). We knowthis mapping is linear, hence
[T (x)]C = x1[T (v1)]C + x2[T (v2)]C + . . . + xn[T (vn)]C = [T ]BC[x]B.
The same trick works again!
18 / 22
Example 11Let us consider the integration operator f 7→
∫ x0 f (x) dx from R3[x] to R4[x].
Choose the bases B = {1, x, x 2, x 3} and C = {1, x, x 2, x 3, x 4} in R3[x] and R4[x],respectively. Then
[T ]BC =
0 0 0 01 0 0 00 12 0 00 0 13 00 0 0 14
.
Let f (x) = 1 − 6x 2 + 4x 3. Then
[(Tf )(x)]C = [T ]BC[f (x)]B = [T ]
BC
10−6
4
=
010−2
1
,
Hence∫ x
0 f (x)dx = x − 2×3 + x 4.
19 / 22
ExerciseLet D : R2[x] → R2[x] be the operator of differentiation. Let us choose the basisB = {1, x, x
2
2 }. What is [D]BB ?
Answer: [D]BB =
0 1 00 0 1
0 0 0
.
Indeed,
[D1]B = [0]B =
00
0
, [Dx]B = [1]B =
10
0
,
[D(
x 2
2
)]B= [x]B =
01
0
.
20 / 22
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definition of linear transformation and its consequences;• be able to construct a matrix of a linear transformation in two given bases;• understand the geometry of two-dimensional linear operators, in particular to
be able to recognise rotations, projections and reflections.
21 / 22
What to do now
Consolidation: Try exercises:Textbook: p. 498 # 1,9,15; p.230 # 5,7, 21,23; p. 530 # 3,5,15.Lecture Notes: Exercises of Sec. 2.1–2.3.
Next topic: Change of basis matrices and their properties. Algebra of linearoperators.
Colour code: Orange/Red
Preparation: Go through the text:Textbook: §6.3: pp. 481–488; §6.6; 516–526.Lecture Notes: Sec. 2.4-2.5.
22 / 22
A1 (2).pdf
Department of Mathematics
MATHS 253 Assignment 1 due 21 March 2022
This assignment will be marked out of a total of 100. Please submit your answers via Canvas,in the usual way, by 11:59 pm on the due date.
1. [10 points] Let V be the vector space of symmetric 2 × 2 real matrices. One basis for V isB = {M1, M2, M3}, given by
M1 =[
1 00 0
]M2 =
[0 11 0
]M3 =
[0 00 1
]Let A1, A2, A3 ∈ V be given by
A1 =[
3 44 −1
]A2 =
[−2 −1−1 2
]A3 =
[5 10
10 1
](a) [4 points] Write down the coordinate columns [A1]B, [A2]B, [A3]B of A1, A2, A3 with respect
to B.
(b) [6 points] Use the Linear Dependency Relationship Algorithm to find a maximal linearlyindependent subsystem of {A1, A2, A3}, and express all other matrices as a linear combi-nation of the matrices of the subsystem.
2. [15 points] Let
A =
4 1 −1−2 1 1
3 1 0
B =
5 0 −3−3 2 3
6 0 −4
(a) [4 points] Find the characteristic polynomials of A and B.
(b) [4 points] Find the minimal polynomials of A and B.
(c) [7 points] Find the Jordan normal forms of A and B, and the corresponding Jordan bases.
3. [20 points] Let V = R2[x] be the real vector space of polynomials of degree at most 2 with realcoefficients. The standard basis for this space is B = { f0, f1, f2}, where f0(x) = 1, f1(x) = x,and f2(x) = x2.
(a) [8 points] Define the linear operator T : V → V by
T( f )(x) = f ′(x) +6x
∫ x0
f (t) dt.
Prove that C = {T( f0), T( f1), T( f2)} is a basis for V.(b) [4 points] Find the change-of-basis matrices PB←C and PC←B.
(c) [4 points] Find [1 + x + x2]C.
(d) [4 points] Let S be the linear operator on V whose representation in the basis B is
[S]B =
1 2 00 1 4
0 0 1
.
Find [S]C.
4. [20 points] Let V = R2[x] be the space of polynomials of degree at most 2 with real coefficients.Define the linear transformation T : V → R by
T( f ) =∫ 1−1
x2 f (x) dx.
Find ker T and range T, and show that T satisfies the rank-nullity theorem.
5. [15 points] Hamilton’s Quaternions H are a four-dimensional R-vector space with basis B ={1, i, j, k}; in particular, any element v ∈ H can be written as
v = a + bi + cj + dk
for some a, b, c, d ∈ R. There is a notion of multiplication for Hamilton’s quaternions: it isdefined by
i2 = j2 = k2 = −1ij = −ji = kjk = −kj = iki = −ik = j
For any u ∈ H, we define the linear map Λu : H → H (called the left regular representation) by
Λu(v) = uv.
In the following parts, let u = 1 − 2i − j + 3k.
(a) [7 points] Find [Λu]B.
(b) [8 points] Show that [Λu]B[Λu]TB = 15I, where I is the 4 × 4 identity matrix. Using thisfact, find a quaternion w such that uw = 1 + i + j + k.
6. [20 points] Find a closed formula for an, which is defined by the following recurrence relation:
an = 3an−1 + 13an−2 − 15an−3 for n ≥ 3a0 = 2a1 = 10a2 = 18
7. [Bonus question for up to 5 points of extra credit, if required] Let A be an invertible n × nmatrix with real entries, such that each row of A sums to the same value r. Prove that r 6= 0,and determine the sum of all entries of A−1.
A2.pdf
Department of Mathematics
MATHS 253 Assignment 2 due 11 April 2022
This assignment will be marked out of a total of 100. Please submit your answers via Canvas,in the usual way, by 11:59 pm on the due date.
1. [15 points] Let f : [−1, 1] → R be defined by
f (x) =
−1 if − 1 ≤ x < 00 if x = 01 if 0 < x ≤ 1
(a) [3 points] Is f odd, even, neither, or both?(b) [12 points] Find the 6th degree Fourier series approximation to f .
2. [25 points] A real orthogonal matrix A is called special orthogonal if det A = 1.
(a) [5 points] Let A and B be orthogonal matrices. Show that AB is special orthogonal if andonly if both A and B are special orthogonal, or neither A nor B is special orthogonal.
(b) [20 points] Let A ∈ R2×2 be special orthogonal. Show that there exists ϑ ∈ [0, 2π) suchthat
A =[
cos ϑ −sin ϑsin ϑ cos ϑ
]
3. [20 points] Let A = 13
1 0 40 5 −4
4 −4 3
. Compute the spectral decomposition of A and use it to
compute sin(
π6 A).
4. [20 points] Let V = R2[x] be the real vector space of polynomials of degree at most 2. LetB = { f0, f1, f2} with f0(x) = 1, f1(x) = x, and f2(x) = x2 be the standard basis of V. Define theinner product 〈·, ·〉 on V by
〈 f , g〉 =∫ 1
0f (x)g(x) dx.
(a) [10 points] Use the Gram-Schmidt algorithm on B to find an orthogonal basis of V.(b) [10 points] Find the best quadratic approximation to f (x) = x4 on [0, 1].
5. [20 points] Let Q be the quadratic form given by
Q(x1, x2, x3) = αx21 + 2×1 x2 + αx
22 + 4×1 x3 + 2×2 x3 − 3x
23
Find all values of α ∈ R such that Q is positive definite, and all values of α ∈ R such that Q isnegative definite, if they exist.
6. [Bonus question for up to 5 points of extra credit, if required] Let A be a Jordan matrix whoseJordan blocks are Jn1(λ0), Jn2(λ0), . . . , Jnk (λ0) for some integers n1, n2, . . . nk and some real num-ber λ0 (which is the same for all the blocks). What are pA(λ) and µA(λ)? Prove that your answeris correct.
ad575b3c65a653a72a6d1749dce4577c.pdf
MATHS 253, Algebra. Lecture 13
28 March 2022
Today’s topic: Orthogonal matrices. Orthogonal diagonalisation of Hermitianoperators.
Colour code: Red
Preparation: Go through the text:Textbook: §5.1; pp. 384–387; §5.4 pp. 411-418;Lecture Notes: Sec. 5.2
1 / 19
Change-of-basis matrices & orthogonal bases
Let B = {v1, . . . , vn} be an orthonormal basis of Rn. Let us consider someproperties of the change-of-basis matrix
Q = PE←B = [I]BE = [v1 v2 . . . vn].
where E is the standard basis.
Example 1
P =[
1/√
2 1/√
21/√
2 −1/√
2
], Q =
1/
√2 1/
√3 1/
√6
−1/√
2 1/√
3 1/√
60 1/
√3 −2/
√6
.
Indeed, the columns form an orthonormal basis of R2 and R3, respectively.
2 / 19
Orthogonal matricesLet us calculate QT Q. We obtain
QT Q =
vT1vT2…
vTn
[v1 v2 . . . vn] =
v1 · v1 · · · v1 · vnv2 · v1 · · · v2 · vn· · · · · · · · ·
vn · v1 · · · vn · vn
= In
The following two conditions are equivalent:• The columns of a matrix Q form an orthonormal basis.• QT Q = In.
Since Q is invertible (as a change-of-basis matrix), the second condition isequivalent to Q−1 = QT .
Definition 2A square n × n matrix with real coefficients is called orthogonal if Q−1 = QT .
3 / 19
Summarising
Proposition 1For a square n × n matrix Q the following two conditions are equivalent:• The columns of a matrix Q form an orthonormal basis.• Q is orthogonal.
4 / 19
Orthogonal matricesProposition 2If Q is an orthogonal matrix, then QT is also orthogonal.
Proof: Apply the transpose to QQ−1 = In to get (Q−1)T QT = ITn = In. It impliesthat (Q−1)T = (QT )−1. Now we see that
(QT )−1 = (Q−1)T = (QT )T ,
which means that QT is orthogonal.
Theorem 3The following conditions on the n × n matrix Q are equivalent:
1. Q is orthogonal.2. Columns of Q form an orthonormal basis of Rn.3. Q = PE←B for an orthonormal basis B.4. Rows of Q form an orthonormal basis of Rn.
5 / 19
ExerciseReplace stars with real numbers so that the matrix
A =1√
7
[1 ?? 1
].
is orthogonal.Let
A =1√
7
[1 xy 1
].
Then 1 + x 2 = y 2 + 1 = 7, hence x 2 = y 2 = 6 and |x| = |y| =√
6. To achieveorthogonality of columns these numbers should have opposite signs, hence:
A =1√
7
[1√
6−√
6 1
]or A =
1√
7
[1 −
√6√
6 1
].
6 / 19
Hermitian operators on Rn and Cn
Definition 4 (5.3.1)A linear operator T : F n → F n is called Hermitian if T∗ = T , or
T (u) · v = u · T (v)
for all u, v ∈ F n.
Example 5 (5.3.1)Let W be a subspace in F n. Let us prove that the orthogonal projectionT (v) = projW (v) onto W is a Hermitian operator.
Proof: Let u, v be arbitrary vectors of F n. Thenu = projW (u) + u
⊥ and v = projW (v) + v⊥, where u⊥, v⊥ ∈ W⊥. Hence
T (u) · v = projW (u) · (projW (v) + v⊥) = projW (u) · projW (v)
= (projW (u) + u⊥) · projW (v) = u · T (v).
7 / 19
Charles Hermite
(1822 – 1901)8 / 19
Hermitian matrices
A complex n × n matrix A = (aij ) is called Hermitian, if aji = āij for all i and j .
We defined a complex conjugate Ā of a matrix A by applying conjugation to eachentry of A, then A is Hermitian iff ĀT = A.
The matrix ĀT has the standard notation A? and bears the name of the adjointmatrix of A. So a Hermitian matrix can also be defined as the one which satisfiesA? = A.
Example 6Any real symmetric matrix, i.e., any matrix satisfying AT = A, is Hermitian.Another example is:
A =[
1 1 − i1 + i −1
].
9 / 19
Some important Hermitian matrices
Pauli introduced in 1927 the following three matrices, which are now known asPauli spin matrices,
σx =
[0 11 0
], σy =
[0 −ii 0
], σz =
[1 00 −1
].
These Hermitian matrices together with the identity matrix I2 (and their linearcombinations) represent all Hermitian operators on a two-dimensional vectorspace over the field of complex numbers: the spin space.
In the context of Pauli’s work, σu is the observable corresponding to spin along theuth coordinate axis in R3.
10 / 19
Hermitian operators and Hermitian matrices
Lemma 7 (5.3.1)Let T be an operator T : Cn → Cn with the standard matrix A, i.e., T (x) = Ax.Then T is Hermitian if and only if A is Hermitian.
Proof.If operator T has A as its standard matrix, then T∗ has A∗ as its standard matrix.Hence T = T∗ if and only if A = A∗.
11 / 19
Eigenvalues of Hermitian operators
Theorem 8 (5.3.2)All eigenvalues of a Hermitian operator on Cn are real.
Proof.Assume that T (v) = λv and v 6= 0. Then on the one hand (by DP3)
T (v) · v = (λv) · v = λ(v · v),
and on the other hand (by DP3 again)
v · T (v) = v · (λv) = λ(v · v).
For a Hermitian operator T these imply:
λ(v · v) = λ(v · v) or (λ−λ)(v · v) = 0.
We have v · v 6= 0 and we conclude that λ = λ, i.e., λ is real.12 / 19
Eigenvectors of Hermitian operators
Theorem 9 (5.3.3)Any two eigenvectors u and v of a Hermitian operator T on Cn belonging to twodistinct eigenvalues λ 6= µ, respectively, are orthogonal.
Proof.Assume that T (u) = λu and T (v) = µv, where λ 6= µ. Then using DP3 we obtain
T (u) · v = (λu) · v = λ(u · v),
andu · T (v) = u · (µv) = µ(u · v) = µ(u · v)
since λ and µ are real. For a Hermitian operator T this will give usλ(u · v) = µ(u · v) or (λ−µ)(u · v) = 0. As λ−µ 6= 0, we obtain u · v = 0 asrequired.
13 / 19
Orthogonal diagonalisation of Hermitian operators
Definition 10 (5.3.3)A linear operator T : V → V is called orthogonally diagonalisable if V has anorthonormal basis consisting of eigenvectors of T .
Theorem 11 (5.3.4)Every Hermitian operator T : Cn → Cn is orthogonally diagonalisable. Moreover, ithas an orthonormal basis consisting of eigenvectors belonging to real eigenvalues.
Proof: We will prove the theorem by induction on n = dimCV . The theorem isobvious for n = 1. Indeed, in a one-dimensional vector space V any linearoperator will be orthogonally diagonalisable.
14 / 19
Fundamental Theorem of Algebra
Theorem 12Let f (z) ∈ C[z] be a polynomial of complex variable z with complex coefficients.Then, if deg f ≥ 1, it has at least one complex root z0, i.e., f (z0) = 0.
This theorem is not a part of this course.
15 / 19
Construction of the first eigenvector
We may now claim that T on Cn has at least one eigenvalue λ1 ∈ C.
Indeed, by Fundamental Theorem of Algebra the characteristic polynomial χT (λ)of T has a root λ1.
As T is Hermitian, this eigenvalue must be real.
Let v be any eigenvector belonging to λ1. We can normalise it and considerv1 =
v‖v‖. Then ‖v1‖ = 1 and it is again an eigenvector for λ1.
16 / 19
Induction step
Let W = v⊥1 be the orthogonal complement of v1 in Cn. Then dimCW = n − 1. We
claim that W is invariant under T .
To justify this we have to prove that, if w belongs to W , so does T (w). To provethat T (w) belongs to W , when w ⊥ v1, means to prove that T (w) is alsoorthogonal to v1. This is indeed true:
T (w) · v1 = w · T (v1) = w · (λv1) = λ(w · v1) = 0.
Now the restriction T |W of T onto W will be a Hermitian linear operator on W .Since dimCW = n − 1 we can apply the induction hypothesis to it.
By the induction hypothesis, W has an orthonormal basis of eigenvectors{v2, . . . , vn} of T |W . Then {v1, v2, . . . , vn} is an orthonormal basis of eigenvectorsof T .
17 / 19
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definition of an orthogonal matrix and all equivalent alternative
definitions;• know what orthogonal diagonalisability means;• know the definition and properties of the dot product in Cn;• know the definition and examples of Hermitian operators;• know that eigenvalues of a Hermitian operator are always real and that
Hermitian operators are orthogonally diagonalisable.
18 / 19
What to do now
Consolidation: Try exercises: Textbook: §5.1: # 3,13,18,20, 28.Lecture Notes: Exercises of Sec. 5.1.
Next topic: Orthogonal diagonalisation of symmetric matrices. Spectraldecomposition.
Colour code: Red
Preparation: Go through the text:Textbook: §5.4 pp. 411–418.Lecture Notes: Sec. 5.2
19 / 19
bf3bd0b5836c7a42062cb61f188eb703.pdf
MATHS 253, Algebra. Lecture 11
23 March 2022Today’s topic: Projections as best approximations. Polynomial and Fourier
approximations.
Colour code: Red
Preparation: Go through the text:Textbook: §7.5, pp. 633–640.Lecture Notes: Sec. 4.1.3–4.3
1 / 24
The main idea of approximation theories
Approximation is one of the most important concepts in mathematics.
Polynomials and trigonometric polynomials are two most common classes offunctions used to approximate more complex functions.
Geometric idea that the projection is a best approximation is behind allapproximation theories.
2 / 24
Projection onto a finite-dimensional subspace
Theorem 1 (4.2.6)If W is a finite-dimensional subspace in an inner product space (which itself maynot be finite-dimensional) and {w1, . . . , wm} is an orthogonal basis for W , then theorthogonal projection of an arbitrary vector v onto W exists and
projW (v) =(v, w1)‖w1‖2
w1 + · · ·+(v, wm)‖wm‖2
wm
= projw1(v) + · · ·+ projwm (v),
whereprojwi (v) =
(v, wi )‖wi‖2
wi.
3 / 24
Proof
Let us write down the condition that for a vector w = a1w1 + · · ·+ amwm of W wehave v − w ⊥ W . It is equivalent to
(v − w, w1) = · · · = (v − w, wm) = 0.
Since0 = (v − w, wi ) = (v − a1w1 −···− amwm, wi ) =
(v, wi )− ai (wi, wi ) = (v, wi )− ai‖wi‖2,
we obtainai =
(v, wi )‖wi‖2
.
So ai wi = projwi (v) and the theorem is proved.
4 / 24
Pythagoras’ Theorem 1
In a right-angled triangle with legs a and b and hypotenuse c we have
a2 + b2 = c2.
Based on the proof from Zhou bi suan jing, a Chinese document dating fromapproximately 200 BC.
5 / 24
Pythagoras’ Theorem 2
Theorem 2 (4.2.7)Let V be any real inner product space. If u, v ∈ V and u ⊥ v, then
‖u + v‖2 = ‖u‖2 +‖v‖2.
Proof.Since (u, v) = 0, we have
‖u + v‖2 = (u + v, u + v) = (u, u) + (v, v) = ‖u‖2 +‖v‖2.
6 / 24
Best Approximation TheoremTheorem 3 (4.2.8)Let V be an inner product space, W be a finite-dimensional subspace of V and vbe an arbitrary vector from V . Then for vectors w ∈ W the norm ‖v − w‖ isminimised when w = projW (v) and for all w 6= projW (v)
‖v − w‖ > ‖v − projW (v)‖.
Proof: We can write v − w in the form
v − w = (v − projW (v)) + (projW (v)− w) = z1 + z2,
where z1 ⊥ W and z2 ∈ W . Using Pythagoras’ Theorem we get
‖v − w‖2 = ‖z1 + z2‖2 = ‖z1‖2 +‖z2‖2 ≥‖z1‖2
or ‖v − w‖≥‖v − projW (v)‖.with ‖v − w‖ = ‖v − projW (v)‖ if and only if w = projW (v).
7 / 24
Least squares approximations
When the integral inner product is used to approximate f (x) by functions g(x) fromclass G we minimise
‖f − g‖2 =∫ b
a(f (x)− g(x))2dx, g(x) ∈G,
so it is sometimes called the least squares approximation.
8 / 24
Approximation by polynomials
If we want to approximate a function f (x) by polynomials of degree not greaterthan n we take the projection of f (x) onto the subspace of all polynomials Rn[x] ofdegree not greater than n.
We use the orthogonal basis consisting from the respective Legendre polynomials.
Let f (x) ∈ FunR[a, b] be a function. The nth polynomial approximation will be:
fn(x) =(f, p0)(p0, p0)
p0 + · · ·+(f, pn)(pn, pn)
pn.
9 / 24
ExampleLet us find the best polynomial approximation of degree ≤ 2 for the functionf (x) = x 2/3 on [−1, 1].We know: (p0, p0) = 2, (p1, p1) = 2/3, (p2, p2) = 8/45. We calculate:
(f, p0) =∫ 1−1
x 2/3 dx =65, (f, p1) = 0 (as f is even),
(f, p2) =∫ 1−1
x 2/3(x 2 −13) dx =
855.
Thus
f2(x) =(f, p0)(p0, p0)
p0 +(f, p1)(p1, p1)
p1 +(f, p2)(p2, p2)
p2 =35· 1 +
911
(x 2 −13) =
911
x 2 +1855.
10 / 24
Graph of the quadratic approximation
−1 −0.8 −0.6 −0.4 −0.2 0 0.2 0.4 0.6 0.8 10
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
true yapproximation
11 / 24
ExampleCalculate the polynomial at degree at most 3 that best approximates ex over theinterval [−1, 1] in the least-squares sense.
(ex, p0) =∫ 1−1
ex dx = e − e−1.
(ex, p1) =∫ 1−1
xex dx = 2e−1,
(ex, p2) =∫ 1−1
ex (x 2 − 1/3)dx =23
e −143
e−1.
(ex, p3) =∫ 1−1
ex (x 3 − 3/5x)dx = −2e +745
e−1.
(Wolfram Alpha: Computational Knowledge Engine used).12 / 24
Example (continued)
(ex, p0)‖p0‖2
= 1.1752012,(ex, p1)‖p1‖2
= 1.1036383,
(ex, p2)‖p2‖2
= 0.53672153,(ex, p3)‖p3‖2
= 0.17613908.
So now we have our approximation
g3(x) = 1.1752012 · p0(x) + 1.1036383 · p1(x) + 0.53672153 · p2(x)
+0.17613908 · p3(x)
= 0.99629402 + 0.99795487x + 0.53672153x 2 + 0.17613908x 3.
13 / 24
Graph of the cubic approximation
−2 −1.5 −1 −0.5 0 0.5 1 1.5 2−1
0
1
2
3
4
5
6
7
8
true yapproximation
On [−1, 1] the two curves can be hardly distinguished.14 / 24
Jean Baptiste Joseph Fourier (1768-1830)
15 / 24
The projection formula for orthonormal basis – recap
Corollary 4Let B = {w1, . . . , wm} be an orthonormal basis of a subspace W of an innerproduct space V and x ∈ V . Then
projW (x) = (x, w1)w1 + (x, w2)w2 + . . . + (x, wm)wm.
This will suit our next goal which is approximations by trig polynomials since weknow a nice orthonormal basis of Tn[x] on [−1, 1]{
1√
2,cosπx,sinπx, . . . ,cos nπx,sin nπx
}.
16 / 24
Fourier approximationsIf we want to approximate a function f defined on [−1, 1] by a trigonometricpolynomial we take the projection of f onto Tn[x] and what you get is called the nthFourier approximation fn:
fn(x) = a0 +n∑
k=1
(ak cos kπx + bk sin kπx) ,
where a0 =(
f (x),1√
2
)1√
2=
12
∫ 1−1
f (x) dx,
ak = (f (x),cos kπx) =∫ 1−1
f (x) cos kπxdx,
bk = (f (x),sin kπx) =∫ 1−1
f (x) sin kπxdx.
The numbers ai , i = 0, 1, . . . , n and bi , i = 1, . . . , n are called Fourier coefficientsof f .
17 / 24
By Best Approximations Theorem we know that
‖f − t‖≥‖f − fn‖ =
(∫ 1−1
[f (x)− fn(x)]2 dx
)1/2for any other Fourier polynomial t of degree n. We say that fn is the least squaresapproximation for f by trigonometric polynomials of degree ≤ n.
The role of Algebra is to suggest a formula for fn(x) and the role of Calculus is tofind necessary and sufficient conditions, under which the convergence
fn(x) → f (x)
takes place and what is the type of this convergence.
18 / 24
ExampleLet us find the nth Fourier approximation for f (x) = x . Since this function is odd,the projections on all cosines will be zero, hence fn will be expressed using thesines only.
bk = (f (x),sin kπx) =∫ 1−1
x sin kπxdx = −xcos kπx
kπ
]1−1
+
∫ 1−1
cos kπxkπ
dx = − 2cos kπ
kπ= − (−1)k
2kπ
.
Thus
fn(x) =2π
[sinπx −
12sin 2πx +
13sin 3πx − . . . + (−1)n+1
1nsin nπx
].
The next slide shows the 2nd and 5th order approximations to this function.19 / 24
2nd order approximation
20 / 24
5th order approximation
21 / 24
ConvergenceFourier asked whether the Fourier series of a continuous function which is periodicon [−1, 1] converges pointwise to this function, i.e., whether or not
fn(x) → f (x) for all x ∈ [−1, 1].
If a function is continuously differentiable then this was proved by Dirichlet, whoexpressed his belief that he would soon be able to extend his result to cover allcontinuous functions.
This was disproved by Paul du Bois-Reymond, who showed in 1876 that there is acontinuous function whose Fourier series diverges at one point.
Theorem (Lenart Carleson, 1966) If the square of the function f is integrable(function belongs to the class L2), then its Fourier series converges almosteverywhere.
This was conjectured by Nicolai Luzin (1915).22 / 24
To Summarise
As a result of today’s lecture (together with your further study) you should• be able to find orthogonal basis in inner product spaces using GSO;• know orthogonal bases for trigonometric and ordinary polynomials;• be able to find best polynomial approximation of degree ≤ n to a given
function.• be able to find best nth degree Fourier approximation to a given function.
23 / 24
What to do now
Consolidation: Try exercises: Textbook: §7.5, p. 644 # 1,3,5,8,13,15, 21,23,25.Next topic: Complex inner product. Adjoint operator. The normal equation.
Colour code: Red
Lecture Notes: Sec. 4.3–5.1.Textbook: pp. 566-570; §7.3; pp. 591–607 (repetition of 250);
24 / 24
c81c79b51b26713d245e6a8c84503937.pdf
MATHS 253, Algebra. Lecture 5
9 March 2022
Today’s topic: Eigenvectors and eigenvalues.
Colour code: Yellow/Orange/Red
Preparation: Go through the text: Lecture Notes: Sec. 1.4, 2.1; Textbook: §4.1,4.3.
1 / 24
Eigenvectors and eigenvalues
Let V be a vector space over a field F and T : V → V be a linear operator on V .
Definition 1A nonzero vector v ∈ V is called an eigenvector of T if for some scalar λ ∈ F
T (v) = λv.
In this case λ is said to be an eigenvalue of T . We will also say that theeigenvector v belongs to the eigenvalue λ.
Note that we do not require that λ 6= 0.
2 / 24
Examples of eigenvectors and eigenvalues
1. Let T : R2 → R2 be the projection operator onto the x -axis. Then
T (e1) = e1 = 1 · e1, T (e2) = 0 = 0 · e2.
The vector e1 is an eigenvector of T belonging to the eigenvalue 1 and thevector e2 is an eigenvector of T belonging to the eigenvalue 0.
2. Let T : R2 → R2 be the reflection operator in the x -axis. As
T (e1) = e1 = 1 · e1, T (e2) = −e2 = (−1) · e2,
the vector e1 is an eigenvector of T belonging to the eigenvalue 1 and thevector e2 is an eigenvector of T belonging to the eigenvalue −1.
3 / 24
More examples of eigenvectors and eigenvalues
3. The function f (x) = eλx is an eigenvector of the differentiation operator Dbelonging to the eigenvalue λ as
D(eλx ) = λeλx.
4. Every nonzero vector v is an eigenvector of the identity operator I:
I(v) = v = 1 · v.
4 / 24
Eigenvectors and eigenvalues of matrices
Definition 2Let T : F n → F n be a linear operator on F n with the standard matrix A, that isT (x) = Ax. Then the fact that x ∈ F n is an eigenvector of T belonging to aneigenvalue λ ∈ F can be expressed as
Ax = λx,
in which case for brevity λ is also called an eigenvalue of A and x is also called aneigenvector of A belonging to the eigenvalue λ.
Shortly we will recap how to find the eigenvalues and eigenvectors of a squaren × n matrix A. But first an exercise.
5 / 24
Exercise 1
Which of the two vectors[
11
]and
[1−1
]is an eigenvector of
[3 1−1 1
]?
Which eigenvalue does it belong to?
Solution: [3 1−1 1
][11
]=
[40
]6= λ
[11
].
Hence[
11
]is not an eigenvector.
[3 1−1 1
][1−1
]=
[2−2
]= 2
[1−1
].
Hence only the second vector[
1−1
]is an eigenvector belonging to eigenvalue 2.
6 / 24
Finding eigenvectors and eigenvalues 1Theorem 3Let T : F n → F n be a linear operator with the standard matrix A ∈ F n×n, that isT (x) = Ax. Then
1. The eigenvalues of T are the roots of the characteristic polynomial
pA(λ) = det(A −λIn),
which is a polynomial of degree n in λ.2. If λ0 is an eigenvalue, then the eigenvectors belonging to λ0 are the nonzero
solutions to (A −λ0In)x = 0, which is the null space of A −λ0In.
Definition 4If λ is an eigenvalue, then the null space of A −λIn is said to be the eigenspacebelonging to the eigenvalue λ, denoted
Eλ = null(A −λIn).7 / 24
Exercise 2Which of the 3 or −3 is an eigenvalue of
[−2 1
1 −2
]? Find an eigenvector
belonging to the eigenvalue.
Solution: Let us try 3 first:∣∣∣∣ −2 − 3 11 −2 − 3∣∣∣∣ =
∣∣∣∣ −5 11 −5∣∣∣∣ = 24 6= 0.
Not an eigenvalue. Let us try −3:∣∣∣∣ −2 + 3 11 −2 + 3∣∣∣∣ =
∣∣∣∣ 1 11 1∣∣∣∣ = 0
Hence only −3 is an eigenvalue and an eigenvector maybe chosen as[
1−1
](any from the null space would do).
8 / 24
Algebraic and geometric multiplicity
Definition 5Let λ0 be an eigenvalue of A and (λ−λ0)k is the highest power of λ−λ0 thatdivides pA(λ). Then k is said to be the algebraic multiplicity of λ0.
Definition 6The eigenvectors belonging to an eigenvalue λ0 and the zero vector form theeigenspace Eλ0 , which is the null space of the matrix (A −λ0In). The dimension ofthis space is called the geometric multiplicity of λ0.
Proposition 1The geometric multiplicity of eigenvalue λ0 of an operator T : F n → F n, given byT (x) = Ax, is equal to n − rank(A −λ0In).
9 / 24
ExampleFind the eigenvalues and eigenvectors of the linear operator T : R3 → R3 with thestandard matrix
A =
7 −1 −2−1 7 2
1 −1 4
.
The characteristic polynomial of the matrix A is equal to
det(A −λI) =
∣∣∣∣∣∣7 −λ −1 −2−1 7 −λ 21 −1 4 −λ
∣∣∣∣∣∣ =∣∣∣∣∣∣
7 −λ −1 −2−1 7 −λ 20 6 −λ 6 −λ
∣∣∣∣∣∣ =(6 −λ)
∣∣∣∣∣∣7 −λ −1 −2−1 7 −λ 20 1 1
∣∣∣∣∣∣ = (6 −λ)∣∣∣∣∣∣
7 −λ 1 −2−1 5 −λ 20 0 1
∣∣∣∣∣∣ =(6 −λ)
∣∣∣∣ 7 −λ 1−1 5 −λ∣∣∣∣ = (6 −λ)3.
Note that we used column operations too!10 / 24
Example continuedTherefore there is only one eigenvalue 6 but it has algebraic multiplicity 3. Let usfind the eigenspace belonging to 6.The eigenvectors are the solutions to the system (A − 6I)x = 0, wherex = (x1, x2, x3)T . We row reduce the characteristic matrix:
A − 6I =
1 −1 −2−1 1 2
1 −1 −2
rref−→
1 −1 −20 0 0
0 0 0
.
This row reduced echelon form corresponds to the system
x1 − x2 − 2×3 = 0,
which solution-space is spanned by the vectors f1 = (1, 1, 0)T and f2 = (2, 0, 1)T .This is a basis for the eigenspace belonging to the eigenvalue 6. The geometricmultiplicity of 6 is two.
11 / 24
Finding eigenvalues and eigenvectors. The case of linear operators
Theorem 7Let T : V → V be a linear operator on a finite-dimensional vector space V over Fand let B be any basis of V . Then:
1. The eigenvalues of T are the eigenvalues of [T ]B (belonging to F );2. The coordinate columns of the eigenvectors of T are the eigenvectors of [T ]B .
Proof.Suppose T (u) = λu. Taking the coordinate column on both sides we get
[T (u)]B = [T ]B[u]B = λ[u]B
and [u]B is an eigenvector of [T ]B belonging to λ.
12 / 24
ExampleLet V = R2[x] be the vector space of polynomials of degree at most 2. The linearoperator T on V is given by the formula
(Tf )(x) = (1 − 2x 2)f ′′(x) + xf ′(x) + 2f (x).Find the eigenvectors and eigenvalues of T .
Let B = {1, x, x 2}. Then(T 1)(x) = 2 · 1,(Tx)(x) = 3 · x(Tx 2)(x) = 2 · 1.
The corresponding matrix will be
[T ]B =
2 0 20 3 0
0 0 0
.
13 / 24
Example continues
The eigenvalues of this matrix are: 0, 2, 3. These will be also the eigenvalues forT .
For λ2 = 0 we find
[T ]B − 0 · I =
2 0 20 3 0
0 0 0
rref−→
1 0 10 1 0
0 0 0
.
The basis of its null space is (1, 0,−1)T which correspond to the functionf1 = 1 − x 2.
14 / 24
Example continuesFor λ2 = 2 we find
[T ]B − 2 · I =
0 0 20 1 0
0 0 −2
rref−→
0 1 00 0 1
0 0 0
.
The basis of its null space is (1, 0, 0)T which correspond to the function f2 = 1.
For λ3 = 3 we find
[T ]B − 3 · I =
−1 0 20 0 0
0 0 −3
rref−→
1 0 00 0 1
0 0 0
.
The basis of its null space is (0, 1, 0)T which correspond to the function f3 = x .
Note that we have a basis {1, x, 1 − x 2} of eigenvectors.15 / 24
Invariant subspaces
The concept of invariant subspace generalizes the concept of an eigenvector.
Definition 8Let T : V → V be a linear operator acting on V . A subspace U ⊆ V is calledinvariant (under T ) if T (U) ⊆ U.
Example 9Let T : V → V is a linear operator acting on V and v is an eigenvector of T . Thenone-dimensional subspace W = Span{v} is invariant.
Example 10Let D be the differentiation operator on R[x]. Then for each n the subspace Rn[x]is invariant under D.
16 / 24
A criterion
Proposition 2 (Criterion)A subspace W of a vector space V is invariant under T : V → V if for some basis{w1, . . . , wm} of W all the vectors T (w1), . . . , T (wm) belong to W .
Proof.Indeed, if all the vectors T (w1), . . . , T (wm) belong to W , then for an arbitraryw = a1w1 + . . . + amwm ∈ W we will get
T (w) = T (a1w1 + . . . + amwm) = a1T (w1) + . . . + amT (wm) ∈ W.
Whence W is invariant under T .
17 / 24
Example
W = span{ex, xex, x 2ex} is an invariant subspace of CR[a, b] under thedifferentiation operator D.
Indeed,
D(ex ) = ex ∈ W,D(xex ) = ex + xex ∈ W,
D(x 2ex ) = 2xex + x 2ex ∈ W.
Hence W is invariant.
18 / 24
Restriction of an operator
If U ⊆ V is an invariant subspace of an operator T : V → V , then we canassociate with T an operator T |U : U → U defined
T |U(u) = T (u).
We simply restrict the domain of T to the subspace U and get a new operator onthe restricted domain.
It is easy to see that T |U is well-defined and linear. It is called the restriction of Ton U.
19 / 24
Example
Let V = D[−∞,∞] be the vector space of all differentiable real valued functionsand W = Span{f1, f2}, where
f1 = eax ·cos bx, f2 = eax · sin bx, (a2 + b2 6= 0)
be a two-dimensional subspace in V . Check that it is invariant under thedifferentiation operator D.
It is easy to check that
D(f1) = aeax ·cos bx − beax · sin bx = af1 − bf2,
D(f2) = beax ·cos bx + aeax · sin bx = bf1 + af2.
So W is invariant as both D(f1) and D(f2) are in W .
20 / 24
Interesting observation about WThe restriction D|W of D on W in the basis B = {f1, f2} has the matrix
[D|W ]B =[
a b−b a
].
Let us denote T = D|W . Since the matrix of T is invertible, the inverse operatorT−1 : W → W exists and we find
[T−1]B =[
a b−b a
]−1=
1a2 + b2
[a −bb a
].
So1
a2 + b2
[a −bb a
][01
]=
1a2 + b2
[−ba
]would be the coordinate column of
∫eax · sin bx dx .
21 / 24
Therefore, ∫eax · sin bx dx =
= −b
a2 + b2eax · cos bx +
aa2 + b2
eax · sin bx + C.
(You may check the result integrating by parts.)
22 / 24
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definition of eigenvectors and eigenvalues;• be able to find eigenvectors and eigenvalues of a linear operator on a
finite-dimensional vector space;• know the definition of an invariant subspace and the criterion of being
invariant;• be able to find the matrix of the restriction of an operator to an invariant
subspace.
23 / 24
What to do now
Consolidation: Try Exercises: Lecture Notes: Exercises of Sec. 3.1–3.2;Textbook: Exercises: pp. 271–273: # 3,5,9,11,13–18,23,29; p.309: #3,5,9,11.
Next topic: Diagonalisation. Applications to discrete time system evolution.
Colour code: Yellow/Orange
Preparation: Go through the text:Lecture Notes: Sec. 2.2–2.3.1Textbook: Sec. §4.3, 4.4; pp. 307–320.
24 / 24
f14236f1fced173edaac716d41c39dd8.pdf
MATHS 253, Algebra. Lecture 4
7 March 2022
Today’s topic: Change of basis matrices and their properties. Algebra of linearoperators.
Colour code: Orange/Red
Preparation: Go through the text: Lecture Notes: Sec. 1.2.2–1.2.3.Textbook: §6.3: pp. 481–488; §6.6; 516–526.
1 / 26
Linear Transformations in Arbitrary Bases – recapTheorem 1Let T : V → W be a linear transformation. Let B = {v1, . . . , vn} be a basis for Vand C = {w1, . . . , wm} be a basis for W . Then
[T (x)]C = A[x]B,
where A is an m × n matrix, which can be found as
A = ([T (v1)]C | [T (v2)]C | · · · | [T (vn)]C) .
The matrix A is called the matrix of T relative to bases B and C. We will alsodenote A as [T ]BC so that the formula will also be written as
[T (x)]C = [T ]BC[x]B (green basis cancels).
The notation in the textbook for [T ]BC is [T ]C←B .2 / 26
ExampleSuppose V is 2-dimensional vector space with the basis B = {v1, v2} and U is3-dimensional vector space with basis C = {u1, u2, u3}. Suppose a linear operatorT : V → U acts on V so that
T (v1) = u1 + u2 + u3,T (v2) = u1 − u2.
What is the matrix [T ]BC of T relative to B and C?
We have
[T (v1)]C =
11
1
, [T (v2)]C =
1−1
0
.
hence
[T ]BC =
1 11 −1
1 0
.
3 / 26
ExampleSuppose V is 3-dimensional vector space and B = {v1, v2, v3} be a basis of V .Suppose a linear operator T : V → V acts on V so that
T (v1) = v1 + v2 + v3 T (v2) = v1 + v2 T (v3) = v1.
What is the matrix [T ]BB of T relative to B and B?
We have
[T (v1)]B =
11
1
, [T (v2)]B =
11
0
, [T (v3)]B =
10
0
.
hence
[T ]BB =
1 1 11 1 0
1 0 0
.
4 / 26
Another exampleLet V be a vector space spanned in CR(−∞,∞) by three functions
f1(x) = ex, f2(x) = xe
x, f3(x) = x2ex.
These functions for a basis of V . Find the matrix of the differentiation operator Drelative to this basis.
As D(ex ) = ex , D(xex ) = ex + xex , D(x 2ex ) = 2xex + x 2ex ,
D(f1) = f1, D(f2) = f1 + f2, D(f3) = 2f2 + f3.
Then
[D(f1)]B =
10
0
, [D(f2)]B =
11
0
, [D(f3)]B =
02
1
.
and
[D]BB =
1 1 00 1 2
0 0 1
.
5 / 26
Uniqueness of the matrix [T ]BC .
To see that this matrix is unique, let us assume that there is a matrixA = [a1 . . . an] that satisfies
[T (v)]C = A[v]B
for all v ∈ V . Let us take v = vi , the i th vector of B. Then
[vi ]B = ei,
the i th vector of the standard basis of F n, and
[T (vi )]C = A[vi ]B = Aei = ai,
the i th column of A. The i th column of [T ]BC coincides with the i th column of A.Hence [T ]BC coincides with A. �
6 / 26
Change-of-basis matrix
Let V be an arbitrary n-dimensional vector space. Let also B = {v1, . . . , vn} andC = {w1, . . . , wn} be two bases for V .
Let us consider the identity transformation I : V → V and the matrix of it relative tothese two bases:
[I]BC = ([v1]C, . . . , [vn]C).
As we established in the previous lecture
[v]C = [I(v)]C = [I]BC[v]B.
This matrix [I]BC relates the coordinate columns [v]C and [v]B of an arbitrary vectorv in V and is called the change-of-basis or change-of-coordinates matrix from B toC. We know it is unique. In the textbook it is denoted as PC←B so in this notation
[v]C = PC←B[v]B.
7 / 26
ExampleSuppose V is 3-dimensional vector space and B = {v1, v2, v3}, C = {u1, u2, u3}be two bases of V such that
v1 = u1 − u2 + 2u3,v2 = u1 + u3,v3 = u2.
What is the change of basis matrix [I]BC of I relative to B and C?
We have
[v1]C =
1−1
2
, [v2]C =
10
1
, [v3]C =
01
0
.
hence
PC←B = [I]BC =
1 1 0−1 0 1
2 1 0
.
8 / 26
ExampleSuppose V is 3-dimensional vector space and B = {v1, v2, v3}, C = {u1, u2, u3}be two bases of V such that
u1 = v2, u2 = v3, u3 = v1.
What is the change of basis matrix [I]BC of I relative to B and C?
We have
[v1]C =
00
1
, [v2]C =
10
0
, [v3]C =
01
0
.
hence
PC←B = [I]BC =
0 1 00 0 1
1 0 0
.
Order of vectors in the basis is important!!!9 / 26
Properties of the change-of-basis matrix
Theorem 2For any two bases B = {v1, . . . , vn} and C = {w1, . . . , wn} of a vector space V ,the change-of-basis matrix
PC←B = ([v1]C, . . . , [vn]C)
is invertible.
Proof.The vectors in basis B are linearly independent, hence their coordinate columnsrelative to C are also linearly independent. Hence the determinant of PC←B isnon-zero and the matrix is invertible.
10 / 26
Further properties (several bases involved)Theorem 3Let B, C and D be any three bases of a finite-dimensional vector space V . Then
1. PC←B = (PB←C)−1.2. PD←B = PD←C PC←B.
Proof.Since all change-of-basis matrices are invertible,
[v]C = PC←B[v]B =⇒ (PC←B)−1[v]C = [v]B.
Since PB←C is unique, PB←C = (PC←B)−1. Also,
[v]D = PD←C[v]C = PD←C PC←B[v]B,
hence, due to uniqueness again, PD←B = PD←C PC←B.11 / 26
ExampleGiven two bases B = {v1, v2} and C = {u1, u2} of R2, where
v1 =[
11
], v2 =
[1−1
], u1 =
[10
], u2 =
[21
],
find the change of basis matrices: PE←B , PE←C , PC←B .
We immediately find
PE←B =[
1 11 −1
], PE←C =
[1 20 1
].
Then we can calculatePC←B = PC←E PE←B = P
−1E←C PE←B
=
[1 −20 1
][1 11 −1
]=
[−1 3
1 −1
].
12 / 26
One more Example
Polynomial f (x) = 2 + (1 + x)− 3(1 + x)2 is written as a linear combination ofpowers of 1 + x . Write it as a linear combination of powers of 1 − x .
We have two bases of R2[x] here involved: B = {1, 1 + x,(1 + x)2} andC = {1, 1 − x,(1 − x)2}. We need the standard basis E = {1, x, x 2} as anintermediate.
We have
1 = 1,1 + x = 1 + x,
(1 + x)2 = 1 + 2x + x 2,
and PE←B =
1 1 10 1 2
0 0 1
.
13 / 26
Continuation of ExampleAlso
1 = 1,1 − x = 1 − x,
(1 − x)2 = 1 − 2x + x 2,
and PE←C =
1 1 10 −1 −2
0 0 1
.
HencePC←B = PC←E PE←B = P
−1E←C PE←B =
1 2 40 −1 −4
0 0 1
.
As f (x) = 2 + (1 + x)− 3(1 + x)2, we have [f ]B = (2, 1,−3)T . Hence
[f ]C = PC←B[f ]B =
1 2 40 −1 −4
0 0 1
21−3
=
−811−3
.
and f (x) = −8 + 11(1 − x)− 3(1 − x)2.14 / 26
AlgebrasA vector space A over a field F where a multiplication of vectors is defined iscalled algebra if for every three vectors u, v, w ∈ A and every a, b ∈ F the followingholds:• (u + v)w = uw + vw, w(u + v) = wu + wv;• (a + b)u = au + bu, (ab)u = a(bu).
Three examples:
1. The vector space Fn×n of all n × n matrices over F is an algebra relative to theoperation of multiplication of matrices.
2. The vector space L(V ) of all linear operators on an n-dimensional vectorspace V over F is an algebra relative to the operation of composition ofoperators.
3. The vector space R3 is an algebra relative to the operation of cross product oftwo vectors u × v.
15 / 26
Algebras Fn×n and L(V ) correspond to each other nicely and, as a result, you cancalculate in one instead of calculating in the other. For any basis B of V themapping
T 7→ [T ]BBtranslates the results from one algebra to another.
For brevity, I’ll write [T ]BB as [T ]B .
16 / 26
Matrices of the product and the sum
Theorem 4Let T and S be two linear operators acting in a finite-dimensional vector space Vover F and let B be any basis of V . Then
[TS]B = [T ]B[S]B, [T + S]B = [T ]B + [S]B, [αT ]B = α[T ]B.
To prove the first one we note that on the one hand
[TS(v)]B = [TS]B[v]B
and on the other[TS(v)]B = [T (S(v))]B = [T ]B[S(v)]B = [T ]B[S]B[v]B.
Due to the uniqueness of the matrix of TS we conclude that [TS]B = [T ]B[S]B . Theother equations are exercises.
17 / 26
Invertible operators
A linear operator T : V → V is said to be invertible if there is another operator S(which might actually coincide with T ) such that TS = ST = I, where I is theidentity operator.
For every invertible operator T the linear operator S such that TS = ST = I isunique. Indeed, TS1 = S1T = I and TS2 = S2T = I imply
S1 = S1I = S1(TS2) = (S1T )S2 = IS2 = S2,
and S1 = S2 as claimed. We denote then S = T−1 and call it the inverse.
18 / 26
Invertible operators examples
1. The rotation Rθ about the origin is invertible with the inverse R2π−θ.
2. Any reflection H of the plane in any line is self-invertible since H 2 = I leadingto H−1 = H.
3. Projection PW onto a proper subspace W of Rn is NOT invertible. You cannotundo it.
19 / 26
The matrix of the inverse
Corollary 5Let V be an n-dimensional vector space and B be a basis of V . Suppose thatT : V → V is an invertible linear operator. Then matrix of [T ]B is also invertible.Moreover
([T ]B)−1 = [T−1]B.
Proof.We apply Theorem 4 to the equation TT−1 = T−1T = I to obtain
[T ]B[T−1]B = [T
−1]B[T ]B = [I]B = In.
This proves the corollary.
20 / 26
Change of the basis
Theorem 6Let B, C be two bases of a finite-dimensional vector space V over F andT : V → V be a linear operator on V . Then
[T ]C = PC←B[T ]B PB←C = P−1[T ]B P,
where P = PB←C .
Proof.We have to express [T (v)]C as a matrix times [v]C :
[T (v)]C = PC←B[T (v)]B = PC←B[T ]B[v]B = PC←B[T ]B PB←C[v]C,
hence [T ]C = PC←B[T ]B PB←C .
21 / 26
Example 1
Let T (x) = Ax, where
A =15
[3 −4−4 −3
].
Find the matrix [T ]B of T relative to the basis B = {v1, v2}, where v1 = (2,−1)Tand v2 = (1, 2)T .
We know A = [T ]E where E is the standard basis. Also
P = PE←B = [v1 v2] =[
2 1−1 2
].
Since [T ]B = PB←E [T ]E PE←B = P−1AP =
[1 00 −1
],
we see that T is a reflection in v1.
22 / 26
Example 2
Let T be the linear operator on R3 with the standard matrix (the matrix of Trelative to the standard basis of R3)
A =
2 −1 22 2 −1−1 2 2
.
Let us find the matrix of T relative to the basis B = {u, v, w}, where
u =1√
3(1, 1, 1)T , v =
1√
6(−1,−1, 2)T , w =
1√
2(1,−1, 0)T .
23 / 26
Let
P = PE←B = [u | v | w] =
13√
3 −16√
612√
2
13√
3 −16√
6 −12√
2
13√
313√
6 0
.
Then
[T ]B = P−1AP = 3
1 0 00 12 −√32
0√
32
12
= 3
1 0 00 cos π3 −sin π3
0 sin π3 cosπ3
.
and we can see that it is the rotation 60◦ about the line x = y = z and uniformmagnification in all directions 3 times.
24 / 26
To Summarise
As a result of today’s lecture (together with your further study) you should• know the definition of the change-of-basis matrix and its properties;• be able to construct a change-of-basis matrix relative to the two given bases;• be able to find the new coordinates of a vector if the basis changes;• find the new matrix of a linear operator if the basis changes.
25 / 26
What to do now
Consolidation: Try exercises:Textbook: p. 489: # 1,3,5,13,15; pp. 531-532:
# 13,15,23, 27.Lecture Notes: Exercises of Sec. 2.4–2.5.
Next topic: Eigenvectors and eigenvalues.
Colour code: Yellow/Orange
Preparation: Go through the text:Textbook: §4.1, pp. 264–271; §4.3, pp. 303–309.Lecture Notes: Sec. 1.4, 2.1
26 / 26
midterm-sem1-2019.pdf
Department of Mathematics
Maths 253 Midterm Test Due: Friday 3 May, 2019
The test is 90 minutes long. There are three questions (turn over this page).The questions will be marked out of 20 marks, 20 marks and 30 marks, respectively.
Please show all relevant working (and reasoning).
1. Let T : V → V be the linear operator on the quadratics V := span{1, x, x2} given by
Tf(x) := f(0)x2 + f ′(0)x +1
2f′′(0).
(a) Find the matrix representation of T with respect to the basis B = [1, x, x2].
(b) Find the eigenvalues of T .
(c) Find the eigenspaces of T .
(d) Is T diagonalisable?
(e) Is T invertible?
(f) Find T 2019.
2. (a) Let 〈·, ·〉 be a real or complex inner product on V , and W be a nonempty subset of V . Showthat the orthogonal complement of W , i.e.,
W ⊥ := {v ∈ V : 〈v, w〉 = 0, ∀w ∈ W},
is a subspace of V , i.e, if v1, v2 ∈ W⊥, then so is the linear combination αv1 + βv2.
(b) For the Legendre inner product 〈f, g〉 :=
∫
1
−1
f(t)g(t) dt on C[−1, 1],
(i) Apply the Gram-Schmidt orthogonalisation to the polynomials 1, x, x2
(ii) Find the orthogonal projection of the polynomials 1 + 2x + x2 and x3 onto the space ofquadratic polynomials span{1, x, x2}.
(c) Let W := span{
111
,
1−10
} and v =
246
.
(i) Find the matrix P giving the orthogonal projection onto W .
(ii) Find the orthogonal projection of v onto W .
(iii) Check that Pv is in W and v − Pv is in W ⊥.
Maths 253 Midterm Test Page 1 of 2
3. (a) Let T : V → W be a linear operator between real or complex inner product spaces V and W .Explain how the adjoint (Hermitian transpose) T ∗ : W → V is defined.
(b) Let T : V → V be a linear operator on a real or complex inner product space V .
(i) Define what it means for T to be Hermitian, and show that Hermitian operators havereal eigenvalues.
(ii) Define what it means for T , a Hermitian operator, to be positive definite and to be positivesemidefinite.
(iii) Let A be an n × m matrix. Show that the square matrix A∗A is positive semidefinite.
(c) Let A =
(
−3 11 −3
)
.
(i) Given that A is Hermitian, what can be said about its eigenvalues and eigenvectors.
(ii) What is the quadratic form Q(x, y) associated with A.
(iii) Determine whether A (equivalently Q) is positive definite, negative definite or indefinite.
(iv) Write Q(x, y) as a sum of squares.
(v) Find the maximum and minimum of Q on the unit ball B and unit sphere S, i.e.,
B = {x = (x1, x2) : x2
1+ x2
2≤ 1}, S = {x = (x1, x2) : x
2
1+ x2
2= 1}.
Remark: You need not calculate the points where these extreme values are taken.
Maths 253 Midterm Test Page 2 of 2
mocktest (1).pdf
Department of Mathematics
MATHS 253 Mock semester test Sem. 1, 2021
1. Consider a linear operator T on the vector space R2×2 matrices given by
T(X) = XA + AX, A =
[0 12 3
].
(a) Find the matrix of T relative to the standard basis {E11,E12,E21,E22}, where
E11 =
[1 00 0
], E12 =
[0 10 0
], E21 =
[0 01 0
], E22 =
[0 00 1
].
2. Give an example of a 2 × 2 matrix A such that for an operator T : R2 → R2 we have dim ker(T) =dim ran(T) and ker(T) ⊥ ran(T).
Solution. Due to dimension theorem we must have ker(T) ⊥ ran(T) = 1 so columns of A are linearly
dependent. If we take, for example, A =
[1 1a a
], then the null space will be spanned by (1,−1)T
which will be orthogonal to (1,a)T only for a = 1. So the simplest choice is
A =
[1 11 1
].
The column space is spanned by (1, 1)T and the null space by (1,−1)T .
3. Prove that the determinants of A ∈ Cn×n be a square matrix.
(a) Prove that for any invertible matrix P the characteristic polynomials of A and P−1AP coincide.
(b) LetpA(λ) = (−1)n(λn −a1λn−1 + . . .)
be the characteristic polynomial of A. Prove that a1 is the sum of all eigenvalues of A.
4. Let R4[x] be an inner vector space of polynomials of degree at most n defined on [−1, 1] with innerproduct
(f,g) =
∫ 1−1f(x)g(x)dx.
Compute any basis of the orthogonal complement (x4)⊥ of x4 in R4[x].
5. Suppose operator T : R5 → R5 in the basis B = {e1,e2,e3,e4,e5} has the matrix
J5(2) =
2 1 0 0 00 2 1 0 00 0 2 1 00 0 0 2 10 0 0 0 2
.
Explain with reason in which basis T will have the matrix
2 0 0 0 01 2 0 0 00 1 2 0 00 0 1 2 00 0 0 1 2
.
MATHS 253 Mock semester test Page 1 of 2
6. A linear operator T : Rn → Rn has the following standard matrix:
A =
a 1 1 . . . 11 a 1 . . . 11 1 a .. . 1. . . . . . . . . . . . . . .1 1 1 . . . a
.
(a) Before any calculations done, say why T is orthogonally diagonalisable.
(b) Spot an eigenvalue λ1 of T whose geometric multiplicity is n−1. Express its value in terms of a.(c) Use orthogonal diagonalisability to find an eigenvector belonging to an eigenvalue λ2 different
from λ1 and calculate λ2 (also in terms of a).
MATHS 253 Mock semester test Page 2 of 2
mocktests (1).pdf
Department of Mathematics
MATHS 253 Mock semester test. Solutions Sem. 1, 2021
1. Consider a linear operator T on the vector space R2×2 matrices given by
T(X) = XA + AX, A =
[0 12 3
].
(a) Find the matrix of T relative to the standard basis {E11,E12,E21,E22}, where
E11 =
[1 00 0
], E12 =
[0 10 0
], E21 =
[0 01 0
], E22 =
[0 00 1
].
Proof. We have
T(E11) = E11A + AE11 =
[0 12 0
]= E12 + 2E21,
T(E12) = E12A + AE12 =
[2 30 2
]= 2E11 + 3E12 + 2E22,
T(E21) = E21A + AE21 =
[1 03 1
]= E11 + 3E21 + E22,
T(E22) = E22A + AE22 =
[0 12 6
]= E12 + 2E21 + 6E22,
Thus the matrix will be
A =
0 2 1 01 3 0 12 0 3 20 2 1 6
.
2. Give an example of a 2 × 2 matrix A such that for an operator T : R2 → R2 we have dim ker(T) =dim ran(T) and ker(T) ⊥ ran(T).
Solution. Due to dimension theorem we must have ker(T) ⊥ ran(T) = 1 so columns of A are linearly
dependent. If we take, for example, A =
[1 1a a
], then the null space will be spanned by (1,−1)T
which will be orthogonal to (1,a)T only for a = 1. So the simplest choice is
A =
[1 11 1
].
The column space is spanned by (1, 1)T and the null space by (1,−1)T .
3. Prove that the determinants of A ∈ Cn×n be a square matrix.
(a) Prove that for any invertible matrix P the characteristic polynomials of A and P−1AP coincide.
(b) LetpA(λ) = (−1)n(λn −a1λn−1 + . . .)
be the characteristic polynomial of A. Prove that a1 is the sum of all eigenvalues of A.
MATHS 253 Mock semester test. Solutions Page 1 of 3
Solution. (a) Since det(P−1AP) = det A, we have
det(P−1AP −λI) = det(P−1AP −λP−1IP ) = det(P−1(A−λI)P) = det(A−λI).
(b) By Jacobi’s theorem there exists an invertible matrix P such that
P−1AP =
λ1
λ2 ?. . .
0. . .
λn
is upper triangular with eigenvalues on the diagonal. Its characteristic polynomial is (−1)n(λ1 −λ) . . . (λn −λ) which coefficient of λn−1 is (−1)n−1(λ1 + . . . + λn). By (a) this is the same coefficientfor pA(λ).
4. Let R4[x] be an inner vector space of polynomials of degree at most n defined on [−1, 1] with innerproduct
(f,g) =
∫ 1−1f(x)g(x)dx.
Compute any basis of the orthogonal complement (x4)⊥ of x4 in R4[x].
Solution. We know (x4)⊥ is 4-dimensional and we know two linearly independent vectors for it, namely,f1(x) = x and f2(x) = x
3 (this is because x4 is even and x and x3 are odd). We need to find two evenfunctions orthogonal to x4. We look these functions in the form a + bx2 + cx4. Then
0 =
∫ 1−1
(a + bx2 + cx4)x4dx =2a
5+
2b
7+
2c
9.
We need two linearly independent vectors, so we can take from which a = −57b and c = 0 and a = 0
and b = −97c. For example, f3(x) = −5 + 7×2 and f4 = −9×2 + 7×4 make {f1(x),f2(x),f3(x),f4(x)}
a basis sought for.
5. Suppose operator T : R5 → R5 in the basis B = {e1,e2,e3,e4,e5} has the matrix
J5(2) =
2 1 0 0 00 2 1 0 00 0 2 1 00 0 0 2 10 0 0 0 2
.
Explain with reason in which basis T will have the matrix
2 0 0 0 01 2 0 0 00 1 2 0 00 0 1 2 00 0 0 1 2
.
Solution. For obtaining the new basis it is sufficient to reverse the order of vectors in B, that is
B′ = {e5,e4,e3,e2,e1}.
MATHS 253 Mock semester test. Solutions Page 2 of 3
6. A linear operator T : Rn → Rn has the following standard matrix:
A =
a 1 1 . . . 11 a 1 . . . 11 1 a .. . 1. . . . . . . . . . . . . . .1 1 1 . . . a
.
(a) Before any calculations done, say why T is orthogonally diagonalisable.
(b) Spot an eigenvalue λ1 of T whose geometric multiplicity is n−1. Express its value in terms of a.(c) Use orthogonal diagonalisability to find an eigenvector belonging to an eigenvalue λ2 different
from λ1 and calculate λ2 (also in terms of a).
Solution. (a) T is orthogonally diagonalisable since A is symmetric.
(b) For an eigenvalue of geometric multiplicity n− 1, after subtracting this eigenvalue all rows mustbe multiples of each other. This can happen only when all rows become (1 1 , . . . , 1). Indeed, if wesubtract a − 1 from every entry on the main diagonal of A we will get the matrix which consistsentirely of ones. The rank of this matrix is 1 and in particulr, its determinant is 0. Hence λ1 = a− 1is an eigenvalue and its geometric multiplicity is n− rank(A−λ1I) = n− 1. Its eigenspace E1 is thehyperplane given by the equation x1 + x2 + . . . + xn = 0 and in particular v = (1, 1, . . . , 1) will be thenormal vector of this hyperplane and a basis of its orthogonal complement.
(c) Since T is orthogonally diagonalisable v must be an eigenvector of T. Calculating Av we getAv = (a + (n− 1))v. Thus λ2 = a + n− 1 is the second eigenvalue. If we choose a basis in E1 andcomplement it with v, then in the basis obtained T will have the matrix
D =
λ1 0 0 . . . 00 λ1 0 . . . 00 0 λ1 . . . 0. . . . . . . . . . . . . . .0 0 0 . . . λ2
.
There cannot be any other eigenvalue since the characteristic polynomial will be then±(λ−λ1)n−1(λ−λ2).
MATHS 253 Mock semester test. Solutions Page 3 of 3
tests-new.pdf
Department of Mathematics
MATHS 253 Semester Test. Solutions 30 August, 2019
1. (15 points) Let T : R2[x] → R3 be the mapping defined by the formula
T(f) := (f(0),f(1),f(2))T .
(a) Calculate T(1 + x + x2);
(b) Find the matrix of T relative to bases B = {1,x,x2} of R2[x] and the standard basis E of R3.
Solution. (a) T(1 + x + x2) = (1, 3, 7)T ;
(b) We have
T(1) = (1, 1, 1)T ,
T(x) = (0, 1, 2)T ,
T(x2) = (0, 1, 4)T
Hence the matrix of T will be
[T]BE =
1 0 01 1 1
1 2 4
.
2. (20 points) Let v = (1, 1, 1)T ∈ R3. Find the projection matrices of R3 onto W = span{v} and ontoW⊥.
Solution. Vector v must be normalised to obtain w = 1√3(1, 1, 1)T . The projection matrices PW onto
W will be:
PW = wwT =
1
3
1 1 11 1 1
1 1 1
.
As projW (x) + projW ⊥ (x) = x, we have PW + PW ⊥ = I3, the identity matrix. Therefore,
PW ⊥ = I3 −PW =1
3
2 −1 −1−1 2 −1−1 −1 2
.
Alternatively, we can find an orthonormal basis in W⊥ and substitute it into the formula for theprojection matrix. Such basis would be, for example, w1 =
1√2(1,−1, 0)T and w2 = 1√
6(1, 1,−2)T .
Then
PW ⊥ = w1wT1 + w2w
T2 =
1
2
1 −1 0−1 1 0
0 0 0
+ 1
6
1 1 −21 1 −2−2 −2 4
= 1
3
2 −1 −1−1 2 −1−1 −1 2
.
Comment: The main difficulty was to find a basis of W⊥ which was practised a lot in MATHS 250.A typical mistake was either not normalising a basis of W⊥ or not orthogonalising it or both.
3. (15 points) In the inner product space CR[−1, 1] with the integral inner product
(f,g) =
∫ 1−1f(x)g(x)dx
use the Gram-Schmidt orthogonalisation to orthogonalise the system of two polynomials
f1(x) = x2, f2(x) = x
2 + x + 1.
MATHS 253 Semester Test. Solutions Page 1 of 3
Solution. We set
g1(x) = f1(x) = x2, g2(x) = f2(x) −
(f2(x),g1(x))
(g1(x),g1(x))g1(x) = x
2 + x + 1 −8
3×2 = −
5
3×2 + x + 1
since
(g1(x),g1(x)) =
∫ 1−1g1(x)
2dx =
∫ 1−1x4dx =
2
5.
(f2(x),g1(x)) =
∫ 1−1f2(x)g1(x)dx =
∫ 1−1×2(x2 + x + 1)dx =
∫ 1−1
(x4 + x2)dx =2
5+
2
3=
16
15.
Note that we have used in this calculation that x2 ⊥ x.
Comment: Poor integration skills have let many students down.
4. (15 points) Write down a 5 × 5 matrix over R whose eigenvectors are λ1 = 1 and λ2 = 2 of algebraicmultiplicities 2 and 3, respectively, and geometric multiplicities 2 and 1, respectively. Justify youranswer.
Solution. For example,
A =
1 0 0 0 00 1 0 0 00 0 2 1 00 0 0 2 10 0 0 0 2
The geometric multiplicity of λ1 is
5 − rank(A− I5) = 5 − rank
0 0 0 0 00 0 0 0 00 0 1 1 00 0 0 1 10 0 0 0 1
= 2
and the geometric multiplicity of λ2 is
5 − rank(A− 2I5) = 5 − rank
−1 0 0 0 0
0 −1 0 0 00 0 0 1 00 0 0 0 10 0 0 0 0
= 1.
5. (20 points) Consider the function
f(x) =
{−1 for −1 ≤ x < 0,
1 for 0 ≤ x ≤ 1
defined on [−1, 1].
(a) Is f(x) odd? Is it even?
(b) A trigonometric polynomial of order one
t(x) = a0 + a1 cos πx + b1 sin πx
is the first order Fourier approximation to f(x).
MATHS 253 Semester Test. Solutions Page 2 of 3
(c) Find a0, a1, b1.
Solution. (a) f is odd (but see comment);
(b) Since f is odd, a0 = a1 = 0. We also have
b1 =
∫ 1−1f(x) sin πxdx = 2
∫ 10
sin πxdx =4
π≈ 1.2732.
Comments: 1) Strictly speaking f(x) is not odd since f(0) 6= 0 and several students pointed this out.However for integration purposes it may still be considered as odd since the condition f(−x) = −f(x)is violated at a single point only. If you wrote that it is neither odd nor even citing f(0) 6= 0, or youwrote that f is odd, both answers were accepted. Again poor integration skills let many people down.
6. (15 points)
(a) Explain what it means for an operator T : Rn → Rn to be orthogonally diagonalisable.(b) Find all real values x,y,z that make the operator T(x) = Ax, where
A =
x 1 2y 3 z
2 3 1
,
orthogonally diagonalisable. Give reasons.
Solution. (a) An operator T : Rn → Rn is orthogonally diagonalisable if there is an orthonormal basisof Rn consisting of eigenvectors of T.
(b) We learned a theorem stating that only operators with a symmetric standard matrix are orthog-onally diagonalisable. Hence
A =
x 1 21 3 3
2 3 1
,
where x is an arbitrary real number.
Comment: This question is very easy to answer, if you study the material properly. It requiredformulating the definition, stating the theorem and making easy conclusion from it. Many studentsdo NOT distinguish between a definition and a theorem. If you got low marks for this question, youhave to reconsider the way you learn.
MATHS 253 Semester Test. Solutions Page 3 of 3