Real symmetric matrices are diagonalizable
Here's my explanation for the diagonalizability of real symmetric matrices. There are lots of proofs and ways to show it, but many of them use facts that my students (in a typical sophomore linear algebra course) don't know, such as the details of Jordan Canonical Form, or seem to use too much machinery. What I have here requires these ideas:
eigenvalues and eigenvectors
eigenvalues of a real symmetric matrix are all real
inner product machinery, including ideas of orthogonal complement of a subspace
idea of an invariant subspace
I've looked in several linear algebra books and couldn't find any of them that address this issue at this level; all of them are (understandably) bringing in ideas of JCF, Schur's theorem/triangular form, or do things like have multiple matrix representations (of different sizes!) of a linear transformation -- one for the entire vector space, another for the transformation as it acts on an invariant subspace, which seems unnecessarily complicated.
My audience here is people who teach linear algebra, although this document right now is a mish-mash of what I would write for instructors and what I would write for students.
The idea that makes our recursion step work is this lemma.
Linear operators have eigenvectors
Lemma: Let be a linear operator on and a nontrivial subspace of . If is invariant under , then has an eigenvector in .
The proof is the standard determinant-free proof of the existence of eigenvectors from Axler's Down With Determinants: choose any nonzero vector . Denote the dimension of with ; then, since is invariant under , the set consists of vectors in a -dimensional space and is therefore linearly dependent. That means there is a nontrivial collection of coefficients so that [ a_0 v + a_1 L(v) + \cdots +a_k L^k(v) = \vec{0}. ] It might be the case that ; let be the largest index so that . The polynomial factors (over the complex numbers!) into linear factors: [ 0 = \sum_{i=0}^j a_i t^i = c(t - r_1)(t - r_2)\cdots (t-r_j) ] Then (I find this to be a bit of a leap, although it's plausible enough...a typical engineering student will likely find the switch from a polynomial in to a polynomial in to be a bit weird, though) the same equation must be true when we use : [ \vec{0} = \left(\sum_{i=0}^j a_i L^i\right)v = \left(c(L - r_1 I)(L - r_2 I)\cdots (L-r_j I)\right)v ] One of the factors must send a nonzero vector to zero...that is, has an eigenvector with eigenvalue .
A bit of cleanup
One last detail: above, could in principle be complex. But if you prove separately that real symmetric matrices only have real eigenvalues, you can then argue that is actually real.
Diagonalizability
The idea here is: find an eigenvector, show that the orthogonal complement is invariant, recurse.
Let be a real symmetric matrix / symmetric linear operator on .
Base case: is invariant under , so has an eigenvector . Let be the orthogonal complement of (the subspace spanned by) .
is invariant under : this is where we really need the symmetry of . Say . Is also in ? Yes: [ (L(w), v_1) = (w, L(v_1)) = (w, \lambda_1 v_1) = \lambda_1 \cdot 0 = 0. ] Since is orthogonal to , it is in .
Apply the above lemma: has an eigenvector in . That eigenvector is orthogonal to , of course.
Recurse: let be the orthogonal complement of . Is invariant under ? Well, certainly is in , since is invariant, but you can use the above argument to show that is orthogonal to for any , so is invariant and once again has an eigenvector in , which must be orthogonal to and .
The recursion bottoms out when you get to , which must be 1-dimensional. You find an eigenvector there, and then you have orthogonal eigenvectors...which must be linearly independent. Ah! You have an orthogonal basis of eigenvectors. Normalize all the lengths to and you have a lovely orthonormal basis of eigenvectors.
See? That was easy. Beyond the usual stuff to understand diagonalizability, you need some basic inner product machinery (which can be phrased entirely in the language of dot products and stuff) and the idea of an invariant subspace, which is only epsilon away from the idea of diagonalizability anyway. I don't know why my textbook (Kolman & Hill) punts on this.
Claim: the subspace of vectors of the form [ \vec{x} = a + b ] for some and is invariant under this matrix.
Let's compute.
...which equals [ (2a + b) + 2b . ] (Don't worry about how I got that, just make sure you understand that is still in the subspace, which means that subspace is invariant under .
Now let's find an eigenvector. The above proof says we can take any nonzero vector in the space (which is 2-dimensional) and find a linear dependence relation among :
We want solutions to the corresponding homogeneous system, so put those columns into a matrix and row reduce:
So, :
Let's factor the corresponding polynomial . We can do this by hand, but let's make Sage do it.
Now let's start multiplying to find an eigenvector:
That's not zero, keep going.
So is an eigenvector in this invariant subspace:
A subspace not invariant under
Let's see that the span of and is not an invariant subspace for this linear transformation.
Is there a linear combination of and that equals the above vector?
So...no. For some special choices of and , there are solutions, but not always. So that subspace is not invariant.
An example of the algorithm
Above I just demonstrated invariant subspaces, which can work for any linear transformation or matrix, not necessarily real symmetric ones. Let's now work through the algorithm described above.
So (-sqrt(6) + 2, 1, -1) is an eigenvector for . We need to work in the orthogonal complement of that vector, and take a nonzero vector there to find another eigenvector. Note that (0,1,1) is orthogonal to that vector; let's use that.
This one is so simple we don't even need to factor the polynomial: is in the kernel of that matrix, which means [ -3 \vec{x} + A \vec{x} = \vec{0} ] or... [ A \vec{x} = 3 \vec{x}. ] Hey, our guess is actually an eigenvector!
Now we need a vector orthogonal to the first two eigenvectors. The orthogonal complement must be 1-dimensional, so any nonzero vector in there will be an eigenvector. Let's put the 2 known eigenvectors into a matrix as rows and find the nullspace of that matrix -- any vector there is orthogonal to both eigenvectors.
So the vector is in the kernel / is an eigenvector:
Now let's check that we have an orthogonal basis. I'll put the vectors into a matrix and multiply by the transpose, which does all the dot products. We should get a diagonal matrix.
Now let's normalize the vectors to get an orthonormal basis.
Now we'll use to diagonalize :