Structure of Linear Operators
On this page:
- Introduction
- $k[x]$-modules
- Minimal Polynomial
- Cyclic Subspaces
- Structure of f.g. Modules over a PID
- Characteristic Polynomial
- Companion Matrices
- Rational Canonical Form
- Magic Formula: $\chi_A(x) = \det(xI - A)$ :)
- Worked Exercises
Introduction
In linear algebra, it is of much interest to study the structure of linear operators on finite-dimensional vector spaces. Our reasons are twofold:
- We can make statements about the vector space itself.
- We can study various properties about the operator.
A lot of resources that I have looked at attempt to derive a majority of this theory without resorting to module theory. I personally think that neglecting a module-theoretic perspective makes linear operator structure theory really dry so I readily use the theory of modules over PIDs.
$k[x]$-modules
Throughout this page, let $V$ be a finite-dimensional $k$-vector space. Given a linear operator $T \in \End_k(V)$, there is a natural $k[x]$-module structure associated with $V$ paired with $T$. In particular, if $p(x) \in k[x]$ and $v\in V$, then
\[p(x) \cdot v \coloneqq p(T)v.\]Now, since $V$ is finite-dimensional as a $k$-vector space, it is necessarily finitely-generated as a $k[x]$-module. Now we point out some key things to think about:
[theorem] [thmtitle]Proposition %counter% ($V$ is a torsion $k[x]$-module)[/thmtitle]
Let $V$ be a finite-dimensional $k$-vector space and $T$ be a linear operator on $V$. Then $V$ is a torsion $k[x]$-module. [/theorem]
[proof] Recall that if $n \coloneqq \dim V$, then $\dim\End_k(V) = n^2$ by choosing a basis and establishing the linear isomorphism $\End_k(V) \cong M_{n}(k)$. This implies that the $n^2 + 1$ elements
\[1, \quad T, \quad T^2, \quad\ldots, \quad T^{n^2}\]are linear dependent in $\End_k(V)$. Thus, there exists some $p(x) \in k[x]$ for which $p(T) = 0$. Since this means $p(x)$ annihilates the entirety of $V$, it follows that $V$ is torsion. [/proof]
[example] [extitle]Remark %counter%[/extitle]
Recall that $k[x]$ is a PID. The proposition that was just proved shows that the annihilator ideal $\ann_{k[x]}(V) \unlhd k[x]$ is a nonzero principal ideal. This will be an incredibly important idea as we walk through the following material. [/example]
In elementary linear algebra, one learns about the invariant subspaces of a vector space and how they interact with the Jordan canonical form of a matrix. Personally, I did not understand this definition very well until I saw the following interpretation of it:
[definition] [deftitle]Definition %counter% (Invariant subspaces)[/deftitle]
Let $V$ be a finite-dimensional vector space and $T$ be a linear operator on $V$. If $S$ is a $k[x]$-submodule of $V$, then we call $S$ a $T$-invariant subspace of $V$. [/definition]
[theorem] [thmtitle]Theorem %counter% (Characterization of $T$-invariance)[/thmtitle]
Let $T$ be a linear operator on $V$. Then $S$ is a $T$-invariant subspace of $V$ if and only if $TS \subseteq S$. [/theorem]
[proof] ($\implies$) Since $S$ is a $k[x]$-submodule of $V$, we know that $T(s) \in S$ for any $s \in S$. So we certainly have $TS \subseteq S$ as desired.
($\impliedby$) Obvious induction argument. [/proof]
One final result we will mention in this section is the characterization of $k[x]$-module isomorphisms.
[theorem] [thmtitle]Theorem %counter% (Characterization of $k[x]$-module isomorphisms)[/thmtitle]
Let $T_1$ and $T_2$ be linear operators on $V$ and call the associated $k[x]$-modules $V_{T_1}$ and $V_{T_2}$. Then $V_{T_1}$ and $V_{T_2}$ are isomorphic as $k[x]$-modules if and only if $T_1 \sim T_2$. [/theorem]
[proof] ($\implies$) Let $\phi:V_{T_1} \to V_{T_2}$ be a $k[x]$-module isomorphism. Then
\[\phi(T_1v) = T_2\phi(v).\]Thus,
\[T_1v = \phi^{-1}T_2\phi(v).\]Thus, it follows that $T_1 = \phi^{-1}T_2\phi$. Furthermore, $\phi$ restricts to a $k$-vector space isomorphism so the result follows immediately.
($\impliedby$) For the converse direction, we pick out some $\phi\in\Aut_k(V)$ such that $T_1 = \phi^{-1}T_2\phi$ and extend it to a $k[x]$-module isomorphism by setting
\[\phi(p(x)v) = p(x)\phi(v).\][/proof]
Minimal Polynomial
There are two ways of defining the minimal polynomial. In treatments that disregard modules, the minimal polynomial is defined as the unique monic polynomial $p(x) \in k[x]$ of minimal degree such that $p(T) = 0$. This is how I first learned about the minimal polynomial, and I found this definition to be very strange and unmotivated. I personally understand the minimal polynomial better via the following definition:
[definition] [deftitle]Definition %counter% (Minimal polynomial)[/deftitle]
Let $V$ be a finite-dimensional vector space and $T$ a linear operator on $V$. Then the minimal polynomial $m_T(x)$ of $T$ is the unique monic generator of $\ann_{k[x]}(V) \unlhd k[x]$. [/definition]
Asking about the generator of principal ideal is very natural ring-theoretic question. And, in UFD theory, we know that monic polynomials in particular, are nice to deal with.
Since the definition of the minimal polynomial for a matrix $A \in M_n(k)$ is practically identical to the one we just gave, we’ll omit writing it formally.
Incredibly important to mention are the following:
[theorem] [thmtitle]Theorem %counter% ($m_T(x)$ is a similarity invariant)[/thmtitle]
Let $T_1$ and $T_2$ be similar linear operators on $V$. Then $m_{T_1}(x) = m_{T_2}(x)$. [/theorem]
[proof] Since $T_1 \sim T_2$, there is some $\phi\in\Aut_k(V)$ for which
\[T_1 = \phi T_2 \phi^{-1}.\]So it follows that
\[0 = m_{T_1}(T_1) = m_{T_1}(\phi T_2 \phi^{-1}).\]Say that $m_{T_1} = \sum_{i} a_i x^i$. Then
\[\begin{align*} m_{T_1}(\phi T_2 \phi^{-1}) &= \sum_{i} a_i (\phi T_2 \phi^{-1})^i \\ &= \phi\left(\sum_{i} a_i T_2^i\right)\phi^{-1} \\ &= \phi m_{T_1}(T_2) \phi^{-1}. \end{align*}\]So it follows that $m_{T_1}(T_2) = 0$. Thus,
\[\generator{m_{T_1}(x)} \subseteq \generator{m_{T_2}(x)}.\]An identical argument gives the $\supseteq$ direction. Thus, $m_{T_1}(x)$ and $m_{T_2}(x)$ are associates in the ring $k[x]$. Since both polynomials are monic, it follows that $m_{T_1}(x) = m_{T_2}(x)$ and we are done. [/proof]
[theorem] [thmtitle]Theorem %counter% ($m_A(x)$ is a similarity invariant)[/thmtitle]
Let $A$ and $B$ be similar matrices in $M_n(V)$. Then $m_{A}(x) = m_{B}(x)$. [/theorem]
[proof] Same argument as above. [/proof]
[theorem] [thmtitle]Theorem %counter% ($m_T(x) = m_{A}(x)$)[/thmtitle]
Let $A \in M_n(V)$ represent the linear operator $T$ on $V$. Then $m_T(x) = m_A(x)$. [/theorem]
[proof] Since $A$ represents $T$, it follows that $m_T(A)$ represents $m_T(T)$. Therefore, $m_T(A) = 0$ and so we obtain
\[\generator{m_T(x)} \subseteq \generator{m_A(x)}.\]A similar argument gives the $\supseteq$ direction from which it follows that $m_T(x)$ and $m_A(x)$ are associates. As both are monic, it follows that $m_T(x) = m_A(x)$ and we are done. [/proof]
Cyclic Subspaces
Eventually, we will want to recast the f.g.-module-over-a-PID structure theorems to our $k[x]$-modules. Accordingly, we need to spend some time studying the cyclic submodules of $k[x]$. As it turns out, this amounts to studying the cyclic subspaces of our vector space.
[definition] [deftitle]Definition %counter% ($T$-cyclic subspace)[/deftitle]
Let $T \in \End_k(V)$. A $T$-invariant subspace $S$ of $V$ is $T$-cyclic if $S$ has a basis of the form
\[\mathcal{B} = \{v, Tv, \ldots, T^{n-1}v\}\]for some $v \in V$ and $n \in \ZZ_{>0}$. The basis $\mathcal{B}$ is called a $T$-cyclic basis for $V$. [/definition]
[theorem] [thmtitle]Theorem %counter% (Cyclic submodules = cyclic subspaces)[/thmtitle]
The cyclic $k[x]$-submodules of $V$ are the $T$-cyclic subspaces of $V$. In particular, the cyclic submodules with minimal polynomial of degree $n$ are the $T$-cyclic subspaces of $V$ with dimension $n$. [/theorem]
[proof] We first show that any cyclic $k[x]$-submodule of $V$ is a $T$-cyclic subspace of $V$. Let $v\in V$ and consider
\[\generator{v} = \{p(x)v \mid p(x) \in k[x]\}.\]It is immediate that this is a vector space. Let $m(x)$ be the minimal polynomial of $T|_{\generator{v}}$ be degree $n$. We claim that
\[\mathcal{B} = \{v, Tv, \ldots, T^{n-1}v\}\]forms a $T$-cyclic basis for $\generator{v}$.
(Spanning) Given any $p(x) \in k[x]$, we have
\[p(x) = m(x)q(x) + r(x)\]by the Euclidean division algorithm. Furthermore,
\[p(T|_{\generator{v}}) = m(T|_{\generator{v}})q(T|_{\generator{v}}) + r(T|_{\generator{v}}) = r(T|_{\generator{v}}).\]So if $m(x)$ has degree $n$, we know that $r(x) = 0$ or $\deg r(x) < n$. So every $r(x)v$ is a linear combination of the elements of $\mathcal{B}$. Thus, $\mathcal{B}$ forms a spanning set for $\generator{v}$.
(Linear independence) If there were some nontrivial linear combination of the elements of $\mathcal{B}$ that was zero, then $m(x)$ cannot be the minimal polynomial of $T|_{\generator{v}}$.
It is straightforward to see that $T$-cyclic subspaces are, in fact, cyclic $k[x]$-submodules by applying the $T$-invariance assumption. For the final claim, we apply the linear independence of $\mathcal{B}$ to show that $m(x)$ has precisely degree $n$. [/proof]
Structure of f.g. Modules over a PID
In this section, we will state the structure theorems for finitely generated modules over a PID and begin translating it to a linear algebraic setting.
[theorem] [thmtitle]Theorem %counter% (Primary decomposition)[/thmtitle]
Let $M$ be a f.g. torsion module over a PID $R$. If
\[m = \prod_{i=1}^n p_i^{e_i}\]generates $M$ and the $p_i$’s are distinct nonassociate primes in $R$, then $M$ can be decomposed as
\[M = \bigoplus_{i=1}^n M_{p_i}\]where each
\[M_{p_i} \coloneqq \frac{m}{p_i^{e_i}} M = \{v \in M \mid p_i^{e_i} v = 0\}\]is a primary submodule with annihilator $\generator{p_i^{e_i}}$. [/theorem]
[theorem] [thmtitle]Theorem %counter% (Cyclic decomposition of primary modules)[/thmtitle]
Let $M$ be a primary f.g. torsion module over a PID $R$ with annihilator $\generator{p^{e}}$. Then $M$ has the decomposition
\[M = \bigoplus_{i=1}^n \generator{m_i}\]such that $\ann_R{\generator{m_i}} = \generator{p^{e_i}}$ and
\[\ann_R{\generator{m_1}} \subseteq \cdots \subseteq \ann_R{\generator{m_n}}.\]Equivalently,
\[p^{e_n} \mid \cdots \mid p^{e_1} = p^{e}.\][/theorem]
In our particular case, we have $k$-vector spaces $V$ interpreted as a f.g. torsion $k[x]$-module and we can take $m$ to be the minimal polynomial.
[theorem] [thmtitle]Theorem %counter% (Primary decomposition for vector spaces)[/thmtitle]
Let $V$ be a finite-dimensional vector space and $T$ a linear operator on $V$. Let
\[m_T(x) = \prod_{i=1}^n p_i^{e_i}(x)\]be the minimal polynomial of $T$ and each $p_i$ are distinct monic irreducible polynomials in $k[x]$. Then $V$ has the $k[x]$ direct sum decomposition
\[V = \bigoplus_{i=1}^n V_{p_i}\]where each
\[V_{p_i} = \frac{m(x)}{p_i^{e_i}(x)} V = \{v \in V \mid p_i^{e_i}(T) v = 0\}\]is a primary submodule of $V$ with annihilator $\generator{p_i^{e_i}(x)}$. That is, $V_{p_i}$ is a $T$-invariant subspace of $V$ with and the minimal polynomial of $T|_{V_{p_i}}$ is $p_i^{e_i}(x)$. [/theorem]
This latter theorem, after doing some careful bookkeeping, leads to the two “very important” theorems: invariant factor decomposition and the elementary divisor decomposition. And then those lead to the rational and Jordan canonical forms. In any case, let’s just keep pushing structure theorems.
This next theorem arises by decomposing the summands of the primary decomposition (they are primary submodules!). For instance, say that
\[M_{p_i} = \bigoplus_{j=1}^{k_i} \generator{m_{i,j}}.\]Then our primary decomposition becomes the rather gross-looking
\[M = \bigoplus_{i=1}^n \bigoplus_{j=1}^{k_i} \generator{m_{i,j}}.\]Each $\generator{m_{i,j}}$ has an annihilator $\generator{p_{i,j}^{e_{i,j}}}$ and we call $p_{i,j}^{e_{i,j}}$ an elementary divisor of $M$. Furthermore, we also know that
\[p_{i,m}^{e_{i,m}} \mid p_{i,m-1}^{e_{i,m-1}} \mid \cdots \mid p_{i,2}^{e_{i,2}} \mid p_{i,1}^{e_{i,1}}.\]Getting dangerously close to Riemannian geometry-style notation, notice that this tells us we may rearrange the summands as
\[\begin{align*} D_i &\coloneqq \generator{m_{1,i}} \oplus \cdots \oplus \generator{m_{n,i}}. \end{align*}\]Then this means that we take the highest powers of each prime for $D_1$. Then we take the second-highest powers of each prime for $D_2$, and so on. We formalize this as the following:
[theorem] [thmtitle]Theorem %counter% (Invariant factor decomposition)[/thmtitle]
Let $V$ be a finite-dimensional vector space and $T$ a linear operator on $V$. Then
\[V = \bigoplus_{i=1}^m V_{d_i}\]where each $V_{d_i}$ is a cyclic $k[x]$-submodule of $V$ with annihilator $\generator{d_i(x)}$ and
\[d_m(x) \mid d_{m-1}(x) \mid \cdots \mid d_2(x) \mid d_1(x) = m_T(x).\]The $d_i(x)$’s are called the invariant factors of $M$ and $m_T(x)$ is the minimal polynomial of $T$. [/theorem]
The elementary divisor decomposition arises from the primary decomposition immediately after decomposing each individual primary submodule.
[theorem] [thmtitle]Theorem %counter% (Elementary divisor decomposition)[/thmtitle]
Let $V$ be a finite-dimensional vector space and $T$ a linear operator on $V$. Then
\[V = \bigoplus_{i=1}^n \bigoplus_{j=1}^{k_i} \generator{v_{i,j}}\]where each $\generator{v_{i,j}}$ has annihilator $\generator{p_{i,j}^{e_{i,j}}(x)}$. Each $p_{i,j}^{e_{i,j}}(x)$ is called an elementary divisor of $V$. [/theorem]
Characteristic Polynomial
In elementary linear algebra, we are told that the characteristic polynomial is this polynomial that is magically produced from the formula
\[\chi_A(\lambda) = \det(\lambda I - A).\]While this is true, and it actually makes computing the characteristic polynomial easy (well, for small matrices at least), it makes theorems like the Cayley-Hamilton theorem run through horribly technical and dry proofs. I personally had a much better experience viewing the characteristic polynomial as the following:
[definition] [deftitle]Definition %counter% (Characteristic polynomial)[/deftitle]
Let $T$ be a linear operator on $V$. The characteristic polynomial $\chi_T(x)$ of $T$ is the product of all the elementary divisors of $T$. That is,
\[\chi_T(x) = \prod_{i,j} p_i^{e_{i,j}}(x).\][/definition]
Notice that this implies that $\deg(\chi_T(x)) = \dim_k(V)$ (modulo some other results I should be mentioning but am not because I am lazy). In any case, it should be clear that we define the characteristic polynomial of a matrix in the same way.
[example] [extitle]Remark %counter%[/extitle]
A rather embarrassing mistake I made when I first learned this viewpoint is that I struggled to see why the minimal polynomial and characteristic polynomial are not just the same polynomial.
The reason they are not in general is because each primary submodule decomposes so that
\[p_{i,k_i}^{e_{i,k_i}} \mid \cdots \mid p_{i,1}^{e_{i,1}}\]and, typically, $k_i > 1$. So we can pick up additional factors in the characteristic polynomial if the primary decomposition summands are not cyclic to begin with. This observation is so important that we will state it as a result. [/example]
[theorem] [thmtitle]Theorem [counter]characterization_of_cyclic_polynomial_modules[/counter] (Characterization of cyclic $k[x]$-modules)[/thmtitle]
Let $T \in \End_k(V)$ have minimal polynomial
\[m_T(x) = \prod_{i=1}^n p_{i}^{e_i}(x).\]Then $V$ is cyclic $k[x]$-module if and only if $m_T(x) = \chi_T(x)$. [/theorem]
[proof] Since $m_T(x) = \chi_T(x)$ if and only if each primary submodule in the primary decomposition of $V$ is cyclic and this happens if and only if each primary is cyclic (submodules of a cyclic module are cyclic and the sum of cyclic submodules is generated by the sum of the generators). [/proof]
Now the following theorem is completely trivial:
[theorem] [thmtitle]Theorem %counter% (Cayley-Hamilton theorem)[/thmtitle]
The minimal polynomial of $T$ is a divisor of the characteristic polynomial of $T$. Equivalently, $\chi_T(T) = 0$. [/theorem]
Furthermore, we also get the following relationship between the two polynomials.
[theorem] [thmtitle]Theorem %counter%[/thmtitle]
The minimal polynomial and characteristic polynomial of $T$ have the same set of distinct roots. [/theorem]
[proof] By the elementary decomposition of $V$, we see that each $p_{i,j}^{e_{i,j}}$ is a factor of the minimal polynomial of $T$. Thus, the claim follows. [/proof]
Companion Matrices
We will now proceed in our endeavors to start talking about the rational canonical form for matrices. We begin by consider the elementary divisor decomposition of $V$:
\[V = \bigoplus_{i, j} \generator{v_{i,j}}\]where each $v_{i,j}$ has annihilator $\generator{p_{i,j}^{e_{i,j}}}$, as usual. In any case, $V$ is the directed sum of cyclic $k[x]$-submodules and we need to figure out how to represent these cyclic modules via matrices. For now, let’s look at the simpler case where $V = \generator{v}$ with minimal polynomial
\[m_T(x) = a_0 + a_1x + \cdots + a_{n-1}x^{n-1} + x^n.\]Now, $V$ has the $T$-cyclic basis
\[\mathcal{B} = \{v, Tv, \ldots, T^{n-1}v\}.\]Then, we know that
\[[T]_\mathcal{B} = \left[ [Tv]_\mathcal{B} \;\mid\; [T(Tv)]_\mathcal{B} \;\mid\; \cdots \;\mid\; [T(T^{n-1}v)]_\mathcal{B}\right].\]Since
\[T^nv = -(a_0 + a_1T + \cdots + a_{n-1}T^{n-1})v,\]it follows that
\[[T]_\mathcal{B} = \begin{bmatrix} 0 & 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & 0 & \cdots & 0 & -a_1 \\ 0 & 1 & 0 & \cdots & 0 & -a_2 \\ 0 & 0 & 1 & \cdots & 0 & -a_3 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 1 & -a_{n-1} \end{bmatrix}\]This is known as the companion matrix for $m_T(x)$. Let us formalize this now.
[definition] [deftitle]Definition %counter% (Companion matrix)[/deftitle]
The companion matrix of the monic polynomial
\[p(x) = a_0 + a_1x + \cdots + a_{n-1}x^{n-1} + x^n\]is the matrix
\[C[p(x)] = \begin{bmatrix} 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & \cdots & 0 & -a_1 \\ 0 & 1 & \cdots & 0 & -a_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & 1 & -a_{n-1} \end{bmatrix}.\][/definition]
Why do we care about this kind of matrix? Well, because of the following!
[theorem] [thmtitle]Theorem [counter]cyclic_modules_are_represented_by_companion_matrices[/counter] (Cyclic modules are represented by companion matrices!)[/thmtitle]
Let $V$ be a finite-dimensional $k$-vector space and $T$ be a linear operator on $V$. Then $V$ is cyclic as a $k[x]$-module if and only if $T$ can be represented by a companion matrix. [/theorem]
[proof] ($\implies$) Derived above.
($\impliedby$) Now suppose
\[[T]_\mathcal{B} = \begin{bmatrix} 0 & 0 & 0 & \cdots & 0 & -a_0 \\ 1 & 0 & 0 & \cdots & 0 & -a_1 \\ 0 & 1 & 0 & \cdots & 0 & -a_2 \\ 0 & 0 & 1 & \cdots & 0 & -a_3 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & 1 & -a_{n-1} \end{bmatrix}\]For the sake of simplicity (and WLOG), say $\mathcal{B}$ is the standard basis of $k^n$. It suffices to show that $m_T(x) = \chi_T(x)$ by Theorem [countref]characterization_of_cyclic_polynomial_modules[/countref]. To do this, we will show that
\[p(x) = a_0 + a_1x + a_2x^2 + \cdots + a_{n-1}x^{n-1} + x^n\]is the minimal polynomial of $A \coloneqq [T]_\mathcal{B}$. First, we show that $p(A) = 0$. It suffices to show that $p(A)e_i = 0$ for every $i\in [n]$. Noting that
\[\begin{align*} p(A)e_1 &= \sum_{i=0}^{n-1} a_iA^i e_1 + A^n e_1 \\ &= \sum_{i=0}^{n-1} a_i e_{i+1} - \sum_{i=0}^{n-1}a_ie_{i+1} \\ &= 0, \end{align*}\]it follows that $p(A)e_1 = 0$. Furthermore, since
\[p(A)e_i = p(A)A^{i-1}e_1 = A^{i-1}p(A)e_1 = 0,\]it follows that $p(A)$ annihilates the standard basis. Thus, $p(A) = 0$. Also, since any monic polynomial of degree less than $n$
\[q(x) = b_0 + b_1x + \cdots + b_{m-1}x^{m-1} + x^m\]fails to annihilate $e_1$ as
\[q(A)x = e_{m+1} + \sum_{i=0}^{m-1} b_ie_{i+1} \neq 0.\]Thus, $p(A)$ must be the minimal polynomial of $A$. Finally, since $m_A(x)$ has degree $n$ which is also the dimension of $V$, it follows that
\[\deg m_A(x) = \dim_k(V) = \deg \chi_A(x).\]Thus, $m_A(x) = \chi_A(x)$ and it follows that $m_T(x) = \chi_T(x)$ and we are done. [/proof]
Rational Canonical Form
Now that we have all the preliminaries in place, we will proceed on to the rational canonical form. Because we have two different ways of decomposing a vector space into cyclic submodules (elementary divisors and invariant factors), there are two different versions of the rational canonical form and they amount to translating the decomposition theorems into their matrix-oriented cousins.
Elementary Divisor RCF
Given the vector space $V$, there is an elementary divisor decomposition
\[V = \bigoplus_{i,j} \generator{v_{i,j}}\]and the canonical choice of basis $\mathcal{B}$ of $V$ is given by the union of the collections
\[\mathcal{B}_{i,j} = \{v_{i,j}, Tv_{i,j}, \ldots, T^{d_{i,j}-1}v_{i,j}\}.\]So then, it follows that
\[[T]_\mathcal{B} = \diag(C[p_i^{e_{i,j}}(x)] \mid i \in [n], j \in [k_i]).\]Matrices written in this form are called the elementary divisor form of rational canonical form.
Invariant Factor RCF
On a very similar note, we can express $V$ in an invariant factor decomposition
\[V = \bigoplus_{i=1}^m D_i\]where each $D_i$ is a cyclic module with annihilator $\generator{d_i(x)}$ and
\[d_m(x) \mid d_{m-1}(x) \mid \cdots \mid d_2(x) \mid d_1(x) = \mu_T(x).\]Then there is a basis $\mathcal{B}$ for which
\[[T]_\mathcal{B} = \diag(C[d_i(x)] \mid i \in [n]).\]Magic Formula: $\chi_A(x) = \det(xI - A)$ :)
Unfortunately, a lot of the theory we developed above is not particularly useful in a non-theoretical environment because the minimal polynomial is typically difficult to compute. On the other hand, we all know that $\chi_A(x)$ is relatively easy to compute for small matrices as
\[\chi_A(x) = \det(xI - A).\]To prove this formula, we make use of the rational canonical form.
[theorem] [thmtitle]Lemma %counter%[/thmtitle]
Let $p(x)$ be a monic polynomial in $k[x]$. Then
\[\det(xI - C[p(x)]) = p(x).\][/theorem]
[proof] We proceed by induction on the degree $n$ of
\[p(x) = a_0 + a_1x + \cdots + a_{n-1}x^{n-1}+ x^n.\]Consider the base case $n = 2$. Then we see that
\[C[p(x)] = \begin{bmatrix} 0 & -a_0 \\ 1 & -a_1 \end{bmatrix}.\]Thus,
\[\begin{align*} \det(xI - C[p(x)]) &= \det\begin{bmatrix} x & a_0 \\ -1 & x + a_1 \end{bmatrix}\\ &= x^2 + a_1x + a_0 \end{align*}\]as desired. Now suppose the formula holds for monic polynomials of degree $n - 1$ and let $p(x)$ be monic of degree $n$. Then
\[\det(xI - C[p(x)]) = \det\begin{bmatrix} x & 0 & 0 & \cdots & 0 & a_0 \\ -1 & x & 0 & \cdots & 0 & a_1 \\ 0 & -1 & x & \cdots & 0 & a_2 \\ 0 & 0 & -1 & \cdots & 0 & a_3 \\ \vdots & \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & 0 & \cdots & -1 & x + a_{n-1} \end{bmatrix}\]Applying the Laplace expansion formula to the first row gives
\[x\det \begin{bmatrix} x & 0 & \cdots & 0 & a_1 \\ -1 & x & \cdots & 0 & a_2 \\ \vdots & \vdots & \ddots & \vdots & \vdots \\ 0 & 0 & \cdots & -1 & x + a_{n-1} \end{bmatrix} + (-1)^na_0 \det \begin{bmatrix} -1 & x & 0 & \cdots & 0 \\ 0 & -1 & x & \cdots & 0 \\ \vdots & \vdots & \vdots & \ddots & \vdots \\ 0 & 0 & 0 & \cdots & -1 \end{bmatrix}.\]The first determinant simplifies to
\[a_1 + a_2x^1 + \cdots + a_{n-1}x^{n-2} + x^{n-1}\]and the second determinant simplifies to $a_0$. Thus,
\[\begin{align*} \det(xI - C[p(x)]) &= x(a_1 + a_2x^1 + \cdots + a_{n-1}x^{n-2} + x^{n-1}) + a_0\\ &= a_0 + a_1x + \cdots + a_{n-1}x^{n-1}+ x^n. &= p(x) \end{align*}\]as desired. [/proof]
[theorem] [thmtitle]Theorem %counter% ($\chi_A(x) = \det(xI - A)$)[/thmtitle]
Let $A$ be a square matrix. Then
\[\chi_A(x) = \det(xI - A).\][/theorem]
[proof] Say $R$ is the rational canonical form (in elementary divisor form) of $A$. Then there is a change-of-basis $P$ for which
\[A = PRP^{-1}.\]Thus,
\[\begin{align*} \det(xI - A) &= \det(xI - PRP^{-1}) \\ &= \det(P(xI - R)P^{-1}) \\ &= \det(P)\det(xI - R)\det(P^{-1}) \\ &= \det(xI - R). \end{align*}\]Since the determinant of a block diagonal matrix is the product of the determinants of the blocks on the diagonal, it follows that
\[\det(xI - R) = \prod_{i,j}\det C[p_{i,i}^{e_{i,j}}(x)]\]where the $p_{i,j}^{e_{i,j}}$ are the elementary divisors of $A$. By the lemma above, $\det C[p_{i,i}^{e_{i,j}}(x)] = p_{i,j}^{e_{i,j}}(x)$ so it follows that
\[\det(xI - R) = \prod_{i,j}p_{i,i}^{e_{i,j}}(x) = \chi_A(x)\]and we are done. [/proof]
Worked Exercises
[example] [extitle]Roman Advanced Linear Algebra Exercise 7.3[/extitle]
Let $A \in M_n(k)$. Then $\chi_A(x) = m_A(x)$ if and only if it is similar to a companion matrix. [/example]
[proof] Let $T$ be the linear operator induced by $A$. Then we know that $V$ is a cyclic $k[x]$-module by Theorem [countref]characterization_of_cyclic_polynomial_modules[/countref]. So it follows by Theorem [countref]cyclic_modules_are_represented_by_companion_matrices[/countref] that there is a representation $B$ of $T$ that is a companion matrix. Furthermore, change-of-basis ensures $A \sim B$ and we are done. [/proof]
[example] [extitle]Roman Advanced Linear Algebra Exercise 7.5[/extitle]
Let $A \in M_n(k)$. The entries of any rational canonical form for $A$ are rational expressions in the coefficients of $A$. [/example]
[proof] The magic determinant formula immediately tells us that the coefficents of the characteristic polynomial are polynomials in the coefficients of $A$. Since the entries of the matrices come from polynomials that are factors of $\chi_A(x)$, it follows that the entires are rational expressions in the entries of the matrix. [/proof]
[example] [extitle]Roman Advanced Linear Algebra Exercise 7.6[/extitle]
Let $T$ be a linear operator on the finite-dimensional vector space $V$. If $p(x) \in k[x]$ is irreducible and if $p(T)$ is not injective, prove that $p(x)$ divides the minimal polynomial of $T$. [/example]
[proof] Since $p(T)$ is not injective, there is some nonzero $v \in V$ for which $p(T)v = 0$. Consider the cyclic $k[x]$-submodule $\generator{v}$. Certainly $p(x)$ annihilates $\generator{v}$ so $p(x) \in \ann_{k[x]}(\generator{v})$. Since $k[x]$ is a PID , this ideal is generated by some monic element $f(x)$ which implies that $f(x) \mid p(x)$. But $p(x)$ is irreducible, so $f(x) = p(x)$. But $f(x)$ is a factor of the minimal polynomial, so it follows that $p(x) \mid m_T(x)$ as desired. [/proof]