Rational Canonical Form: A Summary

The Basic Idea

This post is intended to be a hopefully-not-too-intimidating summary of the rational canonical form (RCF) of a linear transformation. Of course, anything which involves the word "canonical" is probably intimidating no matter what. But even so, I've attempted to write a distilled version of the material found in (the first half of) section 12.2 from Dummit and Foote's Abstract Algebra.

In sum, the RCF is important because it allows us to classify linear transformations on a vector space up to conjugation. Below we'll set up some background, then define the rational canonical form, and close by discussing why the RCF looks the way it does. Next week we'll go through an explicit example to see exactly how the RCF can be used to classify linear transformations.        

From English to Math            

The Background

Let $F$ be a field and let $V$ be a finite dimensional $F$-vector space. Given a linear transformation $T:V\to V$, we can simultaneously view $V$ as an $F[x]$-module by defining the action   $$x\cdot v := Tv$$ for any vector $v\in V$. Then for any polynomial $p(x)\in F[x]$, we know what the action $p(x)\cdot v$ "looks like," namely if $p(x)=a_nx^n+\cdots+a_1x+a_0$ with $a_i\in F$, then     $$  \begin{align} p(x)\cdot v &= (a_nT^n+\cdots+a_1T+a_0I)v\\ &= a_nT^nv + \cdots + a_1Tv+a_0v \end{align} $$   where $I:V\to V$ is the identity. Notice that the last line is indeed another vector in $V$ and so this action makes sense. (And one can easily check that the module axioms are satisfied.)

One key observation is that $V$ is a torsion* $F[x]$-module. In particular, if $m_T(x)$ denotes the minimal polynomial of $T$, then for any $v\in V$ we have $m_T(x)v=0$ (since $m_T(x)v=m_T(T)v$ and $m_T(T)=0$ is the zero-linear transformation by definition). What's more, $V$ is finitely generated (it has a finite number of basis vectors) by assumption. So since $F[x]$ is a PID (as $F$ is a field), we may conclude   $$V\cong F[x]/(a_1(x))\oplus\cdots\oplus F[x]/(a_d(x))$$   where $a_1(x)\mid\cdots\mid a_d(x)$ are the invariant factors of $V$ and where $a_d(x)=m_T(x)$. (This is the Fundamental Theorem of Finitely Generated Modules over a PID, the subject of last week's post.) These invariant factors play a key role in defining the rational canonical form.

 

The Rational Canonical Form

Let $a(x)=x^k+b_{n-1}x^{k-1}+\cdots+b_1x+b_0$ be any monic polynomial in $F[x]$. Construct a $k\times k$ matrix by placing $1's$ along the subdiagonal and the negative of all the coefficients of $a(x)$ - except the leading term - along the last column. We call this the companion matrix of $a(x)$:   $$ \mathscr{C}_{a(x)}:= \begin{pmatrix}0 & 0 & \cdots  &\cdots& \cdots & -b_0\\ 1 & 0 & \cdots  &\cdots& \cdots &-b_1\\ 0 & 1 & \cdots  &\cdots& \cdots &-b_2\\ 0 & 0 &\ddots  & & &\vdots\\ \vdots & \vdots & & \ddots & &\vdots\\ 0 & 0 &\ldots & \ldots & 1 & -b_{k-1} \end{pmatrix}.$$

(For example if $a(x)=x^2+x+1\in \mathbb{Q}(x)$, then $\mathscr{C}_{a(x)}=\bigl( \begin{smallmatrix}  0 & -1\\ 1 & -1 \end{smallmatrix} \bigr).$ )

Letting $n=\text{dim}(V)$, we define the rational canonical form of $T:V\to V$ to be the $n\times n$ (block-diagonal) matrix $$ \begin{pmatrix} \mathscr{C}_{a_1(x)} & & & &\\ & \mathscr{C}_{a_2(x)} & & &\\ & & \ddots &&\\ &&& & \mathscr{C}_{a_d(x)} \end{pmatrix}$$   where the polynomials $a_1(x)\mid a_2(x)\mid\cdots \mid a_d(x)$ are the invariant factors of the representation of $V$ from above.

 

But Why?

Recall that our vector space $V$ can be written as the direct sum $$V\cong F[x]/(a_1(x))\oplus\cdots\oplus F[x]/(a_d(x)).$$ For the moment, let's focus our attention on just one of the factors $$F[x]/(a(x))$$ (I've taken off the index for simplicity). Just like $V$, this quotient space wears two hats: it is both an $F[x]$-module and it is an $F$-vector space! As an $F$-vector space, it has the basis $\mathscr{B}=\{\bar{1},\bar{x},\ldots,\overline{x}^{k-1}\}$ where $\bar{x}:=x+(a(x))$ indicates a coset (this is the example from Dummit and Foote, section 11.1, following Proposition 1).  And just like $V$, this quotient space gets "acted on" by the linear transformation $T$ via $$Tv=x\cdot v$$ where here $v$ is a vector (i.e. a coset!) in $F[x]/(a(x))$.

A natural question to ask is, "What is the matrix representation of $T:F[x]/(a(x))\to F[x]/(a(x)) $ with respect to the basis $\mathcal{B}$?" From undergrad linear algebra we know the $i$th column of this matrix is simply the coefficients (with respect to $\mathcal{B}$) of the image of the $i$th basis vector under $T$. Explicitly:

 The last line follows because  $$  \begin{align}  \overline{x}^k=x^k+(a(x))&= a(x)-(b_{k-1}x^{k-1}+\cdots+b_0)+(a(x))\\  &= -b_{k-1}x^{k-1}-\cdots-b_0+(a(x))\\  &=-b_{k-1}\overline{x}^{k-1}-\cdots-b_0\overline{1}.  \end{align}  $$

Notice this matrix representation for $T$ is the precisely the companion matrix of $a(x)$! But keep in mind, we obtained it by restricting the action of $T$ to just one of the factors $F[x]/(a(x))$ in the direct sum above. To obtain the matrix representation of $T$ as it acts on the entire space $V$, we need to take the direct sum** of all the companion matrices. This corresponds to writing down the matrix of $T:V\to V$ with respect to the basis $\mathscr{B}=\mathscr{B}_1\cup \cdots\cup \mathscr{B}_m$ where $\mathscr{B}_i$ is the basis for $F[x]/(a_i(x))$. And this is exactly how we defined the rational canonical form of $T$ above.

As a final remark, any two similar matrices (or equivalently, similar linear transformations) share the same rational canonical form. (See Dummit and Foote, section 12.2 Theorem 15.) This means that the RCF acts like a "name tag" because it allows us to find representatives of the distinct conjugacy classes of linear transformations of a given order.

Next week we'll illustrate this fact with an explicit example.   

Footnotes:

* Recall: to say an $R$-module $M$ is torsion means for every $m\in M$ there exists a nonzero $r\in R$ such that $rm=0$.

 ** By definition, the direct sum of matrices $A_1,\ldots, A_d$ is the block diagonal matrix with the $A_i$'s down the diagonal and zeros elsewhere.

Related Posts
Leave a comment!