Conjugation Action
General motivation
Many resources have the tendency to say that there are two very important group actions of a group on itself: left-multiplication and conjugation. Despite how important these actions are, I have not really found any books that motivate why these actions are so important. My personal impression is that they can be motivated from Cayley’s theorem. Recall that theorem shows that any group can be realized as a subgroup of a symmetric group (i.e. a permutation group).
When we proved Cayley’s theorem for the finite case, we assigned a particular labeling to the elements of $G$. But then that presupposes a particular labeling of the elements of $G$ and we may want to consider a different labeling of the elements. This is totally analogous to the change-of-basis from linear algebra: we may want to choose a different basis or perhaps we want to consider all such matrices up to equivalence.
In our case, this would mean that two permutations $\sigma$ and $\sigma’$ are equivalent provided that $\sigma’ = \tau \sigma \tau^{-1}$ for some permutation $\tau$. The natural choice (at least at this point) to take $\tau$ from is $S_{|G|}$. Later on, we will see that we may want to be more restrictive about our choice of $\tau$ due to parity reasons. In any case, we say that $\sigma’$ is the result of conjugation of $\sigma$ by $\tau$.
Conjugation in $S_n$ (in the language of orbits)
If we think of conjugation of $\sigma$ by $\tau$ as a change-of-labeling, then we would expect $\sigma’ = \tau\sigma\tau^{-1}$ to inherit the structure of the orbits of $\sigma$. This hints at a way of classifying what we will later call the conjugacy classes of $S_n$.
Proposition. Let $\sigma, \tau \in S_n$ and let $O$ be an orbit of $\langle \sigma \rangle$. Then $f:O \to [n]$ defined by $f(i) = \tau \sigma(i)$ is an injection and $f(O)$ is an orbit of the group generated by $\sigma’ = \tau \sigma \tau^{-1}$.
Proof. If $\tau\sigma(i) = \tau\sigma(j)$, then we can just apply the inverse permutation $(\tau\sigma)^{-1}$ on the left to get $i = j$. So $f$ bijects $O \to f(O)$. To show that $f(O)$ is an orbit of $\langle \sigma’\rangle$, we write
\[O = \{x, \sigma(x), \ldots, \sigma^{d-1}(x)\}\]where $d = |O|$ and $x$ is a representor of $O$. Noting that $\sigma’\tau = \tau \sigma$, we have that
\[f(O) = \{\tau(x), \sigma'\tau(x), \ldots, (\sigma')^{d-1}\tau(x)\}\]by direct computation. Furthermore, since $(\sigma’)^d\tau(x) = \tau\sigma(x) = \tau(x)$, it follows that $\sigma’$ acts as a $d$-cycle on $f(O)$, so $f(O)$ is an orbit of $\langle \sigma’\rangle$ and we are done.
Corollary. Every orbit of $\langle\sigma’\rangle$ is of the form $f(O)$ where $O$ is an orbit of $\langle\sigma\rangle$.
Proof. The map $f$ is a bijection so if $O’$ is an orbit of $\langle \sigma’ \rangle$, the same argument shows that $O := f^{-1}(O’)$ is an orbit of $\langle \sigma \rangle$. Thus, $f(O) = O’$ and the claim follows.
Since conjugation preserves the orbit shapes, we can equivalently say that conjugation preserves the cycle shapes of $\sigma$ thanks to the essential uniqueness of cycle decomposition. This leads us to suspect that we can actually classify the elements of $S_n$ by conjugation. For this to be true, however, we need to show that if $\sigma, \sigma’ \in S_n$ have the same cycle shapes, then there is some $\tau \in S_n$ such that $\sigma’ = \tau\sigma\tau^{-1}$. Computing $\tau$, in general, is pretty straightforward. As an example, suppose
\[\sigma = (12)(34)(5678) \quad\text{ and }\quad \sigma' = (13)(58)(2467).\]Then we claim that $\tau = 13582467$ (in cycle notation, $\tau = (235)(4876)$) satisfies $\sigma’ = \tau\sigma\tau^{-1}$ in $S_8$. To obtain $\tau$, we simply send the $k$th position in $\sigma$ (after we add in the fixed points) to the $k$th position in $\sigma’$. Note that this choice of $\tau$ is not unique because we can reorder disjoint cycles as well as cyclically reorder the cycles themselves.
This process holds in general and can be used to constructively show that cycles of the same cycle shape are conjugate in $S_n$.
Theorem. Suppose that $\sigma,\sigma’$ have the same cycle shapes in $S_n$. Then there exists $\tau$ such that $\sigma’ = \tau\sigma\tau^{-1}$.
Proof. Let $c_1,\ldots, c_r$ be the cycles in $\sigma$ and $c_1’,\ldots, c_r$ be the cycles in $\sigma’$. Also, assume that both lists of cycles are ordered so that $c_k$ has the same cycle length as $c_k’$ for each $k = 1, \ldots, r$. Additionally, assume that the lists above include the cycles of length $1$. Let
\[c_k = (c_{k,1}, \ldots, c_{k,|c_{k}|}) \quad\text{ and }\quad c_k' = (c_{k,1}', \ldots, c_{k,|c_{k}'|}')\]where $|c_k|$ is the cycle length of $c_k$ and $|c_{k}’|$ is the cycle length of $c_{k}’$. Define $\tau$ by sending $c_{k,i}$ to $c_{k,i}’$ for each $i$ and for each cycle. We claim that $\tau$ satisfies $\sigma’ = \tau\sigma\tau^{-1}$.
Let $x \in [n]$ and $c_k$ be the cycle $x$ is in. Furthermore, suppose $x = c_{k,i}$. Then
\[\tau\sigma(x) = \tau(c_{k,i+1}) = c_{k,i+1}'\]and
\[\sigma'\tau(x) = \sigma'(c_{k,i}') = c_{k,i+1}'\]where the indices are taken modulo some appropriate modulus. Thus, it follows that $\tau\sigma = \sigma’\tau$ and the claim follows.
Motivation for abstract conjugation
Notice that it’s quite easy to compute the change-of-labeling needed for conjugate permutations. Indeed, the proof above we gave shows that
\[\sigma = (a_1, a_2, \ldots)(b_1, b_2, \ldots)\]implies that
\[\tau\sigma\tau^{-1} = (\tau(a_1), \tau(a_2), \ldots)(\tau(b_1), \tau(b_2), \ldots).\]It can be tricky, however, to compute conjugation in a subgroup of the symmetric group (which, is important, because Cayley’s theorem shows isomorphism to a subgroup of a symmetric group). A common example is that $(123)$ and $(132)$ are not conjugate in the alternating group $A_3$ (because any conjugation map would always be a transposition but $A_3$ has no transpositions), yet, they are in $S_3$. Accordingly, we see that it does not suffice to study conjugation only in symmetric groups.
The conjugation action
Definition. Let $G$ be a group. Then the action of $G$ on itself given by $g\cdot x := gxg^{-1}$ for some fixed $g\in G$ is called the conjugation action of $G$ on itself. The orbits of the conjugation action are called conjugacy classes. If two elements of $G$ are in the same conjugacy class, they are said to be conjugate to each other.
Remark. By the classification theorem we proved for the conjugacy classes for the symmetric group $S_n$, it follows that the conjugacy classes are indexed by the partitions of $n$. Notice that it is a relatively elementary combinatorial problem to count each conjugacy class as a distinguishability and cyclical symmetry problem.
From abstract group action theory, we know that the orbits of an action partition the set the group acts on. Furthermore, each orbit is in bijection with the group modulo the stabilizer of a representative of that orbit. This immediately gives the famous result known as the class equation. A bit of terminology before we present the class equation: the stabilizer of $S\subseteq G$ under the conjugation action is known as the normalizer $N_G(S)$ of $S$. Furthermore, the pointwise stabilizer (i.e. $gsg^{-1} = s$ for each $s\in S$) of $S$ is known as the centralizer $C_G(S)$ of $S$.
Theorem (Class equation). Let $G$ be a finite group. Then
\[|G| = |Z(G)| + \sum_{x} [G:C_G(x)]\]where $x$ is summed over irredundant representatives of nontrivial conjugacy classes and $Z(G)$ is the center of $G$.
Proof. The conjugacy classes partition $G$. Furthermore, the trivial conjugacy classes with one element necessarily commute with all elements of $G$. This gives us
\[|G| = |Z(G)| + \sum_{x} |C(x)|\]where $C(x)$ is the conjugacy class of $x$ and the sum is taken over irredundant representatives of nontrivial conjugacy classes. Since $C(x)$ is in bijection with $G/C_G(x)$, the claim follows.
Admittedly, the class equation at a brief glance looks like a totally useless result. Despite this, however, it ends up being a potentially very powerful tool. By Lagrange’s theorem, we know that $[G:C_G(x)]$ is a factor of $|G|$, so by taking the class equation modulo some number, we can sometimes say something about the size of the center. This can be exploited to show the existence of a subgroup of a particular order nonconstructively. Here are some classical results that are almost trivial by the class equation:
Theorem. Let $G$ be a nontrivial $p$-group (order is a power of the prime $p$). Then $Z(G)$ is nontrivial.
Proof. Take the class equation modulo $p$. Then we have $|Z(G)| \equiv 0 \pmod p$. Since $Z(G)$ has at least one element (the identity of $G$), $|Z(G)| \ge p$.
Theorem. Let $G$ have order $p^2$ for some prime $p$. Then $G$ is abelian.
Proof. The previous result implies that either $|Z(G)| = p^2$ or $|Z(G)| = p$. If $|Z(G)| = p^2$, then $Z(G) = Z(G)$ and we are done. If $|Z(G)| = p$, then $G/Z(G)$ is a group of order $p$ (by Lagrange) and is cyclic (also by Lagrange). This tells us that if $g\in G$, then there is some $xZ(G)$ such that $gx^{-m} = r$ for some $r \in Z(G)$. This justifies setting $g = rx^a$ and $h=sx^b$, with $r, s \in Z(G)$. Then we see that
\[gh = rx^a sx^b = rx^{a + b} s = s x^{a + b} r = sx^{b}rx^{a} = hg,\]where most of the equalities follow from the fact that $r$ and $s$ must commute with every element of $G$. Note that this shows that $G$ is abelian so $|Z(G)| = p^2$, contradicting the assumption that $|Z(G)| = p$. Thus, $Z(G) = G$ and we are done.
Remark. Notice that, by the classification of finitely-generated abelian groups, this means that any group whose order is a square of a prime $p$ is isomorphic to $\mathbb{Z}/p^2\mathbb{Z}$ or $(\mathbb{Z}/p\mathbb{Z})^{\oplus 2}$. This can be shown directly too without much difficulty by showing that at most two nonidentity elements can be used to generate the entire group.