Rotations and angular momentum

At this point we've had at least a glimpse of all of the important symmetries of quantum mechanics in one dimension. However, when we turn to consider the full three-dimensional world, one more extremely important symmetry operation appears: rotation. Rotational symmetry is everywhere, and has widespread implications in all areas of quantum mechanics. Also, the algebraic structure of spatial rotations re-appears in several places in the form of more abstract "rotations" appearing in nuclear and particle physics.

Let's first remind ourselves of how finite rotations are described in classical physics. An arbitrary finite rotation can be described by a 3x3 matrix \( R \) which encodes the effects of the rotation on any vector as \( \vec{v}' = R \vec{v} \). (In technical terms, the set of \( R \) is a representation of the rotation group.) By definition, a rotation can't change the length of a vector, so we must have \( |\vec{v}'| = |\vec{v}| \) for any \( \vec{v} \) that we choose; this implies that \( R \) must be an orthogonal matrix,

\[ \begin{aligned} R^T R = R R^T = 1. \end{aligned} \]

This is the analogue of a unitary matrix for a system where we only have real numbers. Notice also that this equation implies \( 1 = \det(R R^T) = \det(R) \det(R^T) \), so \( \det(R) = \pm 1 \). Technically, a rotation must satisfy the further constraint \( \det(R) = 1 \); we'll see what the negative-determinant maps mean a bit later.

It's useful to recall the explicit form of the rotation matrix for rotation about a specific axis. For example, rotation by \( \phi \) about the \( z \)-axis is given by

\[ \begin{aligned} R_z(\phi) = \left( \begin{array}{ccc} \cos \phi & -\sin \phi & 0 \\ \sin \phi & \cos \phi & 0 \\ 0 & 0 & 1 \end{array} \right). \end{aligned} \]

The matrices for \( x \) and \( y \) are constructed in the same way, with a \( 1 \) in the appropriate spot to leave the axis of rotation invariant. The minus sign in the top right is a convention (corresponding to the usual right-hand rule.) Any arbitrary \( R \) can be constructed as a product of sequential rotations, at most three, about different axes (you may be having flashbacks to Euler angles right about now.) As I have pointed out before, rotations are non-commutative; the order in which we apply rotations matters, which is very easy for you to verify experimentally with your textbook or cell phone.

In classical mechanics, one typically starts by writing down the expression for an arbitrary finite rotation, as we just have. However, in quantum mechanics we like to build symmetries up from infinitesmal operations and identify the generators. We can easily work backwards to see what a rotating by an infinitesmal angle \( \epsilon \) looks like:

\[ \begin{aligned} R_z(\epsilon) = \left( \begin{array}{ccc} 1 - \epsilon^2/2 & -\epsilon & 0 \\ \epsilon & 1 - \epsilon^2 / 2 & 0 \\ 0 & 0 & 1 \end{array} \right), \end{aligned} \]

and similarly for \( R_x(\epsilon) \) and \( R_y(\epsilon) \). Even for these infinitesmal rotations, we can see the non-commutativity of rotations: it's easy to work out that

\[ \begin{aligned} R_x(\epsilon) R_y(\epsilon) - R_y(\epsilon) R_x(\epsilon) = \left( \begin{array}{ccc} 0 & -\epsilon^2 & 0 \\ \epsilon^2 & 0 & 0 \\ 0 & 0 & 0 \end{array} \right). \end{aligned} \]

Notice that if we ignore terms at order \( \epsilon^2 \), the two matrices actually do commute. However, it's also interesting to notice that this looks similar to an infinitesmal rotation about the \( z \) axis. In fact, we can write

\[ \begin{aligned} R_x(\epsilon) R_y(\epsilon) - R_y(\epsilon) R_x(\epsilon) = R_z(\epsilon^2) - 1 + \mathcal{O}(\epsilon^4). \end{aligned} \]

So for infinitesmal rotations, the commutator of rotations about two perpendicular axes depends explicitly on a rotation about the third axis.

Now to quantum mechanics. For other operators we've considered so far, the algebra in the classical limit was simple, with everything commuting. However, in this case our quantum algebra must reproduce the classical non-commutativity of rotation, so we'll have to construct it from the beginning with the above structure in mind. We start by supposing that for any classical rotation \( R(\vec{n}, \phi) \) there exists a unitary operator \( \hat{U}(\vec{n}, \phi) \) which transforms a ket from the un-rotated coordinate system to the rotated one,

\[ \begin{aligned} \ket{\psi}_R = \hat{U}(\vec{n}, \phi) \ket{\psi}. \end{aligned} \]

As I've written explicitly, any single rotation can be specified by choosing a rotation axis \( \vec{n} \), and an angle of rotation \( \phi \). Taking the angle to be an infinitesmal \( d\phi \), we know that we can write the unitary \( \hat{U} \) in terms of some new Hermitian operator,

\[ \begin{aligned} \hat{U}(\vec{n}, \phi) = 1 - \frac{i}{\hbar} \mathcal{O}_{\vec{n}} d\phi + ... \end{aligned} \]

Of course, we don't want to have an infinite number of operators for each possible choice of rotation axis. If we suppose that like the classical theory, any quantum rotation can be decomposed into rotations about the coordinate axes, then we can hypothesize a vector operator \( \hat{\vec{J}} \) as the generator, and

\[ \begin{aligned} \hat{U}(\vec{n}, \phi) = 1 - \frac{i}{\hbar} (\hat{\vec{J}} \cdot \vec{n}) d\phi + ... \end{aligned} \]

As always, we can go backwards to rebuild a finite rotation from a large number of infinitesmal ones, finding e.g.

\[ \begin{aligned} \hat{U}(\hat{z}, \phi) = \exp \left( \frac{-i \hat{J}_z \phi}{\hbar} \right). \end{aligned} \]

Now we remember that we want to reproduce the classical commutation relations between finite rotation operators, one of which we derived above. Using the exponential form and expanding to order \( \epsilon^2 \), we find for the quantum theory

\[ \begin{aligned} \left[ 1 - \frac{i \hat{J}_x \epsilon}{\hbar} - \frac{\hat{J}_x{}^2 \epsilon^2}{2\hbar^2}, 1 - \frac{i \hat{J}_y \epsilon}{\hbar} - \frac{\hat{J}_y{}^2 \epsilon^2}{2\hbar^2} \right] = \left( 1 - \frac{i \hat{J}_z \epsilon^2}{\hbar} \right) - 1 \end{aligned} \]

The commutator is a mess at first glance, but since everything commutes with \( 1 \), all of the terms of order \( \epsilon^0 \) and \( \epsilon^1 \) vanish immediately, and the \( \epsilon^2 \) term becomes quite simple:

\[ \begin{aligned} [-i\hat{J}_x \epsilon / \hbar, -i \hat{J}_y \epsilon/\hbar] = i \hat{J}_z \epsilon^2 / \hbar \\ \Rightarrow [\hat{J}_x, \hat{J}_y] = i \hbar \hat{J}_z. \end{aligned} \]

There is nothing particularly special about the \( z \) axis, of course. We can repeat this argument for other combinations of rotations, and the only difference will be a possible minus sign. Gathering all of the results, we find that

\[ \begin{aligned} [\hat{J}_i, \hat{J}_j] = i \hbar \epsilon_{ijk} \hat{J}_k. \end{aligned} \]

This is the defining commutation relation for the operator \( \hat{J} \), which we identify as the angular momentum operator, since it generates rotations in the same way that linear momentum generates translations.

Eigenvalues and eigenstates of angular momentum

Let's see what the commutation relations imply about a system with angular momentum in full generality. There is a new operator we can construct by summing up all of the individual angular momentum operators,

\[ \begin{aligned} \hat{J}^2 = \hat{J}_x{}^2 + \hat{J}_y{}^2 + \hat{J}_z{}^2. \end{aligned} \]

This operator commutes with all of the individual angular-momentum operators,

\[ \begin{aligned} [\hat{J}{}^2, \hat{J}_k] = 0. \end{aligned} \]

This is easy to prove: first, notice that we can write the general commutator

\[ \begin{aligned} [\hat{J}_i{}^2, \hat{J}_j] = \hat{J}_i [\hat{J}_i, \hat{J}_j] + [\hat{J}_i, \hat{J}_j] \hat{J}_i \\ = \sum_k i \hbar \epsilon_{ijk} (\hat{J}_i \hat{J}_k + \hat{J}_k \hat{J}_i). \end{aligned} \]

For both \( i \) and \( j \) fixed, the epsilon symbol just gives us a single term \( k \neq i \neq j \) with either a plus or minus sign. However, if we sum over the first index \( i \), then we can use the antisymmetry of epsilon: for every term \( \epsilon_{ijk} \), there will be a term \( \epsilon_{kji} \) with opposite sign. So

\[ \begin{aligned} \sum_i [\hat{J}_i{}^2, \hat{J}_j] = [\hat{J}{}^2, \hat{J}_j] = 0. \end{aligned} \]

We can thus choose one of the three directions \( \hat{J}_i \) to diagonalize simultaneously with \( \hat{J}^2 \); by convention, we usually take \( \hat{J}_z \). (We of course can't choose more than one, since the other directions don't commute with each other.) Before we start to look at the common eigenstates of these operators, it's useful to define the combinations

\[ \begin{aligned} \hat{J}_{\pm} = \hat{J}_x \pm i \hat{J}_y, \end{aligned} \]

which (just like their SHO counterparts) are known as angular-momentum ladder operators or raising and lowering operators, a name which will make more sense shortly. Notice that these are not Hermitian operators; in fact, the complex transpose of one is the other,

\[ \begin{aligned} \hat{J}_+^\dagger = \hat{J}_-. \end{aligned} \]

Their commutation relations are easily derived:

\[ \begin{aligned} [\hat{J}_+, \hat{J}_-] = 2\hbar J_z \\ [\hat{J}_z, \hat{J}_{\pm}] = \pm \hbar \hat{J}_{\pm} \\ [\hat{J}{}^2, \hat{J}_{\pm}] = 0. \end{aligned} \]

Now let's tackle the eigenvalue problem. Usually, the action of these operators and the nature of their eigenkets are just handed down from on high; I find it much more enlightening to go through the derivation as Sakurai does. So let's for the moment assume that the two eigenvalues are completely arbitrary, and label them "a" and "b":

\[ \begin{aligned} \hat{J}{}^2 \ket{a,b} = a \ket{a,b} \\ \hat{J}_z \ket{a,b} = b \ket{a,b}. \end{aligned} \]

The ladder operators have a familiar-looking effect on these eigenstates:

\[ \begin{aligned} \hat{J}_z (\hat{J}_{\pm} \ket{a,b}) = ([\hat{J}_z, \hat{J}_{\pm}] + \hat{J_{\pm}} \hat{J}_z) \ket{a,b} \\ = (b \pm \hbar) (\hat{J}_{\pm} \ket{a,b}). \end{aligned} \]

So in terms of the \( \hat{J}_z \) eigenvalue, the ladder operators give another \( \hat{J}_z \) eigenstate, with the eigenvalue raised or lowered by \( \hbar \). On the other hand, since the ladder operators commute with \( \hat{J}^2 \), they don't affect that eigenvalue:

\[ \begin{aligned} \hat{J}{}^2 (\hat{J}_{\pm} \ket{a,b}) = a (\hat{J}_{\pm} \ket{a,b}). \end{aligned} \]

Unlike the simple harmonic oscillator, it turns out that we can't continue raising or lowering the \( \hat{J}_z \) eigenvalue forever. To see why, notice that we can rewrite the difference

\[ \begin{aligned} \hat{J}{}^2 - \hat{J}_z{}^2 = \frac{1}{2} (\hat{J}_+ \hat{J}_- + \hat{J}_- \hat{J}_+) \\ = \frac{1}{2} ( \hat{J}_+ \hat{J}_+^\dagger + \hat{J}_+^\dagger \hat{J}_+). \end{aligned} \]

If we take the expectation value, we find that it must be manifestly positive:

\[ \begin{aligned} \ev{\hat{J}_+^\dagger \hat{J}_+} = \bra{a,b} \hat{J}_+^\dagger \hat{J}_+ \ket{a,b} = \sprod{a,b+\hbar}{a,b+\hbar} \geq 0, \end{aligned} \]

(remember that all states in our Hilbert space have positive norm!) The same is true for the complex conjugate, so we have immediately that

\[ \begin{aligned} \bra{a,b} (\hat{J}{}^2 - \hat{J}_z{}^2) \ket{a,b} = a - b^2 \geq 0. \end{aligned} \]

So we find both an upper limit and a lower limit on the eigenvalue of \( \hat{J}z \); its square can't be larger than the eigenvalue \( a \) of \( \hat{J}^2 \). This means that for any \( a \), there must be a \( b{\textrm{max}} \) such that

\[ \begin{aligned} \hat{J}_+ \ket{a,b_{\textrm{max}}} = 0. \end{aligned} \]

Now, we can apply the lowering operator to \( 0 \) and we still have \( 0 \):

\[ \begin{aligned} \hat{J}_- \hat{J}_+ \ket{a,b_{\textrm{max}}} = 0. \end{aligned} \]

But this is an interesting combination of operators:

\[ \begin{aligned} \hat{J}_- \hat{J}_+ = (\hat{J}_x - i\hat{J}_y) (\hat{J}_x + i\hat{J}_y) \\ = \hat{J}_x{}^2 + \hat{J}_y{}^2 -i [\hat{J}_y, \hat{J}_x] \\ = \hat{J}{}^2 - \hat{J}_z{}^2 - \hbar \hat{J}_z. \end{aligned} \]

Thus,

\[ \begin{aligned} \hat{J}^2 - \hat{J}_z^2 - \hbar \hat{J}_z \ket{a,b_{\textrm{max}}} = (a - b_{\textrm{max}}^2 - \hbar b_{\textrm{max}}) \ket{a,b_{\textrm{max}}} = 0. \end{aligned} \]

By assumption, the eigenket \( \ket{a, b_{\textrm{max}}} \) certainly exists and is non-null, so the only way this can be true is if

\[ \begin{aligned} a = b_{\textrm{max}} (b_{\textrm{max}} + \hbar). \end{aligned} \]

A similar argument at the other end yields a minimum possible \( b \), and

\[ \begin{aligned} a = b_{\textrm{min}} (b_{\textrm{min}} - \hbar). \end{aligned} \]

So by comparison, \( b_{\textrm{min}} = -b_{\textrm{max}} \). Moreover, for all of this to be self-consistent we must be able to get from the minimum state to the max, and vice-versa, by successive use of the ladder operators. Thus, we find that

\[ \begin{aligned} b_{\textrm{max}} = b_{\textrm{min}} + n\hbar \end{aligned} \]

for some integer \( n \), and therefore

\[ \begin{aligned} b_{\textrm{max}} = \frac{n\hbar}{2}. \end{aligned} \]

The integer \( n \) is fixed by the eigenvalue of \( \hat{J}^2 \), which must satisfy

\[ \begin{aligned} a = \frac{1}{4} \hbar^2 n(n+2). \end{aligned} \]

Physically, you can think about what's happening here in fairly simple terms. Since \( \hat{J}^2 \) is in some sense the "length" of the angular momentum vector operator \( \hat{\vec{J}} \), the "length" \( \hat{J}_z^2 \) of one component of the vector can't be greater than the total vector length, hence the minimum and maximum conditions on \( b \). Combination with the angular-momentum commutation relations then leads to quantization, with only certain values of the angular momentum allowed.

Now that we've done the derivation, let's switch to the standard notation. We label our states by two numbers \( j \) and \( m \), which may be integers or half-integers; in particular, from our derivation above we identify \( j=n/2 \) as the \( \hat{J}^2 \) label, and \( m = b/\hbar \) as the \( \hat{J}_z \) label. The eigenvalue equations are then

\[ \begin{aligned} \hat{J}{}^2 \ket{j,m} = \hbar^2 j(j+1) \ket{j,m} \\ \hat{J}_z \ket{j,m} = \hbar m \ket{j,m}, \end{aligned} \]

and the allowed values of \( m \) are determined by \( j \),

\[ \begin{aligned} m = \{-j, -j+1, ..., j-1, j\}. \end{aligned} \]

If \( j \) is an integer, than \( m \) always is too; if \( j \) is half-integer, then so is \( m \). The dimensionality of the space, i.e. the number of independent kets is always \( 2j+1 \). So a particle carrying intrinsic angular momentum with two possible states is "spin-1/2", because for two states \( j=1/2 \). Incidentally, \( j=1 \) is three dimensional, i.e. vector valued; we will see that the fact that light polarization can be represented as a vector is related to the fact that photons have spin 1.