Operators and More on Hilbert Spaces

Today, we'll finish up our mathematical mystery tour with operators. First, let's finish the proof of the Cauchy-Schwarz inequality,

\[ \begin{aligned} \sprod{\alpha}{\alpha} \sprod{\beta}{\beta} \geq |\sprod{\alpha}{\beta}|^2. \end{aligned} \]

from last time, I left you with the task of proving this in general, and as a hint I showed you the proof in regular \( \mathbb{R}^n \) coordinate space, which relies on the angle between the two vectors. The fact that the inequality is saturated if the two vectors point in the same direction hints that even in a general Hilbert space, we should look at the projection of one vector along the other.

To complete the proof, the best place to start is by constructing the projection of \( \ket{\alpha} \) which is orthogonal to \( \ket{\beta} \):

\[ \begin{aligned} \ket{\alpha_\perp} = \ket{\alpha} - \frac{\sprod{\beta}{\alpha}}{\sprod{\beta}{\beta}} \ket{\beta}. \end{aligned} \]

We can verify that the inner product with \( \ket{\beta} \) is zero using linearity:

\[ \begin{aligned} \sprod{\beta}{\alpha_\perp} = \bra{\beta} \left(\ket{\alpha} - \frac{\sprod{\beta}{\alpha}}{\sprod{\beta}{\beta}} \ket{\beta}\right) \\ = \sprod{\beta}{\alpha} - \frac{\sprod{\beta}{\alpha}}{\sprod{\beta}{\beta}} \sprod{\beta}{\beta} = 0. \end{aligned} \]

The squared norm of \( \ket{\alpha_\perp} \) is

\[ \begin{aligned} \sprod{\alpha_\perp}{\alpha_\perp} = \left(\bra{\alpha} - \frac{\sprod{\alpha}{\beta}}{\sprod{\beta}{\beta}} \bra{\beta} \right) \left(\ket{\alpha} - \frac{\sprod{\beta}{\alpha}}{\sprod{\beta}{\beta}} \ket{\beta}\right) \\ = \sprod{\alpha}{\alpha} + \frac{\sprod{\alpha}{\beta} \sprod{\beta}{\alpha}}{\sprod{\beta}{\beta}} - 2 \frac{\sprod{\alpha}{\beta} \sprod{\beta}{\alpha}}{\sprod{\beta}{\beta}} \end{aligned} \]

or, multiplying through by \( \sprod{\beta}{\beta} \) and reorganizing,

\[ \begin{aligned} \sprod{\alpha}{\alpha} \sprod{\beta}{\beta} = |\sprod{\alpha}{\beta}|^2 + \sprod{\alpha_\perp}{\alpha_\perp} \sprod{\beta}{\beta}. \end{aligned} \]

This completes the proof, since the norm of \( \ket{\alpha_\perp} \) is positive definite; the left-hand side is always greater than or equal to \( |\sprod{\alpha}{\beta}|^2 \). Moreover, the inequality is saturated only if \( \sprod{\alpha_\perp}{\alpha_\perp} = 0 \), which means \( \ket{\alpha_\perp} = 0 \), which by construction means that \( \ket{\alpha} \) and \( \ket{\beta} \) point in the same direction.

Operators

Hilbert space, on its own, is in fact pretty boring from a mathematical point of view! It can be proved that the only number you really need to describe a Hilbert space is its dimension; all finite-dimensional Hilbert spaces of the same dimension are isomorphic, and so are all of the infinite-dimensional ones (roughly.) What will be interesting is the evolution of Hilbert space, how vectors change into other vectors.

From undergraduate QM, we know that measurement or time-evolution involve operations which change the wavefunction. To deal with this in our present notation, we need to introduce operators, objects which map kets into different kets in the same Hilbert space,

\[ \begin{aligned} \hat{O}: \mathcal{H} \rightarrow \mathcal{H}. \end{aligned} \]

A general operator is denoted by a hat and usually with a capital letter, i.e. \( \hat{A},\hat{B},\hat{C} \); the action of an operator on a ket is denoted \( \hat{X} \ket{\alpha} \). (You won't find the hats in Sakurai, but it's standard notation in most other references.)

General properties of operators are:

  1. Equality: \( \hat{A} = \hat{B} \) if \( \hat{A}\ket{\alpha} = \hat{B}\ket{\alpha} \) for all \( \ket{\alpha} \).
  2. Null operator: There exists a special operator \( \hat{0} \) so that \( \hat{0} \ket{\alpha} = \ket{\emptyset} \) for all \( \ket{\alpha} \).
  3. Identity: There exists another special operator \( \hat{1} \), satisfying \( \hat{1} \ket{\alpha} = \ket{\alpha} \) for all \( \ket{\alpha} \).
  4. Addition: \( \hat{A}+\hat{B} \) is another operator satisfying \( \hat{A}+\hat{B}=\hat{B}+\hat{A} \) and \( \hat{A}+(\hat{B}+\hat{C}) = (\hat{A}+\hat{B})+\hat{C} \).
  5. Scalar multiplication: \( c\hat{A} \) is an operator satisfying \( (c\hat{A}) \ket{\alpha} = \hat{A} (c \ket{\alpha}) \).
  6. Multiplication: \( \hat{A}\hat{B} \) is an operator, where \( (\hat{A}\hat{B}) \ket{\alpha} = \hat{A} (\hat{B} \ket{\alpha}) \).

A linear operator can be distributed over linear combinations of kets, i.e.

\[ \begin{aligned} \hat{A} (c_\alpha \ket{\alpha} + c_\beta \ket{\beta}) = c_\alpha \hat{A} \ket{\alpha} + c_\beta \hat{A} \ket{\beta}. \end{aligned} \]

Almost every operator we will encounter in this class is linear; you can assume linearity unless told otherwise. Once again, if we specialize back to the familiar example of \( \mathbb{R}^3 \) as our Hilbert space, linear operators are nothing more than \( 3 \times 3 \) matrices.

Just like matrices, notice that in general, the order of the operators matters when we multiply them: \( \hat{X}\hat{Y} \neq \hat{Y}\hat{X} \). The difference between the two orderings has a special symbol called the commutator:

\[ \begin{aligned} [\hat{A},\hat{B}] \equiv \hat{A}\hat{B} - \hat{B}\hat{A}. \end{aligned} \]

Two operators \( \hat{A},\hat{B} \) are said to commute if \( [\hat{A},\hat{B}]=0 \) (and then we can put them in any order.) Commutators will be appearing a lot in quantum mechanics, so it's worth a short detour into some of their properties. It's not too difficult to prove the following identities:

\[ \begin{aligned} [\hat{A},\hat{A}] = 0 \\ [\hat{A},\hat{B}] = -[\hat{B},\hat{A}] \\ [\hat{A}+\hat{B}, \hat{C}] = [\hat{A},\hat{C}] + [\hat{B},\hat{C}] \\ [\hat{A}, \hat{B}\hat{C}] = [\hat{A},\hat{B}]\hat{C} + \hat{B}[\hat{A},\hat{C}] \\ [\hat{A}\hat{B}, \hat{C}] = \hat{A}[\hat{B},\hat{C}] + [\hat{A},\hat{C}]\hat{B} \\ [\hat{A},[\hat{B},\hat{C}]] + [\hat{B},[\hat{C},\hat{A}]] + [\hat{C},[\hat{A},\hat{B}]] = 0 \end{aligned} \]

This last equation is called the Jacobi identity (the rest of them are too obvious to be named after anyone, it seems.)

Some (but not all!) operators \( \hat{A} \) have an associated inverse operator \( \hat{A}^{-1} \), which undoes the operation and gives back the original ket:

\[ \begin{aligned} \hat{A}{}^{-1} \hat{A} \ket{\alpha} = \hat{I} \ket{\alpha} \end{aligned} \]

It's straightforward to show that if it exists, the inverse of \( \hat{A} \) is unique, and it is both a left and right inverse, i.e. \( \hat{A} \hat{A}^{-1} = \hat{A}^{-1} \hat{A} = \hat{I} \).

Equipped with the inner product from above, we can define the adjoint (sometimes, Hermitian adjoint) of an operator, denoted with a dagger, by the statement

\[ \begin{aligned} (\ket{\alpha}, \hat{A} \ket{\beta}) = (\hat{A}{}^\dagger \ket{\alpha}, \ket{\beta}) \end{aligned} \]

for any choice of \( \ket{\alpha} \) and \( \ket{\beta} \). I've written this out using the inner product explicitly; it's a little confusing in bra-ket notation, since operators always "act to the right" as we've defined them, i.e. map kets to kets. But if we know that \( \hat{A} \ket{\alpha} = \ket{\beta} \), then the corresponding bra is \( \bra{\beta} = \bra{\alpha} \hat{A}^\dagger \), now with the adjoint operator "acting to the left" on the bra. (Once again, it may help you to think of \( A \) as a matrix if this is confusing, with kets as right-multiplied vectors and bras as left-multiplied transpose vectors.)

If we are working with the finite-dimensional Hilbert space \( \mathbb{C}^n \), where \( \hat{A} \) is an \( n \times n \) matrix, then its adjoint is exactly the conjugate transpose matrix: \( (A^\dagger){ij} = A{ji}^\star \). The adjoint has the following properties when combined with some of the operations above:

\[ \begin{aligned} (c\hat{A})^\dagger = c^\star \hat{A}{}^\dagger \\ (\hat{A}+\hat{B})^\dagger = \hat{A}{}^\dagger + \hat{B}{}^\dagger \\ (\hat{A}\hat{B})^\dagger = \hat{B}{}^\dagger \hat{A}{}^\dagger \end{aligned} \]

In quantum mechanics, an important class of operators are those that are self-adjoint, i.e.

\[ \begin{aligned} \hat{A}{}^\dagger = \hat{A}. \end{aligned} \]

These operators are also known as Hermitian operators. Another important class of operators are those whose adjoint gives the inverse operator,

\[ \begin{aligned} \hat{U}{}^\dagger = \hat{U}{}^{-1}. \end{aligned} \]

Such operators are called unitary. A unitary operator acts like a rotation in our Hilbert space, in particular preserving lengths (norms); if \( \ket{\beta} = \hat{U} \ket{\alpha} \), then \( \sprod{\alpha}{\alpha} = \sprod{\beta}{\beta} \). It won't surprise you that unitary operators will be at the center of how we change coordinate systems, but I'll defer that discussion for now.

Eigenkets and eigenvalues

As you'll recall from linear algebra, we can gain an enormous amount of insight into a matrix by studying its eigenvectors and eigenvalues, and our present situation is no different. We identify the eigenkets of operator \( \hat{A} \) as those kets which are left invariant up to a scalar multiplication:

\[ \begin{aligned} \hat{A} \ket{a} = a \ket{a}. \end{aligned} \]

The scalar \( a \) is the eigenvalue associated with eigenket \( \ket{a} \): it's often conventional to use the eigenvalue as a label for the eigenket.

Now we're in a position to see why Hermitian operators are so interesting. Suppose \( A \) is Hermitian, and we have identified two (non-zero) eigenstates \( \ket{a_1} \) and \( \ket{a_2} \). Then we can construct two inner products that are equal:

\[ \begin{aligned} \bra{a_1} \hat{A} \ket{a_2} = \bra{a_2} \hat{A} \ket{a_1}^\star \end{aligned} \]

applying conjugate symmetry of the inner product and the definition of the adjoint. Since we're working with eigenstates of \( \hat{A} \), this becomes

\[ \begin{aligned} a_2 \sprod{a_1}{a_2} = a_1^\star \sprod{a_2}{a_1}^\star \\ (a_2 - a_1^\star) \sprod{a_1}{a_2} = 0. \end{aligned} \]

First, notice that if we choose \( \ket{a_2} = \ket{a_1} \), we immediately have

\[ \begin{aligned} (a_1 - a_1^\star) \sprod{a_1}{a_1} = \textrm{Im}(a_1) ||a_1||^2 = 0. \end{aligned} \]

Since by assumption \( \ket{a_1} \) has non-zero norm, we immediately see that all eigenvalues of a Hermitian operator are real.

Now, we go back and assume \( a_1 \neq a_2 \); then the only way to satisfy the equation is for the inner product to vanish, which means that \( \ket{a_1} \) and \( \ket{a_2} \) are orthogonal. So we've also proved that all eigenvectors of a Hermitian operator with distinct eigenvalues are orthogonal.

(You may ask: what about operators with multiple eigenkets that yield the same eigenvalue? In this case, the states which share an eigenvalue are said to be degenerate. The proof requires a little more work, but it's still possible to construct a complete set of eigenvectors which are mutually orthogonal. We'll return to deal with degenerate systems later on.)

Basis kets and matrix representation

Let's return to the idea of a basis, which I introduced briefly above. A basis is a set of linearly independent kets with the same dimension as the Hilbert space itself. (Infinite bases are a little trickier to think about, but most everything I say about bases will apply to them too.) If \( {\ket{e_n}} \) is a basis, then any ket in the Hilbert space can be written as a linear combination of the basis kets,

\[ \begin{aligned} \ket{\alpha} = \sum_n \alpha_n \ket{e_n}. \end{aligned} \]

In fact, for a Hilbert space we can always find an orthonormal basis, that is, a set of basis kets which are mutually orthogonal and have norm 1. If we suppose that \( {\ket{e_n}} \) is such a basis, then we can take the inner product of the expansion with another basis ket:

\[ \begin{aligned} \sprod{e_m}{\alpha} = \sum_n \alpha_n \sprod{e_m}{e_n} = \alpha_m. \end{aligned} \]

We can rewrite \( \ket{\alpha} \) suggestively:

\[ \begin{aligned} \ket{\alpha} = \sum_n \left( \ket{e_n} \bra{e_n}\right) \ket{\alpha}. \end{aligned} \]

In other words, the projection of \( \ket{\alpha} \) onto one of the basis vectors \( \ket{e_n} \) is given by acting on \( \ket{\alpha} \) with the object

\[ \begin{aligned} \Lambda_n = \ket{e_n} \bra{e_n}. \end{aligned} \]

This is a special operator, known as the projection operator along ket \( \ket{e_n} \). It's also a particular example of a new type of product, called the outer product, which like the inner product is composed of a bra and a ket, but now in reverse order. For a general outer product \( \ket{\alpha} \bra{\beta} \), if we multiply by a ket \( \ket{\gamma} \) on the right, then we find another ket times a constant:

\[ \begin{aligned} \ket{\alpha} \sprod{\beta}{\gamma} = c \ket{\alpha}. \end{aligned} \]

Similarly, multiplying by a bra on the left gives us another bra (times a constant.) So we recognize that the outer product is always an operator.

Going back to our expansion of the ket \( \ket{\alpha} \) above, we can read off an extremely useful identity: the sum of all projection operators in the basis just gives back the original ket, which means that

\[ \begin{aligned} \sum_n \ket{e_n} \bra{e_n} = \hat{1}. \end{aligned} \]

This is an extremely useful relation, and you should circle it in your notes! This identity is called completeness or closure, and replacing \( \hat{1} \) with a sum over some basis states is typically called "inserting a complete set of states." The key word is complete - remember that this relation only holds if the \( {\ket{e_n}} \) form a basis!

Working in terms of a basis, we can take the idea of representing Hilbert space as an ordinary set of vectors and matrices more seriously. For a Hilbert space of finite dimension \( N \), we can represent any state \( \ket{\psi} \) as a column vector over a chosen set of basis vectors \( \ket{e_n} \):

\[ \begin{aligned} \ket{\psi} = \sum_n \psi_n \ket{e_n} = \left(\begin{array}{c} \psi_1 \\ \psi_2 \\ \psi_3 \\ ... \\ \psi_N \end{array} \right) \end{aligned} \]

The associated bra \( \bra{\psi} \) is just the conjugate transpose, i.e. a row vector whose entries are the complex conjugates of the \( \psi_n \)'s,

\[ \begin{aligned} \bra{\psi} = \sum_n \psi_n^\star \bra{e_n} = \left(\psi_1^\star \psi_2^\star ... \psi_N^\star \right). \end{aligned} \]

With this notation, you can verify that the ordinary vector product gives us the correct inner product:

\[ \begin{aligned} \sprod{\chi}{\psi} = \sum_{i} \chi_i^\star \psi_i = \left(\chi_1^\star \chi_2^\star ... \chi_N^\star \right) \left(\begin{array}{c} \psi_1 \\ \psi_2 \\ \psi_3 \\ ... \\ \psi_N \end{array} \right). \end{aligned} \]

(Note that \( \chi_i \) are the elements of the ket \( \ket{\chi} \), hence the complex conjugation.) Multiplying column times row in the usual vector outer product also gives us what we expect:

\[ \begin{aligned} (\ket{\psi}\bra{\chi})_{ij} = \psi_i \chi^\star_j = \left(\begin{array}{c} \psi_1 \\ \psi_2 \\ \psi_3 \\ ... \\ \psi_N \end{array} \right) \left(\chi_1^\star \chi_2^\star ... \chi_N^\star \right). \end{aligned} \]

The result of this outer product is an \( N \times N \) matrix; for example, the projector onto the first basis state is

\[ \begin{aligned} \ket{e_1}\bra{e_1} = \left(\begin{array}{c} 1 \\ 0 \\ 0 \\ ... \end{array} \right) \left( 1 0 0 ... \right) = \left(\begin{array}{cccc} 1 & 0 & ... & 0 \\ 0 & 0 & ... & 0 \\ 0 & 0 & ... & 0 \\ 0 & 0 & ... & 0 \end{array} \right) \end{aligned} \]

It won't surprise you to learn that any operator \( \hat{X} \) can be similarly represented as an \( N \times N \) matrix. To derive the form of the matrix, we "insert a complete set of states" twice:

\[ \begin{aligned} \hat{X} = \sum_m \sum_n \ket{e_m} \bra{e_m} \hat{X} \ket{e_n} \bra{e_n} \end{aligned} \]

Each combination of basis vectors \( \ket{e_m} \bra{e_n} \) will fill one of the \( N^2 \) components, with \( m \) labelling the row and \( n \) the column, so that we can represent \( \hat{X} \) as the matrix

\[ \begin{aligned} \hat{X} \rightarrow \left( \begin{array}{ccc} \bra{e_1} \hat{X} \ket{e_1} & \bra{e_1} \hat{X} \ket{e_2} & ... \\ \bra{e_2} \hat{X} \ket{e_1} & \bra{e_2} \hat{X} \ket{e_2} & ... \\ ...& ... & ... \end{array}\right) \end{aligned} \]

Notice that I used an arrow and not an equals sign, to remind us that the form of this matrix depends not only on the operator \( \hat{X} \), but on the basis \( {\ket{e_n}} \) that we work in. The entries of the matrix \( \bra{e_m} \hat{X} \ket{e_n} \) are known as matrix elements.

Note that if we use the eigenvectors of a Hermitian operator \( \hat{X} \) as our basis, it's easy to see that the matrix representing \( \hat{X} \) is diagonal, and its entries are the eigenvalues. When we return to coordinate rotations and changes of basis, we'll see how powerful this observation can be (diagonal matrices are nice things.)

Based on our definitions above, there is a relation between the matrix elements of \( \hat{X} \) and its adjoint \( \hat{X}^\dagger \):

\[ \begin{aligned} \bra{e_m} \hat{X} \ket{e_n} = \bra{e_n} \hat{X}{}^\dagger \ket{e_m}^\star. \end{aligned} \]

In other words, the matrix representing \( \hat{X}^\dagger \) is the conjugate transpose of the matrix representing \( \hat{X} \). It's also easy to show that the matrix representation of a product of operators \( \hat{X} \hat{Y} \) can be found by applying the usual matrix multiplication to the representations of \( \hat{X} \) and \( \hat{Y} \).

The two-state system, a.k.a. the qubit

Now that we've introduced a lot of abstractions, let's go back to a specific physical system, the two-state system that we considered in the context of the Stern-Gerlach experiment, and see how the machinery works. We take as our basis kets the states \( \ket{S_z = +\hbar/2} \) and \( \ket{S_z = -\hbar/2} \), which we'll just write \( \ket{+} \) and \( \ket{-} \) for short.

Before we can proceed, we need to state the postulates that define a quantum mechanical system and its behavior. You should have seen some form of the postulates in your undergrad quantum class, and they are seemingly never presented exactly the same way twice. Quantum mechanics is a difficult subject to grasp, so it's probably a good thing to have multiple perspectives! In that spirit, I'll present my own statement of the postulates.

Next time: we finish the postulates, and apply them to the two-state system.