26 Identical particles
One subtlety that we have completely ignored so far in our discussion of multi-particle quantum systems is what to do when some of our particles are identical to each other, for example, if we want to study the helium atom in which we have two orbital electrons. There is no way to tell one electron apart from another, since they are elementary particles, hence “identical”.
Of course, to do a theoretical calculation, we’re always free to label our particles: in the example of helium we can label one of the electrons as “1” and the other as “2”. Although our choice of labels “1” and “2” is arbitrary, once we decide on a label we can consistently follow that particle along its entire trajectory for all time. If we were doing classical mechanics, this would be the end of the story, possibly aside from exploiting the symmetry of exchanging label 1 with label 2 when finding our solution.
For quantum mechanics, however, the idea of assigning labels breaks down, simply because we can’t follow a specific particle’s trajectory. If we observe electron 1 and electron 2 both entering some localized region of space, and then observe two electrons leaving the same region of space, there is no way to tell which one is 1 and which is 2. Both of the following paths are possible, and cannot be distinguished even in principle:
If we label the two distinct outgoing paths above as outcomes \alpha and \alpha', then the state after measurement collapses to \ket{\alpha} \ket{\alpha'} - except that we don’t know which particle is which, so there are two quantum states. In fact, measuring outcomes \alpha and \alpha' is consistent with an arbitrary linear combination of these two states, c_1 \ket{\alpha}_1 \ket{\alpha'}_2 + c_2 \ket{\alpha'}_1 \ket{\alpha}_2. This ambiguity is known as exchange degeneracy; it presents us with the puzzle that we cannot determine the state of our system completely, even if we measure every observable we can find! Fixing this will help us understand what a realistic solution for the helium atom looks like, and on top of that leads to a variety of interesting physical effects (the most famous of which is the Pauli exclusion principle which you already know from chemistry class.)
Many books use exchange degeneracy as the focus of deriving the symmetrization postulate. Here, I will instead focus on the operators instead of the states, although in the end we’ll have to return to considering the state itself anyway.
26.1 Permutation symmetry
Let’s be more concrete about what the idea of exchanging paths sketched above means mathematically. Suppose we have two identical particles, each of which carries a set of quantum numbers that we can label with the collective index \ket{\alpha}. They exist in two identical Hilbert spaces which we label as \mathcal{H}_1 and \mathcal{H}_2, so the overall Hilbert space is \mathcal{H} = \mathcal{H}_1 \otimes \mathcal{H}_2. If one of the particles (particle 1) is in state \ket{\alpha}_1, and the other (particle 2) is in state \ket{\alpha'}_2, then the overall state of the system is \ket{\alpha}_1 \ket{\alpha'}_2. This is not the same as the state \ket{\alpha'}_1 \ket{\alpha}_2 that would result from exchanging the two particles from each other; in fact, if these are basis states then those two options are orthogonal to one another, and thus perfectly distinguishable from each other.
To proceed, let’s define an operator that will exchange particle labels for us, the permutation operator (some references call this the “exchange operator” instead.) For a two-particle system, its definition is simple: we have
\hat{P}_{12} \ket{\alpha}_1 \ket{\alpha'}_2 = \ket{\alpha'}_1 \ket{\alpha}_2. There are some immediately obvious symmetry properties here. Clearly the order of the labels on the operator doesn’t matter, and applying it twice will take us back to our initial state: \hat{P}_{21} = \hat{P}_{12} \\ \hat{P}_{12}^2 = \hat{1} so that P_{12}^{-1} = P_{12}. These are all the same properties as parity, so we can recognize that this is the same symmetry group, \mathbb{Z}_2. How does the permutation operator act on observables? Since we have a bipartite Hilbert space again, the most general observable takes the form \hat{A}_1 \otimes \hat{B}_2; since we know the Hilbert spaces are identical, we know that \hat{B}_1 \otimes \hat{A}_2 is also a good operator. It’s straightforward to see that the permutation operator obeys \hat{P}_{12} (\hat{A}_1 \otimes \hat{B}_2) \hat{P}_{12} = \hat{B}_1 \otimes \hat{A}_2.
So far, all of the discussion is generic to any symmetry operator we might write down. But there is a key difference for the permutation operator, stemming from the observation we made above: because we can’t keep track of particle labels in quantum mechanics, it’s impossible to measure any quantity that depends on particle label. This means that we have to update one of our postulates again:
All physical observables correspond to Hermitian operators which are invariant under permutation symmetry for any identical particles.
This is simply a statement of the fact that if we can’t tell the particles apart, we can’t actually measure something like \hat{x}_1 \otimes \hat{1}_2 which would tell us the position of just particle 1, because we have no way of telling which particle that is. If we have two electrons and we try to measure the position of one of them, the operator we’re really measuring has to be instead \hat{x}_1 \otimes \hat{1}_2 + \hat{1}_1 \otimes \hat{x}_2. If we’re given a Hermitian operator \hat{O} which isn’t permutation invariant, we can always symmetrize it by using the permutation operator, so that \hat{O} + \hat{P}_{12} \hat{O} \hat{P}_{12} will be symmetric (and therefore something that we can measure.)
Another way to state this requirement is: in a system with two identical particles, when we go to make a measurement, it will be impossible to tell whether someone has switched them before we measured or not, because they are identical. If permuting before or after measurement gives the same answer, this means that we must have [\hat{O}, \hat{P}_{12}] = 0, which in turn implies that \hat{O} is invariant under the permutation symmetry. It’s also worth noting that since the Hamlitonian \hat{H} is itself a physical observable (we can measure energy), in any quantum system we must have [\hat{H}, \hat{P}_{12}] = 0, so that permutation symmetry is always a dynamical symmetry.
Because any \hat{O} and \hat{P}_{12} commute, they must admit a set of simultaneous eigenstates. Given the sets of other quantum numbers \alpha and \alpha', there are two different eigenstates of permutation that we can construct, \ket{\psi_S} = \frac{1}{\sqrt{2}} \left(\ket{\alpha}_1 \ket{\alpha'}_2 + \ket{\alpha'}_1 \ket{\alpha}_2 \right), \\ \ket{\psi_A} = \frac{1}{\sqrt{2}} \left(\ket{\alpha}_1 \ket{\alpha'}_2 - \ket{\alpha'}_1 \ket{\alpha}_2 \right). The symmetric combination \ket{\psi_S} has \hat{P}_{12} \ket{\psi_S} = +\ket{\psi_S}, while for the antisymmetric combination \hat{P}_{12} \ket{\psi_A} = -\ket{\psi_A}. These are the only options, since the eigenvalues of \hat{P}_{12} are \pm 1. Because the symmetry is preserved by all other operators we’re using to write the labels \alpha and \alpha', the total state can only be a combination of these two: \ket{\psi} = c_S \ket{\psi_S} + c_A \ket{\psi_A}, with |c_S|^2 + |c_A|^2 = 1. This has helped with the exchange degeneracy problem; if we consider the measurement of an operator \hat{O} in this state, since it must commute with \hat{P}_{12}, it’s easy to show that \bra{\psi_A} \hat{O} \ket{\psi_S} = 0, and so the expectation breaks down into two expectations, \left\langle \hat{O} \right\rangle = |c_S|^2 \left\langle \hat{O} \right\rangle_S + |c_A|^2 \left\langle \hat{O} \right\rangle_A. It is somewhat interesting that this looks exactly like a mixed state since no interference is possible, which leads into interesting philosophical discussions about the symmetrization postulate. But we don’t have to go any further with derivations, because at this point we are saved by the intervention of a deeper result.
In (3+1)-dimensional quantum mechanics, the behavior of a given particle under permutation symmetry is determined by its spin:
- All particles with integer spin (known as bosons) are symmetric under permutation (obeying “Bose-Einstein statistics”).
- All particles with half-integer spin (known as fermions) are anti-symmetric under permutation (obeying “Fermi-Dirac statistics”).
Spin-statistics resolves the remaining exchange degeneracy by telling us that in the arbitrary state above, we must either have c_S = 1 or c_A = 1, depending on what kind of particle the state is describing. Some quantum mechanics books state this in the form of the “symmetrization postulate” and add it to the list of other postulates of quantum mechanics, phrased in a different way, that only symmetrized or anti-symmetrized wavefunctions are physical. I think five postulates is enough; as the name implies, spin-statistics can be proved, not assumed, but it requires inclusion of relativistic effects (meaning working in quantum field theory, not just quantum mechanics.) As Feynman says in his lectures, much to his own disappointment (and still true many years later), there is not a good intuitive, elementary explanation of how the spin-statistics theorem is proved, which means in his eyes (and mine) that we don’t fully understand the mechanism yet. Maybe one of you will find such an understanding!
Experimentally, there are no exceptions to the spin-statistics theorem, no matter what the actual particle in question is. Photons, helium-4 atoms, carbon-12 nuclei, and J/\psi mesons are all bosons; electrons, neutrinos, protons, and helium-3 atoms are all fermions. As I’ve emphasized with my lists here, the term “particle” in the spin-statistics theorem does not refer only to elementary particles. The carbon-12 nucleus is a composite bound state of 6 protons and 6 neutrons. All of these nucleons are themselves fermions, and the wavefunction of the overall carbon-12 nucleus is certainly antisymmetric under exchange of two protons. Nevertheless, the carbon-12 nucleus itself is a boson; if we are studying a system consisting of many carbon-12 nuclei, then the wavefunction of the overall system is symmetric under exchange of two nuclei. As long as we’re only asking questions about the system of carbon-12 nuclei that don’t depend on their internal structure, then they are identical and act as bosons.
The famous Pauli exclusion principle is an immediate consequence of the spin-statistics theorem and the permutation symmetry we’ve outlined. Two electrons cannot exist in the same state simultaneously, because that would correspond to the state \ket{\alpha}_1 \ket{\alpha}_2, which is symmetric; as fermions, the electron wavefunction must be antisymmetric under exchange.
If you’re paying very close attention, you’ll notice that I specified “3+1 dimensions” in the spin-statistics theorem above. This is because in lower-dimensional systems, in particular systems with two spatial dimensions, there is another possibility: in addition to fermions which pick up a -1 under permutation and bosons which pick up a +1, it is possible to have anyons, for which permutation yields a phase factor e^{i\theta}.
The reason that anyons are possible in two spatial dimensions has to do with topology: there are distinguishable differences between how we exchange two particles, that can be thought of in terms of “braiding” of their space-time trajectories. In other words, we don’t just have to account for “exchanged or not” - we can exchange by going around once, twice, three times, and so on. The resulting symmetry group for a pair of particles is no longer \mathbb{Z}_2, but the infinite-dimensional group \mathbb{Z}. In three spatial dimensions, braiding is impossible - the braids can always be unwound, so there is no way to tell the difference in how many times we go around during an exchange.
Anyons are beyond the scope of what I want to go into in these notes, since they are quite complicated to deal with, especially when we have more than just two particles to deal with (in which case something called the “braid group” B_n pops up.) But I want to mention them at least, since they’re an exception to the rule of spin-statistics.
26.1.1 Example: two electrons in an atomic orbital
It’s important to understand that permutation symmetry requires us to exchange all of the quantum numbers on our two particles, not just a subset. This means that the requirement for (anti-)symmetry under permutation applies to the total wavefunction, and parts of it can have different properties. The canonical example is two electrons in an atomic orbital (same n) with the same l quantum number as well. The m_l and m_s quantum numbers are undetermined.
As we’ve seen before, we can treat the orbital and spin angular momentum of these electrons separately, which means that the overall wavefunction must split into spatial and spin parts, \psi(1,2) = \psi_{\textrm{space}}(1,2) \psi_{\textrm{spin}}(1,2). I’m writing the wavefunction as a completely generic function of labels 1 and 2, to remind ourselves that if we apply permutation it’s not just the position vectors \hat{\vec{r}}_1 and \hat{\vec{r}}_2 that we need to switch. (Plus, the spin part doesn’t depend on position.) Only the total wavefunction must obey the spin-statistics theorem; even though we have two fermions, they can be in a symmetric spin state, as long as the spatial state is antisymmetric, or vice-versa.
What are the possibilities for the total spin wavefunction of the system? To consider total spin we have to add the two spin-1/2 angular momenta of the two electrons, which decomposes into S=0,1 as we’ve seen before: \ket{1,1} = \ket{\uparrow \uparrow} \\ \ket{1,0} = \frac{1}{\sqrt{2}} (\ket{\uparrow \downarrow} + \ket{\downarrow \uparrow}) \\ \ket{1,-1} = \ket{\downarrow \downarrow} \\ \ket{0,0} = \frac{1}{\sqrt{2}} (\ket{\uparrow \downarrow} - \ket{\downarrow \uparrow}). Now we take special note of the symmetry properties: the three “triplet” S=1 states are all symmetric under exchange of the two spin labels, while the singlet S=0 state is antisymmetric. This isn’t a coincidence, but a simple consequence of permutation symmetry. We can write the spin-lowering operator as \hat{S}_- = \hat{S}_{1-} + \hat{S}_{2-}. This is, by construction, invariant under permutation \hat{P}_{12}. That means that if we take the state \ket{s,s}, whatever its symmetry or anti-symmetry properties are, all of the lowered states \ket{s,m_s} must have the same eigenvalue under permutation. In other words, we can check one state \ket{s,m_s} within each multiplet of fixed s to see if it’s symmetric or antisymmetric, and all other states with the same s will match.
For the spatial part of the wavefunction, since we’ve specified that n and l are the same for our two electrons, the possibilities are restricted to the states given by addition of two l=1 eigenstates, which we know can give total orbital angular momentum L = 0,1,2. Now that we’ve observed that the value of m_L won’t matter as far as symmetry properties, we can just look at one wavefunction for each L: \ket{2,2} = \ket{1,1}_1 \ket{1,1}_2 \\ \ket{1,1} = \frac{1}{\sqrt{2}} (\ket{1,1}_1 \ket{1,0}_2 - \ket{1,0}_1 \ket{1,1}_2) \\ \ket{0,0} = \frac{1}{\sqrt{6}} (\ket{1,-1}_1 \ket{1,1}_2 + \ket{1,1}_1 \ket{1,-1}_2 - 2 \ket{1,0}_1 \ket{1,0}_2) Here we can easily see that L=0,2 are totally symmetric under particle exchange, whereas L=1 is antisymmetric. It turns out that this is a very general observation: when we expand the product l \otimes l, the highest total-angular momentum level L=2l is always symmetric, L=2l-1 is always antisymmetric, and so on. If you think about it, this is the same statement that the even l spherical harmonics are even under parity, since exchanging two identical particles in space is just a parity reflection about some plane.
We now come to a vey important observation: the required total antisymmetry of the wavefunction for two electrons thus restricts which combinations of spatial and spin wavefunctions are allowed. Since the spin-triplet state is symmetric, it can only exist with angular momentum L=1. Likewise, the spin-singlet state can only exist with L=0 or L=2. We find missing energy levels, compared to what we would have expected for non-identical particles (if the two electrons were in different orbitals, say.)
More generally, suppose we have a system again labeled by a single operator \hat{A}, now with only two eigenvalues a and a'. If we had two identical semi-classical particles (obeying Maxwell-Boltzmann statistics) in this system, there would be four possible joint states: \ket{a}_1 \ket{a}_2,\ \ket{a}_1 \ket{a'}_2,\ \ket{a'}_1 \ket{a}_2,\ \ket{a'}_1 \ket{a'}_2. On the other hand, two fermions described by this two-state system can only exist in a single combined state, \frac{1}{\sqrt{2}} (\ket{a}_1 \ket{a'}_2 - \ket{a'}_1 \ket{a}_2). (We haven’t described rigorously how to extend all of this to more than two particles yet, but it should nevertheless be clear that there are no allowed wavefunctions for three fermions in a two-state system.) Finally, bosons have three allowed joint states: \ket{a}_1 \ket{a}_2,\ \ket{a'}_1 \ket{a'}_2,\ \frac{1}{\sqrt{2}} (\ket{a}_1 \ket{a'}_2 + \ket{a'}_1 \ket{a}_2). So even the bosons, which will happily exist in the same state, are still “missing” one of the four states we would have expected from Maxwell-Boltzmann statistics, essentially because there is no distinction between the two middle states. I should emphasize once again that the Maxwell-Boltzmann case is not physical in quantum mechanics for identical particles; all particles are either fermions or bosons.
26.2 The helium atom
Now we’re ready to consider another example, a very important one: the helium atom. We will study helium first with perturbation theory, and then the variational method to compare. (We won’t do the example of Rayleigh-Ritz, but it yields the most precise results for helium energies and states.)
The helium atom has two orbital electrons, which we’ll label 1 and 2. Writing only the kinetic and Coulomb terms, the helium Hamiltonian is \hat{H} = \frac{\hat{p}_1^2}{2m} - \frac{Ze^2}{r_1} + \frac{\hat{p}_2^2}{2m} - \frac{Ze^2}{r_2} + \frac{e^2}{|\vec{r_2} - \vec{r_1}|}. This is no longer a central potential, due to the repulsive force between the two electrons given by the last term. We can’t solve exactly, so we’ll treat the electron repulsion as a perturbation, writing the above as \hat{H} = \hat{H}_1 + \hat{H}_2 + \hat{W}.
With this choice, the unperturbed wavefunction can be decomposed into a product of two hydrogenic wavefunctions. However, we have to keep in mind that with two identical electrons, spin-statistics is active, which means that we must have either a symmetric or antisymmetric combination of wavefunctions, \psi_{\textrm{space}}^{(0)}(\vec{r}_1, \vec{r}_2) = \frac{1}{\sqrt{2}} \left[ \psi_{n_1,l_1,m_1}(\vec{r}_1) \psi_{n_2,l_2,m_2} (\vec{r}_2) \pm \psi_{n_1,l_1,m_1}(\vec{r}_2) \psi_{n_2,l_2,m_2}(\vec{r_1}) \right]. Which choice we make for a given state depends on the spin state of the electrons, which itself can also be symmetric or anti-symmetric; we’ll see how this works in some specific examples. Before we consider spin, we can observe that with this setup the unperturbed energies for helium are just the sum of two hydrogenic energies: E_{n_1,n_2}^{(0)} = -Z^2 \left( \frac{1}{n_1^2} + \frac{1}{n_2^2} \right)\ \textrm{Ry} = (-54.4\ {\rm eV}) \left( \frac{1}{n_1^2} + \frac{1}{n_2^2} \right) plugging in Z=2 for helium. For the ground state n_1 = n_2 = 1, we find an energy which is 8 times the ground-state energy of hydrogen, or -108.8 eV.
It’s interesting to note that the ionization energy required to liberate one of the electrons is only half of the ground-state energy. The threshold for a free electron is at n=\infty, so we have for example E_{\infty,1}^{(0)} = -54.4\ {\rm eV}. This immediately implies that even the lowly n_1=2, n_2=2 energy level is unstable, since E_{2,2}^{(0)} = -54.4\ {\rm eV} \left( \frac{1}{4} + \frac{1}{4} \right) = -27.2\ {\rm eV}. This is well above the energy scale E_{\infty,1}^{(0)}. Because of this energy difference, the state \ket{2,2} will auto-ionize; there is a transition in which one electron goes to the ground state and the other escapes, leaving behind a {\rm He}^+ ion.
We will return to say more about auto-ionization later in these notes when we have better tools to describe the transition. For now, the key point to remember is that the stable states of an ordinary helium atom have at least one electron in the ground state. (This is all zeroth-order in perturbation theory, which isn’t quite right, but the qualitative conclusions and the existence of auto-ionizing states are robust.)
26.2.1 Helium ground-state energy: perturbative estimate
Let’s focus on just the ground state, where both electrons have n = 1. This forces the individual wavefunctions for both electrons to be in the \ket{nlm} = \ket{100} state, which in turn tells us that the spatial wavefunction must be symmetric, \psi_{g,\rm space}^{(0)}(\vec{r}_1, \vec{r}_2) = \psi_{100}(\vec{r}_1) \psi_{100} (\vec{r}_2) Fermi statistics then requires the overall wavefunction to be antisymmetric under exchange, which means that the spin wavefunction is definitely in the s=0 singlet state, \ket{\chi_g} = \frac{1}{\sqrt{2}} (\ket{\uparrow}_1 \ket{\downarrow}_2 - \ket{\downarrow}_1 \ket{\uparrow}_2).
Now let’s include the electron repulsion as a perturbation. We can evaluate the first-order energy correction for the ground state as an integral: E_0^{(1)} = \left\langle \frac{e^2}{|\vec{r}_1 - \vec{r}_2|} \right\rangle \\ = \int d^3 r_1 \int d^3r_2 \frac{|\psi_{100}(r_1)|^2 |\psi_{100}(r_2)|^2}{|\vec{r_1} - \vec{r_2}|}. The hydrogenic ground-state wavefunction is, being careful to include the spherical harmonic to get the normalization right, \psi_{100}(r) = \frac{1}{\sqrt{\pi}} \sqrt{\frac{Z^3}{a_0^3}} e^{-Zr/a_0}. Dealing with the relative position of the electrons is trickier; we can rewrite |\vec{r}_1 - \vec{r}_2| = \sqrt{(\vec{r}_1 - \vec{r}_2)^2} \\ = \sqrt{r_1^2 + r_2^2 - 2r_1 r_2 \cos \theta}, where \theta is the relative angle between the two position vectors. Substituting in above, we have E_0^{(1)} = \left(\frac{Z^3}{\pi a_0^3} \right)^2 \int dr_1 r_1^2 e^{-2Zr_1 / a_0} \int dr_2 r_2^2 e^{-2Zr_2 / a_0} \\ \int d\Omega_1 \int d\Omega_2 \frac{e^2}{\sqrt{r_1^2 + r_2^2 - 2r_1 r_2 \cos \theta}}. We haven’t specified any coordinate axes yet; if we take the z axis to point along \vec{r}_1, then the angle \theta is just \theta_2, and we have E_0^{(1)} = \frac{8 Z^6 e^2}{a_0^6} \int dr_1 \int dr_2 (...) \int d(\cos \theta_2) \frac{1}{\sqrt{r_1^2 + r_2^2 - 2r_1 r_2 \cos \theta_2}}. The last remaining angular integral isn’t too bad: \int_{-1}^1 d(\cos \theta_2) \frac{1}{\sqrt{r_1^2 + r_2^2 - 2r_1 r_2 \cos \theta_2}} \\ = -\frac{1}{r_1 r_2} \left. \sqrt{r_1^2 + r_2^2 - 2r_1 r_2 \cos \theta_2}\right|_{-1}^1 \\ = -\frac{1}{r_1 r_2} (\sqrt{(r_1 - r_2)^2} - \sqrt{(r_1 + r_2)^2}) \\ = \frac{1}{r_1 r_2} (r_1 + r_2 - |r_1 - r_2|). Recognizing that the integral is totally symmetric under exchange of r_1 and r_2, we can assume r_1 > r_2 and just double the result; with this assumption, the above expression becomes simply 4 / r_1 (including the doubling due to symmetry.) Because we made this assumption, we also have to enforce it for the radial integrals by integrating r_1 over [r_2, \infty] instead of [0, \infty]. Doing a u-substitution first with u = Zr/a_0, we have
E_0^{(1)} = \frac{32 Ze^2}{a_0} \int_0^\infty du_2 \int_{u_2}^\infty du_1 u_1 u_2^2 e^{-2(u_1+u_2)}
The radial integrals are straightforward from here: they give a numerical factor of 5/256, so that we find for the energy correction E_0^{(1)} = \frac{5}{8} \frac{Ze^2}{a_0} = \frac{5Z}{4}\ \textrm{Ry}. This is a positive shift, changing the estimated ground-state energy for helium from -108.8 eV to -74.8 eV. This is a reasonably good estimate; the experimental value is around -78.98 eV.
26.2.2 Helium ground-state energy: variational estimate
Let’s see how the variational method does for helium, and whether we can gain any additional physical insight from that approach. There are many possible choices of variational wave functions, but a relatively simple approach turns out to work quite well. We’ll continue to assume that we can write the total wavefunction as a product of two hydrogenic wavefunctions, \tilde{\psi}(r) = \frac{1}{\sqrt{\pi}} \left( \frac{Z^\star}{a_0}\right)^{3/2} e^{-Z^\star r/a_0} and then \tilde{\psi}(r_1, r_2) = \tilde{\psi}(r_1) \tilde{\psi}(r_2). We take Z^\star to be our variational parameter; we could have tried a_0 as well, but since the wavefunction only depends on the ratio of the two the outcome would be the same. The trial energy is thus E(Z^\star) = \int d^3 r_1 \int d^3 r_2 \tilde{\psi}^\star(r_1) \tilde{\psi}^\star(r_2) \left( \frac{\hat{p}_1^2}{2m} - \frac{Ze^2}{r_1} \right. \\ \left. + \frac{\hat{p}_2^2}{2m} - \frac{Ze^2}{r_2} + \frac{e^2}{|\vec{r}_1 - \vec{r}_2|} \right) \tilde{\psi}(r_1) \tilde{\psi}(r_2). (note that this is the Hamiltonian so it is correct that the factors here contain Z and not Z^\star.) Rather than evaluating the integral directly this time, we can use some tricks. Notice that since the \tilde{\psi} function is just a ground-state hydrogenic wavefunction but with Z^\star as the atomic number, which means that \left( \frac{\hat{p}_1^2}{2m} - \frac{Z^\star e^2}{r_1} \right) \tilde{\psi}(r_1) = E_0(Z^\star) \tilde{\psi}(r_1) where E_0(Z^\star) = -(Z^\star)^2\ \textrm{Ry}, and similarly for r_2. In other words, \tilde{\psi}(r) is a ground-state eigenfunction of the “unphysical” hydrogen Hamiltonian with Z^\star. By rewriting -\frac{Ze^2}{r} = -\frac{Z^\star e^2}{r} + \frac{(Z^\star - Z) e^2}{r}, we reduce the big integral above to E(Z^\star) = 2 E_0(Z^\star) + 2(Z^\star - Z) e^2 \left\langle \frac{1}{r} \right\rangle_{Z^\star} + e^2 \left\langle \frac{1}{|\vec{r}_1 - \vec{r}_2|} \right\rangle_{Z^\star}. where the subscript Z^\star reminds us that we’re taking the expectation value with respect to the trial wavefunction. The first expectation value is just given by the virial theorem, e^2 \left\langle \frac{1}{r} \right\rangle_{Z^\star} = \frac{Z^\star e^2}{a_0} = 2Z^\star\ \textrm{Ry}. There’s no neat trick to evaluate the second expectation value, but we just found what it was in our perturbative estimate: e^2 \left\langle \frac{1}{|\vec{r}_1 - \vec{r}_2|} \right\rangle_{Z^\star} = \frac{5}{4} Z^\star\ \textrm{Ry}. Putting everything together, we see that E(Z^\star) = \left( 2(Z^\star)^2 - 4ZZ^\star + \frac{5}{4} Z^\star \right)\ \textrm{Ry}. This function is minimized for Z^\star_{\textrm{min}} = Z - \frac{5}{16}, giving the variational ground-state energy E(Z^\star_{\textrm{min}}) = - \left[ 2 \left(Z - \frac{5}{16} \right) \right]^2\ \textrm{Ry} \\ \approx -77.38\ \textrm{eV}. Comparing to the experimental value of -78.975 eV and our perturbative estimate of -74.8 eV, we see that the variational approach has given a significant improvement; and if we were more clever in our choice of variational function and parameters, we would be able to lower the variational bound and do even better. (As I mentioned, a Rayleigh-Ritz approach gives the best variational bounds on the helium ground-state and excited-state energies in practice.)
One other interesting point to make is that the result coming from our variational calculation has a sensible physical interpretation. We can think of Z^\star as an effective nuclear charge seen by each electron, which has been reduced by some amount due to “screening” by the presence of the other electron. The fact that this gives a very close answer to the true ground-state energy implies that this charge screening is the dominant correction to the non-interacting picture.
26.3 Exchange interaction and excited states of helium
Now let’s extend our discussion beyond the ground state. As we noted before, for a helium atom all stable bound states require one of the electrons to be in the n=1 orbital, or else the atom will have enough energy to auto-ionize. This means that the most interesting case to study is that where one electron is in state \ket{100}, and the other in a general state \ket{nlm}. The spatial part of the two-electron wave function can be written as \psi_{\textrm{space}}(\vec{r}_1, \vec{r}_2) = \frac{1}{\sqrt{2}} \left[ \psi_{100}(\vec{r}_1) \psi_{nlm} (\vec{r}_2) \pm \psi_{100}(\vec{r}_2) \psi_{nlm}(\vec{r_1}) \right]. The sign here is dictated by what the spin state of the electrons is. There are two possibilities:
- Total spin s=0 (singlet): the spin wavefunction is antisymmetric, which means that the spatial wavefunction is symmetric. This was the case for the n_1 = 1, n_2 = 1 ground state. The s=0 states are known collectively as para-helium states.
- Total spin s=1 (triplet): the spin wavefunction is symmetric, so the spatial wavefunction is antisymmetric. The s=1 states of helium are known as ortho-helium states.
Out of all of these states, the ground state is special since it can only have a symmetric spatial wavefunction, which means ground-state helium only exists as a para-helium state. However, for any of the other excited states of this form, both the ortho- and para-helium states will be present.
Let’s have a look at the excited-state energies, once again treating the Coulomb repulsion between the electrons as a perturbation. For any choice of \ket{nlm}, we have for the unperturbed energy E^{(0)} = - Z^2 \left( 1 + \frac{1}{n^2} \right)\ \textrm{Ry}, and the energy correction is \Delta E = \left\langle \frac{e^2}{|\vec{r}_1 - \vec{r}_2|} \right\rangle. The spin part of the wavefunction has no effect on this perturbation, so the expectation value is taken purely with respect to the spatial part: \Delta E = \int d^3r_1 \int d^3r_2 |\psi_{\textrm{space}}(\vec{r}_1, \vec{r}_2)|^2 \frac{e^2}{|\vec{r}_1 - \vec{r}_2|}. If we expand out the spatial wavefunction, we find that we can write \Delta E = I \pm J, where I = \int d^3 r_1 \int d^3 r_2 |\psi_{100}(\vec{r}_1)|^2 |\psi_{nlm}(\vec{r}_2)|^2 \frac{e^2}{|\vec{r}_1 - \vec{r}_2|} is an integral of the form that we saw above for the ground state, and J = \textrm{Re} \left[ \int d^3 r_1 \int d^3 r_2 \psi_{100}^\star(\vec{r}_1) \psi_{nlm}^\star(\vec{r}_2) \frac{e^2}{|\vec{r}_1 - \vec{r}_2|} \psi_{100}(\vec{r}_2) \psi_{nlm}(\vec{r}_1) \right] is a new term, known as the exchange interaction (or “exchange energy.”)
In general, the value of J versus I will depend on what system we’re studying; it is even possible in some cases for J to be negative. For helium, it is the case that the para-helium states (singlet), which are spatially symmetric, have increased energy, while ortho-helium (triplet) states have lower energy. This makes physical sense, since we can think of the symmetrized spatial wavefunctions on average putting the electrons “closer” to each other, giving a larger energy contribution from the Coulomb repulsion between them, and vice-versa.
Since the difference between the two energies depends entirely on the spin state of the two electrons, it’s interesting to notice that we can actually rewrite the total energy shift in the form \Delta E = I - \frac{1}{2} \left(1 + \vec{\sigma}_1 \cdot \vec{\sigma}_2\right) J where as a reminder, the \sigma_i matrices are the dimensionless versions of the \vec{S}_i operators; we have \vec{\sigma}_1 \cdot \vec{\sigma}_2 = +1 for the S=1 triplet state, and -3 for the S=0 singlet. Thus, we find that even without any explicit spin dependence included in the Hamiltonian, the presence of Fermi-Dirac statistics here has induced a spin-dependent energy splitting - and a very large one, compared to the fine-structure spin effects we saw before! (This is purely electromagnetic and will be of order eV, not \alpha^2 times eV.)
26.3.1 Diatomic molecules
Diatomic molecules give another classic example of the effects of identical particle statistics. As we’ve discussed before, the energy levels of a diatomic molecule are set by different ways in which the molecule can move: rotation, vibration, and internal excitations of the atoms themselves. The lowest-energy modes for a diatomic molecule are the rotational modes. If we’re interested only in these modes, we can study them by way of the rigid rotor Hamiltonian, \hat{H} = \frac{\hat{L}^2}{2I}, where \hat{\vec{L}} is the angular momentum of the molecule, and the moment of inertia I is determined entirely by the nuclei, so I = \mu a^2 with \mu the reduced mass and a the inter-nuclear distance. This is a very simple Hamiltonian to deal with: its energy eigenvalues are simply E_l = \frac{\hbar^2}{2I} l(l+1). If we have a heterogeneous molecule like OH, then all of these energy levels are present; the nuclear spin, in particular, is irrelevant since the distance between the two nuclei leads to a very small coupling.
However, for molecules composed of identical compounds, the situation is more complicated. Keep in mind that the nuclei set the properties of the quantum rotor spectrum, so we will be interested in the total wavefunction of the two nuclei here. For the hydrogen molecule H_2, each nucleus is a single proton, which is spin-1/2 and therefore a fermion. This leads us to a similar situation to that for helium: when the two nuclear spins exist in one of the spin-symmetric triplet states, we have ortho-H_2 and the spatial wavefunction must be antisymmetric. For the singlet state (para-H_2), the spatial wavefunction is symmetric.
How do we relate these symmetries to the allowed energy levels? In the particular case of a diatomic molecule, we notice that the permutation operator \hat{P}_{12} is identical to the parity operator! Therefore, all of the angular momentum eigenstates of the system, which in space correspond to spherical harmonics Y_l^m(\theta, \phi), are permutation eigenstates with eigenvalue (-1)^l. Thus, ortho-H_2 states have only odd l, while for para-H_2 only even l are allowed.
Let’s do a little statistical mechanics again. Defining the “rotational temperature” \Theta \equiv \frac{\hbar^2}{2Ik_B}, the general form of the partition function for a quantum rotor is Z_{\rm rot} = \sum_l N_l e^{-\beta E_l} = \sum_l (2l+1) e^{-l(l+1) \Theta / T}. However, as we have just argued for H_2, the ortho and para states will have different allowed values, and thus the sums will be different. In addition to knocking out some values of l, the ortho states pick up a factor of 3 since they have 3 allowed spin states per l value. Thus, we have the two partition functions Z_{o} = 3 \sum_{l=1,3,5...} (2l+1) e^{-l(l+1) \Theta / T}, \\ Z_{p} = \sum_{l=0,2,4...} (2l+1) e^{-l(l+1) \Theta / T}. In the high-temperature limit, the difference between Z_o and Z_p becomes irrelevant, except for the counting factor of 3 (so \lim_{T \rightarrow \infty} Z_o / Z_p = 3.) From the partition function, the overall rotational energy stored in a given species is given by \left\langle E \right\rangle_s = -\frac{\partial}{\partial \beta} \ln Z_s, and then to get something more easily accessible in experiment, we take another derivative to get the molar heat capacity, C_{{\rm rot},s} = \frac{\partial \left\langle E \right\rangle_s}{\partial T}.
We can’t really evaluate this analytically very well, but a numerical evaluation leads to the following results:
In a real H_2 gas at high temperature, the experimental curve lies somewhere in between these curves, corresponding to a roughly 3:1 mixture of ortho-H_2 to para-H_2; this is exactly the mixture we would predict from the degeneracy of the nuclear spin states.
Finally, the case where the two nuclei are bosonic, e.g. N_2, has the most dramatic effect. Each nitrogen nucleus has 7 protons and 7 neutrons, so indeed the nitrogen nucleus is a boson, and therefore only states which are symmetric are allowed. This completely removes all of the odd-l states from the emission spectrum!
These missing states actually provided some early hint for the existence of the neutron. The emission spectrum of nitrogen gas was known before the neutron was discovered, and so was the quantum mechanics of identical particles. If we didn’t know about the neutron, we would predict that the nitrogen nucleus just consists of 7 protons, and is therefore a fermion and has no missing energy levels. The experimental evidence to the contrary showed there must be some other physics at work!
26.4 Systems of multiple identical particles
Often in realistic systems, we’re dealing with many particles that are all identical instead of just pairs. Fortunately, it is straightforward to generalize what we’ve done so far to handle this case. If we have three identical particles 1,2,3, then we can define a set of permutation operators that swap pairwise labels: \hat{P}_{12}, \hat{P}_{13}, \hat{P}_{23}. Bosonic wavefunctions must be totally symmetric under the action of any of the permutation operators; fermionic wavefunctions are totally antisymmetric. In general, \hat{P}_{ij} \ket{\psi}_{\textrm{bosons}} = + \ket{\psi}_{\textrm{bosons}}, \\ \hat{P}_{ij} \ket{\psi}_{\textrm{fermions}} = - \ket{\psi}_{\textrm{fermions}}. For the three-particle case, we can construct the appropriately (anti-)symmetrized wavefunctions by inspection: \psi^S(1,2,3) = \frac{1}{\sqrt{6}} [\psi(1,2,3) + \psi(2,3,1) + \psi(3,1,2) \\ + \psi(1,3,2) + \psi(2,1,3) + \psi(3,2,1) ]. \\ \psi^A(1,2,3) = \frac{1}{\sqrt{6}} [[\psi(1,2,3) + \psi(2,3,1) + \psi(3,1,2) \\ - \psi(1,3,2) - \psi(2,1,3) - \psi(3,2,1) ]. \\ The terms with minus signs in the fermionic wavefunction correspond to odd permutations, that is to say, permutations of \psi(1,2,3) generated by applying an odd number of pairwise permutation operators.
A useful shorthand for keeping track of the minus signs when we have a lot of fermions in a system is to use the Slater determinant. Given a set of one-particle fermion wavefunctions \ket{\chi}_1(p), ..., \ket{\chi}_N(p), where p labels which particle we apply the wavefunction to, the combined and totally antisymmetrized wavefunction is given by \psi^A(1,2,3,...,N) = \frac{1}{\sqrt{N!}} \left| \begin{array}{ccc} \chi_1(1) & \chi_1(2) & ... \\ \chi_2(1) & \chi_2(2) & ... \\ ... & ... & ... \end{array} \right|. The symmetry properties of the determinant match the required symmetry under permutation for a totally antisymmetric wavefunction. As an aside, we can use the Slater determinant to write down a bosonic wavefunction quickly too; we just ignore all the minus signs.
The Slater determinant is most often used for wavefunctions, but applies to any antisymmetric combination of states. For example, in the two-state example just above, we can write the antisymmetric state as a Slater determinant, \frac{1}{\sqrt{2}} (\ket{a}_1 \ket{a'}_2 - \ket{a'}_1 \ket{a}_2) = \frac{1}{\sqrt{2}} \left| \begin{array}{cc} \ket{a}_1 & \ket{a}_2 \\ \ket{a'}_1 & \ket{a'}_2 \end{array} \right|.
As a comment in passing, when we have multiple particles involved at once, the symmetry group expands to the permutation group S_n for n particles. Bosonic wavefunctions inhabit the trivial representation of S_n, meaning that any permutation just gives +1 back, corresponding to total symmetry. Fermionic wavefunctions exist in the sign representation, which maps odd numbers of permutations to -1 instead of +1. Other representations of S_n exist, but because of spin-statistics, only these two simple representations are allowed in quantum mechanics.
26.4.1 Fermi energy and degeneracy pressure
The general case of a large number of non-interacting fermions is interesting and important (serving as a starting unperturbed system for many condensed-matter applications, for example.) If we have N such fermions and their one-particle energy eigenstates are given by \hat{H}_i u_{E_k}(i) = E_k u_{E_k}(i) where \hat{H} = \sum_{i=1}^N \hat{H}_i is just a sum of N identical non-interacting Hamiltonians, then any combined energy eigenstate can be written as a product of u_{E_k}(i). To keep track of all the signs in forming the antisymmetrized wavefunction, we use the Slater determinant as written above, u^A(1,2,3,...,N) = \frac{1}{\sqrt{N!}} \left| \begin{array}{ccc} u_{E_1}(1) & u_{E_1}(2) & ... \\ u_{E_2}(1) & u_{E_2}(2) & ... \\ ... & ... & ... \end{array} \right|. An immediate consequence of the anti-symmetry that we can read off from the determinant is that to be able to write a wavefunction at all for a system of N fermions, there must be at least N distinct states for the fermions to exist in. Up to degeneracy, this means in particular that the ground-state energy of a collection of fermions is significantly raised, relative to the ground-state energy of a single fermion! Essentially, the system must fill all available energy states up to a certain energy level to accomodate N fermions at once. This raised energy is known as the Fermi energy.
I won’t go through a detailed calculation of the Fermi energy; you can find some detail in Sakurai, or in any stat mech book. Besides, the value of the Fermi energy is given by a sum over energy states, and depends on what system we’re trying to study. But I do want to stress that the Fermi energy is inherently different from a zero-point energy shift; it really does have physical consequences. Let’s look at one particular example, which is a white dwarf star.
White dwarves are stars which have burned off all of their fuel; nuclear fusion stops, and the pressure that the fusion provides against gravitational collapse ceases. As a result, the star becomes very dense. (The full astrophysical story here is more complicated than this, but we’re interested in the end result right now.) The gravitational binding energy of the white dwarf is given by a simple classical calculation, E_G = -\frac{3}{5} \frac{GM^2}{R} The change in energy with respect to volume gives a corresponding pressure, P_G = \partial E_G / \partial V. However, there is another effect at work here; the squeezed matter which makes up the white dwarf exists as electron-degenerate matter, in which the electrons are stripped away from the nuclei and essentially exist as an electron gas. If we have an electron gas confined to a fixed volume, then the energy levels available are restricted by the wave number k = 2\pi / L. A straightforward calculation gives the result E_e = \frac{3\hbar^2}{10m_e} N_e \left(\frac{3\pi^2 N_e}{V}\right)^{2/3}. for a gas of N_e electrons in volume V.
It’s more stat mech than quantum mechanics, but I’ll put a quick Fermi energy derivation here nevertheless, since it does rely on what we’ve developed about identical particles.
Our model will be the three-dimensional particle in a box: assume a zero potential for 0 \leq x, y, z \leq L and infinite potential outside. There’s no need to use our machinery for rotations in three dimensions, since this system just decouples into three one-dimensional particle-in-a-box systems. The wavefunction is the product of one-dimensional solutions \psi_{n_x,n_y,n_z}(x,y,z) = \sqrt{\frac{8}{L^3}} \sin \left( \frac{n_x \pi x}{L} \right) \sin \left( \frac{n_y \pi y}{L} \right) \sin \left( \frac{n_z \pi z}{L} \right), and the energy is the sum E_{n_x,n_y,n_z} = \frac{\pi^2 \hbar^2}{2mL^2} (n_x^2 + n_y^2 + n_z^2).
Now, we suppose that the box is filled with N_e electrons, where N_e is very large. Neglecting interactions between the electrons, the key effect is that each energy level \ket{n_x, n_y, n_z} can only support 2 electrons; one with spin up, and one with spin down. The total energy is then just the sum, E_{\rm tot} = \sum_{n_x,n_y,n_z} 2E_{n_x,n_y,n_z} = \frac{\pi^2 \hbar^2}{mL^2} \sum_{n_x,n_y,n_z} (n_x^2 + n_y^2 + n_z^2). By assumption, N_e is very large, which means we can replace the sum with an integral, E_{\rm tot} \approx \frac{\pi^2 \hbar^2}{m_eL^2} \int dn_x dn_y dn_z (n_x^2 + n_y^2 + n_z^2) \\ = \frac{\pi^2 \hbar^2}{m_eL^2} \frac{\pi}{2} \int_0^{n_{\rm max}} dn\ n^4, \\ = \frac{\pi^3 \hbar^2}{10m_eL^2} n_{\rm max}^5, switching to spherical coordinates and doing the angular integral - it only runs over the first octant since all of the n_x,n_y,n_z must be positive, so we get 4\pi/8 from the solid angle.
Now we just need to know what n_{\rm max} is. We can write this as an integral as well, since it’s just 2 for each state, so N_e = \sum_{n_x,n_y,n_z} 2 = 2 \frac{\pi}{2} \int_0^{n_{\rm max}} dn\ n^2 \\ = \frac{\pi}{3} n_{\rm max}^3, or n_{\rm max} = (3 N_e / \pi)^{1/3}. Plugging back in, E_{\rm tot} = \frac{\pi^3 \hbar^2}{10m_eL^2} (3 N_e/\pi)^{5/3} \\ = \frac{3\hbar^2}{10m_e} N_e \left(\frac{3\pi^2 N_e}{V}\right)^{2/3}. Now, this is not the Fermi energy; the Fermi energy is the level of the highest filled energy level. Appealing to some stat mech, the total energy is related to the Fermi energy as E_{\rm tot} = \frac{3}{5} N_e E_F from which we read off E_F = \frac{\hbar^2}{2m_e} \left( \frac{3 \pi^2 N_e}{V}\right)^{2/3}, matching the textbook result.
The derivative \partial E_F / \partial V once again gives a pressure, which is known as the degeneracy pressure. This opposes the gravitational pressure, and in fact if we rewrite the gravitational energy as E_G = -\frac{3}{5} G (N_n M_n)^2 \left( \frac{4\pi}{3} \right)^{1/3} V^{-1/3} and then calculate the pressures and balance them, we can predict that the radius of a white dwarf should be approximately 7,000 km (see here for some details.) So a white dwarf packs an entire solar mass into a space the size of the Earth - very compact!
If a white dwarf can go on to acquire more mass (by accreting from a partner star, say), then eventually the available Fermi energy increases to the point that the electrons are fused into the protons, leaving neutrons (and neutrinos) - this gives a supernova (type Ia), and leaves behind a neutron star, which is once again supported by degeneracy pressure. The stellar radius in our degeneracy pressure estimate scales inversely with the mass m of the fermion, so the predicted size of a neutron star would be 7000 km times m_e / m_n \sim 4 km; we shouldn’t take this number too seriously since neutrons are strongly interacting, but real neutron stars have observed radii on the order of 10 km, so it’s not too far off.