Struggles with the Continuum (Part 5)

19 September, 2016

Quantum field theory is the best method we have for describing particles and forces in a way that takes both quantum mechanics and special relativity into account. It makes many wonderfully accurate predictions. And yet, it has embroiled physics in some remarkable problems: struggles with infinities!

I want to sketch some of the key issues in the case of quantum electrodynamics, or ‘QED’. The history of QED has been nicely told here:

• Silvian Schweber, QED and the Men who Made it: Dyson, Feynman, Schwinger, and Tomonaga, Princeton U. Press, Princeton, 1994.

Instead of explaining the history, I will give a very simplified account of the current state of the subject. I hope that experts forgive me for cutting corners and trying to get across the basic ideas at the expense of many technical details. The nonexpert is encouraged to fill in the gaps with the help of some textbooks.

QED involves just one dimensionless parameter, the fine structure constant:

\displaystyle{ \alpha = \frac{1}{4 \pi \epsilon_0} \frac{e^2}{\hbar c} \approx \frac{1}{137.036} }

Here e is the electron charge, \epsilon_0 is the permittivity of the vacuum, \hbar is Planck’s constant and c is the speed of light. We can think of \alpha^{1/2} as a dimensionless version of the electron charge. It says how strongly electrons and photons interact.

Nobody knows why the fine structure constant has the value it does! In computations, we are free to treat it as an adjustable parameter. If we set it to zero, quantum electrodynamics reduces to a free theory, where photons and electrons do not interact with each other. A standard strategy in QED is to take advantage of the fact that the fine structure constant is small and expand answers to physical questions as power series in \alpha^{1/2}. This is called ‘perturbation theory’, and it allows us to exploit our knowledge of free theories.

One of the main questions we try to answer in QED is this: if we start with some particles with specified energy-momenta in the distant past, what is the probability that they will turn into certain other particles with certain other energy-momenta in the distant future? As usual, we compute this probability by first computing a complex amplitude and then taking the square of its absolute value. The amplitude, in turn, is computed as a power series in \alpha^{1/2}.

The term of order \alpha^{n/2} in this power series is a sum over Feynman diagrams with n vertices. For example, suppose we are computing the amplitude for two electrons wth some specified energy-momenta to interact and become two electrons with some other energy-momenta. One Feynman diagram appearing in the answer is this:

Here the electrons exhange a single photon. Since this diagram has two vertices, it contributes a term of order \alpha. The electrons could also exchange two photons:

giving a term of \alpha^2. A more interesting term of order \alpha^2 is this:

Here the electrons exchange a photon that splits into an electron-positron pair and then recombines. There are infinitely many diagrams with two electrons coming in and two going out. However, there are only finitely many with n vertices. Each of these contributes a term proportional to \alpha^{n/2} to the amplitude.

In general, the external edges of these diagrams correspond to the experimentally observed particles coming in and going out. The internal edges correspond to ‘virtual particles’: that is, particles that are not directly seen, but appear in intermediate steps of a process.

Each of these diagrams is actually a notation for an integral! There are systematic rules for writing down the integral starting from the Feynman diagram. To do this, we first label each edge of the Feynman diagram with an energy-momentum, a variable p \in \mathbb{R}^4. The integrand, which we shall not describe here, is a function of all these energy-momenta. In carrying out the integral, the energy-momenta of the external edges are held fixed, since these correspond to the experimentally observed particles coming in and going out. We integrate over the energy-momenta of the internal edges, which correspond to virtual particles, while requiring that energy-momentum is conserved at each vertex.

However, there is a problem: the integral typically diverges! Whenever a Feynman diagram contains a loop, the energy-momenta of the virtual particles in this loop can be arbitrarily large. Thus, we are integrating over an infinite region. In principle the integral could still converge if the integrand goes to zero fast enough. However, we rarely have such luck.

What does this mean, physically? It means that if we allow virtual particles with arbitrarily large energy-momenta in intermediate steps of a process, there are ‘too many ways for this process to occur’, so the amplitude for this process diverges.

Ultimately, the continuum nature of spacetime is to blame. In quantum mechanics, particles with large momenta are the same as waves with short wavelengths. Allowing light with arbitrarily short wavelengths created the ultraviolet catastrophe in classical electromagnetism. Quantum electromagnetism averted that catastrophe—but the problem returns in a different form as soon as we study the interaction of photons and charged particles.

Luckily, there is a strategy for tackling this problem. The integrals for Feynman diagrams become well-defined if we impose a ‘cutoff’, integrating only over energy-momenta p in some bounded region, say a ball of some large radius \Lambda. In quantum theory, a particle with momentum of magnitude greater than \Lambda is the same as a wave with wavelength less than \hbar/\Lambda. Thus, imposing the cutoff amounts to ignoring waves of short wavelength—and for the same reason, ignoring waves of high frequency. We obtain well-defined answers to physical questions when we do this. Unfortunately the answers depend on \Lambda, and if we let \Lambda \to \infty, they diverge.

However, this is not the correct limiting procedure. Indeed, among the quantities that we can compute using Feynman diagrams are the charge and mass of the electron! Its charge can be computed using diagrams in which an electron emits or absorbs a photon:

Similarly, its mass can be computed using a sum over Feynman diagrams where one electron comes in and one goes out.

The interesting thing is this: to do these calculations, we must start by assuming some charge and mass for the electron—but the charge and mass we get out of these calculations do not equal the masses and charges we put in!

The reason is that virtual particles affect the observed charge and mass of a particle. Heuristically, at least, we should think of an electron as surrounded by a cloud of virtual particles. These contribute to its mass and ‘shield’ its electric field, reducing its observed charge. It takes some work to translate between this heuristic story and actual Feynman diagram calculations, but it can be done.

Thus, there are two different concepts of mass and charge for the electron. The numbers we put into the QED calculations are called the ‘bare’ charge and mass, e_\mathrm{bare} and m_\mathrm{bare}. Poetically speaking, these are the charge and mass we would see if we could strip the electron of its virtual particle cloud and see it in its naked splendor. The numbers we get out of the QED calculations are called the ‘renormalized’ charge and mass, e_\mathbb{ren} and m_\mathbb{ren}. These are computed by doing a sum over Feynman diagrams. So, they take virtual particles into account. These are the charge and mass of the electron clothed in its cloud of virtual particles. It is these quantities, not the bare quantities, that should agree with experiment.

Thus, the correct limiting procedure in QED calculations is a bit subtle. For any value of \Lambda and any choice of e_\mathrm{bare} and m_\mathrm{bare}, we compute e_\mathbb{ren} and m_\mathbb{ren}. The necessary integrals all converge, thanks to the cutoff. We choose e_\mathrm{bare} and m_\mathrm{bare} so that e_\mathbb{ren} and m_\mathbb{ren} agree with the experimentally observed charge and mass of the electron. The bare charge and mass chosen this way depend on \Lambda, so call them e_\mathrm{bare}(\Lambda) and m_\mathrm{bare}(\Lambda).

Next, suppose we want to compute the answer to some other physics problem using QED. We do the calculation with a cutoff \Lambda, using e_\mathrm{bare}(\Lambda) and m_\mathrm{bare}(\Lambda) as the bare charge and mass in our calculation. Then we take the limit \Lambda \to \infty.

In short, rather than simply fixing the bare charge and mass and letting \Lambda \to \infty, we cleverly adjust the bare charge and mass as we take this limit. This procedure is called ‘renormalization’, and it has a complex and fascinating history:

• Laurie M. Brown, ed., Renormalization: From Lorentz to Landau (and Beyond), Springer, Berlin, 2012.

There are many technically different ways to carry out renormalization, and our account so far neglects many important issues. Let us mention three of the simplest.

First, besides the classes of Feynman diagrams already mentioned, we must also consider those where one photon goes in and one photon goes out, such as this:

These affect properties of the photon, such as its mass. Since we want the photon to be massless in QED, we have to adjust parameters as we take \Lambda \to \infty to make sure we obtain this result. We must also consider Feynman diagrams where nothing comes in and nothing comes out—so-called ‘vacuum bubbles’—and make these behave correctly as well.

Second, the procedure just described, where we impose a ‘cutoff’ and integrate over energy-momenta p lying in a ball of radius \Lambda, is not invariant under Lorentz transformations. Indeed, any theory featuring a smallest time or smallest distance violates the principles of special relativity: thanks to time dilation and Lorentz contractions, different observers will disagree about times and distances. We could accept that Lorentz invariance is broken by the cutoff and hope that it is restored in the \Lambda \to \infty limit, but physicists prefer to maintain symmetry at every step of the calculation. This requires some new ideas: for example, replacing Minkowski spacetime with 4-dimensional Euclidean space. In 4-dimensional Euclidean space, Lorentz transformations are replaced by rotations, and a ball of radius \Lambda is a rotation-invariant concept. To do their Feynman integrals in Euclidean space, physicists often let time take imaginary values. They do their calculations in this context and then transfer the results back to Minkowski spacetime at the end. Luckily, there are theorems justifying this procedure.

Third, besides infinities that arise from waves with arbitrarily short wavelengths, there are infinities that arise from waves with arbitrarily long wavelengths. The former are called ‘ultraviolet divergences’. The latter are called ‘infrared divergences’, and they afflict theories with massless particles, like the photon. For example, in QED the collision of two electrons will emit an infinite number of photons with very long wavelengths and low energies, called ‘soft photons’. In practice this is not so bad, since any experiment can only detect photons with energies above some nonzero value. However, infrared divergences are conceptually important. It seems that in QED any electron is inextricably accompanied by a cloud of soft photons. These are real, not virtual particles. This may have remarkable consequences.

Battling these and many other subtleties, many brilliant physicists and mathematicians have worked on QED. The good news is that this theory has been proved to be ‘perturbatively renormalizable’:

• J. S. Feldman, T. R. Hurd, L. Rosen and J. D. Wright, QED: A Proof of Renormalizability, Lecture Notes in Physics 312, Springer, Berlin, 1988.

• Günter Scharf, Finite Quantum Electrodynamics: The Causal Approach, Springer, Berlin, 1995

This means that we can indeed carry out the procedure roughly sketched above, obtaining answers to physical questions as power series in \alpha^{1/2}.

The bad news is we do not know if these power series converge. In fact, it is widely believed that they diverge! This puts us in a curious situation.

For example, consider the magnetic dipole moment of the electron. An electron, being a charged particle with spin, has a magnetic field. A classical computation says that its magnetic dipole moment is

\displaystyle{ \vec{\mu} = -\frac{e}{2m_e} \vec{S} }

where \vec{S} is its spin angular momentum. Quantum effects correct this computation, giving

\displaystyle{ \vec{\mu} = -g \frac{e}{2m_e} \vec{S} }

for some constant g called the gyromagnetic ratio, which can be computed using QED as a sum over Feynman diagrams with an electron exchanging a single photon with a massive charged particle:

The answer is a power series in \alpha^{1/2}, but since all these diagrams have an even number of vertices, it only contains integral powers of \alpha. The lowest-order term gives simply g = 2. In 1948, Julian Schwinger computed the next term and found a small correction to this simple result:

\displaystyle{ g = 2 + \frac{\alpha}{\pi} \approx 2.00232 }

By now a team led by Toichiro Kinoshita has computed g up to order \alpha^5. This requires computing over 13,000 integrals, one for each Feynman diagram of the above form with up to 10 vertices! The answer agrees very well with experiment: in fact, if we also take other Standard Model effects into account we get agreement to roughly one part in 10^{12}.

This is the most accurate prediction in all of science.

However, as mentioned, it is widely believed that this power series diverges! Next time I’ll explain why physicists think this, and what it means for a divergent series to give such a good answer when you add up the first few terms.

The Circular Electron Positron Collider

16 September, 2016

Chen-Ning Yang is perhaps China’s most famous particle physicists. Together with Tsung-Dao Lee, he won the Nobel prize in 1957 for discovering that the laws of physics known the difference between left and right. He helped create Yang–Mills theory: the theory that describes all the forces in nature except gravity. He helped find the Yang–Baxter equation, which describes what particles do when they move around on a thin sheet of matter, tracing out braids.

Right now the world of particle physics is in a shocked, somewhat demoralized state because the Large Hadron Collider has not yet found any physics beyond the Standard Model. Some Chinese scientists want to forge ahead by building an even more powerful, even more expensive accelerator.

But Yang recently came out against this. This is a big deal, because he is very prestigious, and only China has the will to pay for the next machine. The director of the Chinese institute that wants to build the next machine, Wang Yifeng, issued a point-by-point rebuttal of Yang the very next day.

Over on G+, Willie Wong translated some of Wang’s rebuttal in some comments to my post on this subject. The real goal of my post here is to make this translation a bit easier to find—not because I agree with Wang, but because this discussion is important: it affects the future of particle physics.

First let me set the stage. In 2012, two months after the Large Hadron Collider found the Higgs boson, the Institute of High Energy Physics proposed a bigger machine: the Circular Electron Positron Collider, or CEPC.

This machine would be a ring 100 kilometers around. It would collide electrons and positrons at an energy of 250 GeV, about twice what you need to make a Higgs. It could make lots of Higgs bosons and study their properties. It might find something new, too! Of course that would be the hope.

It would cost $6 billion, and the plan was that China would pay for 70% of it. Nobody knows who would pay for the rest.

According to Science:

On 4 September, Yang, in an article posted on the social media platform WeChat, says that China should not build a supercollider now. He is concerned about the huge cost and says the money would be better spent on pressing societal needs. In addition, he does not believe the science justifies the cost: The LHC confirmed the existence of the Higgs boson, he notes, but it has not discovered new particles or inconsistencies in the standard model of particle physics. The prospect of an even bigger collider succeeding where the LHC has failed is “a guess on top of a guess,” he writes. Yang argues that high-energy physicists should eschew big accelerator projects for now and start blazing trails in new experimental and theoretical approaches.

That same day, IHEP’s director, Wang Yifang, posted a point-by-point rebuttal on the institute’s public WeChat account. He criticized Yang for rehashing arguments he had made in the 1970s against building the BECP. “Thanks to comrade [Deng] Xiaoping,” who didn’t follow Yang’s advice, Wang wrote, “IHEP and the BEPC … have achieved so much today.” Wang also noted that the main task of the CEPC would not be to find new particles, but to carry out detailed studies of the Higgs boson.

Yang did not respond to request for comment. But some scientists contend that the thrust of his criticisms are against the CEPC’s anticipated upgrade, the Super Proton-Proton Collider (SPPC). “Yang’s objections are directed mostly at the SPPC,” says Li Miao, a cosmologist at Sun Yat-sen University, Guangzhou, in China, who says he is leaning toward supporting the CEPC. That’s because the cost Yang cites—$20 billion—is the estimated price tag of both the CEPC and the SPPC, Li says, and it is the SPPC that would endeavor to make discoveries beyond the standard model.

Still, opposition to the supercollider project is mounting outside the high-energy physics community. Cao Zexian, a researcher at CAS’s Institute of Physics here, contends that Chinese high-energy physicists lack the ability to steer or lead research in the field. China also lacks the industrial capacity for making advanced scientific instruments, he says, which means a supercollider would depend on foreign firms for critical components. Luo Huiqian, another researcher at the Institute of Physics, says that most big science projects in China have suffered from arbitrary cost cutting; as a result, the finished product is often a far cry from what was proposed. He doubts that the proposed CEPC would be built to specifications.

The state news agency Xinhua has lauded the debate as “progress in Chinese science” that will make big science decision-making “more transparent.” Some, however, see a call for transparency as a bad omen for the CEPC. “It means the collider may not receive the go-ahead in the near future,” asserts Institute of Physics researcher Wu Baojun. Wang acknowledged that possibility in a 7 September interview with Caijing magazine: “opposing voices naturally have an impact on future approval of the project,” he said.

Willie Wong’s prefaced his translation of Wang’s rebuttal with this:

Here is a translation of the essential parts of the rebuttal; some standard Chinese language disclaimers of deference etc are omitted. I tried to make the translation as true to the original as possible; the viewpoints expressed are not my own.

Here is the translation:

Today (September 4) published the article by CN Yang titled “China should not build an SSC today”. As a scientist who works on the front line of high energy physics and the current director of the the high energy physics institute in the Chinese Academy of Sciences, I cannot agree with his viewpoint.

(A) The first reason to Dr. Yang’s objection is that a supercollider is a bottomless hole. His objection stemmed from the American SSC wasting 3 billion US dollars and amounted to naught. The LHC cost over 10 billion US dollars. Thus the proposed Chinese accelerator cannot cost less than 20 billion US dollars, with no guaranteed returns. [Ed: emphasis original]

Here, there are actually three problems. The first is “why did SSC fail”? The second is “how much would a Chinese collider cost?” And the third is “is the estimate reasonable and realistic?” Here I address them point by point.

(1) Why did the American SSC fail? Are all colliders bottomless pits?

The many reasons leading to the failure of the American SSC include the government deficit at the time, the fight for funding against the International Space Station, the party politics of the United States, the regional competition between Texas and other states. Additionally there are problems with poor management, bad budgeting, ballooning construction costs, failure to secure international collaboration. See references [2,3] [Ed: consult original article for references; items 1-3 are English language]. In reality, “exceeding the budget” is definitely not the primary reason for the failure of the SSC; rather, the failure should be attributed to some special and circumstantial reasons, caused mainly by political elements.

For the US, abandoning the SSC was a very incorrect decision. It lost the US the chance for discovering the Higgs Boson, as well as the foundations and opportunities for future development, and thereby also the leadership position that US has occupied internationally in high energy physics until then. This definitely had a very negative impact on big science initiatives in the US, and caused one generation of Americans to lose the courage to dream. The reasons given by the American scientific community against the SSC are very similar to what we here today against the Chinese collider project. But actually the cancellation of the SSC did not increase funding to other scientific endeavors. Of course, activation of the SSC would not have reduced the funding to other scientific endeavors, and many people who objected to the project are not regretting it.

Since then, LHC was constructed in Europe, and achieved great success. Even though its construction exceeded its original budget, but not by a lot. This shows that supercollider projects do not have to be bottomless, and has a chance to succeed.

The Chinese political landscape is entirely different from that of the US. In particular, for large scale constructions, the political system is superior. China has already accomplished to date many tasks which the Americans would not, or could not do; many more will happen in the future. The failure of SSC doesn’t mean that we cannot do it. We should scientifically analyze the situation, and at the same time foster international collaboration, and properly manage the budget.

(2) How much would it cost? Our planned collider (using circumference of 100 kilometers for computations) will proceed in two steps. [Ed: details omitted. The author estimated that the electron-positron collider will cost 40 Billion Yuan, followed by the proton-proton collider which will cost 100 billion Yuan, not accounting for inflation. With approximately 10 year construction time for each phase.] The two-phase planning is to showcase the scientific longevity of the project, especially entrainment of other technical development (e.g. high energy superconductors), and that the second phase [ed: the proton-proton collider] is complementary to the scientific and technical developments of the first phase. The reason that the second phase designs are incorporated in the discussion is to prevent the scenario where design elements of the first phase inadvertently shuts off possibility of further expansion in the second phase.

(3) Is this estimate realistic? Are we going to go down the same road as the American SSC?

First, note that in the past 50 years , there were many successful colliders internationally (LEP, LHC, PEPII, KEKB/SuperKEKB etc) and many unsuccessful ones (ISABELLE, SSC, FAIR, etc). The failed ones are all proton accelerators. All electron colliders have been successful. The main reason is that proton accelerators are more complicated, and it is harder to correctly estimate the costs related to constructing machines beyond the current frontiers.

There are many successful large-scale constructions in China. In the 40 years since the founding of the high energy physics institute, we’ve built [list of high energy experiment facilities, I don’t know all their names in English], each costing over 100 million Yuan, and none are more than 5% over budget, in terms of actual costs of construction, time to completion, meeting milestones. We have a well developed expertise in budget, construction, and management.

For the CEPC (electron-positron collider) our estimates relied on two methods:

(i) Summing of the parts: separately estimating costs of individual elements and adding them up.

(ii) Comparisons: using costs for elements derived from costs of completed instruments both domestically and abroad.

At the level of the total cost and at the systems level, the two methods should produce cost estimates within 20% of each other.

After completing the initial design [ref. 1], we produced a list of more than 1000 required equipments, and based our estimates on that list. The estimates are refereed by local and international experts.

For the SPPC (the proton-proton collider; second phase) we only used the second method (comparison). This is due to the second phase not being the main mission at hand, and we are not yet sure whether we should commit to the second phase. It is therefore not very meaningful to discuss its potential cost right now. We are committed to only building the SPPC once we are sure the science and the technology are mature.

(B) The second reason given by Dr. Yang is that China is still a developing country, and there are many social-economic problems that should be solved before considering a supercollider.

Any country, especially one as big as China, must consider both the immediate and the long-term in its planning. Of course social-economic problems need to be solved, and indeed solving them is taking currently the lions share of our national budget. But we also need to consider the long term, including an appropriate amount of expenditures on basic research, to enable our continuous development and the potential to lead the world. The China at the end of the Qing dynasty has a rich populace with the world’s highest GDP. But even though the government has the ability to purchase armaments, the lack of scientific understanding reduced the country to always be on the losing side of wars.

In the past few hundred years, developments into understanding the structure of matter, from molecules, atoms, to the nucleus, the elementary particles, all contributed and led the scientific developments of their era. High energy physics pursue the finest structure of matter and its laws, the techniques used cover many different fields, from accelerator, detectors, to low temperature, superconducting, microwave, high frequency, vacuum, electronic, high precision instrumentation, automatic controls, computer science and networking, in many ways led to the developments in those fields and their broad adoption. This is a indicator field in basic science and technical developments. Building the supercollider can result in China occupying the leadership position in such diverse scientific fields for several decades, and also lead to the domestic production of many of the important scientific and technical instruments. Furthermore, it will allow us to attract international intellectual capital, and allow the training of thousands of world-leading specialists in our institutes. How is this not an urgent need for the country?

In fact, the impression the Chinese government and the average Chinese people create for the world at large is a populace with lots of money, and also infatuated with money. It is hard for a large country to have a international voice and influence without significant contribution to the human culture. This influence, in turn, affects the benefits China receive from other countries. In terms of current GDP, the proposed project (including also the phase 2 SPPC) does not exceed that of the Beijing positron-electron collider completed in the 80s, and is in fact lower than LEP, LHC, SSC, and ILC.

Designing and starting the construction of the next supercollider within the next 5 years is a rare opportunity to let us achieve a leadership position internationally in the field of high energy physics. The newly discovered Higgs boson has a relatively low mass, which allows us to probe it further using a circular positron-electron collider. Furthermore, such colliders has a chance to be modified into proton colliders. This facility will have over 5 decades of scientific use. Furthermore, currently Europe, US, and Japan all already have scientific items on their agenda, and within 20 years probably cannot construct similar facilities. This gives us an advantage in competitiveness. Thirdly, we already have the experience building the Beijing positron-electron collider, so such a facility is in our strengths. The window of opportunity typically lasts only 10 years, if we miss it, we don’t know when the next window will be. Furthermore, we have extensive experience in underground construction, and the Chinese economy is currently at a stage of high growth. We have the ability to do the constructions and also the scientific need. Therefore a supercollider is a very suitable item to consider.

(C) The third reason given by Dr. Yang is that constructing a supercollider necessarily excludes funding other basic sciences.

China currently spends 5% of all R&D budget on basic research; internationally 15% is more typical for developed countries. As a developing country aiming to joint the ranks of developed country, and as a large country, I believe we should aim to raise the ratio to 10% gradually and eventually to 15%. In terms of numbers, funding for basic science has a large potential for growth (around 100 billion yuan per annum) without taking away from other basic science research.

On the other hand, where should the increased funding be directed? Everyone knows that a large portion of our basic science research budgets are spent on purchasing scientific instruments, especially from international sources. If we evenly distribute the increased funding amount all basic science fields, the end results is raising the GDP of US, Europe, and Japan. If we instead spend 10 years putting 30 billion Yuan into accelerator science, more than 90% of the money will remain in the country, and improve our technical development and market share of domestic companies. This will also allow us to raise many new scientists and engineers, and greatly improve the state of art in domestically produced scientific instruments.

In addition, putting emphasis into high energy physics will only bring us to the normal funding level internationally (it is a fact that particle physics and nuclear physics are severely underfunded in China). For the purposes of developing a world-leading big science project, CEPC is a very good candidate. And it does not contradict a desire to also develop other basic sciences.

(D) Dr. Yang’s fourth objection is that both supersymmetry and quantum gravity have not been verified, and the particles we hope to discover using the new collider will in fact be nonexistent.

That is of course not the goal of collider science. In [ref 1] which I gave to Dr. Yang myself, we clearly discussed the scientific purpose of the instrument. Briefly speaking, the standard model is only an effective theory in the low energy limit, and a new and deeper theory is need. Even though there are some experimental evidence beyond the standard model, more data will be needed to indicate the correct direction to develop the theory. Of the known problems with the standard model, most are related to the Higgs Boson. Thus a deeper physical theory should have hints in a better understanding of the Higgs boson. CEPC can probe to 1% precision [ed. I am not sure what this means] Higgs bosons, 10 times better than LHC. From this we have the hope to correctly identify various properties of the Higgs boson, and test whether it in fact matches the standard model. At the same time, CEPC has the possibility of measuring the self-coupling of the Higgs boson, of understanding the Higgs contribution to vacuum phase transition, which is important for understanding the early universe. [Ed. in this previous sentence, the translations are a bit questionable since some HEP jargon is used with which I am not familiar] Therefore, regardless of whether LHC has discovered new physics, CEPC is necessary.

If there are new coupling mechanisms for Higgs, new associated particles, composite structure for Higgs boson, or other differences from the standard model, we can continue with the second phase of the proton-proton collider, to directly probe the difference. Of course this could be due to supersymmetry, but it could also be due to other particles. For us experimentalists, while we care about theoretical predictions, our experiments are not designed only for them. To predict whether a collider can or cannot discover a hypothetical particle at this moment in time seems premature, and is not the view point of the HEP community in general.

(E) The fifth objection is that in the past 70 years high energy physics have not led to tangible improvements to humanity, and in the future likely will not.

In the past 70 years, there are many results from high energy physics, which led to techniques common to everyday life. [Ed: list of examples include sychrotron radiation, free electron laser, scatter neutron source, MRI, PET, radiation therapy, touch screens, smart phones, the world-wide web. I omit the prose.]

[Ed. Author proceeds to discuss hypothetical economic benefits from
a) superconductor science
b) microwave source
c) cryogenics
d) electronics
sort of the usual stuff you see in funding proposals.]

(F) The sixth reason was that the institute for High Energy Physics of the Chinese Academy of Sciences has not produced much in the past 30 years. The major scientific contributions to the proposed collider will be directed by non-Chinese, and so the nobel will also go to a non-Chinese.

[Ed. I’ll skip this section because it is a self-congratulatory pat on one’s back (we actually did pretty well for the amount of money invested), a promise to promote Chinese participation in the project (in accordance to the economic investment), and the required comment that “we do science for the sake of science, and not for winning the Nobel.”]

(G) The seventh reason is that the future in HEP is in developing a new technique to accelerate particles, and developing a geometric theory, not in building large accelerators.

A new method to accelerate particles is definitely an important aspect to accelerator science. In the next several decades this can prove useful for scattering experiments or for applied fields where beam confinement is not essential. For high energy colliders, in terms of beam emittance and energy efficiency, new acceleration principles have a long way to go. During this period, high energy physics cannot be simply put on hold. In terms of “geometric theory” or “string theory”, these are too far from experimentally approachable, and is not a problem we can consider currently.

People disagree on the future of high energy physics. Currently there are no Chinese winners of the Nobel prize in physics, but there are many internationally. Dr. Yang’s viewpoints are clearly out of mainstream. Not just currently, but also in the past several decades. Dr. Yang has been documented to have held a pessimistic view of higher energy physics and its future since the 60s, and that’s how he missed out on the discovery of the standard model. He is on record as being against Chinese collider science since the 70s. It is fortunate that the government supported the Institute of High Energy Physics and constructed various supporting facilities, leading to our current achievements in synchrotron radiation and neutron scattering. For the future, we should listen to the younger scientists at the forefront of current research, for that’s how we can gain international recognition for our scientific research.

It will be very interesting to see how this plays out.

Struggles with the Continuum (Part 4)

14 September, 2016

In this series we’re looking at mathematical problems that arise in physics due to treating spacetime as a continuum—basically, problems with infinities.

In Part 1 we looked at classical point particles interacting gravitationally. We saw they could convert an infinite amount of potential energy into kinetic energy in a finite time! Then we switched to electromagnetism, and went a bit beyond traditional Newtonian mechanics: in Part 2 we threw quantum mechanics into the mix, and in Part 3 we threw in special relativity. Briefly, quantum mechanics made things better, but special relativity made things worse.

Now let’s throw in both!

When we study charged particles interacting electromagnetically in a way that takes both quantum mechanics and special relativity into account, we are led to quantum field theory. The ensuing problems are vastly more complicated than in any of the physical theories we have discussed so far. They are also more consequential, since at present quantum field theory is our best description of all known forces except gravity. As a result, many of the best minds in 20th-century mathematics and physics have joined the fray, and it is impossible here to give more than a quick summary of the situation. This is especially true because the final outcome of the struggle is not yet known.

It is ironic that quantum field theory originally emerged as a solution to a problem involving the continuum nature of spacetime, now called the ‘ultraviolet catastrophe’. In classical electromagnetism, a box with mirrored walls containing only radiation acts like a collection of harmonic oscillators, one for each vibrational mode of the electromagnetic field. If we assume waves can have arbitrarily small wavelengths, there are infinitely many of these oscillators. In classical thermodynamics, a collection of harmonic oscillators in thermal equilibrium will share the available energy equally: this result is called the ‘equipartition theorem’.

Taken together, these principles lead to a dilemma worthy of Zeno. The energy in the box must be divided into an infinite number of equal parts. If the energy in each part is nonzero, the total energy in the box must be infinite. If it is zero, there can be no energy in the box.

For the logician, there is an easy way out: perhaps a box of electromagnetic radiation can only be in thermal equilibrium if it contains no energy at all! But this solution cannot satisfy the physicist, since it does not match what is actually observed. In reality, any nonnegative amount of energy is allowed in thermal equilibrium.

The way out of this dilemma was to change our concept of the harmonic oscillator. Planck did this in 1900, almost without noticing it. Classically, a harmonic oscillator can have any nonnegative amount of energy. Planck instead treated the energy

…not as a continuous, infinitely divisible quantity, but as a discrete quantity composed of an integral number of finite equal parts.

In modern notation, the allowed energies of a quantum harmonic oscillator are integer multiples of \hbar \omega, where \omega is the oscillator’s frequency and \hbar is a new constant of nature, named after Planck. When energy can only take such discrete values, the equipartition theorem no longer applies. Instead, the principles of thermodynamics imply that there is a well-defined thermal equilibrium in which vibrational modes with shorter and shorter wavelengths, and thus higher and higher energies, hold less and less of the available energy. The results agree with experiments when the constant \hbar is given the right value.

The full import of what Planck had done became clear only later, starting with Einstein’s 1905 paper on the photoelectric effect, for which he won the Nobel prize. Einstein proposed that the discrete energy steps actually arise because light comes in particles, now called ‘photons’, with a photon of frequency \omega carrying energy \hbar\omega. It was even later that Ehrenfest emphasized the role of the equipartition theorem in the original dilemma, and called this dilemma the ‘ultraviolet catastrophe’. As usual, the actual history is more complicated than the textbook summaries. For details, try:

• Helen Kragh, Quantum Generations: A History of Physics in the Twentieth Century, Princeton U. Press, Princeton, 1999.

The theory of the ‘free’ quantum electromagnetic field—that is, photons not interacting with charged particles—is now well-understood. It is a bit tricky to deal with an infinite collection of quantum harmonic oscillators, but since each evolves independently from all the rest, the issues are manageable. Many advances in analysis were required to tackle these issues in a rigorous way, but they were erected on a sturdy scaffolding of algebra. The reason is that the quantum harmonic oscillator is exactly solvable in terms of well-understood functions, and so is the free quantum electromagnetic field. By the 1930s, physicists knew precise formulas for the answers to more or less any problem involving the free quantum electromagnetic field. The challenge to mathematicians was then to find a coherent mathematical framework that takes us to these answers starting from clear assumptions. This challenge was taken up and completely met by the mid-1960s.

However, for physicists, the free quantum electromagnetic field is just the starting-point, since this field obeys a quantum version of Maxwell’s equations where the charge density and current density vanish. Far more interesting is ‘quantum electrodynamics’, or QED, where we also include fields describing charged particles—for example, electrons and their antiparticles, positrons—and try to impose a quantum version of the full-fledged Maxwell equations. Nobody has found a fully rigorous formulation of QED, nor has anyone proved such a thing cannot be found!

QED is part of a more complicated quantum field theory, the Standard Model, which describes the electromagnetic, weak and strong forces, quarks and leptons, and the Higgs boson. It is widely regarded as our best theory of elementary particles. Unfortunately, nobody has found a rigorous formulation of this theory either, despite decades of hard work by many smart physicists and mathematicians.

To spur progress, the Clay Mathematics Institute has offered a million-dollar prize for anyone who can prove a widely believed claim about a class of quantum field theories called ‘pure Yang–Mills theories’.

A good example is the fragment of the Standard Model that describes only the strong force—or in other words, only gluons. Unlike photons in QED, gluons interact with each other. To win the prize, one must prove that the theory describing them is mathematically consistent and that it describes a world where the lightest particle is a ‘glueball’: a blob made of gluons, with mass strictly greater than zero. This theory is considerably simpler than the Standard Model. However, it is already very challenging.

This is not the only million-dollar prize that the Clay Mathematics Institute is offering for struggles with the continuum. They are also offering one for a proof of global existence of solutions to the Navier–Stokes equations for fluid flow. However, their quantum field theory challenge is the only one for which the problem statement is not completely precise. The Navier–Stokes equations are a collection of partial differential equations for the velocity and pressure of a fluid. We know how to precisely phrase the question of whether these equations have a well-defined solution for all time given smooth initial data. Describing a quantum field theory is a trickier business!

To be sure, there are a number of axiomatic frameworks for quantum field theory:

• Ray Streater and Arthur Wightman, PCT, Spin and Statistics, and All That, Benjamin Cummings, San Francisco, 1964.

• James Glimm and Arthur Jaffe, Quantum Physics: A Functional Integral Point of View, Springer, Berlin, 1987.

• John C. Baez, Irving Segal and Zhengfang Zhou, Introduction to Algebraic and Constructive Quantum Field Theory, Princeton U. Press, Princeton, 1992.

• Rudolf Haag, Local Quantum Physics: Fields, Particles, Algebras, Springer, Berlin, 1996.

We can prove physically interesting theorems from these axioms, and also rigorously construct some quantum field theories obeying these axioms. The easiest are the free theories, which describe non-interacting particles. There are also examples of rigorously construted quantum field theories that describe interacting particles in fewer than 4 spacetime dimensions. However, no quantum field theory that describes interacting particles in 4-dimensional spacetime has been proved to obey the usual axioms. Thus, much of the wisdom of physicists concerning quantum field theory has not been fully transformed into rigorous mathematics.

Worse, the question of whether a particular quantum field theory studied by physicists obeys the usual axioms is not completely precise—at least, not yet. The problem is that going from the physicists’ formulation to a mathematical structure that might or might not obey the axioms involves some choices.

This is not a cause for despair; it simply means that there is much work left to be done. In practice, quantum field theory is marvelously good for calculating answers to physics questions. The answers involve approximations. In practice the approximations work very well. The problem is just that we do not fully understand, in a mathematically rigorous way, what these approximations are supposed to be approximating.

How could this be? In the next part, I’ll sketch some of the key issues in the case of quantum electrodynamics. I won’t get into all the technical details: they’re too complicated to explain in one blog article. Instead, I’ll just try to give you a feel for what’s at stake.

Struggles with the Continuum (Part 3)

12 September, 2016

In these posts, we’re seeing how our favorite theories of physics deal with the idea that space and time are a continuum, with points described as lists of real numbers. We’re not asking if this idea is true: there’s no clinching evidence to answer that question, so it’s too easy to let ones philosophical prejudices choose the answer. Instead, we’re looking to see what problems this idea causes, and how physicists have struggled to solve them.

We started with the Newtonian mechanics of point particles attracting each other with an inverse square force law. We found strange ‘runaway solutions’ where particles shoot to infinity in a finite amount of time by converting an infinite amount of potential energy into kinetic energy.

Then we added quantum mechanics, and we saw this problem went away, thanks to the uncertainty principle.

Now let’s take away quantum mechanics and add special relativity. Now our particles can’t go faster than light. Does this help?

Point particles interacting with the electromagnetic field

Special relativity prohibits instantaneous action at a distance. Thus, most physicists believe that special relativity requires that forces be carried by fields, with disturbances in these fields propagating no faster than the speed of light. The argument for this is not watertight, but we seem to actually see charged particles transmitting forces via a field, the electromagnetic field—that is, light. So, most work on relativistic interactions brings in fields.

Classically, charged point particles interacting with the electromagnetic field are described by two sets of equations: Maxwell’s equations and the Lorentz force law. The first are a set of differential equations involving:

• the electric field \vec E and mangetic field \vec B, which in special relativity are bundled together into the electromagnetic field F, and

• the electric charge density \rho and current density \vec \jmath, which are bundled into another field called the ‘four-current’ J.

By themselves, these equations are not enough to completely determine the future given initial conditions. In fact, you can choose \rho and \vec \jmath freely, subject to the conservation law

\displaystyle{ \frac{\partial \rho}{\partial t} + \nabla \cdot \vec \jmath = 0 }

For any such choice, there exists a solution of Maxwell’s equations for t \ge 0 given initial values for \vec E and \vec B that obey these equations at t = 0.

Thus, to determine the future given initial conditions, we also need equations that say what \rho and \vec{\jmath} will do. For a collection of charged point particles, they are determined by the curves in spacetime traced out by these particles. The Lorentz force law says that the force on a particle of charge e is

\vec{F} = e (\vec{E} + \vec{v} \times \vec{B})

where \vec v is the particle’s velocity and \vec{E} and \vec{B} are evaluated at the particle’s location. From this law we can compute the particle’s acceleration if we know its mass.

The trouble starts when we try to combine Maxwell’s equations and the Lorentz force law in a consistent way, with the goal being to predict the future behavior of the \vec{E} and \vec{B} fields, together with particles’ positions and velocities, given all these quantities at t = 0. Attempts to do this began in the late 1800s. The drama continues today, with no definitive resolution! You can find good accounts, available free online, by Feynman and by Janssen and Mecklenburg. Here we can only skim the surface.

The first sign of a difficulty is this: the charge density and current associated to a charged particle are singular, vanishing off the curve it traces out in spacetime but ‘infinite’ on this curve. For example, a charged particle at rest at the origin has

\rho(t,\vec x) = e \delta(\vec{x}), \qquad \vec{\jmath}(t,\vec{x}) = \vec{0}

where \delta is the Dirac delta and e is the particle’s charge. This in turn forces the electric field to be singular at the origin. The simplest solution of Maxwell’s equations consisent with this choice of \rho and \vec\jmath is

\displaystyle{ \vec{E}(t,\vec x) = \frac{e \hat{r}}{4 \pi \epsilon_0 r^2}, \qquad \vec{B}(t,\vec x) = 0 }

where \hat{r} is a unit vector pointing away from the origin and \epsilon_0 is a constant called the permittivity of free space.

In short, the electric field is ‘infinite’, or undefined, at the particle’s location. So, it is unclear how to define the ‘self-force’ exerted by the particle’s own electric field on itself. The formula for the electric field produced by a static point charge is really just our old friend, the inverse square law. Since we had previously ignored the force of a particle on itself, we might try to continue this tactic now. However, other problems intrude.

In relativistic electrodynamics, the electric field has energy density equal to

\displaystyle{ \frac{\epsilon_0}{2} |\vec{E}|^2 }

Thus, the total energy of the electric field of a point charge at rest is proportional to

\displaystyle{ \frac{\epsilon_0}{2} \int_{\mathbb{R}^3} |\vec{E}|^2 \, d^3 x =  \frac{e^2}{8 \pi \epsilon_0} \int_0^\infty \frac{1}{r^4} \, r^2 dr. }

But this integral diverges near r = 0, so the electric field of a charged particle has an infinite energy!

How, if at all, does this cause trouble when we try to unify Maxwell’s equations and the Lorentz force law? It helps to step back in history. In 1902, the physicist Max Abraham assumed that instead of a point, an electron is a sphere of radius R with charge evenly distributed on its surface. Then the energy of its electric field becomes finite, namely:

\displaystyle{  E = \frac{e^2}{8 \pi \epsilon_0} \int_{R}^\infty \frac{1}{r^4} \, r^2 dr = \frac{1}{2} \frac{e^2}{4 \pi \epsilon_0 R} }

where e is the electron’s charge.

Abraham also computed the extra momentum a moving electron of this sort acquires due to its electromagnetic field. He got it wrong because he didn’t understand Lorentz transformations. In 1904 Lorentz did the calculation right. Using the relationship between velocity, momentum and mass, we can derive from his result a formula for the ‘electromagnetic mass’ of the electron:

\displaystyle{  m = \frac{2}{3} \frac{e^2}{4 \pi \epsilon_0 R c^2} }

where c is the speed of light. We can think of this as the extra mass an electron acquires by carrying an electromagnetic field along with it.

Putting the last two equations together, these physicists obtained a remarkable result:

\displaystyle{  E = \frac{3}{4} mc^2 }

Then, in 1905, a fellow named Einstein came along and made it clear that the only reasonable relation between energy and mass is

E = mc^2

What had gone wrong?

In 1906, Poincaré figured out the problem. It is not a computational mistake, nor a failure to properly take special relativity into account. The problem is that like charges repel, so if the electron were a sphere of charge it would explode without something to hold it together. And that something—whatever it is—might have energy. But their calculation ignored that extra energy.

In short, the picture of the electron as a tiny sphere of charge, with nothing holding it together, is incomplete. And the calculation showing E = \frac{3}{4}mc^2, together with special relativity saying E = mc^2, shows that this incomplete picture is inconsistent. At the time, some physicists hoped that all the mass of the electron could be accounted for by the electromagnetic field. Their hopes were killed by this discrepancy.

Nonetheless it is interesting to take the energy E computed above, set it equal to m_e c^2 where m_e is the electron’s observed mass, and solve for the radius R. The answer is

\displaystyle{ R = \frac{1}{8 \pi \epsilon_0} \frac{e^2}{m_e c^2} } \approx 1.4 \times 10^{-15} \mathrm{ meters}

In the early 1900s, this would have been a remarkably tiny distance: 0.00003 times the Bohr radius of a hydrogen atom. By now we know this is roughly the radius of a proton. We know that electrons are not spheres of this size. So at present it makes more sense to treat the calculations so far as a prelude to some kind of limiting process where we take R \to 0. These calculations teach us two lessons.

First, the electromagnetic field energy approaches +\infty as we let R \to 0, so we will be hard pressed to take this limit and get a well-behaved physical theory. One approach is to give a charged particle its own ‘bare mass’ m_\mathrm{bare} in addition to the mass m_\mathrm{elec} arising from electromagnetic field energy, in a way that depends on R. Then as we take the R \to 0 limit we can let m_\mathrm{bare} \to -\infty in such a way that m_\mathrm{bare} + m_\mathrm{elec} approaches a chosen limit m, the physical mass of the point particle. This is an example of ‘renormalization’.

Second, it is wise to include conservation of energy-momentum as a requirement in addition to Maxwell’s equations and the Lorentz force law. Here is a more sophisticated way to phrase Poincaré’s realization. From the electromagnetic field one can compute a ‘stress-energy tensor’ T, which describes the flow of energy and momentum through spacetime. If all the energy and momentum of an object comes from its electromagnetic field, you can compute them by integrating T over the hypersurface t = 0. You can prove that the resulting 4-vector transforms correctly under Lorentz transformations if you assume the stress-energy tensor has vanishing divergence: \partial^\mu T_{\mu \nu} = 0. This equation says that energy and momentum are locally conserved. However, this equation fails to hold for a spherical shell of charge with no extra forces holding it together. In the absence of extra forces, it violates conservation of momentum for a charge to feel an electromagnetic force yet not accelerate.

So far we have only discussed the simplest situation: a single charged particle at rest, or moving at a constant velocity. To go further, we can try to compute the acceleration of a small charged sphere in an arbitrary electromagnetic field. Then, by taking the limit as the radius r of the sphere goes to zero, perhaps we can obtain the law of motion for a charged point particle.

In fact this whole program is fraught with difficulties, but physicists boldly go where mathematicians fear to tread, and in a rough way this program was carried out already by Abraham in 1905. His treatment of special relativistic effects was wrong, but these were easily corrected; the real difficulties lie elsewhere. In 1938 his calculations were carried out much more carefully—though still not rigorously—by Dirac. The resulting law of motion is thus called the ‘Abraham–Lorentz–Dirac force law’.

There are three key ways in which this law differs from our earlier naive statement of the Lorentz force law:

• We must decompose the electromagnetic field in two parts, the ‘external’ electromagnetic field F_\mathrm{ext} and the field produced by the particle:

F = F_\mathrm{ext} + F_\mathrm{ret}

Here F_\mathrm{ext} is a solution Maxwell equations with J = 0, while F_\mathrm{ret} is computed by convolving the particle’s 4-current J with a function called the ‘retarded Green’s function’. This breaks the time-reversal symmetry of the formalism so far, ensuring that radiation emitted by the particle moves outward as t increases. We then decree that the particle only feels a Lorentz force due to F_\mathrm{ext}, not F_\mathrm{ret}. This avoids the problem that F_\mathrm{ret} becomes infinite along the particle’s path as r \to 0.

• Maxwell’s equations say that an accelerating charged particle emits radiation, which carries energy-momentum. Conservation of energy-momentum implies that there is a compensating force on the charged particle. This is called the ‘radiation reaction’. So, in addition to the Lorentz force, there is a radiation reaction force.

• As we take the limit r \to 0, we must adjust the particle’s bare mass m_\mathrm{bare} in such a way that its physical mass m = m_\mathrm{bare} + m_\mathrm{elec} is held constant. This involves letting m_\mathrm{bare} \to -\infty as m_\mathrm{elec} \to +\infty.

It is easiest to describe the Abraham–Lorentz–Dirac force law using standard relativistic notation. So, we switch to units where c and 4 \pi \epsilon_0 equal 1, let x^\mu denote the spacetime coordinates of a point particle, and use a dot to denote the derivative with respect to proper time. Then the Abraham–Lorentz–Dirac force law says

m \ddot{x}^\mu = e F_{\mathrm{ext}}^{\mu \nu} \, \dot{x}_\nu \; - \; \frac{2}{3}e^2 \ddot{x}^\alpha \ddot{x}_\alpha \, \dot{x}^\mu \; + \; \frac{2}{3}e^2 \dddot{x}^\mu  .

The first term at right is the Lorentz force, which looks more elegant in this new notation. The second term is fairly intuitive: it acts to reduce the particle’s velocity at a rate proportional to its velocity (as one would expect from friction), but also proportional to the squared magnitude of its acceleration. This is the ‘radiation reaction’.

The last term, called the ‘Schott term’, is the most shocking. Unlike all familiar laws in classical mechanics, it involves the third derivative of the particle’s position!

This seems to shatter our original hope of predicting the electromagnetic field and the particle’s position and velocity given their initial values. Now it seems we need to specify the particle’s initial position, velocity and acceleration.

Furthermore, unlike Maxwell’s equations and the original Lorentz force law, the Abraham–Lorentz–Dirac force law is not symmetric under time reversal. If we take a solution and replace t with -t, the result is not a solution. Like the force of friction, radiation reaction acts to make a particle lose energy as it moves into the future, not the past.

The reason is that our assumptions have explicitly broken time symmetry. The splitting F = F_\mathrm{ext} + F_\mathrm{ret} says that a charged accelerating particle radiates into the future, creating the field F_\mathrm{ret}, and is affected only by the remaining electromagnetic field F_\mathrm{ext}.

Worse, the Abraham–Lorentz–Dirac force law has counterintuitive solutions. Suppose for example that F_\mathrm{ext} = 0. Besides the expected solutions where the particle’s velocity is constant, there are solutions for which the particle accelerates indefinitely, approaching the speed of light! These are called ‘runaway solutions’. In these runaway solutions, the acceleration as measured in the frame of reference of the particle grows exponentially with the passage of proper time.

So, the notion that special relativity might help us avoid the pathologies of Newtonian point particles interacting gravitationally—five-body solutions where particles shoot to infinity in finite time—is cruelly mocked by the Abraham–Lorentz–Dirac force law. Particles cannot move faster than light, but even a single particle can extract an arbitrary amount of energy-momentum from the electromagnetic field in its immediate vicinity and use this to propel itself forward at speeds approaching that of light. The energy stored in the field near the particle is sometimes called ‘Schott energy’.

Thanks to the Schott term in the Abraham–Lorentz–Dirac force law, the Schott energy can be converted into kinetic energy for the particle. The details of how this work are nicely discussed in a paper by Øyvind Grøn, so click the link and read that if you’re interested. I’ll just show you a picture from that paper:

Gron - Schott energy

So even one particle can do crazy things! But worse, suppose we generalize the framework to include more than one particle. The arguments for the Abraham–Lorentz–Dirac force law can be generalized to this case. The result is simply that each particle obeys this law with an external field F_\mathrm{ext} that includes the fields produced by all the other particles. But a problem appears when we use this law to compute the motion of two particles of opposite charge. To simplify the calculation, suppose they are located symmetrically with respect to the origin, with equal and opposite velocities and accelerations. Suppose the external field felt by each particle is solely the field created by the other particle. Since the particles have opposite charges, they should attract each other. However, one can prove they will never collide. In fact, if at any time they are moving towards each other, they will later turn around and move away from each other at ever-increasing speed!

This fact was discovered by C. Jayaratnam Eliezer in 1943. It is so counterintuitive that several proofs were required before physicists believed it.

None of these strange phenomena have ever been seen experimentally. Faced with this problem, physicists have naturally looked for ways out. First, why not simply cross out the \dddot{x}^\mu term in the Abraham–Lorentz–Dirac force? Unfortunately the resulting simplified equation

m \ddot{x}^\mu = e F_{\mathrm{ext}}^{\mu \nu} \, \dot{x}_\nu - \frac{2}{3}e^2 \ddot{x}^\alpha \ddot{x}_\alpha \, \dot{x}^\mu

has only trivial solutions. The reason is that with the particle’s path parametrized by proper time, the vector \dot{x}^\mu has constant length, so the vector \ddot{x}^\mu is orthogonal to \dot{x}^\mu . So is the vector F_{\mathrm{ext}}^{\mu \nu} \dot{x}_\nu, because F_{\mathrm{ext}} is an antisymmetric tensor. So, the last term must be zero, which implies \ddot{x} = 0, which in turn implies that all three terms must vanish.

Another possibility is that some assumption made in deriving the Abraham–Lorentz–Dirac force law is incorrect. Of course the theory is physically incorrect, in that it ignores quantum mechanics and other things, but that is not the issue. The issue here is one of mathematical physics, of trying to formulate a well-behaved classical theory that describes charged point particles interacting with the electromagnetic field. If we can prove this is impossible, we will have learned something. But perhaps there is a loophole. The original arguments for the Abraham–Lorentz–Dirac force law are by no means mathematically rigorous. They involve a delicate limiting procedure, and approximations that were believed, but not proved, to become perfectly accurate in the r \to 0 limit. Could these arguments conceal a mistake?

Calculations involving a spherical shell of charge has been improved by a series of authors, and nicely summarized by Fritz Rohrlich. In all these calculations, nonlinear powers of the acceleration and its time derivatives are neglected, and one hopes this is acceptable in the r \to 0 limit.

Dirac, struggling with renormalization in quantum field theory, took a different tack. Instead of considering a sphere of charge, he treated the electron as a point from the very start. However, he studied the flow of energy-momentum across the surface of a tube of radius r centered on the electron’s path. By computing this flow in the limit r \to 0, and using conservation of energy-momentum, he attempted to derive the force on the electron. He did not obtain a unique result, but the simplest choice gives the Abraham–Lorentz–Dirac equation. More complicated choices typically involve nonlinear powers of the acceleration and its time derivatives.

Since this work, many authors have tried to simplify Dirac’s rather complicated calculations and clarify his assumptions. This book is a good guide:

• Stephen Parrott, Relativistic Electrodynamics and Differential Geometry, Springer, Berlin, 1987.

But more recently, Jerzy Kijowski and some coauthors have made impressive progress in a series of papers that solve many of the problems we have described.

Kijowski’s key idea is to impose conditions on precisely how the electromagnetic field is allowed to behave near the path traced out by a charged point particle. He breaks the field into a ‘regular’ part and a ‘singular’ part:

F = F_\textrm{reg} + F_\textrm{sing}

Here F_\textrm{reg} is smooth everywhere, while F_\textrm{sing} is singular near the particle’s path, but only in a carefully prescribed way. Roughly, at each moment, in the particle’s instantaneous rest frame, the singular part of its electric field consists of the familiar part proportional to 1/r^2, together with a part proportional to 1/r^3 which depends on the particle’s acceleration. No other singularities are allowed!

On the one hand, this eliminates the ambiguities mentioned earlier: in the end, there are no ‘nonlinear powers of the acceleration and its time derivatives’ in Kijowski’s force law. On the other hand, this avoids breaking time reversal symmetry, as the earlier splitting F = F_\textrm{ext} + F_\textrm{ret} did.

Next, Kijowski defines the energy-momentum of a point particle to be m \dot{x}, where m is its physical mass. He defines the energy-momentum of the electromagnetic field to be just that due to F_\textrm{reg}, not F_\textrm{sing}. This amounts to eliminating the infinite ‘electromagnetic mass’ of the charged particle. He then shows that Maxwell’s equations and conservation of total energy-momentum imply an equation of motion for the particle!

This equation is very simple:

m \ddot{x}^\mu = e F_{\textrm{reg}}^{\mu \nu} \, \dot{x}_\nu

It is just the Lorentz force law! Since the troubling Schott term is gone, this is a second-order differential equation. So we can hope that to predict the future behavior of the electromagnetic field, together with the particle’s position and velocity, given all these quantities at t = 0.

And indeed this is true! In 1998, together with Gittel and Zeidler, Kijowski proved that initial data of this sort, obeying the careful restrictions on allowed singularities of the electromagnetic field, determine a unique solution of Maxwell’s equations and the Lorentz force law, at least for a short amount of time. Even better, all this remains true for any number of particles.

There are some obvious questions to ask about this new approach. In the Abraham–Lorentz–Dirac force law, the acceleration was an independent variable that needed to be specified at t = 0 along with position and momentum. This problem disappears in Kijowski’s approach. But how?

We mentioned that the singular part of the electromagnetic field, F_\textrm{sing}, depends on the particle’s acceleration. But more is true: the particle’s acceleration is completely determined by F_\textrm{sing}. So, the particle’s acceleration is not an independent variable because it is encoded into the electromagnetic field.

Another question is: where did the radiation reaction go? The answer is: we can see it if we go back and decompose the electromagnetic field as F_\textrm{ext} + F_\textrm{ret} as we had before. If we take the law

m \ddot{x}^\mu = e F_{\textrm{reg}}^{\mu \nu} \dot{x}_\nu

and rewrite it in terms of F_\textrm{ext}, we recover the original Abraham–Lorentz–Dirac law, including the radiation reaction term and Schott term.

Unfortunately, this means that ‘pathological’ solutions where particles extract arbitrary amounts of energy from the electromagnetic field are still possible. A related problem is that apparently nobody has yet proved solutions exist for all time. Perhaps a singularity worse than the allowed kind could develop in a finite amount of time—for example, when particles collide.

So, classical point particles interacting with the electromagnetic field still present serious challenges to the physicist and mathematician. When you have an infinitely small charged particle right next to its own infinitely strong electromagnetic field, trouble can break out very easily!

Particles without fields

Finally, I should also mention attempts, working within the framework of special relativity, to get rid of fields and have particles interact with each other directly. For example, in 1903 Schwarzschild introduced a framework in which charged particles exert an electromagnetic force on each other, with no mention of fields. In this setup, forces are transmitted not instantaneously but at the speed of light: the force on one particle at one spacetime point x depends on the motion of some other particle at spacetime point y only if the vector x - y is lightlike. Later Fokker and Tetrode derived this force law from a principle of least action. In 1949, Feynman and Wheeler checked that this formalism gives results compatible with the usual approach to electromagnetism using fields, except for several points:

• Each particle exerts forces only on other particles, so we avoid the thorny issue of how a point particle responds to the electromagnetic field produced by itself.

• There are no electromagnetic fields not produced by particles: for example, the theory does not describe the motion of a charged particle in an ‘external electromagnetic field’.

• The principle of least action guarantees that ‘if A affects B then B affects A‘. So, if a particle at x exerts a force on a particle at a point y in its future lightcone, the particle at y exerts a force on the particle at x in its past lightcone. This raises the issue of ‘reverse causality’, which Feynman and Wheeler address.

Besides the reverse causality issue, perhaps one reason this approach has not been more pursued is that it does not admit a Hamiltonian formulation in terms of particle positions and momenta. Indeed, there are a number of ‘no-go theorems’ for relativistic multiparticle Hamiltonians, saying that these can only describe noninteracting particles. So, most work that takes both quantum mechanics and special relativity into account uses fields.

Indeed, in quantum electrodynamics, even the charged point particles are replaced by fields—namely quantum fields! Next time we’ll see whether that helps.

Struggles with the Continuum (Part 2)

9 September, 2016

Last time we saw that that nobody yet knows if Newtonian gravity, applied to point particles, truly succeeds in predicting the future. To be precise: for four or more particles, nobody has proved that almost all initial conditions give a well-defined solution for all times!

The problem is related to the continuum nature of space: as particles get arbitrarily close to other, an infinite amount of potential energy can be converted to kinetic energy in a finite amount of time.

I left off by asking if this problem is solve by more sophisticated theories. For example, does the ‘speed limit’ imposed by special relativity help the situation? Or might quantum mechanics help, since it describes particles as ‘probability clouds’, and puts limits on how accurately we can simultaneously know both their position and momentum?

We begin with quantum mechanics, which indeed does help.

The quantum mechanics of charged particles

Few people spend much time thinking about ‘quantum celestial mechanics’—that is, quantum particles obeying Schrödinger’s equation, that attract each other gravitationally, obeying an inverse-square force law. But Newtonian gravity is a lot like the electrostatic force between charged particles. The main difference is a minus sign, which makes like masses attract, while like charges repel. In chemistry, people spend a lot of time thinking about charged particles obeying Schrödinger’s equation, attracting or repelling each other electrostatically. This approximation neglects magnetic fields, spin, and indeed anything related to the finiteness of the speed of light, but it’s good enough explain quite a bit about atoms and molecules.

In this approximation, a collection of charged particles is described by a wavefunction \psi, which is a complex-valued function of all the particles’ positions and also of time. The basic idea is that \psi obeys Schrödinger’s equation

\displaystyle{ \frac{d \psi}{dt} = - i H \psi}

where H is an operator called the Hamiltonian, and I’m working in units where \hbar = 1.

Does this equation succeeding in predicting \psi at a later time given \psi at time zero? To answer this, we must first decide what kind of function \psi should be, what concept of derivative applies to such funtions, and so on. These issues were worked out by von Neumann and others starting in the late 1920s. It required a lot of new mathematics. Skimming the surface, we can say this.

At any time, we want \psi to lie in the Hilbert space consisting of square-integrable functions of all the particle’s positions. We can then formally solve Schrödinger’s equation as

\psi(t) = \exp(-i t H) \psi(0)

where \psi(t) is the solution at time t. But for this to really work, we need H to be a self-adjoint operator on the chosen Hilbert space. The correct definition of ‘self-adjoint’ is a bit subtler than what most physicists learn in a first course on quantum mechanics. In particular, an operator can be superficially self-adjoint—the actual term for this is ‘symmetric’—but not truly self-adjoint.

In 1951, based on earlier work of Rellich, Kato proved that H is indeed self-adjoint for a collection of nonrelativistic quantum particles interacting via inverse-square forces. So, this simple model of chemistry works fine. We can also conclude that ‘celestial quantum mechanics’ would dodge the nasty problems that we saw in Newtonian gravity.

The reason, simply put, is the uncertainty principle.

In the classical case, bad things happen because the energy is not bounded below. A pair of classical particles attracting each other with an inverse square force law can have arbitrarily large negative energy, simply by being very close to each other. Since energy is conserved, if you have a way to make some particles get an arbitrarily large negative energy, you can balance the books by letting others get an arbitrarily large positive energy and shoot to infinity in a finite amount of time!

When we switch to quantum mechanics, the energy of any collection of particles becomes bounded below. The reason is that to make the potential energy of two particles large and negative, they must be very close. Thus, their difference in position must be very small. In particular, this difference must be accurately known! Thus, by the uncertainty principle, their difference in momentum must be very poorly known: at least one of its components must have a large standard deviation. This in turn means that the expected value of the kinetic energy must be large.

This must all be made quantitative, to prove that as particles get close, the uncertainty principle provides enough positive kinetic energy to counterbalance the negative potential energy. The Kato–Lax–Milgram–Nelson theorem, a refinement of the original Kato–Rellich theorem, is the key to understanding this issue. The Hamiltonian H for a collection of particles interacting by inverse square forces can be written as

H = K + V

where K is an operator for the kinetic energy and V is an operator for the potential energy. With some clever work one can prove that for any \epsilon > 0, there exists c > 0 such that if \psi is a smooth normalized wavefunction that vanishes at infinity and at points where particles collide, then

| \langle \psi , V \psi \rangle | \le \epsilon \langle \psi, K\psi \rangle + c.

Remember that \langle \psi , V \psi \rangle is the expected value of the potential energy, while \langle \psi, K \psi \rangle is the expected value of the kinetic energy. Thus, this inequality is a precise way of saying how kinetic energy triumphs over potential energy.

By taking \epsilon = 1, it follows that the Hamiltonian is bounded below on such
states \psi:

\langle \psi , H \psi \rangle \ge -c

But the fact that the inequality holds even for smaller values of \epsilon is the key to showing H is ‘essentially self-adjoint’. This means that while H is not self-adjoint when defined only on smooth wavefunctions that vanish at infinity and at points where particles collide, it has a unique self-adjoint extension to some larger domain. Thus, we can unambiguously take this extension to be the true Hamiltonian for this problem.

To understand what a great triumph this is, one needs to see what could have gone wrong! Suppose space had an extra dimension. In 3-dimensional space, Newtonian gravity obeys an inverse square force law because the area of a sphere is proportional to its radius squared. In 4-dimensional space, the force obeys an inverse cube law:

\displaystyle{ F = -\frac{Gm_1 m_2}{r^3} }

Using a cube instead of a square here makes the force stronger at short distances, with dramatic effects. For example, even for the classical 2-body problem, the equations of motion no longer ‘almost always’ have a well-defined solution for all times. For an open set of initial conditions, the particles spiral into each other in a finite amount of time!

Hyperbolic spiral - a fairly common orbit in an inverse cube force

Hyperbolic spiral – a fairly common orbit in an inverse cube force.

The quantum version of this theory is also problematic. The uncertainty principle is not enough to save the day. The inequalities above no longer hold: kinetic energy does not triumph over potential energy. The Hamiltonian is no longer essentially self-adjoint on the set of wavefunctions that I described.

In fact, this Hamiltonian has infinitely many self-adjoint extensions! Each one describes different physics: namely, a different choice of what happens when particles collide. Moreover, when G exceeds a certain critical value, the energy is no longer bounded below.

The same problems afflict quantum particles interacting by the electrostatic force in 4d space, as long as some of the particles have opposite charges. So, chemistry would be quite problematic in a world with four dimensions of space.

With more dimensions of space, the situation becomes even worse. In fact, this is part of a general pattern in mathematical physics: our struggles with the continuum tend to become worse in higher dimensions. String theory and M-theory may provide exceptions.

Next time we’ll look at what happens to point particles interacting electromagnetically when we take special relativity into account. After that, we’ll try to put special relativity and quantum mechanics together!

For more

For more on the inverse cube force law, see:

• John Baez, The inverse cube force law, Azimuth, 30 August 2015.

It turns out Newton made some fascinating discoveries about this law in his Principia; it has remarkable properties both classically and in quantum mechanics.

The hyperbolic spiral is one of 3 kinds of orbits possible in an inverse cube force; for the others see:

Cotes’s spiral, Wikipedia.

The picture of a hyperbolic spiral was drawn by Anarkman and Pbroks13 and placed on Wikicommons under a Creative Commons Attribution-Share Alike 3.0 Unported license.

Struggles with the Continuum (Part 1)

8 September, 2016

Is spacetime really a continuum? That is, can points of spacetime really be described—at least locally—by lists of four real numbers (t,x,y,z)? Or is this description, though immensely successful so far, just an approximation that breaks down at short distances?

Rather than trying to answer this hard question, let’s look back at the struggles with the continuum that mathematicians and physicists have had so far.

The worries go back at least to Zeno. Among other things, he argued that that an arrow can never reach its target:

That which is in locomotion must arrive at the half-way stage before it arrives at the goal.Aristotle summarizing Zeno

and Achilles can never catch up with a tortoise:

In a race, the quickest runner can never overtake the slowest, since the pursuer must first reach the point whence the pursued started, so that the slower must always hold a lead.Aristotle summarizing Zeno

These paradoxes can now be dismissed using our theory of real numbers. An interval of finite length can contain infinitely many points. In particular, a sum of infinitely many terms can still converge to a finite answer.

But the theory of real numbers is far from trivial. It became fully rigorous only considerably after the rise of Newtonian physics. At first, the practical tools of calculus seemed to require infinitesimals, which seemed logically suspect. Thanks to the work of Dedekind, Cauchy, Weierstrass, Cantor and others, a beautiful formalism was developed to handle real numbers, limits, and the concept of infinity in a precise axiomatic manner.

However, the logical problems are not gone. Gödel’s theorems hang like a dark cloud over the axioms of mathematics, assuring us that any consistent theory as strong as Peano arithmetic, or stronger, cannot prove itself consistent. Worse, it will leave some questions unsettled.

For example: how many real numbers are there? The continuum hypothesis proposes a conservative answer, but the usual axioms of set theory leaves this question open: there could vastly more real numbers than most people think. And the superficially plausible axiom of choice—which amounts to saying that the product of any collection of nonempty sets is nonempty—has scary consequences, like the existence of non-measurable subsets of the real line. This in turn leads to results like that of Banach and Tarski: one can partition a ball of unit radius into six disjoint subsets, and by rigid motions reassemble these subsets into two disjoint balls of unit radius. (Later it was shown that one can do the job with five, but no fewer.)

However, most mathematicians and physicists are inured to these logical problems. Few of us bother to learn about attempts to tackle them head-on, such as:

nonstandard analysis and synthetic differential geometry, which let us work consistently with infinitesimals,

constructivism, which avoids proof by contradiction: for example, one must ‘construct’ a mathematical object to prove that it exists,

finitism (which avoids infinities altogether),

ultrafinitism, which even denies the existence of very large numbers.

This sort of foundational work proceeds slowly, and is now deeply unfashionable. One reason is that it rarely seems to intrude in ‘real life’ (whatever that is). For example, it seems that no question about the experimental consequences of physical theories has an answer that depends on whether or not we assume the continuum hypothesis or the axiom of choice.

But even if we take a hard-headed practical attitude and leave logic to the logicians, our struggles with the continuum are not over. In fact, the infinitely divisible nature of the real line—the existence of arbitrarily small real numbers—is a serious challenge to almost all of the most widely used theories of physics.

Indeed, we have been unable to rigorously prove that most of these theories make sensible predictions in all circumstances, thanks to problems involving the continuum.

One might hope that a radical approach to the foundations of mathematics—such as those listed above—would allow avoid some of the problems I’ll be discussing. However, I know of no progress along these lines that would interest most physicists. Some of the ideas of constructivism have been embraced by topos theory, which also provides a foundation for calculus with infinitesimals using synthetic differential geometry. Topos theory and especially higher topos theory are becoming important in mathematical physics. They’re great! But as far as I know, they have not been used to solve the problems I want to discuss here.

Today I’ll talk about one of the first theories to use calculus: Newton’s theory of gravity.

Newtonian Gravity

In its simplest form, Newtonian gravity describes ideal point particles attracting each other with a force inversely proportional to the square of their distance. It is one of the early triumphs of modern physics. But what happens when these particles collide? Apparently the force between them becomes infinite. What does Newtonian gravity predict then?

Of course real planets are not points: when two planets come too close together, this idealization breaks down. Yet if we wish to study Newtonian gravity as a mathematical theory, we should consider this case. Part of working with a continuum is successfully dealing with such issues.

In fact, there is a well-defined ‘best way’ to continue the motion of two point masses through a collision. Their velocity becomes infinite at the moment of collision but is finite before and after. The total energy, momentum and angular momentum are unchanged by this event. So, a 2-body collision is not a serious problem. But what about a simultaneous collision of 3 or more bodies? This seems more difficult.

Worse than that, Xia proved in 1992 that with 5 or more particles, there are solutions where particles shoot off to infinity in a finite amount of time!

This sounds crazy at first, but it works like this: a pair of heavy particles orbit each other, another pair of heavy particles orbit each other, and these pairs toss a lighter particle back and forth. Xia and Saari’s nice expository article has a picture of the setup:


Each time the lighter particle gets thrown back and forth, the pairs move further apart from each other, while the two particles within each pair get closer together. And each time they toss the lighter particle back and forth, the two pairs move away from each other faster!

As the time t approaches a certain value t_0, the speed of these pairs approaches infinity, so they shoot off to infinity in opposite directions in a finite amount of time, and the lighter particle bounces back and forth an infinite number of times!

Of course this crazy behavior isn’t possible in the real world, but Newtonian physics has no ‘speed limit’, and we’re idealizing the particles as points. So, if two or more of them get arbitrarily close to each other, the potential energy they liberate can give some particles enough kinetic energy to zip off to infinity in a finite amount of time! After that time, the solution is undefined.

You can think of this as a modern reincarnation of Zeno’s paradox. Suppose you take a coin and put it heads up. Flip it over after 1/2 a second, then flip it over after 1/4 of a second, and so on. After one second, which side will be up? There is no well-defined answer. That may not bother us, since this is a contrived scenario that seems physically impossible. It’s a bit more bothersome that Newtonian gravity doesn’t tell us what happens to our particles when t = t_0.

Your might argue that collisions and these more exotic ‘noncollision singularities’ occur with probability zero, because they require finely tuned initial conditions. If so, perhaps we can safely ignore them!

This is a nice fallback position. But to a mathematician, this argument demands proof.

A bit more precisely, we would like to prove that the set of initial conditions for which two or more particles come arbitrarily close to each other within a finite time has ‘measure zero’. This would mean that ‘almost all’ solutions are well-defined for all times, in a very precise sense.

In 1977, Saari proved that this is true for 4 or fewer particles. However, to the best of my knowledge, the problem remains open for 5 or more particles. Thanks to previous work by Saari, we know that the set of initial conditions that lead to collisions has measure zero, regardless of the number of particles. So, the remaining problem is to prove that noncollision singularities occur with probability zero.

It is remarkable that even Newtonian gravity, often considered a prime example of determinism in physics, has not been proved to make definite predictions, not even ‘almost always’! In 1840, Laplace wrote:

We ought to regard the present state of the universe as the effect of its antecedent state and as the cause of the state that is to follow. An intelligence knowing all the forces acting in nature at a given instant, as well as the momentary positions of all things in the universe, would be able to comprehend in one single formula the motions of the largest bodies as well as the lightest atoms in the world, provided that its intellect were sufficiently powerful to subject all data to analysis; to it nothing would be uncertain, the future as well as the past would be present to its eyes. The perfection that the human mind has been able to give to astronomy affords but a feeble outline of such an intelligence.Laplace

However, this dream has not yet been realized for Newtonian gravity.

I expect that noncollision singularities will be proved to occur with probability zero. If so, the remaining question would why it takes so much work to prove this, and thus prove that Newtonian gravity makes definite predictions in almost all cases. Is this is a weakness in the theory, or just the way things go? Clearly it has something to do with three idealizations:

• point particles whose distance can be arbitrarily small,

• potential energies that can be arbitrariy large and negative,

• velocities that can be arbitrarily large.

These are connected: as the distance between point particles approaches zero, their potential energy approaches -\infty, and conservation of energy dictates that some velocities approach +\infty.

Does the situation improve when we go to more sophisticated theories? For example, does the ‘speed limit’ imposed by special relativity help the situation? Or might quantum mechanics help, since it describes particles as ‘probability clouds’, and puts limits on how accurately we can simultaneously know both their position and momentum?

Next time I’ll talk about quantum mechanics, which indeed does help.


2 September, 2016

I’m now going to try to announce all my new writings in one place: on Twitter.

Why? Well, someone I respect said he’s been following my online writings, off and on, ever since the old days of This Week’s Finds. He wishes it were easier to find my new stuff all in one place. Right now it’s spread out over several locations:

Azimuth: serious posts on environmental issues and applied mathematics, fairly serious popularizations of diverse scientific subjects.

Google+: short posts of all kinds, mainly light popularizations of math, physics, and astronomy.

The n-Category Café: posts on mathematics, leaning toward category theory and other forms of pure mathematics that seem too intimidating for the above forums.

Visual Insight: beautiful pictures of mathematical objects, together with explanations.

Diary: more personal stuff, and polished versions of the more interesting Google+ posts, just so I have them on my own website.

It’s absurd to expect anyone to look at all these locations to see what I’m writing. Even more absurdly, I claimed I was going to quit posting on Google+, but then didn’t. So, I’ll try to make it possible to reach everything via Twitter.

Unlike Facebook, you don’t need to join Twitter to see what people put there. Furthermore, you can see it while blocking cookies. So, I feel okay about this approach to broadcasting my stuff to a larger audience. (Some of my best friends are very concerned with privacy. In fact, I lost touch with one when he said he would only communicate with me in encrypted emails.)

I currently have 4 followers.