The Koide Formula

There are three charged leptons: the electron, the muon and the tau. Let m_e, m_\mu and m_\tau be their masses. Then the Koide formula says

\displaystyle{ \frac{m_e + m_\mu + m_\tau}{\big(\sqrt{m_e} + \sqrt{m_\mu} + \sqrt{m_\tau}\big)^2} = \frac{2}{3} }

There’s no known reason for this formula to be true! But if you plug in the experimentally measured values of the electron, muon and tau masses, it’s accurate within the current experimental error bars:

\displaystyle{ \frac{m_e + m_\mu + m_\tau}{\big(\sqrt{m_e} + \sqrt{m_\mu} + \sqrt{m_\tau}\big)^2} = 0.666661 \pm 0.000007 }

Is this significant or just a coincidence? Will it fall apart when we measure the masses more accurately? Nobody knows.

Here’s something fun, though:

Puzzle. Show that no matter what the electron, muon and tau masses might be—that is, any positive numbers whatsoever—we must have

\displaystyle{ \frac{1}{3} \le \frac{m_e + m_\mu + m_\tau}{\big(\sqrt{m_e} + \sqrt{m_\mu} + \sqrt{m_\tau}\big)^2} \le 1}

For some reason this ratio turns out to be almost exactly halfway between the lower bound and upper bound!

Koide came up with his formula in 1982 before the tau’s mass was measured very accurately.  At the time, using the observed electron and muon masses, his formula predicted the tau’s mass was

m_\tau = 1776.97 MeV/c2

while the observed mass was

m_\tau = 1784.2 ± 3.2 MeV/c2

Not very good.

In 1992 the tau’s mass was measured much more accurately and found to be

m_\tau = 1776.99 ± 0.28 MeV/c2

Much better!

Koide has some more recent thoughts about his formula:

• Yoshio Koide, What physics does the charged lepton mass relation tell us?, 2018.

He points out how difficult it is to explain a formula like this, given how masses depend on an energy scale in quantum field theory.

35 Responses to The Koide Formula

  1. This reminds me of the Eddington’s obsession with the number 137 and the Titius–Bode law of planetary distances.

    • John Baez says:

      Well, Eddington started out writing a long explanation of why the reciprocal of the fine structure constant was 136. Then when it measured to be closer to 137 he said he’d left something out. But it’s even closer to 137.036.

      When I die my first question to the Devil will be: What is the meaning of the fine structure constant? — Wolfgang Pauli

      The big question is whether there’s any explanation at all for the 25 dimensionless constants in the Standard Model. But most of what people write about this is nonsense. A nontrivial relation between them, like the Kohde formula, would be an enormous clue… if it were true.

      There are lots of interesting relations between masses of hadrons, but these are very different: there are underlying reasons for these. Around 1962 Gell–Mann used one to predict the mass of the Ω:

      Gell-Mann–Okubo mass formula.

      Those were the good old days!

  2. linasv says:

    Ever look into harmonic maps? I’ve always planned to dig into them more deeply, never have, but got the general impression that they’re more-or-less the same thing as the sigma model in low-energy nucleon physics, of which the chiral model is a special case. I’m throwing this out because there’s a natural “square root of the laplacian” appearing there, which, with a lot of gymnastics and hand-waving, might be thought of as the the “square root of the mass”. That is to say, one could argue that both the numerator and the denominator of the Koide formula might appear “naturally” in this setting. I wish I could clone myself, so that I had time to pursue this idea. Alas.

    • John Baez says:

      Yes, the differential equation for harmonic maps underlies so-called “nonlinear sigma-models”.

      There must be a lot of work trying to explain the Kohde model, but I haven’t read most of it, and obviously most of this work must be wrong.

      • linasv says:

        (My PhD thesis was on non-linear sigma models.) I guess I’m trying to say that “harmonic maps” is a chapter you might find in a book on Riemannian geometry. Geometry is the kind of place where one finds formulas involving integers and fractions that arise from first principles. Yet, on the other hand, the solitons that appear in non-linear sigma models are reasonable models of the low-energy properties of baryons. As such, the solitons have to be interpreted as spin-1/2 (exactly how is mysterious, at least to me). Well, your leptons are spin-1/2. The probably-just-plain-wrong idea I’m proposing is that perhaps leptons can be interpreted as solitonic excitations. Have to pull a Dirac operator out of that mess. But insofar as that’s a geometric claim, a fiddling with geometry, it strikes me as plausible that something like the Koide formula could fall out. Yes, I’ve sketched an imaginary incoherent fairy-tale to bridge a chasm that is too wide. But still, I find it entertaining, Perhaps because I’m infatuated with the geometry of Dirac operators. I’ll leave it at that.

      • John Baez says:

        Okay, sorry, so I don’t have to explain harmonic maps vs. nonlinear sigma-models to you! I guess the term “harmonic map” is mainly used to mean a map from a Riemannian manifold (space) to another Riemannian manifold (the target manifold) which is a critical point of a certain action, while nonlinear sigma-models also study maps from a Lorentzian manifold (spacetime) to a Riemannian manifold (the target manifold) which are critical points of the analogous action. So the math literature on harmonic maps would probably mainly help with the static behavior of nonlinear sigma-models.

        If I were trying to dream up a soliton-based explanation of lepton masses, I’d start by reading this article and a bunch of the references:

        • nLab, Skyrmion.

        There’s been a lot of interesting work recently, which I haven’t followed at all!

        • linasv says:

          Obviously, using telepathy to communicate ideas isn’t working out! My PhD was actually about coupling free quarks to Skyrmions. At that time, I did not know that a special case of Riemannian manifolds are spin manifolds, where the frame bundle vierbeins can be arranged to allow a consistent spin bundle, and thence Dirac operators. What I don’t understand is what happens if one throws spin manifolds into harmonic maps and stir the pot. Even simple questions: given a Skyrmion as a background field, can I consistently glue spin structures together between the spacetime manifold and the target chiral manifold? If this gluing can be accomplished, is there a corresponding bound state of the Dirac field (in the target space), non-vanishing where-ever the Skyrmion field is non-vanishing? I am aware that Dan Freed at UTexas has done work on this (and that Karen Uhlenbeck sits down the hall) but I’m not familiar with it.

          Here’s where the geometry comes in: The target-space of a sigma model can be any symmetric space. This provides a rich playground of groups to play with, and all sorts of invariants necessarily precipitate. (For example, the chiral anomaly is more-or-less the odd-parity part of the Cartan involution, from what I can tell.) Again: what happens if spin structures are thrown into this mix?

          The Skyrmion has a mass that is more-or-less inversely proportional to its radius, and “magically” has the proton mass, given the proton radius (its off by about 20%). If one were to write down a spinor field (that lives on the target manifold, not the spacetime manifold), bound to the Skyrmion field, is it’s effective mass also the same? I haven’t a clue. What would excitations of this field look like? Are they bound to the Skyrmion? (Are they “confined”?) I assume that one can consistently glue a spinor in the target manifold to a spinor field on the spacetime manifold, but what does this look like and what does it imply? I dunno, I’m not convinced that the questions I asked make sense.

          My general impression is that there is very little work on the interplay of spin structures and harmonic-maps aka chiral-models. Obviously, I forgot to mention leptons up above, so let me do so now: by convention, the skyrmion is interpreted as a nucleon, so the associated isospin/flavor is interpreted as up/down, instead of electron/muon. But in a more general framework, might there be an instanton instead of a skyrmion? Could that instanton involve lepton fields? Could some prestidigitation produce the Koide formula? This is the idle daydream I wanted to communicate. And yes, as I re-read what I just wrote, it seems dubious.

          This took me under half-an-hour to type up; yet if I were to follow my own nose and investigate the possibilities I’m proposing, I know it would take me months stretching into years to gain a more refined understanding. Every time I start to think I am a wunderkind with pencil-n-paper calculations, I find a sign error that takes days to resolve. Either that, or scrambling to reference works to validate notation and identities, followed by pleasant distractions found in those references. I imagine its the same for everyone.

    • Can’t read it? Due to the COVID-19 pandemic, Physics Today has made its entire archive publicly accessible to those who register.

    • Toby Bartels says:

      I have a feeling that this article was originally published within 3 days of a whole number of years ago.

        • Toby Bartels says:

          Ah, too bad. Now that I look again, I see a faint grey ‘01 November’. Perhaps it was submitted on April 1. Anyway, the comments are illuminating.

        • The Physics Today paper is 1 November, but it talks about a Physical Review paper from decades ago, which appeared on 15 May.

          I don’t think that it was a joke.

        • Toby Bartels says:

          I'm sure that the 1951 letter by Lenz was serious. But the 2017 reminder implies (although without outright stating so) that the coincidence still holds for the many more digits that we now have (or at least had in 2017), when it actually doesn't hold for any additional digits (as noted in the online comments).

        • It does note that it now holds only to a good approximation, whereas in 1951 it held exactly (i.e. within the experimental accuracy). One could also take it to mean that the values for the masses were already pretty good in 1951. (Compare to the fine-structure constant, which for a time was believed to be exactly 1/136, or 1/137.)

    • John Baez says:

      Interesting! I rediscovered the formula

      \displaystyle{ \frac{m_p}{m_e} \approx 6 \pi^5}

      myself as a kid, by fiddling around on a calculator. But I didn’t take it very seriously, because it was already clear to everyone that the proton mass is a rather complicated spinoff of QCD, not something there should be a simple formula for.

      • Right.

        In their magnum opus, Barrow and Tipler calculate whether such numerical coincidences are surprising or not, based on how many constants, which powers, and so on one has to select from.

        The age of the universe in Planck units is 10^{60}, which is the square root of the alleged factor of over-prediction of the value of the cosmological constant. Barrow and Shaw wrote some papers, which I don’t begin to understand, connecting the two quantities.

        Sadly, John is no longer with us (and Tipler has gone off the deep end).

  3. mitchellporter says:

    The category theorist Marni Sheppeard could have contributed a lot to this discussion, since she worked for years on explaining the Koide formula, but she was found dead a week and a half ago. It’s a terrible thing that we’re having this discussion without her.

    In particular she worked with Carl Brannen, who in 2006 found a trigonometric expression for Koide’s formula which gives the physical masses, for an angle of exactly 2/9 radians. The precision of it was enough to catch the attention of Koide himself (arXiv:0706.2534, eqns 3.9-3.10)… Brannen and Sheppeard tried to explain it in the context of circulant (cyclically symmetrical) mass matrices, a form pioneered by Harrison and Scott (hep-ph/0203209).

    • nad says:

      Sad to hear this. Mountaineering is unfortunately a rather dangerous sport and very sadly hits also experienced mountaineers (see e.g. also

      I have never met her, but saw that she commented on and off to the n-category café. I asked John, when I met him in Oxford, whether he knew about her whereabouts, but he didn’t knew much. (I think she spend also some time in Oxford).

      Did you know her?

    • John Baez says:

      I first met Marni Sheppeard when she was working with Louis Crane on what came to be called the Crane–Sheppeard model. I later spent a long time developing the math of this model, which involve representations of something called the Poincaré 2-group. My student Derek Wise, Laurent Freidel and his student Aristide Baratin and I developed the foundations of this sort of representation theory in a book called Infinite-Dimensional Representations of 2-Groups. By the end we were too exhausted to develop the model, but recently there’s been some new work on it:

      • Seth K. Asante, Bianca Dittrich, Florian Girelli, Aldo Riello and Panagiotis Tsimiklis, Quantum geometry from higher gauge theory.

      I later was a reader for Marni Sheppeard’s PhD thesis, which unfortunately was a garbled mix of smidgens of topos theory and twistor theory, nothing that made sense.

      Once she got lost for days in some mountains in New Zealand and Louis Crane played some role in rescuing her—I forget exactly what, maybe he just noticed that she was missing. So her death, while tragic, was foreshadowed.

  4. Johan Swanljung says:

    Nitpicking, but if the masses are all positive (in the puzzle), the ratio must be strictly less than 1.

  5. allenknutson says:

    These are the running masses (when surrounded by virtual particles), not the bare masses (which I presume need renormalization), right? Is this the sort of relation that would continue to hold at high energies?

    The Wikipedia page has a comment about this:

    • John Baez says:

      Yes, in the current ‘top-down’ thinking about particle physics, where the fundamental physics appears at high energies and gets tweaked by the renormalization group as we descend to lower energies, it’s very hard to understand why an exact relation like the Koide formula would hold at low energies. I added a tiny bit about this to my blog article: mainly, a link to a paper by Koide where he discusses some theories in which symmetries would prevent the lepton masses from depending on the energy scale. So he recognizes that this issue is important.

  6. The inequality is a straight forward application of the Cauchy-Schwarz inequality:

    \sum_{i=1}^n a_i b_i \leq \Big(  \sum_{i=1}^n a_i^2  \Big)^{\frac{1}{2} } \Big(  \sum_{i=1}^n b_i^2  \Big)^{\frac{1}{2} }.

    With b_i=1 we get:

    \sqrt{a}+ \sqrt{b} + \sqrt{c} \leq (a+b +c)^{  \frac{1}{2} }  \, \sqrt{3} .

    Squaring we get the LHS of the inequality:

    \displaystyle{ \frac{1}{3} \leq \frac{a+b+c   }{ (\sqrt{a}+\sqrt{b}+\sqrt{c})^2   }    }  .

    The RHS is obtained by noting that:

    (\sqrt{a}+\sqrt{b}+ \sqrt{c})^2 > a+b +c .


    \displaystyle{ \frac{a+b+c}{ (\sqrt{a}+\sqrt{b}+\sqrt{c})^2        } \leq 1 } .

    • John Baez says:

      Nice! Without loss of generality we can normalize the masses so their sum is one; by symmetry if the function assumes a minimum at a unique point on this triangle it must be the center, where the masses are equal, and there the function is 1/3. However, the trouble (as usual) with this argument is showing that there’s a unique minimum; it seems just as easy to use Lagrange multiplier to find all the critical points, discover there’s just one, and then investigate the function on the boundary of the triangle, where one or more of the masses is zero.

  7. Concerning your math puzzle:

    The value 2/3 means that the vector h = {√me, √mμ, √mτ} has an angle of exactly 45° with the vector {1,1,1}. Any vector h on the 45° cone gives a value of 2/3.

    For an arbitrary value of Q other than 2/3 the general formula for the this angle θ is given:

    \theta ~=~ \arccos{\sqrt{\tfrac{1}{3Q}}}

    So for Q=2/3 this formula gives:

    \arccos{\sqrt{\tfrac{1}{2}}}~=~ 45^o

    For Q=1/3 we get 0°, the minimum angle.

    For Q=1 we get an angle of 54.73° which is the angle between {1,1,1} and any of {0,0,1} , {0,1,0} and {1,0,0}. If we go further then at least one of the three components becomes negative.

    So the angle of θ=45° with respect to {1,1,1} is the geometric translation of Koïde’s coincidence.

    (and yes, we all do know how mass renormalization and all that is done in the textbooks I may suppose)

  8. […] Μια σχετική συζήτηση (που μας θύμισε την εξίσωση του Koide) γίνεται στην τελευταία ανάρτηση του John Carlos Baez ΕΔΩ. […]

  9. westy31 says:

    The Koide formula reminds me of “The kiss precise”.

    The Koide formula (rewritten a bit):

    (0 + m_e + m_mu + m_tau)^2 = 3/2 * (0 + m_e^2 + m_mu^2 + m_tau^2)

    For 4 disks to exactly touch, the radii must satisfy:

    (1/r0 + 1/r1 + 1/r3 + 1/r3)^2 = 2 * (1/r0^2 + 1/r1^2 + 1/r3^2 + 1/r3^2)

    It turns out the factor 2 changes to a factor 3/2 if we demand the disks, instead of touching, intersect at an angle whose cosine is 2/3.
    Also, we can set 1/r0 to zeo by making the circle a line (a circle of infinite radius)

    So we get:
    (0 + 1/r1 + 1/r3 + 1/r3)^2 = 3/2 * (0 + 1/r1^2 + 1/r3^2 + 1/r3^2)

    The Koide formula can be visualised as a configuration of disks:

    I made a little web sit one this:

    I don’t know if this brings us any closer to knowing if this formula is more than pure coincidence.

    The value of 3/2 for the 1/cosine is special: 1/cos(alfa) is exactly equal to the factor that apears in the generalised Kiss precise formula.

    • Wolfgang says:

      When I once learned about the Koide formula it reminded me on this , but I was unable to formulate anything substantial about it. I liked the relation to geometric curvature though. Maybe this is related to your geometric explanation?

      • westy31 says:

        Well, I code-named Descartes’ theorem “Kiss precise”, so that at least is related.
        To generalize Descartes’ theorem to circles intersecting at an angle, I used a nice proof by Jerzy Kocik:
        He uses a map from disks with radius (r) and midpoint (x,y) to a (3+1)D Minkowski space. I vaguely see this as a clue. Also, I vaguely see lepton circles tiling event horizons in momentum space…

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.