Well Temperaments (Part 2)

Last time I ended with a question: why are certain numbers close to 1 so important in tuning systems? It helps to understand a bit about this before we plunge into the study of well temperaments. It turns out that in some sense western harmony evolved one prime at a time, so let’s look at the subject that way.

The prime 2

If all the frequency ratios in our tuning system were powers of 2:

2i

life would be very simple. Multiplying a frequency by 2 raises its pitch by an octave, so the only chords we could play are those built out of octaves. Not much music could be made! But there’d be no difficult decisions, either.

The primes 2 and 3

In Pythagorean tuning, also called 3-limit tuning, we generate all our frequency ratios by multiplying powers of 2 and powers of 3:

2i · 3j

This is more exciting. While the frequency ratio of 2 is an octave, that of 3/2 is called a just perfect fifth. So now we can use octaves and fifths to build other intervals (that is, frequency ratios).

But in fact, any positive real number can be approximated arbitrarily well by numbers of the form 2i · 3j, so we have an embarrassment of riches: more intervals than we really want! To bring the system under control, we take some number of the form 2i · 3j that’s really close to 1 and act like it is 1.

I examined the options in an earlier post and got a list of ‘winners’ according to some precise criterion. A couple of early winners are

28 · 3-5 = 256/243 ≈ 1.053

and

2-11 · 37 = 2143/2048 ≈ 1.068

These would be important in scales with 5 or 7 notes, but western music holds out for a much better one, called the Pythagorean comma:

Pythagorean comma = p = 2-19 · 312 = 531441/524288 ≈ 1.013643

This is important for a 12-tone scale, because it means that if we go up 12 fifths, multiplying the frequency by 3/2 each time, it’s almost the same as going up 7 octaves.

But not quite! There are many ways of dealing with this problem. In Pythagorean tuning we absorb the problem by dividing one of our fifths by the Pythagorean comma, turning it into an unpleasant ‘wolf fifth’:

Pythagorean wolf fifth = 3/2p = 218 · 3-11 = 262144/177147 ≈ 1.479811

For example:

But we can spread the inverse of the Pythagorean comma around the circle of fifths any way we like, and different ways give different tuning systems.

For example, in equal temperament we spread it completely evenly around the circle of fifths, using the equal tempered fifth everywhere:

equal tempered fifth = 3/2p1/12 = 27/12 ≈ 1.498307

This is not an example of 3-limit tuning because it uses irrational numbers! But it’s an obvious way to solve the problem of the Pythagorean comma which emerges in 3-limit tuning. More interesting solutions tend to involve the next prime number.

The primes 2, 3, and 5

In 5-limit tuning we generate all our frequency ratios by multiplying powers of 2, 3 and 5:

2i · 3j · 5k

Equivalently, we build them using the octave (2), the just perfect fifth (3/2) and the just major third: 5/4.

There are some new simple fractions close to 1 that you can build with 2, 3 and also 5. The most important is the syntonic comma:

syntonic comma = σ = 2-4 · 34 · 5-1 = 81/80 = 1.0125

This shows up when you try to reconcile the perfect fifth and the major third. If you go up four just perfect fifths, you boost the frequency by a factor of (3/2)4 = 81/16 = 5.0625, which is a bit more than a just major third and two octaves, namely 5/4 × 22 = 5. The ratio is the syntonic comma.

As we’ll see in future episodes, this realization is fundamental to many well tempered tuning systems. We’ve already seen the grand-daddy of these systems: quarter-comma meantone. It’s not a well tempered itself, but fixing its main flaw leads to well tempered systems. In quarter-comma meantone, we divide most of our fifths by the fourth root of the syntonic comma, which gives lots of just major thirds, shown in blue below:

So, this scale has many ‘quarter-comma fifths’ with a frequency ratio of (3/2)σ-1/4. Going around the whole circle and multiplying 12 of these quarter-comma fifths would give 125, which is not quite the 128 we need to go up 7 octaves. So we need to take one of these quarter-comma fifths and multiply it by 128/125. The resulting ‘wolf fifth’ sounds terrible—and this is what well temperaments seek to cure.

The number 128/125 is an important fraction close to 1 built from just the primes 2 and 5. It’s called the lesser diesis:

lesser diesis = δ = 27 · 5-3 = 128/125 = 1.024

It’s not only a power of 2 divided by a power of 5, but also a power of 2 divided by a power of 10. You’ve bumped into it if you’ve ever wondered why people often use ‘kilobyte’ to mean 1024 bytes, not 1000.

From the Pythagorean comma, syntonic comma and lesser diesis we can generate other fractions close to 1 built from the primes 2, 3 and 5. For example, I’ve already discussed the product of the syntonic comma and lesser diesis, and also their ratio.

But when we study well temperaments, more important will be the Pythagorean comma divided by the syntonic comma. Called the schisma, this fraction is very close to 1:

schisma = χ = p/σ = 2-15 · 5 · 38 = 32805/32768 ≈ 1.001129

I’ll talk about it more next time.

It’s also important to note that the lesser diesis is not independent from the Pythagorean comma and syntonic comma. We’ve already seen today that going up a fifth twelve times is the same as going up 7 octaves divided by the Pythagorean comma. Now we’re seeing that going up (3/2)σ-1/4 twelve times is the same as going up 7 octaves times the lesser diesis. So, we have

(syntonic comma)-3 · lesser diesis = (Pythagorean comma)-1

or

p δ = σ3

We’ve already this in a slightly different way before.

Due to this relation there must be other fractions close to 1, built only from powers of the primes 2, 3, and 5, that are independent from p, σ and δ. In fact, we’ve already seen four such fractions appearing as the sizes of semitones in just intonation:

The ratios of these semitones include the syntonic comma, the lesser diesis, and also their product the greater diesis and their ratio the diaschisma!

But these semitones are not extremely close to 1. The smallest, the lesser chromatic semitone, is 25/24 ≈ 1.041666. So there must be interesting examples of fractions built from 2, 3 and 5, independent of the syntonic and Pythagorean commas, and much closer to 1. On Mastodon I asked for examples built solely from the primes 3 and 5, and a bunch of people helped me out. Here are some of the first few winners:

3-1 · 51 = 1.666…
33 · 5-2 = 1.08
3-19 · 513 ≈ 1.050283
322 · 5-15 ≈ 1.028295
3-41 528 ≈ 1.021383
363 · 543 ≈ 1.006767

The main thing to notice here is that we need fractions with impractically large numerators and denominators to get closer than the large diatonic semitone, 27/25 = 1.08. These fractions won’t play a role in well temperaments.

Higher primes

I won’t say much about primes after 5 now. But they’ve been studied in music theory at least since Ptolemy, and the compositions of Ben Johnston really run wild with them. For a tiny bit about the virtues of the prime 7, read my post on the harmonic seventh chord.

The facts I’ve crudely laid out above must be part of an elegant general theory of approximating the number 1 by fractions built from powers of a specified set of primes, and how to build scales from these fractions. Done systematically, this could be of interest not just to music theorists but even pure mathematicians. But I will not explore this now, since my goal was merely to recall some facts needed to understand the explosion of well temperaments starting around 1690!

Next time I’ll digress slightly into Kirnberger’s discovery of a tuning system with frequency ratios built only from the primes 2, 3, and 5 that comes extremely close to equal temperament. This is not a practical system, but it relies on an utterly astounding coincidence, and more importantly it highlights the role of the schisma, which will keep showing up in other systems.

schisma = χ = p/σ = 2-15 · 5 · 38 = 32805/32768 ≈ 1.001129

For more on Pythagorean tuning, read this series:

Pythagorean tuning.

For more on just intonation, read this series:

Just intonation.

For more on quarter-comma meantone tuning, read this series:

Quarter-comma meantone.

For more on well-tempered scales, read this series:

Part 1. An introduction to well temperaments.

Part 2. How small intervals in music arise naturally from products of integral powers of primes that are close to 1. The Pythagorean comma, the syntonic comma and the lesser diesis.

Part 3. Kirnberger’s rational equal temperament. The schisma, the grad and the atom of Kirnberger.

Part 3. Kirnberger’s rational equal temperament. The schisma, the grad and the atom of Kirnberger.

Part 4. The music theorist Kirnberger: his life, his personality, and a brief introduction to his three well temperaments.

Part 5. Kirnberger’s three well temperaments: Kirnberger I, Kirnberger II and Kirnberger III.

For more on equal temperament, read this series:

Equal temperament.

22 Responses to Well Temperaments (Part 2)

  1. Mark Meckes says:

    There’s at least one piece of music that uses power of 2 pitch ratios almost exclusively.

  2. Steve Huntsman says:

    FWIW, Daniel Rosiak told me that he used Pythagorean tuning in a tutorial as an example of nontrivial sheaf cohomology in dimension 1.

    • John Baez says:

      Cool, I’d like to see that. I wonder if it’s another way to think about the thing Michael Weiss were discussing back here. Michael wrote:

      I once attended a concert of music performed with just intonation. As I recall, it was billed as non-Bachian music, in ironic tribute to Bach’s Well-Tempered Clavier. The organizers had a mathematical bent. One of them explained that the commas are a discrete analog of holonomies. So you could say this is music with curvature.

      I replied:

      If commas are really a discrete analogue of holonomies then I, as a supposed expert on gauge theory and its discrete analogues (like spin networks and lattice gauge theory), should be able to make that precise!

      So that’s an interesting gauntlet you just threw down there!

      Since the Pythagorean comma is simpler I could start with that. Think of the circle of fifths X as a graph with 12 nodes and 12 edges. As we move along each edge, suppose the frequency goes up by a factor of 3/2. The issue is that when we go all the way around we are not ‘back where we started’.

      So, we want to assign some group element to each oriented edge of X, which records the fact that the frequency gets multiplied by 3/2 as we move along that edge.

      For this we need to think about frequency ratios modulo octaves. We can do it as follows: take the multiplicative group of positive reals \mathbb{R}_+ and mod out by 2, getting the group

      G = \mathbb{R}_+/2

      Note that we’re modding out by 2 multiplicatively, not additively as is more often done! So G is the group of frequency ratios mod octaves, and it’s isomorphic to the circle group, often called \mathrm{U}(1) by physicists.

      To put a G-connection on the graph X means that we assign an element of G to each oriented edge of X. We take all these elements to be [3/2] \in G.

      This says mathematically that each time we move up a fifth on the circle of fifths, the frequency gets multiplied by 3/2, but we only care about frequencies mod octaves.

      The holonomy as we go all the way around the circle of fifths is defined to be the product of the group elements labeling all the edges. This is

      [(3/2)^{12}] \in G

      and this is not the identity, though it’s close: this is the Pythagorean comma. You can see it in this diagram made by AugPi:

  3. Mark Meckes says:

    Here’s another attempt at linking to the piece I alluded to.

    • John Baez says:

      By the time your comment reached my email the link was to “http://xn--hvg/”, which seems like nonstandard syntax, and by the time it reached here it was replaced by a link back to this article. Sorry! Maybe you can just tell us the URL and let me do the rest?

      • Toby Bartels says:

        I don’t know how it ended up as http://xn--hvg/`, but I can parse that and (mostly) follow it afterwards. The prefixxn--indicates IDNA encoding of a non-ASCII domain name, and thenhvgis the IDNA encoding of, a left double quotation mark. By the time it hit the blog, thehttp://` and ending / were gone (perhaps because alone is not a valid domain), and the only thing in the linked URL is . This gets tacked on to the end of the URL for the current page by our browsers, and then WordPress silently drops that, since this page has no subpages.

        • Toby Bartels says:

          And now WordPress failed to parse my Markdown correctly, but I think that I see why, so I'll post again. (These two comments can be deleted.)

      • Toby Bartels says:

        I don’t know how it ended up as http://xn--hvg/` but I can parse that and (mostly) follow it afterwards. The prefixxn--indicates IDNA encoding of a non-ASCII domain name, and thenhvgis the IDNA encoding of, a left double quotation mark. By the time it hit the blog, thehttp://` and ending / were gone (perhaps because alone is not a valid domain), and the only thing in the URL is . This gets tacked on to the end of the URL for the current page, and then WordPress silently drops that, since this page has no subpages.

      • Toby Bartels says:

        I don’t know how it ended up as xn--hvg/, but I can parse that and (mostly) follow it afterwards. The prefix xn-- indicates IDNA encoding of a non-ASCII domain name, and then hvg is the IDNA encoding of , a left double quotation mark. By the time it hit the blog, the http://` and ending/were gone (perhaps becausealone is not a valid domain), and the only thing in the URL is“`. This gets tacked on to the end of the URL for the current page, and then WordPress silently drops that, since this page has no subpages.

      • Toby Bartels says:

        I don’t know how it ended up as xn--hvg, but I can parse that and (mostly) follow it afterwards. The prefix xn-- indicates IDNA encoding of a non-ASCII domain name, and then hvg is the IDNA encoding of , a left double quotation mark. By the time it hit the blog, the beginning http and ending / were gone (perhaps because alone is not a valid domain), and the only thing in the URL is . This gets tacked on to the end of the URL for the current page, and then WordPress silently drops that, since this page has no subpages.

    • Mark Meckes says:

      Here it is:

      • Toby Bartels says:

        Interesting, it's almost as if the point isn't the pitch but the rhythm and dynamics. Yet pitch does play a role!

      • Mark Meckes says:

        Indeed! It’s not as if the piano is being treated like an untuned percussion instrument. Definite pitch plays an important role throughout this piece.

        I like how dramatically this piece demonstrates that treating pitches that differ by an octave as “the same” is an oversimplification.

    • Mark Meckes says:

      Apparently I just don’t know the right way to include links in a reply on a WordPress blog, because I didn’t expect that to work automatically.

  4. John Baez says:

    Over on Mathstodon, Mike Battaglia left some fascinating comments including these:

    I guess I’ll also just leave a comment about this: “The facts I’ve crudely laid out above must be part of an elegant general theory of approximating the number 1 by fractions built from powers of a specified set of primes, and how to build scales from these fractions. Done systematically, this could be of interest not just to music theorists but even pure mathematicians.”

    This is basically what we in the tuning community call “regular temperament theory,” and it is super interesting. The basic idea: the intervals of 5-limit just intonation form a free abelian group, which we can think of as a Z-module; let’s call it our “universe” U. Then the torsion-free quotient groups of U are, basically, extremely interesting: they represent all of the “regular temperaments” one can derive from U. Associated to each such temperament T is a kernel from the quotient map, which represents the set of all commas “tempered out” in T.

    Each temperament has a set of characteristic chord progressions one can only play in that temperament (or some temperament strictly extending its kernel). For instance, the reason that 90s hip hop Skee Lo song (originally “Spinnin” by Bernard Wright) can’t be played in 1/4-comma meantone is because, in just intonation, that chord progression doesn’t bring you back to 1/1, but the comma 128/125 (which is tempered out in 12-TET, but not 1/4-comma meantone). So, if you play this in 12-TET, it sounds good, but there’s a weird shift from Cb to B if you try to do it in meantone. But if you play it in 15-TET, no problem!

    Now, of course, we only care about temperaments in which mostly sensible things are in the kernel – there’s no point tempering out, you know, 5/4 or whatever. So, associated to any temperament are two useful invariants: its approximation error (in approximating just intonation) and its complexity (how many notes one needs to play useful chords, kind of). Given these quantities, and a free parameter denoting the trade-off between complexity and tuning accuracy, we can basically compute the “best” temperaments of whatever rank we want (according to these two criteria, anyway).

    A “tuning map” f in hom(U,ℝ) counts as a “tuning” of some regular temperament T iff the kernel of T is a subset of the kernel of f (meaning that f tempers out the same commas that T does, plus possibly some extras). For any such , there is always a “best” tuning map w/ minimal average approximation error – with a few variations regarding what “average” means, but most sensible ways of doing this tend to yield the same sort of result. So, this is an invariant associated to any torsionfree quotient of U: the “best-case in-tuneness” of it. I’ll skip the details but the “complexity” of each quotient can be computed using similar reasoning.

    Anyway, I’ll leave it at that for now, but there is like a near-endless amount of interesting math people are doing with these simple ideas, with people using the exterior algebra and Grassmannian manifolds to parameterize a “space of temperaments” and etc. I would guess much of this is probably pretty tame for you, but, you know, pretty intense for what usually counts as “music theory!”

    • John Baez says:

      Mike Battaglia continued:

      The best place to start is probably Paul Erlich’s “A Middle Path” paper, which is here: https://sethares.engr.wisc.edu/paperspdf/Erlich-MiddlePath.pdf

      That paper is a bit dated and some of the terminology has long since changed, but it’s probably still the best place to start; it introduces the basic ideas, one of the methods of tuning optimization, and the notation we tend to use. There’s plenty beyond that, for which I would imagine you’d get up to speed pretty quickly, since most of this is really just linear algebra! I’m always happy to talk about this stuff so feel free to ping me about it.

      The stuff with Grassmannians isn’t in that paper, but is likewise pretty simple: each regular temperament can be uniquely identified by its kernel as a quotient of U. Since regular temperaments are torsionfree quotients, a each such kernel is a pure subgroup. So, we want a moduli space for pure subgroups of U. Well, if we think of as a lattice embedded in a real vector space V, then subgroups of U are also subspaces of V, and the Grassmannian parameterizes subspaces of V. The exterior algebra makes this all very straightforward.

      Super quick explanation of the exterior algebra stuff, which isn’t in that paper:

      For instance, suppose U is 5-limit just intonation, and we want to quotient by the subgroup generated by 81/80, the syntonic comma, and 128/125, the lesser diesis. So,

      81/80 = 2^-4 \cdot 3^4 \cdot 5^1

      and using the modified bra-ket notation in that paper we can represent it as the vector

      |-4 , 4 ,-1 \rangle

      with 128/125 being

      | 7 , 0 ,-3 \rangle

      If we take the wedge product of these two vectors we get the bivector

      ||-28, 19, -12\rangle\rangle

      and in fact we will get the same thing if we use any basis for the same kernel. This bivector represents the kernel of the temperament – and thus represents the temperament itself – and using the Plücker embedding we can view it as a point on the above Grassmannian.

      Then, there’s a certain “dual” operation (slightly modified Hodge star) which turns this into the covector

      \langle 12, 19, 28|

      representing 12-TET: 2/1 maps to 12 steps, 3/1 to 19 and 5/1 to 28. This tells you that 12-TET is the unique temperament tempering out that kernel. (There’s a small quirk w/ the sign here, as the wedge product is anti-commutative.)

      Now, for instance, suppose that we like tempering out 81/80, but instead of tempering out the diesis of 128/125, you had a different idea in mind: you want to temper it equal to the chromatic semitone 25/24, making them both one kind of thing. This means their quotient, 3125/3072 is tempered out. If you do the above math, you instead get

      \langle 19, 30 ,44 |

      or 19-TET – this tells you that 19-TET has different “enharmonic equivalences” than 12.

      Sometimes useful – though I often prefer the straight linear algebra view.

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

This site uses Akismet to reduce spam. Learn how your comment data is processed.