Gheorghe Craciun is a mathematician at the University of Wisconsin who recently proved the Global Attractor Conjecture, which since 1974 was the most famous conjecture in mathematical chemistry. This week he visited U. C. Riverside and gave a talk on this subject. But he also told me about something else—something quite remarkable.
A peptide is basically a small protein: a chain of made of fewer than 50 amino acids. If you plot the number of peptides of different masses found in various organisms, you see peculiar oscillations:
These oscillations have a frequency of about 14 daltons, where a ‘dalton’ is roughly the mass of a hydrogen atom—or more precisely, 1/12 the mass of a carbon atom.
Biologists had noticed these oscillations in databases of peptide masses. But they didn’t understand them.
Can you figure out what causes these oscillations?
It’s a math puzzle, actually.
Next I’ll give you the answer, so stop looking if you want to think about it first.
Almost all peptides are made of 20 different amino acids, which have different masses, which are almost integers. So, to a reasonably good approximation, the puzzle amounts to this: if you have 20 natural numbers how many ways can you write any natural number as a finite ordered sum of these numbers? Call it and graph it. It oscillates! Why?
(We count ordered sums because the amino acids are stuck together in a linear way to form a protein.)
There’s a well-known way to write down a formula for . It obeys a linear recurrence:
and we can solve this using the ansatz
Then the recurrence relation will hold if
for all But this is fairly easy to achieve! If is the biggest mass, we just need this polynomial equation to hold:
There will be a bunch of solutions, about of them. (If there are repeated roots things get a bit more subtle, but let’s not worry about.) To get the actual formula for we need to find the right linear combination of functions where ranges over all the roots. That takes some work. Craciun and his collaborator Shane Hubler did that work.
But we can get a pretty good understanding with a lot less work. In particular, the root with the largest magnitude will make grow the fastest.
If you haven’t thought about this sort of recurrence relation it’s good to look at the simplest case, where we just have two masses Then the numbers are the Fibonacci numbers. I hope you know this: the th Fibonacci number is the number of ways to write as the sum of an ordered list of 1’s and 2’s!
1+1+1, 1+2, 2+1
1+1+1+1, 1+1+2, 1+2+1, 2+1+1, 2+2
If I drew edges between these sums in the right way, forming a ‘family tree’, you’d see the connection to Fibonacci’s original rabbit puzzle.
In this example the recurrence gives the polynomial equation
and the root with largest magnitude is the golden ratio:
The other root is
With a little more work you get an explicit formula for the Fibonacci numbers in terms of the golden ratio:
But right now I’m more interested in the qualitative aspects! In this example both roots are real. The example from biology is different.
Puzzle 1. For which lists of natural numbers are all the roots of
I don’t know the answer. But apparently this kind of polynomial equation always one root with the largest possible magnitude, which is real and has multiplicity one. I think it turns out that is asymptotically proportional to where is this root.
But in the case that’s relevant to biology, there’s also a pair of roots with the second largest magnitude, which are not real: they’re complex conjugates of each other. And these give rise to the oscillations!
For the masses of the 20 amino acids most common in life, the roots look like this:
The aqua root at right has the largest magnitude and gives the dominant contribution to the exponential growth of The red roots have the second largest magnitude. These give the main oscillations in which have period 14.28.
For the full story, read this:
• Shane Hubler and Gheorghe Craciun, Periodic patterns in distributions of peptide masses, BioSystems 109 (2012), 179–185.
Most of the pictures here are from this paper.
My main question is this:
Puzzle 2. Suppose we take many lists of natural numbers and draw all the roots of the equations
What pattern do we get in the complex plane?
I suspect that this picture is an approximation to the answer you’d get to Puzzle 2:
If you stare carefully at this picture, you’ll see some patterns, and I’m guessing those are hints of something very beautiful.
Earlier on this blog we looked at roots of polynomials whose coefficients are all 1 or -1:
The pattern is very nice, and it repays deep mathematical study. Here it is, drawn by Sam Derbyshire:
But now we’re looking at polynomials where the leading coefficient is 1 and all the rest are -1 or 0. How does that change things? A lot, it seems!
By the way, the 20 amino acids we commonly see in biology have masses ranging between 57 and 186. It’s not really true that all their masses are different. Here are their masses:
I pretended that none of the masses are equal in Puzzle 2, and I left out the fact that only about 1/9th of the coefficients of our polynomial are nonzero. This may affect the picture you get!