Patterns That Eventually Fail

Sometimes patterns can lead you astray. For example, it’s known that

\displaystyle{ \mathrm{li}(x) = \int_0^x \frac{dt}{\ln t} }

is a good approximation to \pi(x), the number of primes less than or equal to x. Numerical evidence suggests that \mathrm{li}(x) is always greater than \pi(x). For example,

\mathrm{li}(10^{12}) - \pi(10^{12}) = 38,263

and

\mathrm{li}(10^{24}) - \pi(10^{24}) = 17,146,907,278

But in 1914, Littlewood heroically showed that in fact, \mathrm{li}(x) - \pi(x) changes sign infinitely many times!

This raised the question: when does \pi(x) first exceed \mathrm{li}(x)? In 1933, Littlewood’s student Skewes showed, assuming the Riemann hypothesis, that it must do so for some x less than or equal to

\displaystyle{ 10^{10^{10^{34}}} }

Later, in 1955, Skewes showed without the Riemann hypothesis that \pi(x) must exceed \mathrm{li}(x) for some x smaller than

\displaystyle{ 10^{10^{10^{964}}} }

By now this bound has been improved enormously. We now know the two functions cross somewhere near 1.397 \times 10^{316}, but we don’t know if this is the first crossing!

All this math is quite deep. Here is something less deep, but still fun.

You can show that

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, dt = \frac{\pi}{2} }

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, \frac{\sin \left(\frac{t}{101}\right)}{\frac{t}{101}} \, dt = \frac{\pi}{2} }

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, \frac{\sin \left(\frac{t}{101}\right)}{\frac{t}{101}} \, \frac{\sin \left(\frac{t}{201}\right)}{\frac{t}{201}} \, dt = \frac{\pi}{2} }

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, \frac{\sin \left(\frac{t}{101}\right)}{\frac{t}{101}} \, \frac{\sin \left(\frac{t}{201}\right)}{\frac{t}{201}} \, \frac{\sin \left(\frac{t}{301}\right)}{\frac{t}{301}} \, dt = \frac{\pi}{2} }

and so on.

It’s a nice pattern. But this pattern doesn’t go on forever! It lasts a very, very long time… but not forever.

More precisely, the identity

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, \frac{\sin \left(\frac{t}{101}\right)}{\frac{t}{101}} \, \frac{\sin \left(\frac{t}{201}\right)}{\frac{t}{201}} \cdots \, \frac{\sin \left(\frac{t}{100 n +1}\right)}{\frac{t}{100 n + 1}} \, dt = \frac{\pi}{2} }

holds when

n < 9.8 \cdot 10^{42}

but not for all n. At some point it stops working and never works again. In fact, it definitely fails for all

n > 7.4 \cdot 10^{43}

The explanation

The integrals here are a variant of the Borwein integrals:

\displaystyle{ \int_0^\infty \frac{\sin(x)}{x} \, dx= \frac{\pi}{2} }

\displaystyle{ \int_0^\infty \frac{\sin(x)}{x}\frac{\sin(x/3)}{x/3} \, dx = \frac{\pi}{2} }

\displaystyle{ \int_0^\infty \frac{\sin(x)}{x}\, \frac{\sin(x/3)}{x/3} \, \frac{\sin(x/5)}{x/5} \, dx = \frac{\pi}{2} }

where the pattern continues until

\displaystyle{ \int_0^\infty \frac{\sin(x)}{x} \, \frac{\sin(x/3)}{x/3}\cdots\frac{\sin(x/13)}{x/13} \, dx = \frac{\pi}{2} }

but then fails:

\displaystyle{\int_0^\infty \frac{\sin(x)}{x} \, \frac{\sin(x/3)}{x/3}\cdots \frac{\sin(x/15)}{x/15} \, dx \approx \frac \pi 2 - 2.31\times 10^{-11} }

I never understood this until I read Greg Egan’s explanation, based on the work of Hanspeter Schmid. It’s all about convolution, and Fourier transforms:

Suppose we have a rectangular pulse, centred on the origin, with a height of 1/2 and a half-width of 1.

Now, suppose we keep taking moving averages of this function, again and again, with the average computed in a window of half-width 1/3, then 1/5, then 1/7, 1/9, and so on.

There are a couple of features of the original pulse that will persist completely unchanged for the first few stages of this process, but then they will be abruptly lost at some point.

The first feature is that F(0) = 1/2. In the original pulse, the point (0,1/2) lies on a plateau, a perfectly constant segment with a half-width of 1. The process of repeatedly taking the moving average will nibble away at this plateau, shrinking its half-width by the half-width of the averaging window. So, once the sum of the windows’ half-widths exceeds 1, at 1/3+1/5+1/7+…+1/15, F(0) will suddenly fall below 1/2, but up until that step it will remain untouched.

In the animation below, the plateau where F(x)=1/2 is marked in red.

The second feature is that F(–1)=F(1)=1/4. In the original pulse, we have a step at –1 and 1, but if we define F here as the average of the left-hand and right-hand limits we get 1/4, and once we apply the first moving average we simply have 1/4 as the function’s value.

In this case, F(–1)=F(1)=1/4 will continue to hold so long as the points (–1,1/4) and (1,1/4) are surrounded by regions where the function has a suitable symmetry: it is equal to an odd function, offset and translated from the origin to these centres. So long as that’s true for a region wider than the averaging window being applied, the average at the centre will be unchanged.

The initial half-width of each of these symmetrical slopes is 2 (stretching from the opposite end of the plateau and an equal distance away along the x-axis), and as with the plateau, this is nibbled away each time we take another moving average. And in this case, the feature persists until 1/3+1/5+1/7+…+1/113, which is when the sum first exceeds 2.

In the animation, the yellow arrows mark the extent of the symmetrical slopes.

OK, none of this is difficult to understand, but why should we care?

Because this is how Hanspeter Schmid explained the infamous Borwein integrals:

∫sin(t)/t dt = π/2
∫sin(t/3)/(t/3) × sin(t)/t dt = π/2
∫sin(t/5)/(t/5) × sin(t/3)/(t/3) × sin(t)/t dt = π/2

∫sin(t/13)/(t/13) × … × sin(t/3)/(t/3) × sin(t)/t dt = π/2

But then the pattern is broken:

∫sin(t/15)/(t/15) × … × sin(t/3)/(t/3) × sin(t)/t dt < π/2

Here these integrals are from t=0 to t=∞. And Schmid came up with an even more persistent pattern of his own:

∫2 cos(t) sin(t)/t dt = π/2
∫2 cos(t) sin(t/3)/(t/3) × sin(t)/t dt = π/2
∫2 cos(t) sin(t/5)/(t/5) × sin(t/3)/(t/3) × sin(t)/t dt = π/2

∫2 cos(t) sin(t/111)/(t/111) × … × sin(t/3)/(t/3) × sin(t)/t dt = π/2

But:

∫2 cos(t) sin(t/113)/(t/113) × … × sin(t/3)/(t/3) × sin(t)/t dt < π/2

The first set of integrals, due to Borwein, correspond to taking the Fourier transforms of our sequence of ever-smoother pulses and then evaluating F(0). The Fourier transform of the sinc function:

sinc(w t) = sin(w t)/(w t)

is proportional to a rectangular pulse of half-width w, and the Fourier transform of a product of sinc functions is the convolution of their transforms, which in the case of a rectangular pulse just amounts to taking a moving average.

Schmid’s integrals come from adding a clever twist: the extra factor of 2 cos(t) shifts the integral from the zero-frequency Fourier component to the sum of its components at angular frequencies –1 and 1, and hence the result depends on F(–1)+F(1)=1/2, which as we have seen persists for much longer than F(0)=1/2.

• Hanspeter Schmid, Two curious integrals and a graphic proof, Elem. Math. 69 (2014) 11–17.

I asked Greg if we could generalize these results to give even longer sequences of identities that eventually fail, and he showed me how: you can just take the Borwein integrals and replace the numbers 1, 1/3, 1/5, 1/7, … by some sequence of positive numbers

1, a_1, a_2, a_3 \dots

The integral

\displaystyle{\int_0^\infty \frac{\sin(x)}{x} \, \frac{\sin(a_1 x)}{a_1 x} \, \frac{\sin(a_2 x)}{a_2 x} \cdots \frac{\sin(a_n x)}{a_n x} \, dx }

will then equal \pi/2 as long as a_1 + \cdots + a_n \le 1, but not when it exceeds 1. You can see a full explanation on Wikipedia:

• Wikipedia, Borwein integral: general formula.

As an example, I chose the integral

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, \frac{\sin \left(\frac{t}{101}\right)}{\frac{t}{101}} \, \frac{\sin \left(\frac{t}{201}\right)}{\frac{t}{201}} \cdots \, \frac{\sin \left(\frac{t}{100 n +1}\right)}{\frac{t}{100 n + 1}} \, dt  }

which equals \pi/2 if and only if

\displaystyle{ \sum_{k=1}^n \frac{1}{100 k + 1} \le 1  }

Thus, the identity holds if

\displaystyle{ \sum_{k=1}^n \frac{1}{100 k} \le 1  }

However,

\displaystyle{ \sum_{k=1}^n \frac{1}{k} \le 1 + \ln n }

so the identity holds if

\displaystyle{ \frac{1}{100} (1 + \ln n) \le 1 }

or

\ln n \le 99

or

n \le e^{99} \approx 9.8 \cdot 10^{42}

On the other hand, the identity fails if

\displaystyle{ \sum_{k=1}^n \frac{1}{100 k + 1} > 1  }

so it fails if

\displaystyle{ \sum_{k=1}^n \frac{1}{101 k} > 1  }

However,

\displaystyle{ \sum_{k=1}^n \frac{1}{k} \ge \ln n }

so the identity fails if

\displaystyle{ \frac{1}{101} \ln n > 1 }

or

\displaystyle{ \ln n > 101}

or

\displaystyle{n > e^{101} \approx 7.4 \cdot 10^{43} }

With a little work one could sharpen these estimates considerably, though it would take more work to find the exact value of n at which

\displaystyle{ \int_0^\infty \frac{\sin t}{t} \, \frac{\sin \left(\frac{t}{101}\right)}{\frac{t}{101}} \, \frac{\sin \left(\frac{t}{201}\right)}{\frac{t}{201}} \cdots \, \frac{\sin \left(\frac{t}{100 n +1}\right)}{\frac{t}{100 n + 1}} \, dt = \frac{\pi}{2} }

first fails.

22 Responses to Patterns That Eventually Fail

  1. Antonio says:

    There’s a typo in the definition of li(x)

  2. Greg Egan says:

    I believe the first n for which the pattern fails is:

    15,341,178,777,673,149,429,167,740,440,969,249,338,310,889

    The sum can be rewritten in terms of digamma functions:

    \displaystyle{\sum _{k=1}^n \frac{1}{100 k+1} = \frac{1}{100} \left(\psi\left(n+\frac{101}{100}\right)-\psi\left(\frac{101}{100}\right)\right)}

    which Mathematica can calculate with a Taylor series (or something similar), so it’s possible to evaluate it to sufficient precision to be sure that the value is greater than 1 for this n, and less than 1 for n-1.

    • Greg Egan says:

      And the Schmid integrals, with their extra factor of 2 \cos(t), will stick to the pattern until the sum first exceeds 2, at n equal to:

      412,388,856,479,291,008,968,946,990,055,780,778,304,566,005,151,152,521,580,673,941,188,473,456,899,173,626,864,612

    • John Baez says:

      Greg wrote:

      I believe the first n for which the pattern fails is:

      15,341,178,777,673,149,429,167,740,440,969,249,338,310,889

      That’s great! I should have read your comment before I tweeted about this today, so I could include the exact number. But I composed the tweet yesterday after merely working out an estimate.

      I didn’t know that the digamma function

      \displaystyle{ \psi(x)=\frac{d}{dx}\ln\big(\Gamma(x)\big)=\frac{\Gamma'(x)}{\Gamma(x)} }

      obeys

      \displaystyle{ \psi(x+N)-\psi(x)=\sum_{k=0}^{N-1} \frac{1}{x+k} }

      So, I didn’t know a good strategy to compute the exact number.

  3. The explanation by Greg and Schmid is really insightful. Rademacher functions are step functions (the Fourier transform of the sinc function gets you back to the square wave) and the integral of the product of them over (0,1) is zero :
    \int_0^1 r_1(t) r_2(t) \dots r_n(t) dt = 0 for any n

  4. regarding patterns, I see this in the news:

    “Famed mathematician claims proof of 160-year-old Riemann hypothesis:

    https://www.newscientist.com/article/2180406-famed-mathematician-claims-proof-of-160-year-old-riemann-hypothesis/

    • John Baez says:

      Yes, and you can see my response on Twitter, but I don’t really want to talk about that here.

      More relevant is the question of whether numerical evidence for the Riemann hypothesis is really convincing!

      In 2000, Gourdon and Demichel showed that the first 10,000,000,000,000 nontrivial zeros of the Riemann zeta function lie on the line Re(z) = 1/2, just as the Riemann hypothesis claims. They also checked two billion much larger zeros.

      Normally this would seem pretty convincing. But some argue otherwise! According to Wikipedia:

      At first, the numerical verification that many zeros lie on the line seems strong evidence for it. However, analytic number theory has had many conjectures supported by large amounts of numerical evidence that turn out to be false. See Skewes number for a notorious example, where the first exception to a plausible conjecture related to the Riemann hypothesis probably occurs around 10316; a counterexample to the Riemann hypothesis with imaginary part this size would be far beyond anything that can currently be computed using a direct approach. The problem is that the behavior is often influenced by very slowly increasing functions such as log log T, that tend to infinity, but do so so slowly that this cannot be detected by computation. Such functions occur in the theory of the zeta function controlling the behavior of its zeros; for example the function S(T) above has average size around (log log T)1/2. As S(T) jumps by at least 2 at any counterexample to the Riemann hypothesis, one might expect any counterexamples to the Riemann hypothesis to start appearing only when S(T) becomes large. It is never much more than 3 as far as it has been calculated, but is known to be unbounded, suggesting that calculations may not have yet reached the region of typical behavior of the zeta function.

      Emphasis mine.

  5. Māris Ozols says:

    I think there is a typo in the condition on a_i. It should be 1/a_1 + \dots 1/a_n \leq 1.

  6. […] A mathematical pattern that fails after about 10^43 examples 5 by fanf2 | 0 comments on Hacker News. […]

  7. Severin Pappadeux says:

    Well, considering series 1/2 + 1/4 + 1/8 \cdots which is always less than 1,

    \displaystyle{ \int dx \frac{\sin{x}}{x} \, \frac{\sin{x/2}}{x/2} \, \frac{\sin{x/4}}{x/4} \cdots }

    is always equal to \pi/2

  8. L Spice says:

    As things stand, I read your discussion of how you found the bounds on the range of n where your example first fails as saying, for example,

    the identity holds if \sum_{k=1}^n \frac{1}{100 k} \le 1 but \sum_{k=1}^n \frac{1}{k} \le 1 + \ln n

    (i.e., that the first statement, and the second notionally opposed statement, have to hold). I think that it would be clearer with commas (which would, perhaps unfortunately, have to be part of the displayed math):

    the identity holds if \sum_{k=1}^n \frac{1}{100 k} \le 1, but \sum_{k=1}^n \frac{1}{k} \le 1 + \ln n

    to emphasise that the ‘but’ is helping you narrow down the range of n to be considered, not directly imposing a further condition on n.

    • John Baez says:

      Good point! I’m glad someone is reading this stuff in a careful and critical way.

      Instead of making one little comma carry such a heavy burden, I think I’ll deploy that wonderful word “However”.

  9. The Borweins have been at this stuff for a while – see their 2007 paper with Robert Baillie, “Surprising Sinc Sums and Integrals” which sets out the Fourier theoretic background to the compelling visual explanations give by Greg Egan and Hanspeter Schmid.

    In Fourier theoretic terms you are transforming, via convolutions, a unitbox signal which gets you to a sinc wave in the transform space. Along the way you get the “localisation –nonlocalisation effect” ie if something is concentrated in interval of length L then in the transform space it cannot be located in an interval essentially less than 1/L – Heisenberg uncertainty if you like, but really just a property of Fourier transforms. It is still surprising that the pattern breaks down and Hanspeter Schmid really nails it by referring to the erosion on the plateau of the box function in the convolution process. He has come at it from a signal processing angle I think.

    As an aside Jonathan Borwein died in 2016 while at Newcastle University, Australia. He was a Fellow of the Australian Academy of Sciences.

  10. Mark Meckes says:

    \int_0^\infty \frac{\sin(x) \cos(ax)}{x} dx = \frac{\pi}{4} if a is at most 1 and is 0 if a is bigger than 1. So you can let a_n be any sequence that takes a long time to exceed 1, like n/10^{10^{10^{34}}} to get a similar example.

    It’s just a cheap variation of what you discussed in the post, and works for a simpler version of the same reason.

  11. Blake Stacey says:

    The link for the animation is broken; I think it has an extra closing parenthesis.

  12. Ishi Crew says:

    As an irrelevant analogy this reminds me of somone who is driving a car , and who has a series of drinks, each smaller than the previous one, Eventually they may go off the road and stop.

You can use Markdown or HTML in your comments. You can also use LaTeX, like this: $latex E = m c^2 $. The word 'latex' comes right after the first dollar sign, with a space after it.

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

This site uses Akismet to reduce spam. Learn how your comment data is processed.