Numerical constants are not a red flag

I do not understand the Yitang Zhang’s claimed proof of an improvement to the zero-free region of a Dirichlet L-function. That will not stop me from weighing in on some of the discourse I’ve seen around the proof, because let’s be honest, nobody else commenting on the proof, other than experts in hard analytic number theory, has any chance of understanding the proof either. And since Peking University and Nature both have announcements of the proof that are not entirely correct, I can at least promise that what follows will be better than their discussion 🙂

Since the announcement of the proof, several comments on r/math, among other places, have observed how unusual it is that Zhang’s proof contains specific (and, in the minds of the commenters, arbitrary-looking) constants. This seems like the final form of the meme that “real mathematicians don’t work with numbers, only concepts” that one often hears undergraduate math majors repeat, but that is simply not true. Zhang claims an explicit bound on the width of the zero-free region, and while his choice of the exact size of the bound was a little bit arbitrary, this is not entirely unusual. [1] I refer to two papers that I recently read which also have explicit bounds and as such have actual numbers all over the place: Dolgopyat’s method and the fractal uncertainty principle and Efficient algorithms for solving the p-Laplacian in polynomial time. In an upcoming paper that I have with some coauthors, we currently have a bound

\displaystyle L \leq \max(23000 d^{3/2} c_N^{-3/2}, 80^3 ||\partial^2_{xy} \Phi||_{C^0}^3 c_N^{-3}, (2\theta^2)^{-3/4}, (160c_N^2)^{-3})

though before submitting for publication we may optimize this estimate a bit to get the constants a bit smaller. If we do, they will definitely look less round and more arbitrary.

Anyways, let’s take a look at Zhang’s paper and see where some of the constants that people seem most suspicious of are coming from. Those constants are on page 10 in the draft that’s currently on the arXiv and are

\displaystyle \iota_2 = 0.94977 - 1.38995i, ~ \iota_3 = -1.00635 - 0.22789i, ~\iota_4 = -0.68738 + 1.60688i .

Where do these “iotas” come from, and where are they going? (This will be a reformatting of a reddit comment I made.)

The key estimate is Proposition 2.5, which bounds a quantity defined on the bottom of page 9. (Don’t ask me what the significance of that quantity is.) By Cauchy’s product inequality

\displaystyle \alpha \beta \leq \frac{\alpha^2}{2\varepsilon} + \frac{\varepsilon \beta^2}{2}

he needs to prove (2.32) and (2.33), where (2.32) is an estimate of the form

\displaystyle \alpha^2 < 0.001aP

and (2.33) is of the form

\displaystyle \beta^2 < 3000aP

for some a, P that are determined elsewhere in the proof. Note that if (2.33) was stronger (smaller constant than 3000) he would not need to have such a strong bound in (2.32). However, on page 100 he proves (2.33) by splitting a sum into a dominant term and a bunch of error terms, and then doing the sloppy thing and showing that each of those error terms is at most 100aP or something like that. The result is that he gets a big constant 3000 at the end.

So Zhang needs a small constant in (2.32) to pay for his sloppiness in (2.33) — in other words he needs to choose \varepsilon suitably. Messing around with Cauchy’s product inequality (I did this in desmos), you can convince yourself that the constant in (2.32) can be a little bit bigger than 0.001, but Zhang decided to round down to make the statement of (2.32) less messy.

So now we turn to page 99 to see the proof of (2.32), which comes down to an estimate of the form

\displaystyle c_1 + c_2 + 2Re(c_3) < 0.001.

Here his constant c_3 probably easily follows from the iotas, which suggests that he probably made those choices so that -Re(c_3) > 0.69951 would be just barely smaller than max(c_1, c_2) = 6.9955. Those constants are calculated in Sections 8 and 9 respectively. Anyways, since Sections 8 and 9 depend on the iotas as well, it would not be unreasonable to speculate that using some numerical analysis he was able to find that as long as the constants lived in some small ball in the complex plane they would be OK, and then rounded to sufficient precision to obtain the choice of iotas.

I suspect that computing c_1, c_2 is probably the hard part of the proof, so I’m not going to try to understand it. However, skimming Sections 8 and 9, it seems like c_1, c_2 are obtained by reducing a whole lot of analytic number theory voodoo to the computation of finitely many one-dimensional integrals and then using a computer to evaluate the integrals. This constrains what you can choose the iotas to be, and then by explicit numerical computation one obtains them as given in Section 2.

Zhang’s proof may yet be wrong — and simply because the problem in question is known to be notorious for drawing incorrect proofs (including one due to Zhang himself) — I would be suspicious of any claimed proof until the experts can validate it. However, the fact that he has numerically significant constants in his proof is not a red flag — in fact, since he claims explicit bounds, I would be far more suspicious if this was not true.

[1] One of these days someone is going to write a paper with explicit bounds 69 and 420, and the journal referee will have to decide if this is too unprofessional to publish.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s