Let’s Read: Sendov’s conjecture in high degree, part 4: details of case one

In this proof we (finally!) finish the proof of case one.

As usual, we throughout fix a nonstandard natural {n} and a complex polynomial of degree {n} whose zeroes are all in {\overline{D(0, 1)}}. We assume that {a} is a zero of {f} whose standard part is {1}, and assume that {f} has no critical points in {\overline{D(a, 1)}}. Let {\lambda} be a random zero of {f} and {\zeta} a random critical point. Under these circumstances, {\lambda^{(\infty)}} is uniformly distributed on {\partial D(0, 1)} and {\zeta^{(\infty)}} is almost surely zero. In particular,

\displaystyle \mathbf E \log\frac{1}{|\lambda|}, \mathbf E \log |\zeta - a| = O(n^{-1})

and {\zeta} is infinitesimal in probability, hence infinitesimal in distribution. Let {\mu} be the expected value of {\zeta} (thus also of {\lambda}) and {\sigma^2} its variance. I think we won’t need the nonstandard-exponential bound {\varepsilon_0^n} this time, as its purpose was fulfilled last time.

Last time we reduced the proof of case one to a sequence of lemmata. We now prove them.

1. Preliminary bounds

Lemma 1 Let {K \subseteq \mathbf C} be a compact set. Then

\displaystyle f(z) - f(0), ~f'(z) = O((|z| + o(1))^n)

uniformly for {z \in K}.

Proof: It suffices to prove this for a compact exhaustion, and thus it suffices to assume

\displaystyle K = \overline{D(0, R)}.

By underspill, it suffices to show that for every standard {\varepsilon > 0} we have

\displaystyle |f(z) - f(0)|, ~|f'(z)| \leq C(|z| + \varepsilon)^n.

We first give the proof for {f'}.

First suppose that {\varepsilon < |z| \leq R}. Since {\zeta} is infinitesimal in distribution,

\displaystyle \mathbf E \log |z - \zeta| \leq \mathbf E \log \max(|z - \zeta|, \varepsilon/2) \leq \log \max(|z|, \varepsilon/2) + o(1);

here we need the {\varepsilon/2} and the {R} since {\log |z - \zeta|} is not a bounded continuous function of {\zeta}. Since {\varepsilon < |z|} we have

\displaystyle \mathbf E \log |z - \zeta| \leq \log |z| + o(1)

but we know that

\displaystyle -\frac{\log n}{n - 1} - \frac{1}{n - 1} \log |f'(z)| = U_\zeta(z) = -\mathbf E \log |z - \zeta|

so, solving for {\log |f'(z)|}, we get

\displaystyle \log |f'(z)| \leq (n - 1) \log |z| + o(n);

we absorbed a {\log n} into the {o(n)}. That gives

\displaystyle |f'(z)| \leq e^{o(n)} |z|^{n-1}.

Since {f'} is a polynomial of degee {n - 1} and {f} is monic (so the top coefficient of {f'} is {n}) this gives a bound

\displaystyle |f'(z)| \leq e^{o(n)} (|z| + \varepsilon)^{n - 1}

even for {|z| \leq \varepsilon}.

Now for {f}, we use the bound

\displaystyle |f(z) - f(0)| \leq \max_{|w| < |z|} |f'(w)|

to transfer the above argument. \Box

2. Uniform convergence of {\zeta}

Lemma 2 There is a standard compact set {S \subseteq \overline{D(0, 1)}} and a standard countable set {T \subseteq \overline{D(0, 1)} \setminus \overline{D(1, 1)}} such that

\displaystyle S = (\overline{D(0, 1)} \cap \partial D(1, 1)) \cup T,

all elements of {T} are isolated in {S}, and {||\zeta - S||_{L^\infty}} is infinitesimal.

Tao claims

\displaystyle \mathbf P(|\zeta - a| \geq \frac{1}{2m}) = O(n^{-1})

where {m} is a large standard natural, which makes no sense since the left-hand side should be large (and in particular, have positive standard part). I think this is just a typo though.

Proof: Since {\zeta} was assumed far from {a = 1 - o(1)} we have

\displaystyle \zeta \in \overline{D(0, 1)} \setminus D(1, 1 - o(1)).

We also have

\displaystyle \mathbf E \log |\zeta - a| = O(n^{-1})

so for every standard natural {m} there is a standard natural {k_m} such that

\displaystyle \mathbf P(\log |\zeta - a| \geq \frac{1}{2m}) \leq \frac{k_m}{n}.

Multiplying both sides by {n} we see that

\displaystyle \text{card } Z \cap K_m = \text{card } Z \cap \{\zeta_0 \in \overline{D(0, 1)}: \log |\zeta_0 - a| \geq \frac{1}{2m}\} \leq k_m

where {Z} is the variety of critical points {f' = 0}. Let {T_m} be the set of standard parts of zeroes in {K_m}; then {T_m} has cardinality {\leq k_m} and so is finite. For every zero {\zeta_0 \in Z}, either

  1. For every {m},

    \displaystyle |\zeta_0 - a| < \exp\left(\frac{1}{2m}\right)

    so the standard part of {|\zeta_0 - a|} is {1}, or

  2. There is an {m} such that {d(\zeta_0, T_m)} is infinitesimal.

So we may set {T = \bigcup_m T_m}; then {T} is standard and countable, and does not converge to a point in {\partial D(1, 1)}, so {S} is standard and {||\zeta - S||_{L^\infty}} is infinitesimal.

I was a little stumped on why {S} is compact; Tao doesn’t prove this. It turns out it’s obvious, I was just too clueless to see it. The construction of {T} forces that for any {\varepsilon > 0}, there are only finitely many {z \in T} with {|z - \partial D(1, 1)| \geq \varepsilon}, so if {T} clusters anywhere, then it can only cluster on {\partial D(1, 1)}. This gives the desired compactness. \Box

The above proof is basically just the proof of Ascoli’s compactness theorem adopted to this setting and rephrased to replace the diagonal argument (or 👏 KEEP 👏 PASSING 👏 TO 👏 SUBSEQUENCES 👏) with the choice of a nonstandard natural. I think the point is that, once we have chosen a nontrivial ultrafilter on {\mathbf N}, a nonstandard function is the same thing as sequence of functions, and the ultrafilter tells us which subsequences of reals to pass to.

3. Approximating {f,f'} outside of {S}

We break up the approximation lemma into multiple parts. Let {K} be a standard compact set which does not meet {S}. Given a curve {\gamma} we denote its arc length by {|\gamma|}; we always assume that an arc length does exist.

A point which stumped me for a humiliatingly long time is the following:

Lemma 3 Let {z, w \in K}. Then there is a curve {\gamma} from {z} to {w} which misses {S} and satisfies the uniform estimate

\displaystyle |z - w| \sim |\gamma|.

Proof: We use the decomposition of {S} into the arc

\displaystyle S_0 = \partial D(1, 1) \cap \overline{D(0, 1)}

and the discrete set {T}. We try to set {\gamma} to be the line segment {[z, w]} but there are two things that could go wrong. If {[z, w]} hits a point of {T} we can just perturb it slightly by an error which is negligible compared to {[z, w]}. Otherwise we might hit a point of {S_0} in which case we need to go the long way around. However, {S_0} and {K} are compact, so we have a uniform bound

\displaystyle \max(\frac{1}{|z - S_0|}, \frac{1}{|w - S_0|}) = O(1).

Therefore we can instead consider a curve {\gamma} which goes all the way around {S_0}, leaving {D(0, 1)}. This curve has length {O(1)} for {z, w} close to {S_0} (and if {z, w} are far from {S_0} we can just perturb a line segment without generating too much error). Using our uniform max bound above we see that this choice of {\gamma} is valid. \Box

Recall that the moments {\mu,\sigma} of {\zeta} are infinitesimal.

Since {||\zeta - S||_{L^\infty}} is infinitesimal, and {K} is a positive distance from any infinitesimals (since it is standard compact), we have

\displaystyle |z - \zeta|, |z - \mu| \sim 1

uniformly in {z}. Therefore {f} has no critical points near {K} and so {f''/f'} is holomorphic on {K}.

We first need a version of the fundamental theorem.

Lemma 4 Let {\gamma} be a contour in {K} of length {|\gamma|}. Then

\displaystyle f'(\gamma(1)) = f'(\gamma(0)) \left(\frac{\gamma(1) - \mu}{\gamma(0) - \mu}\right)^{n - 1} e^{O(n) |\gamma| \sigma^2}.

Proof: Our bounds on {|z - \zeta|} imply that we can take the Taylor expansion

\displaystyle \frac{1}{z - \zeta} = \frac{1}{z - \mu} + \frac{\zeta - \mu}{(z - \mu)^2} + O(|\zeta - \mu|^2)

of {\zeta} in terms of {\mu}, which is uniform in {\zeta}. Taking expectations preserves the constant term (since it doesn’t depend on {\zeta}), kills the linear term, and replaces the quadratic term with a {\sigma^2}, thus

\displaystyle s_\zeta(z) = \frac{1}{z - \mu} + O(\sigma^2).

At the start of this series we showed

\displaystyle f'(\gamma(1)) = f'(\gamma(0)) \exp\left((n-1)\int_\gamma s_\zeta(z) ~dz\right).

Plugging in the Taylor expansion of {s_\zeta} we get

\displaystyle f'(\gamma(1)) = f'(\gamma(0)) \exp\left((n-1)\int_\gamma \frac{dz}{z - \zeta}\right) e^{O(n) |\gamma| \sigma^2}.

Simplifying the integral we get

\displaystyle \exp\left((n-1)\int_\gamma \frac{dz}{z - \zeta}\right) = \left(\frac{\gamma(1) - \mu}{\gamma(0) - \mu}\right)^{n - 1}

whence the claim. \Box

Lemma 5 Uniformly for {z,w \in K} one has

\displaystyle f'(w) = (1 + O(n|z - w|\sigma^2 e^{o(n|z - w|)})) \frac{(w - \mu)^{n-1}}{(z - \mu)^{n - 1}}f'(z).

Proof: Applying the previous two lemmata we get

\displaystyle f'(w) = e^{O(n|z - w|\sigma^2)} \frac{(w - \mu)^{n-1}}{(z - \mu)^{n - 1}}f'(z).

It remains to simplify

\displaystyle e^{O(n|z - w|\sigma^2)} = 1 + O(n|z - w|\sigma^2 e^{o(n|z - w|)}).

Taylor expanding {\exp} and using the self-similarity of the Taylor expansion we get

\displaystyle e^z = 1 + O(|z| e^{|z|})

which gives that bound. \Box

Lemma 6 Let {\varepsilon > 0}. Then

\displaystyle f(z) = f(0) + \frac{1 + O(\sigma^2)}{n} f'(z) (z - \mu) + O((\varepsilon + o(1))^n).

uniformly in {z \in K}.

Proof: We may assume that {\varepsilon} is small enough depending on {K}, since the constant in the big-{O} notation can depend on {K} as well, and {\varepsilon} only appears next to implied constants. Now given {z} we can find {\gamma} from {z} to {\partial B(0, \varepsilon)} which is always moving at a speed which is uniformly bounded from below and always moving in a direction towards the origin. Indeed, we can take {\gamma} to be a line segment which has been perturbed to miss the discrete set {T}, and possibly arced to miss {S_0} (say if {z} is far from {D(0, 1)}). By compactness of {K} we can choose the bounds on {\gamma} to be not just uniform in time but also in space (i.e. in {K}), and besides that {\gamma} is a curve through a compact set {K'} which misses {S}. Indeed, one can take {K'} to be a closed ball containing {K}, and then cut out small holes in {K'} around {T} and {S_0}, whose radii are bounded below since {K} is compact. Since the moments of {\zeta} are infinitesimal one has

\displaystyle \int_\gamma (w - \mu)^{n-1} ~dw = \frac{(z - \mu)^n}{n} - \frac{\varepsilon^n e^{in\theta}}{n} = \frac{(z - \mu)^n}{n} - O((\varepsilon + o(1))^n).

Here we used {\varepsilon < 1} to enforce

\displaystyle \varepsilon^n/n = O(\varepsilon^n).

By the previous lemma,

\displaystyle f'(w) = (1 + O(n|z - w|\sigma^2 e^{o(n|z - w|)})) \frac{(w - \mu)^{n-1}}{(z - \mu)^{n - 1}}f'(z).

Integrating this result along {\gamma} we get

\displaystyle f(\gamma(0)) = f(\gamma(1)) - \frac{f'(\gamma(0))}{(\gamma(0) - \mu)^{n-1}} \left(\int_\gamma (w - \mu)^{n-1} ~dw + O\left(n\sigma^2 \int_\gamma|\gamma(0) - w| e^{o(n|\gamma(0) - w|)}|w - \mu|^{n-1}~dw \right) \right).

Applying our preliminary bound, the previous paragraph, and the fact that {|\gamma(1)| = \varepsilon}, thus

\displaystyle f(\gamma(1)) = f(0) + O((\varepsilon + o(1))^n),

we get

\displaystyle f(z) = f(0) + O((\varepsilon + o(1))^n) - \frac{f'(z)}{(z - \mu)^{n-1}} \left(\frac{(z - \mu)^n}{n} - O((\varepsilon + o(1))^n) + O\left(n\sigma^2 \int_\gamma|z - w| e^{o(n|z - w|)}|w - \mu|^{n-1}~dw \right)\right).

We treat the first term first:

\displaystyle \frac{f'(z)}{(z - \mu)^{n-1}} \frac{(z - \mu)^n}{n} = \frac{1}{n} f'(z) (z - \mu).

For the second term, {z \in K} while {\mu^{(\infty)} \in K}, so {|z - \mu|} is bounded from below, whence

\displaystyle \frac{f'(z)}{(z - \mu)^{n-1}} O((\varepsilon + o(1))^n) = O((\varepsilon + o(1))^n).

Thus we simplify

\displaystyle f(z) = f(0) + O((\varepsilon + o(1))^n) + \frac{1}{n} f'(z) (z - \mu) + \frac{f'(z)}{(z - \mu)^{n-1}} O\left(n\sigma^2 \int_\gamma|z - w| e^{o(n|z - w|)}|w - \mu|^{n-1}~dw \right).

It will be convenient to instead write this as

\displaystyle f(z) = f(0) + O((\varepsilon + o(1))^n) + \frac{1}{n} f'(z) (z - \mu) + O\left(n|f'(z)|\sigma^2 \int_\gamma|z - w| e^{o(n|z - w|)} \left|\frac{w - \mu}{z - \mu}\right|^{n-1}~dw \right).

Now we deal with the pesky integral. Since {\gamma} is moving towards {\partial B(0, \varepsilon)} at a speed which is bounded from below uniformly in “spacetime” (that is, {K \times [0, 1]}), there is a standard {c > 0} such that if {w = \gamma(t)} then

\displaystyle |w - \mu| \leq |z - \mu| - ct

since {\gamma} is going towards {\mu}. (Tao’s argument puzzles me a bit here because he claims that the real inner product {\langle z - w, z\rangle} is uniformly bounded from below in spacetime, which seems impossible if {w = z}. I agree with its conclusion though.) Exponentiating both sides we get

\displaystyle \left|\frac{w - \mu}{z - \mu}\right|^{n-1} = O(e^{-nct})

which bounds

\displaystyle f(z) = f(0) + O((\varepsilon + o(1))^n) + \frac{1}{n} f'(z) (z - \mu) + O\left(n|f'(z)|\sigma^2 \int_0^1 te^{-(c-o(1))nt} ~dt\right).

Since {c} is standard, it dominates the infinitesimal {o(1)}, so after shrinking {c} a little we get a new bound

\displaystyle f(z) = f(0) + O((\varepsilon + o(1))^n) + \frac{1}{n} f'(z) (z - \mu) + O\left(n|f'(z)|\sigma^2 \int_0^1 te^{-cnt} ~dt\right).

Since {n\int_0^1 te^{-cnt} ~dt} is exponentially small in {n}, in particular it is smaller than {O(n^{-1})}. Plugging in everything we get the claim. \Box

4. Control on zeroes away from {S}

After the gargantuan previous section, we can now show the “approximate level set” property that we discussed last time.

Lemma 7 Let {K} be a standard compact set which misses {S} and {\varepsilon > 0} standard. Then for every zero {\lambda_0 \in K} of {f},

\displaystyle U_\zeta(\lambda) = \frac{1}{n} \log \frac{1}{|f(0)|} + O(n^{-1}\sigma^2 + (\varepsilon + o(1))^n).

Last time we showed that this implies

\displaystyle U_\zeta(\lambda_0) = U_\zeta(a) + O(n^{-1}\sigma^2 + (\varepsilon + o(1))^n).

Thus all the zeroes of {f} either live in {S} or a neighborhood of a level set of {U_\zeta}. Proof: Plugging in {z = \lambda_0} in the approximation

\displaystyle f(z) = f(0) + \frac{1 + O(\sigma^2)}{n} f'(z) (z - \mu) + O((\varepsilon + o(1))^n)

we get

\displaystyle f(0) + \frac{1 + O(\sigma^2)}{n} f'(\lambda_0) (\lambda_0 - \mu) = O((\varepsilon + o(1))^n).

Several posts ago, we proved {|f(0)| \sim 1} as a consequence of Grace’s theorem, so {f(0)O((\varepsilon + o(1))^n) = O((\varepsilon + o(1))^n)}. In particular, if we solve for {f'(\lambda_0)} we get

\displaystyle \frac{|f'(\lambda_0)}{n} |\lambda_0 - \mu| = |f(0)| (1 + O(\sigma^2 + (\varepsilon + o(1))^n).

Using

\displaystyle U_\zeta(z) = -\frac{\log n}{n - 1} - \frac{1}{n - 1} \log |f'(z)|,

plugging in {z = \lambda_0}, and taking logarithms, we get

\displaystyle -\frac{n - 1}{n} U_\zeta(\lambda_0) + \frac{1}{n} \log | \lambda_0 - \mu| = \frac{1}{n} \log |f(0)| + O(n^{-1}\sigma^2 + (\varepsilon + o(1))^n).

Now {\lambda_0 \in K} and {K} misses the standard compact set {S}, so since {0 \in S} we have

\displaystyle |\lambda - \zeta|, |\lambda - \mu| \sim 1

(since {\zeta^{(\infty)} \in S} and {\mu} is infinitesimal). So we can Taylor expand in {\zeta} about {\mu}:

\displaystyle \log |\lambda_0 - \zeta| = \log |\lambda_0 - \mu| - \text{Re }\frac{\zeta - \mu}{\lambda_0 - \mu} + O(\sigma^2).

Taking expectations and using {\mathbf E \zeta - \mu},

\displaystyle -U_\zeta(\lambda_0) = \log |\lambda_0 - \mu| + O(\sigma^2).

Plugging in {\log |\lambda_0 - \mu|} we see the claim. \Box

I’m not sure who originally came up with the idea to reason like this; I think Tao credits M. J. Miller. Whoever it was had an interesting idea, I think: {f = 0} is a level set of {f}, but one that a priori doesn’t tell us much about {f'}. We have just replaced it with a level set of {U_\zeta}, a function that is explicitly closely related to {f'}, but at the price of an error term.

5. Fine control

We finish this series. If you want, you can let {\varepsilon > 0} be a standard real. I think, however, that it will be easier to think of {\varepsilon} as “infinitesimal, but not as infinitesimal as the term of the form o(1)”. In other words, {1/n} is smaller than any positive element of the ordered field {\mathbf R(\varepsilon)}; briefly, {1/n} is infinitesimal with respect to {\mathbf R(\varepsilon)}. We still reserve {o(1)} to mean an infinitesimal with respect to {\mathbf R(\varepsilon)}. Now {\varepsilon^n = o(1)} by underspill, since this is already true if {\varepsilon} is standard and {0 < \varepsilon < 1}. Underspill can also be used to transfer facts at scale {\varepsilon} to scale {1/n}. I think you can formalize this notion of “iterated infinitesimals” by taking an iterated ultrapower of {\mathbf R} in the theory of ordered rings.

Let us first bound {\log |a|}. Recall that {|a| \leq 1} so {\log |a| \leq 0} but in fact we can get a sharper bound. Since {T} is discrete we can get {e^{-i\theta}} arbitrarily close to whatever we want, say {-1} or {i}. This will give us bounds on {1 - a} when we take the Taylor expansion

\displaystyle \log|a| = -(1 - a)(1 + o(1)).

Lemma 8 Let {e^{i\theta} \in \partial D(0, 1) \setminus S} be standard. Then

\displaystyle \log |a| \leq \text{Re } ((1 - e^{-i\theta} + o(1))\mu) - O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Proof: Let {K} be a standard compact set which misses {S} and {\lambda_0 \in K} a zero of {f}. Since {\zeta \notin K} (since {S} is close to {\zeta}) and {|a-\zeta|} has positive standard part (since {d(a, S) = 1}) we can take Taylor expansions

\displaystyle -\log |\lambda_0 - \zeta| = -\log |\lambda_0| + \text{Re } \frac{\zeta}{\lambda_0} + O(|\zeta|^2)

and

\displaystyle -\log |a - \zeta| = -\log|a| + \text{Re } \frac{\zeta}{a} + O(|\zeta|^2)

in {\zeta} about {0}. Taking expectations we have

\displaystyle U_\zeta(\lambda_0) = -\log |\lambda_0| + \text{Re } \frac{\mu}{\lambda_0} + O(\mathbf E |\zeta|^2)

and similarly for {a}. Thus

\displaystyle -\log |a| + \text{Re } \frac{\mu}{a} = -\log |\lambda_0| + \text{Re } \frac{\mu}{\lambda_0} + O(\mathbf E |\zeta|^2 + n^{-1}\sigma^2 + (\varepsilon + o(1))^n)

since

\displaystyle U_\zeta(\lambda_0) - U_\zeta(a) = O(n^{-1}\sigma^2 + (\varepsilon + o(1))^n).

Since

\displaystyle \mathbf E|\zeta|^2 = |\mu|^2 + \sigma^2

we have

\displaystyle -\log|\lambda_0| + \text{Re } \left(\frac{1}{\lambda_0} - \frac{1}{a}\right)\mu = -\log|a| + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Now {|\lambda_0| \leq 1} so {-\log |\lambda_0| \geq 0}, whence

\displaystyle \text{Re } \left(\frac{1}{\lambda_0} - \frac{1}{a}\right)\mu \geq -\log|a| + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Now recall that {\lambda^{(\infty)}} is uniformly distributed on {\partial D(0, 1)}, so we can choose {\lambda_0} so that

\displaystyle |\lambda_0 - e^{i\theta}| = o(1).

Thus

\displaystyle \frac{1}{\lambda_0} - \frac{1}{a} = 1 - e^{-i\theta} + o(1)

which we can plug in to get the claim. \Box

Now we prove the first part of the fine control lemma.

Lemma 9 One has

\displaystyle \mu, 1 - a = O(\sigma^2 + (\varepsilon + o(1))^n).

Proof: Let {\theta_+ \in [0.98\pi, 0.99\pi],\theta_- \in [1.01\pi, 1.02\pi]} be standard reals such that {e^{i\theta_\pm} \notin S}. I don’t think the constants here actually matter; we just need {0 < 0.01 < 0.02 < \pi/8} or something. Anyways, summing up two copies of the inequality from the previous lemma with {\theta = \theta_\pm} we have

\displaystyle 1.9 \text{Re } \mu \geq \text{Re } ((1 + e^{-i\theta_+} + 1 + e^{-i\theta_-} + o(1))\mu) \geq \log |a| + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n)

since

\displaystyle 2 + e^{-i\theta_+} + e^{-i\theta_-} + o(1) \leq 1.9.

That is,

\displaystyle \text{Re } \mu \geq \frac{\log|a|}{1.9} + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Indeed,

\displaystyle -\log |a| = (1 - a)(1 + o(1)),

so

\displaystyle \text{Re }\mu \geq -\frac{1 - a}{1.9 + o(1)} + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

If we square the tautology {|\zeta - a| \geq 1} then we get

\displaystyle |\zeta|^2 - 2a \text{Re }\zeta + a^2 \geq 1.

Taking expected values we get

\displaystyle |\mu|^2 + \sigma^2 - 2a \text{Re }\mu + a^2 \geq 1

or in other words

\displaystyle \text{Re }\mu \leq -\frac{1 - a^2}{2a} + O(|\mu|^2 + \sigma^2) = -(1 - a)(1 + o(1)) + O(|\mu|^2 + \sigma^2)

where we used the Taylor expansion

\displaystyle \frac{1 - a^2}{2a} = (1 - a)(1 + o(1))

obtained by Taylor expanding {1/a} about {1} and applying {1 - a = o(1)}. Using

\displaystyle \text{Re }\mu \geq -\frac{1 - a}{1.9 + o(1)} + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n)

we get

\displaystyle -\frac{1 - a}{1.9 + o(1)} + O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n) \leq \text{Re }\mu \leq -(1 - a)(1 + o(1)) + O(|\mu|^2 + \sigma^2)

Thus

\displaystyle (1 - a)\left(1 + \frac{1}{1.9 + o(1)} + o(1)\right) = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Dividing both sides by {1 + \frac{1}{1.9 + o(1)} + o(1) \in [1, 2]} we have

\displaystyle 1 - a = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

In particular

\displaystyle \text{Re }\mu = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n)(1 + o(1)) + O(|\mu|^2 + \sigma^2) = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Now we treat the imaginary part of {\text{Im } \mu}. The previous lemma gave

\displaystyle \text{Re } ((1 - e^{-i\theta} + o(1))\mu) - \log |a| = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Writing everything in terms of real and imaginary parts we can expand out

\displaystyle \text{Re } ((1 - e^{-i\theta} + o(1))\mu) = (\sin \theta + o(1))\text{Re } \mu + (1 - \cos \theta + o(1))\text{Re }\mu.

Using the bounds

\displaystyle (1 - \cos \theta + o(1))\text{Re }\mu, ~\log |a| = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n)

(Which follow from the previous paragraph and the bound {\log |a| = O(1 - a)}), we have

\displaystyle (\sin \theta + o(1))\text{Im } \mu = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Since {T} is discrete we can find {\theta} arbitrarily close to {\pm \pi/2} which meets the hypotheses of the above equation. Therefore

\displaystyle \text{Im } \mu = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Pkugging everything in, we get

\displaystyle 1 - a \sim \mu = O(|\mu|^2 + \sigma^2 + (\varepsilon + o(1))^n).

Now {|\mu|^2 = o(|\mu|)} since {\mu} is infinitesimal; therefore we can discard that term. \Box

Now we are ready to prove the second part. The point is that we are ready to dispose of the semi-infinitesimal {\varepsilon}. Doing so puts a lower bound on {U_\zeta(a)}.

Lemma 10 Let {I \subseteq \partial D(0, 1) \setminus S} be a standard compact set. Then for every {e^{i\theta} \in I},

\displaystyle U_\zeta(a) - U_\zeta(e^{i\theta}) \geq -o(\sigma^2) - o(1)^n.

Proof: Since {\lambda^{(\infty)}} is uniformly distributed on {\partial D(0, 1)}, there is a zero {\lambda_0} of {f} with {|\lambda_0 - e^{i\theta}| = o(1)}. Since {|\lambda_0| \leq 1}, we can find an infinitesimal {\eta} such that

\displaystyle \lambda_0 = e^{i\theta}(1 - \eta)

and {|1 - \eta| \leq 1}. In the previous section we proved

\displaystyle U_\zeta(a) - U_\zeta(\lambda_0) = O(n^{-1}\sigma^2) + (\varepsilon + o(1))^n).

Using {n^{-1} = o(1)} and plugging in {\lambda_0} we have

\displaystyle U_\zeta(a) - U_\zeta(e^{i\theta}(1 - \eta)) = o(\sigma^2) + O((\varepsilon + o(1))^n).

Now

\displaystyle \text{Re } \eta \int_0^1 \frac{dt}{1 - t\eta + e^{-i\theta}\zeta} = \log |1 - e^{-i\theta}\zeta| - \log|1 - \eta - e^{-i\theta}\zeta| = \log|e^{i\theta} - \zeta| - \log|e^{i\theta} - e^{i\theta}\eta - \zeta|.

Taking expectations,

\displaystyle \text{Re }\eta \mathbf E\int_0^1 \frac{dt}{1 - t\eta + e^{-i\theta}\zeta} = U_\zeta(e^{i\theta}(1 - \eta)) - U_\zeta(e^{i\theta}).

Taking a Taylor expansion,

\displaystyle \frac{1}{1 - t\eta - e^{-i\theta}\zeta} = \frac{1}{1 - t\eta} + \frac{e^{-i\theta}\zeta}{(1 - t\eta)^2} + O(|\zeta|^2)

so by Fubini’s theorem

\displaystyle \mathbf E\int_0^1 \frac{dt}{1 - t\eta + e^{-i\theta}\zeta} = \int_0^1 \left(\frac{1}{1 - t\eta} + \frac{e^{-i\theta}}{(1 - t\eta)^2}\mu + O(|\mu|^2 + \sigma^2)\right)~dt;

using the previous lemma and {\eta = o(1)} we get

\displaystyle  U_\zeta(e^{i\theta}(1 - \eta)) - U_\zeta(e^{i\theta}) = \text{Re }\eta \int_0^1 \frac{dt}{1 - t\eta} + o(\sigma^2) + O((\varepsilon + o(1))^n).

We also have

\displaystyle \text{Re } \eta \int_0^1 \frac{dt}{1 - t\eta} = -\log \frac{1}{e^{i\theta} - e^{i\theta}\eta} = U_0(1 - \eta)

since {0} is deterministic (and {U_0(e^{i\theta} z) = U_0(z)}, and {U_0(1) = 0}; very easy to check!) I think Tao makes a typo here, referring to {U_i(e^{i\theta}(1 - \eta))}, which seems irrelevant. We do have

\displaystyle U_0(1 - \eta) = -\log|1 - \eta| \geq 0

since {|1 - \eta| \leq 0}. Plugging in

\displaystyle \text{Re } \eta \int_0^1 \frac{dt}{1 - t\eta} \geq 0

we get

\displaystyle U_\zeta(e^{i\theta} - e^{i\theta}\eta) - U_\zeta(e^{i\theta}) \geq -o(\sigma^2) - O((\varepsilon + o(1))^n).

I think Tao makes another typo, dropping the Big O, but anyways,

\displaystyle U_\zeta(a) - U_\zeta(e^{i\theta} - e^{i\theta}\eta) = o(\sigma^2) - O((\varepsilon + o(1))^n)

so by the triangle inequality

\displaystyle U_\zeta(a) - U_\zeta(e^{i\theta}) \geq -o(\sigma^2) - O((\varepsilon + o(1))^n).

By underspill, then, we can take {\varepsilon \rightarrow 0}. \Box

We need a result from complex analysis called Jensen’s formula which I hadn’t heard of before.

Theorem 11 (Jensen’s formula) Let {g: D(0, 1) \rightarrow \mathbf C} be a holomorphic function with zeroes {a_1, \dots, a_n \in D(0, 1)} and {g(0) \neq 0}. Then

\displaystyle \log |g(0)| = \sum_{j=1}^n \log |a_j| + \frac{1}{2\pi} \int_0^{2\pi} \log |g(e^{i\theta})| ~d\theta.

In hindsight this is kinda trivial but I never realized it. In fact {\log |g|} is subharmonic and in fact its Laplacian is exactly a linear combination of delta functions at each of the zeroes of {g}. If you subtract those away then this is just the mean-value property

\displaystyle \log |g(0)| = \frac{1}{2\pi} \int_0^{2\pi} \log |g(e^{i\theta})| ~d\theta.

Let us finally prove the final part. In what follows, implied constants are allowed to depend on {\varphi} but not on {\delta}.

Lemma 12 For any standard {\varphi \in C^\infty(\partial D(0, 1))},

\displaystyle \int_0^{2\pi} \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) ~d\theta = o(\sigma^2) + o(1)^n.

Besides,

\displaystyle U_\zeta(a) = o(\sigma^2) + o(1)^n.

Proof: Let {m} be the Haar measure on {\partial D(0, 1)}. We first prove this when {\varphi \geq 0}. Since {T} is discrete and {\partial D(0, 1)} is compact, for any standard (or semi-infinitesimal) {\delta > 0}, there is a standard compact set

\displaystyle I \subseteq \partial D(0, 1) \setminus S

such that

\displaystyle m(\partial D(0, 1) \setminus I) < \delta.

By the previous lemma, if {e^{i\theta} \in I} then

\displaystyle \varphi(e^{i\theta}) U_\zeta(a) - \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) \geq -o(\sigma^2) - o(1)^n

and the same holds when we average in Haar measure:

\displaystyle  U_\zeta(a)\int_I \varphi~dm - \int_I \varphi(e^{i\theta}) U_\zeta(e^{i\theta})~dm(e^{i\theta}) \geq -o(\sigma^2) - o(1)^n.

We have

\displaystyle |\log |e^{i\theta} - \zeta| + \text{Re } e^{-i\theta}\zeta| \leq |\log|3 - \zeta| + 3\text{Re } \zeta| \in L^2(dm(e^{i\theta}))

so, using the Cauchy-Schwarz inequality, one has

\displaystyle \int_{\partial D(0, 1) \setminus I} \varphi(e^{i\theta}) (\log |e^{i\theta} - \zeta| + \text{Re } e^{-i\theta}\zeta) ~dm(e^{i\theta}) = \sqrt{\int_I |\log |e^{i\theta} - \zeta| + \text{Re } e^{-i\theta}\zeta|} = O(\delta^{1/2}).

Meanwhile, if {|\zeta| \leq 1/2} then the fact that

\displaystyle \log |e^{i\theta} - \zeta| = \text{Re }-\frac{\zeta}{e^{i\theta}} + O(|\zeta|^2)

implies

\displaystyle \log |e^{i\theta} - \zeta| + \text{Re } \frac{\zeta}{e^{i\theta}} = O(|\zeta|^2)

and hence

\displaystyle \int_{\partial D(0, 1) \setminus I} \varphi(e^{i\theta}) (\log |e^{i\theta} - \zeta| + \text{Re } e^{-i\theta}\zeta) ~dm(e^{i\theta}) = O(\delta|\zeta|^2).

We combine these into the unified estimate

\displaystyle \int_{\partial D(0, 1) \setminus I} \varphi(e^{i\theta}) (\log |e^{i\theta} - \zeta| + \text{Re } e^{-i\theta}\zeta) ~dm(e^{i\theta}) = O(\delta^{1/2}|\zeta|^2)

valid for all {|\zeta| \leq 1}, hence almost surely. Taking expected values we get

\displaystyle \int_{\partial D(0, 1) \setminus I} \varphi(e^{i\theta})U_\zeta(e^{i\theta}) + \varphi(e^{i\theta}) \text{Re }e^{-i\theta}\mu ~dm(e^{i\theta}) = O(\delta^{1/2}(|\mu|^2 + \sigma^2)) + o(\sigma^2) + o(1)^n.

In the last lemma we bounded {|\mu|} so we can absorb all the terms with {\mu} in them to get

\displaystyle \int_{\partial D(0, 1) \setminus I} \varphi(e^{i\theta})U_\zeta(e^{i\theta}) ~dm(e^{i\theta}) = O(\delta^{1/2}\sigma^2) + o(\sigma^2) + o(1)^n.

We also have

\displaystyle \int_{\partial D(0, 1) \setminus I} \varphi ~dm = O(\delta)

(here Tao refers to a mysterious undefined measure {\sigma} but I’m pretty sure he means {m}). Putting these integrals together with the integrals over {I},

\displaystyle \ U_\zeta(a)int_{\partial D(0, 1)} \varphi ~dm - \int_{\partial D(0, 1)} \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) ~dm(e^{i\theta}) \geq -O(\delta^{1/2}\sigma^2) - o(\sigma^2) - o(1)^n.

By underspill we can delete {\delta}, thus

\displaystyle  U_\zeta(a)\int_{\partial D(0, 1)} \varphi ~dm - \int_{\partial D(0, 1)} \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) ~dm(e^{i\theta}) \geq - o(\sigma^2) - o(1)^n.

We now consider the specific case {\varphi = 1}. Then

\displaystyle U_\zeta(a) - \int_{\partial D(0, 1)} U_\zeta ~dm \geq -o(\sigma^2) - o(1)^n.

Now Tao claims and doesn’t prove

\displaystyle \int_{\partial D(0, 1)} U_\zeta ~dm = 0.

To see this, we expand as

\displaystyle \int_{\partial D(0, 1)} U_\zeta ~dm = -\mathbf E \frac{1}{2\pi} \int_0^{2\pi} \log|\zeta - e^{i\theta}| ~d\theta

using Fubini’s theorem. Now we use Jensen’s formula with {g(z) = \zeta - z}, which has a zero exactly at {\zeta}. This seems problematic if {\zeta = 0}, but we can condition on {|\zeta| > 0}. Indeed, if {\zeta = 0} then we have

\displaystyle  \int_0^{2\pi} \log|\zeta - e^{i\theta}| ~d\theta = \int_0^{2\pi} \log 1 ~d\theta = 0

which already gives us what we want. Anyways, if {|\zeta| > 0}, then by Jensen’s formula,

\displaystyle \frac{1}{2\pi} \int_0^{2\pi} \log|\zeta - e^{i\theta}| ~d\theta = \log |\zeta| - \log |\zeta| = 0.

So that’s how it is. Thus we have

\displaystyle -U_\zeta(a) \leq o(\sigma^2) + o(1)^n.

Since {|a - \zeta| \geq 1}, {\log |a - \zeta| \geq 0}, so the same is true of its expected value {-U_\zeta(a)}. This gives the desired bound

\displaystyle U_\zeta(a) = o(\sigma^2) + o(1)^n.

We can use that bound to discard {U_\zeta(a)} from the average

\displaystyle  U_\zeta(a)\int_{\partial D(0, 1)} \varphi ~dm - \int_{\partial D(0, 1)} \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) ~dm(e^{i\theta}) \geq - o(\sigma^2) - o(1)^n,

thus

\displaystyle \int_{\partial D(0, 1)} \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) ~dm(e^{i\theta})= o(\sigma^2) + o(1)^n.

Repeating the Jensen’s formula argument from above we see that we can replace {\varphi} with {\varphi - k} for any {k \geq 0}. So this holds even if {\varphi} is not necessarily nonnegative. \Box

Let’s Read: Sendov’s conjecture in high degree, part 3: case zero, and a sketch of case one

In this post we’re going to complete the proof of case zero and continue the proof of case one. In the last two posts we managed to prove:

Theorem 1 Let {n} be a nonstandard natural, and let {f} be a monic polynomial of degree {n} on {\mathbf C} with all zeroes in {\overline{D(0, 1)}}. Suppose that {a} is a zero of {f} such that:

  1. Either {a} or {a - 1} is infinitesimal, and
  2. {f} has no critical points on {\overline{D(a, 1)}}.

Let {\lambda} be a random zero of {f} and {\zeta} a random critical point of {f}. Let {\mu} be the expected value of {\lambda}. Let {z} be a complex number outside some measure zero set and let {\gamma} be a contour that misses the zeroes of {f},{f'}. Then:

  1. {\zeta \in \overline{D(0, 1)} \setminus \overline{D(a, 1)}}.
  2. {\mu = \mathbf E \zeta}.
  3. One has

    \displaystyle U_\lambda(z) = -\frac{1}{n} \log |f(z)|

    and

    \displaystyle U_\zeta(z) = -\frac{\log n}{n - 1} - \frac{1}{n-1} \log |f'(z)|.

  4. One has

    \displaystyle s_\lambda(z) = \frac{1}{n} \frac{f'(z)}{f(z)}

    and

    \displaystyle s_\zeta(z) = \frac{1}{n - 1} \frac{f''(z)}{f'(z)}.

  5. One has

    \displaystyle U_\lambda(z) - \frac{n - 1}{n} U_\zeta(z) = \frac{1}{n} \log |s_\lambda(z)|

    and

    \displaystyle s_\lambda(z) - \frac{n - 1}{n} s_\zeta(z) = -\frac{1}{n} \frac{s_\lambda'(z)}{s_\lambda(z)}.

  6. One has

    \displaystyle f(\gamma(1)) = f(\gamma(0)) \exp \left(n \int_\gamma s_\lambda(z) ~dz\right)

    and

    \displaystyle f'(\gamma(1)) = f(\gamma(0)) \exp\left((n-1) \int_\gamma s_\zeta(z) ~dz\right).

Moreover,

  1. If {a} is infinitesimal (case zero), then {\lambda^{(\infty)},\zeta^{(\infty)}} are identically distributed and almost surely lie in

    \displaystyle C = \{e^{i\theta}: 2\theta \in [\pi, 3\pi]\}.

    Moreover, if {K} is any compact set which misses {C}, then

    \displaystyle \mathbf P(\lambda \in K) = O\left(a + \frac{\log n}{n^{1/3}}\right),

    so {d(\lambda, C)} is infinitesimal in probability.

  2. If {a - 1} is infinitesimal (case one), then {\lambda^{(\infty)}} is uniformly distributed on {\partial D(0, 1)} and {\zeta^{(\infty)}} is almost surely zero. Moreover,

    \displaystyle \mathbf E \log \frac{1}{|\lambda|}, \mathbf E \log |\zeta - a| = O(n^{-1}).

We also saw that Sendov’s conjecture in high degree was equivalent to the following result, that we will now prove.

Lemma 2 Let {n} be a nonstandard natural, and let {f} be a monic polynomial of degree {n} on {\mathbf C} with all zeroes in {\overline{D(0, 1)}}. Let {a} be a zero of {f} such that:

  1. Either {a \log n} is infinitesimal (case zero), or
  2. There is a standard {\varepsilon_0 > 0} such that

    \displaystyle 1 - o(1) \leq a \leq 1 - \varepsilon_0^n

    (case one).

If there are no critical points of {f} in {\overline{D(a, 1)}}, then {0 = 1}.

1. Case zero

Now we prove case zero — the easy case — of Lemma 2.

Suppose that {a \log n} is infinitesimal. In this case, {\lambda^{(\infty)}, \zeta^{(\infty)}} are identically distributed and almost surely are {\in C}.

Lemma 3 There are {0 < r_1 < r_2 < 1/2} such that for every {|z| \in [r_1, r_2]},

\displaystyle |s_{\lambda^{(\infty)}}(z)| \sim 1

uniformly.

Proof: Since {\lambda^{(\infty)}} is supported in {C}, {s_{\lambda^{(\infty)}}} is holomorphic away from {C}. Since {\lambda^{(\infty)}} is bounded, if {z} is near {\infty} then

\displaystyle s_{\lambda^{(\infty)}}(z) = \mathbf E\frac{1}{z - \lambda^{(\infty)}} \sim \mathbf E \frac{1}{z} = \frac{1}{z}

which is nonzero near {\infty}. So the variety {s_{\lambda^{(\infty)}} = 0} is discrete, so there are {0 < r_1 < r_2 < 1/2} such that {s_{\lambda^{(\infty)}}(re^{i\theta}) \neq 0} whenever {r \in [r_1, r_2]}. To see this, suppose not; then for every {r_1 < r_2} we can find {r \in [r_1, r_2]} and {\theta} with {s_{\lambda^{(\infty)}}(re^{i\theta}) = 0}, so {s_{\lambda^{(\infty)}}} has infinitely many zeroes in the compact set {\overline{D(0, 1/2)}}. Since this is definitely not true, the claim follows by continuity of {s_{\lambda^{(\infty)}}}. \Box

Let {m} be the number of zeroes of {s_{\lambda^{(\infty)}}} in {D(0, r_1)}, so {m} is a nonnegative standard natural since {s_{\lambda^{(\infty)}}} is standard and {\overline{D(0, 1/2)}} is compact. Let {\gamma(\theta) = re^{i\theta}} where {r \in (r_1, r_2)}; then

\displaystyle \int_\gamma \frac{s_{\lambda^{(\infty)}}'(z)}{s_{\lambda^{(\infty)}}(z)} = m,

by the argument principle.

We claim that in fact {m \leq -1}, which contradicts that {m} is nonnegative. This will be proven in the rest of this section.

Here something really strange happens in Tao’s paper. He proves this:

Lemma 4 One has

\displaystyle \left|\frac{1}{n} \frac{f'(z)}{f(z)} - s_{\lambda^{(\infty)}(z)}\right| = o(1)

in {L^1_{loc}}.

We now need to show that the convergence in {L^1_{loc}} above commutes with the use of the argument principle so that

\displaystyle \int_\gamma \frac{(f'/f)'(z)}{f'(z)/f(z)} = m;

this will be good because we have control on the zeroes and critical points of {f} using our contradiction assumption. What’s curious to me is that Tao seems to substitute this with convergence in {L^\infty_{loc}} on an annulus. Indeed, convergence in {L^\infty_{loc}} does commute with use of the argument principle, but at no point of the proof does it seem like he uses the convergence in {L^1_{loc}}. So I include the proof of the latter in the next section as a curiosity item, but I think it can be omitted entirely. Tell me in the comments if I’ve made a mistake here.

If {\chi} is a smooth cutoff supported on {\overline{D(0, 1/2)}} and identically one on {\overline{D(0, r_3)}} (where {r_2 < r_3 < 1/2}), one has

\displaystyle \frac{1}{n} \frac{f'(z)}{f(z)} = \mathbf E \frac{1}{z - \lambda} = \mathbf E \frac{1 - \chi(\lambda)}{z - \lambda} + \mathbf E \frac{\chi(\lambda)}{z - \lambda}.

The {1 - \chi} term is easy to deal with, since for every {z}, {(1 - \chi(\lambda))/(z - \lambda)} is a bounded continuous function of {\lambda} whenever {|z| < r_2} (so {|\lambda - z| \geq r_3 - r_2 > 0}). By the definition of being infinitesimal in distribution we have

\displaystyle \left|\mathbf E\frac{1 - \chi(\lambda)}{z - \lambda} - \frac{1}{z - \lambda^{(\infty)}}\right| = o(1).

Therefore

\displaystyle \mathbf E \frac{1 - \chi(\lambda)}{z - \lambda} - s_{\lambda^{(\infty)}}(z)

is uniformly infinitesimal.

Now we treat the {\chi} term. Interestingly, this is the main point of the argument where we use that {a \log n} is infinitesimal, and the rest of the argument seems to mainly go through with much weaker assumptions on {a}.

Lemma 5 There is an {r \in [r_1, r_2]} such that if {|z| = r} then

\displaystyle \left|\mathbf E \frac{\chi(\lambda)}{z - \lambda}\right| = o(1).

Proof: By the triangle inequality and its reverse, if {|z| = r} then

\displaystyle \left|\mathbf E \frac{\chi(\lambda)}{z - \lambda}\right| \leq \mathbf E \frac{\chi(\lambda)}{|r - |\lambda||}.

Here {r \in [r_1, r_2]} is to be chosen.

Since we have

\displaystyle \mathbf P(\lambda \in K) = O\left(a + \frac{\log n}{n^{1/3}}\right)

whenever {K} is a compact set which misses {C}, this in particular holds when {K = \overline{B(0, 1/2)}}. Since {a = o(1/\log n)} and {n^{-1/3}\log n = o(1/\log n)} it follows that

\displaystyle \mathbf P(|\lambda| \leq 1/2) = o(1/\log n).

In particular,

\displaystyle \mathbf E\chi(\lambda) = o(1/\log n).

We now claim

\displaystyle \int_{r_1}^{r_2} \frac{dr}{\max(|r - |\lambda||, n^{-10})} = O(\log n).

By splitting the integrand we first bound

\displaystyle \int_{\substack{[r_1,r_2]\\|r-|\lambda|| \leq n^{-10}}} \frac{dr}{\max(|r - |\lambda||, n^{-10})} \leq 2n^{-10}n^{10} = 2 = O(\log n)

since {\log n} is nonstandard and the domain of integration has measure at most {2n^{-10}}. On the other hand, the other term

\displaystyle \int_{\substack{[r_1,r_2]\\|r-|\lambda|| \geq n^{-10}}} \frac{dr}{\max(|r - |\lambda||, n^{-10})} \leq \int_{n^{-10}}^{r_2 - r_1} \frac{dr}{r} = \log(r_2 - r_1) - 10 \log n = O(\log n)

since {\log(r_2 - r_1)} is standard while {\log n} is nonstandard. This proves the claim.

Putting the above two paragraphs together and using Fubini’s theorem,

\displaystyle \int_{r_1}^{r_2} \mathbf E \frac{\chi(\lambda)}{\max(|r - |\lambda||, n^{-10})} ~dr = \mathbf E\chi(\lambda) \int_{r_1}^{r_2} \frac{1}{\max(|r - |\lambda||, n^{-10})} ~dr = O(\log n) \mathbf E\chi(\lambda)

is infinitesimal. So outside of a set of infinitesimal measure, {r \in [r_1, r_2]} satisfies

\displaystyle \mathbf E \frac{\chi(\lambda)}{\max(|r - |\lambda||, n^{-10})} = o(1).

If {|r - |\lambda|| \leq n^{-10}} then there is a (deterministic) zero {\lambda_0} such that {|r - |\lambda_0|| \leq n^{-10}}, thus {r} lies in a set of measure {2n^{-10}}. There are {\leq n} such sets since there are {n} zeroes of {f}, so their union has measure {2n^{-9}}, which is infinitesimal. Therefore

\displaystyle \mathbf E \frac{\chi(\lambda)}{|r - |\lambda||} = o(1)

which implies the claim. \Box

Summing up, we have

\displaystyle \frac{f'}{nf} = s_{\lambda^{(\infty)}} + o(1)

in {L^\infty(B(0, r))}, where {r} is as in the previous lemma. Pulling out the factor of {1/n}, which is harmless, we can use the argument principle to deduce that {m} is the number of zeroes minus poles of {f'/f}; that is, the number of critical points minus zeroes of {f}. Indeed, convergence in {L^\infty} does commute with the argument principle, so we can throw out the infinitesimal {o(1)}.

But {a} is infinitesimal, and we assumed that {f} had no critical points in {\overline{D(a, 1)}}, which contains {D(0, r)}. So {f} has no critical points, but has a zero {a}; therefore {m \leq -1}.

In a way this part of the proof was very easy: the only tricky bit was using the cutoff to get convergence in {L^\infty} like we needed. The hint that we could use the argument principle was the fact that {a} was infinitesimal, so we had control of the critical points near the origin.

2. Convergence in {L^1_{loc}}

Let {\nu} be the distribution of {\lambda} and {\nu^{(\infty)}} of {\lambda^{(\infty)}}. Since {\lambda - \lambda^{(\infty)}} is infinitesimal in distribution, {\nu^{(\infty)} - \nu} is infinitesimal in the weak topology of measures; that is, for every continuous function {g} and compact set {K},

\displaystyle \int_K g ~d(\nu^{(\infty)} - \nu) = o(1).

Now

\displaystyle s_{\lambda^{(\infty)}}(z) - s_\lambda(z) = \int_{D(0, 1)} \frac{d\nu^{(\infty)}(w)}{z - w} - \int_{D(0, 1)} \frac{d\nu(w)}{z - w}.

If {K} is a compact set and {\rho} is Lebesgue measure then

\displaystyle \int_K s_{\lambda^{(\infty)}} - s_\lambda ~d\rho = \int_K \int_{D(0, 1)} \frac{d(\nu^{(\infty)} - \nu)(w)}{z - w} ~d\rho(z).

By Tonelli’s theorem

\displaystyle \int_K \int_{D(0, 1)} \frac{d(\nu^{(\infty)} - \nu)(w)}{|z - w|} ~d\rho(z) = \int_{D(0, 1)} \int_K \frac{d\rho(z)}{|z - w|} d(\nu^{(\infty)} - \nu)(w)

and the inner integral is finite since {1/|z|} is Lebesgue integrable in codimension {2}. So the outer integrand is a bounded continuous function, which implies that

\displaystyle \int_K \int_{D(0, 1)} \frac{d(\nu^{(\infty)} - \nu)(w)}{|z - w|} ~d\rho(z) = o(1)

which gives what we want when we recall

\displaystyle s_\lambda(z) = \frac{1}{n} \frac{f'(z)}{f(z)}

and we plug in {s_\lambda}.

3. Case one: Outlining the proof

The proof for case one is much longer, and is motivated by the pseudo-counterexample

\displaystyle f(z) = z^n - 1.

Here {a} is an {n}th root of unity, and {f} has no critical points on {D(a, 1)}, but does have {n - 1} critical points at {0 \in \partial D(a, 1)}. Similar pseudo-counterexamples hold for

\displaystyle 1 - o(1) \leq a \leq 1 - \varepsilon_0^n

where {\varepsilon_0 > 0} is standard. We will seek to control these examples by controlling {\zeta} up to an error of size {O(\sigma^2) + o(1)^n}; here {\sigma^2} is the variance of {\zeta} and {o(1)^n} is an infinitesimal raised to the power of {n}, thus is very small, and forces us to balance out everything in terms of {a}.

As discussed in the introduction of this post, {\zeta} is infinitesimal in probability (and, in particular, its expected value {\mu} is infinitesimal); thus, with overwhelming probability, the critical points of {f} are all infinitesimals. Combining this with the fact that {\lambda^{(\infty)}} is uniformly distributed on {\partial D(0, 1)}, it follows that {f} sort of looks like {f(z) = z^n - 1}.

We start with some nice bounds:

Lemma 6 (preliminary bounds) For any standard compact set {K \subset \mathbf C}, one has

\displaystyle f(z) = f(0) + O((|z| + o(1))^n)

and

\displaystyle f'(z) = O((|z| + o(1))^n)

uniformly in {z \in K}.

In other words, {f} sort of grows like the translate of a homogeneous polynomial of degree {n}.

It would be nice if {\zeta} was infinitesimal in {L^\infty}, but this isn’t quite true; the following lemma is the best we can do.

Lemma 7 (uniform convergence of {\zeta}) There is a standard compact set

\displaystyle S = (\overline{D(0, 1)} \cap \partial D(1, 1)) \cup T

where {T} is countable, standard, does not meet {\overline{D(1, 1)}}, and consists of isolated points of {S}, such that {d(\zeta, S)} is infinitesimal in {L^\infty}.

So we think of {S} as some sort of generalization of {0}. Away from {S} we have good bounds on {f,f'}:

Lemma 8 (approximating {f,f'} outside {S}) For any standard compact set {K \subset \mathbf C \setminus S}:

  1. Uniformly in {z, w \in K},

    \displaystyle f'(w) = (1 + O(n|z - w|\sigma^2|e^{o(n|z -w|)})) \frac{f'(z)}{(z - \mu)^{n-1}} (w - \mu)^{n-1}.

  2. For every standard {\varepsilon > 0} and uniformly in {z \in K},

    \displaystyle f(z) = f(0) + \frac{1 + O(\sigma^2)}{n} f'(z) (z - \mu) + O((\varepsilon + o(1))^n).

As a consequence, we can show that every zero of {f} which is far from {S} is close to the level set

\displaystyle U_\zeta = \frac{1}{n} \log\frac{1}{|f(0)|}.

This in particular holds for {a}, since the standard part of {a} is {1}, and {T} does not come close to {1} (so neither does {S}). In fact the error term is infinitesimal:

Lemma 9 (zeroes away from {S}) For any standard compact set {K \subset \mathbf C \setminus S}, any standard {\varepsilon > 0}, and any zero {\lambda_0 \in K},

\displaystyle U_\zeta(\lambda_0) = \frac{1}{n} \log \frac{1}{|f(0)|} + O(n^{-1}\sigma^2) + O((\varepsilon + o(1))^n)

uniformly in {\lambda_0}.

Since {a} satisfies the hypotheses of the above lemma,

\displaystyle U_\zeta(\lambda) - U_\zeta(a) = O(n^{-1}\sigma^2 + (\varepsilon + o(1))^n)

is infinitesimal. This gives us some more bounds:

Lemma 10 (fine control) For every standard {\varepsilon > 0}:

  1. One has

    \displaystyle \mu, 1 - a = O(\sigma^2 + (\varepsilon + o(1))^n).

  2. For every compact set {I \subseteq \partial D(0, 1) \setminus S} and {e^{i\theta} \in I},

    \displaystyle U_\zeta(a) - U_\zeta(e^{i\theta}) -o(\sigma^2) - o(1)^n.

  3. For every standard smooth function {\varphi: \partial D(0, 1) \rightarrow \mathbf C},

    \displaystyle \int_0^{2\pi} \varphi(e^{i\theta}) U_\zeta(e^{i\theta}) ~d\theta = o(\sigma^2) + o(1)^n.

  4. One has

    \displaystyle U_\zeta(a) = o(\sigma)^2 + o(1)^n.

Here Tao claims

\displaystyle \int_0^{2\pi} e^{-2i\theta} \log \frac{1}{|e^{i\theta} - \zeta|} ~d\theta = \frac{\pi}{2}\zeta^2.

Apparently this follows from Fourier inversion but I don’t see it. In any case if we take the expected value of the left-hand side we get

\displaystyle \int_0^{2\pi} \mathbf Ee^{-2i\theta} \log \frac{1}{|e^{i\theta} - \zeta|} ~d\theta = \int_0^{2\pi} e^{-2i\theta} U_\zeta(e^{i\theta}) = o(\sigma^2) + o(1)^n

by the fine control lemma, so

\displaystyle \mathbf E \zeta^2 = o(\sigma^2) + o(1)^n.

In particular this holds for the real part of {\zeta}. Since {\sigma^2} is infinitesimal, so are the first two moments of the real part of {\zeta}.

Since {|a - \zeta| \in [1, 2]}, one has

\displaystyle |a - \zeta| - 1 \sim \log |a - \zeta|.

This is true since for any {s \in [1, 2]} one has {\log s \sim s - 1} (which follows by Taylor expansion). In particular,

\displaystyle |1 - \zeta| \leq |1 - a| + |a - \zeta| = 1 + O((1 - a) + \log |a - \zeta|).

Let {\tilde \zeta} be the best approximation of {\zeta} on the arc {\partial D(1, 1) \cap \overline{D(0, 1)}}, which exists since that arc is compact; then

\displaystyle |\zeta - \tilde \zeta| = O((1 - a) + \log|a - \zeta|).

Since {\tilde \zeta \in \partial D(1, 1)}, it has the useful property that

\displaystyle \text{arg }\zeta \in [\pi/3,\pi/2] \cup [-\pi/2, -\pi/3];

therefore

\displaystyle \text{Re } \tilde \zeta^2 \leq -\frac{1}{2} |\tilde \zeta|^2.

Plugging in the expansion for {\tilde \zeta} we have

\displaystyle \text{Re } \zeta^2 \leq -\frac{1}{2} |\zeta|^2 + O(|\zeta|((1-a) + \log|a -\zeta|) + ((1 - a) + \log|a - \zeta|)^2).

We now use the inequality {2|zw| \leq |z|^2 + |w|^2} several times. First we bound

\displaystyle \frac{1}{2} |\zeta|((1-a) + \log|a -\zeta|) \leq \frac{1}{4}|\zeta|^2 + O(\log^2 |a - \zeta|).

I had to think a bit about why this is legal; the point is that you can absorb the implied constant on {\zeta} into the implied constant on {\log |a - \zeta|} before applying the inequality. Now we bound

\displaystyle ((1 - a) + \log|a - \zeta|)^2 = (1 - a)^2 + 2(1 - a)\log|a - \zeta| + \log^2 |a - \zeta| = O((1-a)^2 + \log^2 |a - \zeta|)

by similar reasoning.

Thus we conclude the bound

\displaystyle \text{Re } \zeta^2 \leq - \frac{1}{4} |\zeta|^2 + O((1-a)^2 + \log^2 |a - \zeta|),

or in other words,

\displaystyle \mathbf E \text{Re }\zeta^2 \leq -\frac{1}{4} \mathbf E |\zeta|^2 + O((1-a)^2+ \mathbf E \log^2 |a - \zeta|).

Applying the fine control lemma, or more precisely the result

\displaystyle 1 - a = O(\sigma^2 + (\varepsilon + o(1))^n),

as well as the fact that {1 - a} is infinitesimal, we have

\displaystyle (1-a)^2 = (1 - a) O(\sigma^2 + (\varepsilon + o(1))^n) = o(\sigma^2) + o(\varepsilon + o(1))^n)

for every standard {\varepsilon > 0}, hence by underspill

\displaystyle (1-a)^2 = o(\sigma^2) + o(1)^n.

By the fine control lemma,

\displaystyle U_\zeta(a) = o(\sigma^2) + o(1)^n.

Thus we bound

\displaystyle \mathbf E \log^2 |a - \zeta| \leq -\mathbf E \log |a - \zeta| = U_\zeta(a) = o(\sigma^2) + o(1)^n

owing to the fact that {|a - \zeta| \in [1, 2]} so that {\log |a - \zeta| \in [0, 1]}.

Plugging in the above bounds,

\displaystyle \mathbf E \text{Re }\zeta^2 \leq -\frac{1}{4} \mathbf E|\zeta|^2 + o(\sigma^2) + o(1)

By definition of variance we have

\displaystyle \mathbf E |\zeta|^2 - |\mu|^2 = \sigma^2

and {\mu} is infinitesimal so we can spend the {o(\sigma^2)} term as

\displaystyle \mathbf E\text{Re }\zeta^2 \leq -\frac{1+o(1)}{4} \mathbf E |\zeta|^2 + o(1)^n.

But the fine control lemma said

\displaystyle \mathbf E\text{Re }\zeta^2 = o(\sigma^2) + o(1)^n.

So

\displaystyle |\mu|^2 + \sigma^2 = o(1)^n.

In particular,

\displaystyle o(\sigma^2) = o(1)^n

since {\mu} is infinitesimal.

We used underspill to show

\displaystyle (1 - a)^2 = o(\sigma^2) + o(1)^n = o(1)^n

so

\displaystyle 1 - \varepsilon_0^n \geq a = 1 - o(1)^n > 1 - \varepsilon_0^n

since {\varepsilon_0} was standard, which implies {0 = 1}.

Next time, we’ll go back and fill in all the lemmata that we skipped in the proof for case one. This is a tricky bit — pages 25 through 34 of Tao’s paper. (For comparison, we covered pages 19 through 21, some of the exposition in pages 24 through 34, and pages 34 through 36 this time). Next time, then.

Let’s Read: Sendov’s conjecture in high degree, part 2: distribution of random zeroes

Before we begin, I want to fuss around with model theory again. Recall that if {z} is a nonstandard complex number, then {z^{(\infty)}} denotes the standard part of {z}, if it exists. We previously defined what it meant for a nonstandard random variable to be infinitesimal in distribution. One can define something similar for any metrizable space with a notion of {0}, where {f} is infinitesimal provided that {d(f, 0)} is. For example, a nonstandard random variable {\eta} is infinitesimal in {L^1_{loc}} if for every compact set {K} that {\eta} can take values in, {||\eta||_{L^1(K)}} is infinitesimal, since {L^1_{loc}} is metrizable with

\displaystyle d(0, \eta) = \sum_{m=1}^\infty 2^{-m} \frac{||\eta||_{L^1(K_m)}}{1 + ||\eta||_{L^1(K_m)}}

whenever {(K_m)} is a compact exhaustion. If {f} is nonstandard, {|f - f^{(\infty)}|} is infinitesimal in some metrizable space, and {f^{(\infty)}} is standard, then we call {f^{(\infty)}} the standard part of {f} in {\mathcal T}; then the standard part is unique since metrizable spaces are Hausdorff.

If the metrizable space is compact, the case that we will mainly be interested in, then the standard part exists. This is a point that we will use again and again. Passing to the cheap perspective, this says that if {K} is a compact metric space and {(f^{(n)})} is a sequence in {K}, then there is a {f^{(\infty)}} which is approximates {f^{(n)}} infinitely often, but that’s just the Bolzano-Weierstrass theorem. Last time used Prokohov’s theorem to show that if {\xi} is a nonstandard tight random variable, then {\xi} has a standard part {\xi^{(\infty)}} in distribution.

We now restate and prove Proposition 9 from the previous post.

Theorem 1 (distribution of random zeroes) Let {n} be a nonstandard natural, {f} a monic polynomial of degree {n} with all zeroes in {\overline{D(0, 1)}}, and let {a \in [0, 1]} be a zero of {f}. Suppose that {f'} has no zeroes in {\overline{D(a, 1)}}. Let {\lambda} be a random zero of {f} and {\zeta} a random zero of {f'}. Then:

  1. If {a^{(\infty)} = 0} (case zero), then {\lambda^{(\infty)}} and {\zeta^{(\infty)}} are identically distributed and almost surely lie in the curve

    \displaystyle C = \{e^{i\theta}: 2\theta \in [\pi, 3\pi]\}.

    In particular, {d(\lambda, C) = o(1)} in probability. Moreover, for every compact set {K \subseteq \overline{D(0, 1)} \setminus C},

    \displaystyle \mathbf P(\lambda \in K) = O\left(a + \frac{\log n}{n^{1/3}}\right).

  2. If {a^{(\infty)} = 1} (case one), then {\lambda^{(\infty)}} is uniformly distributed on the unit circle {\partial D(0, 1)} and {\zeta^{(\infty)}} is almost surely zero. Moreover,

    \displaystyle \mathbf E \log \frac{1}{|\lambda|}, \mathbf E\log |\zeta - a| = O(n^{-1}).

1. Moment-generating functions and balayage

We first show that {\lambda^{(\infty)}} and {\zeta^{(\infty)}} have equal moment-generating functions in a suitable sense.

To do this, we first show that they have the same logarithmic potential. Let {\eta} be a random variable such that {|\eta| = O(1)} almost surely (that is, {\eta} is almost surely bounded). Then the logarithmic potential

\displaystyle U_\eta(z) = \mathbf E \log \frac{1}{|z - \eta|}

is defined almost everywhere as we discussed last time, and is harmonic outside of the essential range of {\eta}.

Lemma 2 Let {\eta} be a nonstandard, almost surely bounded, random complex number. Then the standard part of {U_\eta} is {U_{\eta^{(\infty)}}} according to the topology of {L^1_{loc}} under Lebesgue measure.

Proof: We pass to the cheap perspective. If we instead have a random sequence of {\eta_j} and {\eta_j \rightarrow \eta} in distribution, then {U_{\eta_j} \rightarrow U_\eta} in {L^1_{loc}}, since up to a small error in {L^1_{loc}} we can replace {\log} with a test function {g}; one then has

\displaystyle \lim_{j \rightarrow \infty} \iint_{K \times \mathbf C} g\left(\frac{1}{|z - w|}\right) ~d\mu_j(w) ~dz = \iint_{K \times \mathbf C} g\left(\frac{1}{|z - w|}\right) ~d\mu(w) ~dz

where {\mu_j \rightarrow \mu} in the weak topology of measures, {\mu_j} is the distribution of {\eta_j}, {\mu} is the distribution of {\eta}, and {K} is a compact set equipped with Lebesgue measure. \Box

Lemma 3 For every {1 < |z| \leq 3/2}, we have

\displaystyle U_\lambda(z) - U_\zeta(z) = O\left(\frac{1}{n} \log \frac{1}{|z| - 1}\right).

In particular, {U_{\lambda^{(\infty)}}(z) = U_{\zeta^{(\infty)}}(z)}.

Proof: By definition, {\lambda \in D(0, 1)}, so {z - \lambda \in D(z, 1)}. Now {D(z, 1)} is a disc with diameter {T([|z| - 1, |z| + 1])} where {T} is a rotation around the origin. Taking reciprocals preserves discs and preserves {T}, so {(z - \lambda)^{-1}} sits inside a disc {W} with a diameter {T[(|z|+1)^{-1}, (|z|-1)^{-1}]}. Then {W} is convex, so the expected value of {(z - \lambda)^{-1}} is also {\in W}. Therefore the Stieltjes transform

\displaystyle s_\lambda(z) = \mathbf E \frac{1}{z - \lambda}

satisfies {s_\lambda(z) \in W}. In particular,

\displaystyle \log |s_\lambda(z)| \in \left[\log \frac{1}{|z| + 1}, \log \frac{1}{|z| - 1}\right].

But we showed that

\displaystyle U_\lambda(z) - \frac{n - 1}{n} U_\zeta(z) = \frac{1}{n} \log |s_\lambda(z)|

almost everywhere last time. This implies that for almost every {z},

\displaystyle -\frac{\log(|z| + 1)}{n} \leq U_\lambda(z) - \frac{n - 1}{n}U_\zeta(z) \leq -\frac{\log(|z| - 1)}{n}

but all terms here are continuous so we can promote this to a statement that holds for every {z}. In particular,

\displaystyle U_\lambda(z) - \frac{n - 1}{n} U_\zeta(z) = O\left(\frac{1}{n} \log \frac{1}{|z|-1}\right)

hence

\displaystyle U_\lambda(z) - U_\zeta(z) = O\left(\frac{1}{n} U_\zeta(z) + \frac{1}{n} \log \frac{1}{|z|-1}\right).

Since {|\zeta| < 1} while {1 < |z| < 3/2}, {|z - \zeta|} is bounded from above and below by a constant times {|z| - 1}. Therefore the same holds of its logarithm {U_\zeta(z)}, which is bounded from above and below by a constant times {-\log(|z| - 1)}. This implies the first claim.

To derive the second claim from the first, we use the previous lemma, which implies that we must show that

\displaystyle \log \frac{1}{|z| - 1} = O(n)

in {L^1_{loc}}. But this follows since {-\log|\cdot|} is integrable in two dimensions. \Box

Lemma 4 Let {\eta} be an almost surely bounded random variable. Then

\displaystyle U_\eta(Re^{i\theta}) = -\log R + \frac{1}{2} \sum_{m \neq 0} \frac{e^{im\theta}}{|m| R^{|m|}} \mathbf E\eta^{|m|}.

Proof: One has the Taylor series

\displaystyle \log \frac{1}{|Re^{i\theta} - w|} = -\log R + \frac{1}{2} \sum_{m \neq 0} \frac{e^{im\theta} w^{|m|}}{|m| R^{|m|}}.

Indeed, by rescaling and using {\log(ab) = \log a + \log b}, we may assume {R = 1}. The summands expand as

\displaystyle \text{Re }\frac{e^{im\theta} w^{|m|}}{|m| R^{|m|}} = \frac{w^{|m|} \cos |m|\theta}{|m|}

and the imaginary parts all cancel by symmetry about {0}. Using the symmetry about {0} again we get

\displaystyle -\log R + \frac{1}{2} \sum_{m \neq 0} \frac{e^{im\theta} w^{|m|}}{|m| R^{|m|}} = \sum_{m=1}^\infty \frac{w^{|m|} \cos |m|\theta}{|m|}.

This equals the left-hand side as long as {|w| < R}. Taking expectations and commuting the expectation with the sum using Fubini’s theorem (since {\eta} is almost surely bounded), we see the claim. \Box

Lemma 5 For all {m \geq 1}, one has

\displaystyle \mathbf E\lambda^m - \mathbf E\zeta^m = O\left(\frac{m \log m}{n}\right).

In particular, {\lambda^{(\infty)}} and {\zeta^{(\infty)}} have identical moments.

Proof: If we take {1 < R \leq 3/2} then we conclude that

\displaystyle \sum_{m \neq 0} \frac{e^{im\theta}}{|m| R^{|m|}} \mathbf E\lambda^{|m|} - \sum_{m \neq 0} \frac{e^{im\theta}}{|m| R^{|m|}} \mathbf E\zeta^{|m|} = O\left(\frac{1}{n} \log \frac{1}{R - 1}\right).

The left-hand side is a Fourier series, and by uniqueness of Fourier series it holds that for every {m},

\displaystyle \frac{e^{im\theta}}{|m| R^{|m|}} \mathbf E(\lambda^{|m|} - \zeta^{|m|}) = O\left(\frac{1}{n} \log \frac{1}{R - 1}\right).

This gives a bound on the difference of moments

\displaystyle \mathbf E\lambda^m - \mathbf E\zeta^m = O\left(\frac{m R^m}{n} \log \frac{1}{R - 1}\right)

which is only possible if the moments of {\lambda^{(\infty)}} and {\zeta^{(\infty)}} are identical. The left-hand side doesn’t depend on {R}, but if {m \geq 2}, {R = 1 + 1/m}, then {R^m \leq 2} and {-\log(R - 1) = \log m} so the claim holds. On the other hand, if {m = 1} then this claim still holds, since we showed last time that

\displaystyle \mathbf E\lambda = \mathbf E\zeta

and obviously {1 \log 1 = 0}. \Box

Here I was puzzled for a bit. Surely if two random variables have the same moment-generating function then they are identically distributed! But, while we can define the moment-generating function of a random variable as a formal power series {F}, it is not true that {F} has to have a positive radius of convergence, in which case the inverse Laplace transform of {F} is ill-defined. Worse, the circle is not simply connected, and in case one, we have to look at a uniform distribution on the circle, whose moments therefore aren’t going to points on the circle, so the moment-generating function doesn’t tell us much.

2. Balayage

We recall the definition of the Poisson kernel {P}:

\displaystyle P(Re^{i\theta}, re^{i\alpha}) = \sum_{m = -\infty}^\infty \frac{r^{|m|}}{R^{|m|}} e^{im(\theta - \alpha)}

whenever {0 < r < R} is a radius. Convolving the Poisson kernel against a continuous function {g} on {\partial B(0, R)} solves the Dirichlet problem of {B(0, R)} with boundary data {g}.

Definition 6 Let {\eta \in D(0, R)} be a random variable. The balayage of {\eta} is

\displaystyle \text{Bal}(\eta)(Re^{i\theta}) = \mathbf EP(Re^{i\theta}, \eta).

Balayage is a puzzling notion. First, the name refers to a hair-care technique, which is kind of unhelpful. According to Tao, we’re supposed to interpret balayage as follows.

If {w_0 \in B(0, R)} is an initial datum for Brownian motion {w}, then {P(Re^{i\theta}, w_0)} is the probability density of the first location {Re^{i\theta}} where {w} passes through {\partial B(0, R)}. Tao asserts this without proof, but conveniently, this was a problem in my PDE class last semester. The idea is to approximate {\mathbf R^2} by the lattice {L_\varepsilon = \varepsilon \mathbf Z^2}, which we view as a graph where each vertex has degree {4}, with one edge to each of the vertices directly above, below, left, and right of it. Then the Laplacian on {\mathbf R^2} is approximated by the graph Laplacian on {L_\varepsilon}, and Brownian motion is approximated by the discrete-time stochastic process wherein a particle starts at the vertex that best approximates {w_0} and at each stage has a {1/4} chance of moving to each of the vertices adjacent to its current position.

So suppose that {w_0} and {Re^{i\theta}} are actually vertices of {L_\varepsilon}. The probability density {P_\varepsilon(Re^{i\theta}, w_0)} is harmonic in {w_0} with respect to the graph Laplacian since it is the mean of {P_\varepsilon(Re^{i\theta}, w)} as {w} ranges over the adjacent vertices to {w_0}; therefore it remains harmonic as we take {\varepsilon \rightarrow 0}. The boundary conditions follow similarly.

Now {\eta} if is a random initial datum for Brownian motion which starts in {D(0, R)}, the balayage of {\eta} is again a probability density on {\partial B(0, R)} that records where one expects the Brownian motion to escape, but this time the initial datum is also random.

I guess the point is that balayage serves as a substitute for the moment-generating function in the event that the latter is just a formal power series. We want to be able to use analytic techniques on the moment-generating function, but we can’t, so we just use balayage instead.

Let {\psi} be the balayage of {\eta}. Since {\eta} is bounded, we can use Fubini’s theorem to commute the expectation with the sum and see that

\displaystyle \psi(Re^{i\theta}) = \sum_{m-\infty}^\infty R^{-|m|} e^{im\theta} \mathbf E(r^{|m|} e^{-im\alpha}) = 1 + 2\sum_{m=1}^\infty R^{-|m|} \cos m\theta \mathbf E(r^{|m|} \cos m\alpha)

provided that {\eta = re^{i\alpha}}. It will be convenient to rewrite this in the form

\displaystyle \psi(Re^{i\theta}) = 1 + 2\text{Re} \sum_{m=1}^\infty R^{-m}e^{im\theta} \mathbf E\eta^m

so {\psi} is uniquely determined by the moment-generating function of {\eta}. In particular, {\lambda^{(\infty)}} and {\zeta^{(\infty)}} have identical balayage, and one has a bound

\displaystyle \text{Bal}(\lambda)(Re^{i\theta}) - \text{Bal}(\zeta)(Re^{i\theta}) = O\left(\frac{1}{n}\sum_{m=1}^\infty \frac{m \log m}{R^m}\right).

We claim that

\displaystyle \sum_{m=1}^\infty \frac{m \log m}{R^m} = O\left(-\frac{\log(R-1)}{(R - 1)^2}\right)

which implies the bound

\displaystyle \text{Bal}(\lambda)(Re^{i\theta}) - \text{Bal}(\zeta)(Re^{i\theta}) = O\left(\frac{1}{n}\frac{\log\frac{1}{R-1}}{(R - 1)^2}\right).

To see this, we discard the {m = 1} term since {1 \log 1 = 0}, which implies that

\displaystyle \sum_{m=1}^\infty \frac{m \log m}{R^m} = \sum_{M=1}^\infty \sum_{m=2^M}^{2^{M+1} - 1} \frac{m \log m}{R^m}.

Up to a constant factor we may assume that the logarithms are base {2} in which case we get a bound

\displaystyle \sum_{m=1}^\infty \frac{m \log m}{R^m} \leq C\sum_{M=1}^\infty \frac{M2^M}{R^{2^M}}.

The constant is absolute since {R \in (1, 3/2]}.

By the integral test, we get a bound

\displaystyle \sum_{M=1-\log(R-1)}^\infty \frac{M2^M}{R^{2^M}} \leq C\int_{-\log(R-1)}^\infty \frac{x2^x}{R^{2^x}} ~dx \leq C\int_{-\log(R-1)}^\infty \frac{2^{x^{(1+\varepsilon)}}}{R^{2^x}} ~dx.

Using the bound

\displaystyle \int_{1/(R-1)}^\infty \frac{dy}{R^y} \leq CR^{-1/(R-1)} \leq C2^{-1/(R-1)}

for any {N} and the change of variable {y = 2^x} (thus {dy = 2^x \log 2 ~dx}), we get a bound

\displaystyle \sum_{M=1-\log(R-1)}^\infty \frac{M2^M}{R^{2^M}} \leq C \int_{-\log(R-1)} \frac{dy}{R^y} \leq C2^{-1/(R-1)}

since the {\varepsilon} error in the exponent can’t affect the exponential decay of the integral in {1/(R-1)}. Since we certainly have

\displaystyle 2^{-1/(R-1)} \leq C\frac{-\log(R-1)}{(R-1)^2}

this is a suitable tail bound.

To complete the proof of the claim we need to bound the main term. To this end we bound

\displaystyle \sum_{M=1}^{-\log(R-1)} \frac{M2^M}{R^{2^M}} \leq \log\frac{1}{R-1} \sup_{x > 0} \frac{x2^x}{R^{2^x}} = \log\frac{1}{R-1} 2 \uparrow \sup_y \frac{y \log y}{R^y}.

Here {\alpha \uparrow \beta = \alpha^\beta} denotes exponentiation. Now if {R - 1} is small enough (say {R - 1 < 3/4}), this supremum will be attained when {x > 1}, thus {y \log y \leq 2y}. Therefore

\displaystyle \sum_{M=1}^{-\log(R-1)} \frac{M2^M}{R^{2^M}} \leq \left(2\uparrow \sup_{y > 0} \frac{y}{R^y}\right)^2 \log\frac{1}{R-1} .

Luckily {yR^{-y}} is easy to differentiate: its critical point is {1/y = \log R}. This gives

\displaystyle \sup_{y > 0} \frac{y}{R^y} \leq \log \frac{1}{R - 1}

so

\displaystyle \left(2\uparrow \sup_{y > 0} \frac{y}{R^y}\right)^2 \leq \frac{1}{(R-1)^2}

which was the bound we needed, and proves the claim. Maybe there’s an easier way to do this, because Tao says the claim is a trivial consequence of dyadic decomposition.

Let’s interpret the bound that we just proved. Well, if the balayage of {\eta} is supposed to describe the point on the circle {\partial B(0, R)} at which a Brownian motion with random initial datum {\eta} escapes, a bound on a difference of two balyages should describe how the trajectories diverge after escaping. In this case, the divergence is infinitesimal, but at different speeds depending on {R}. As {R \rightarrow 1}, our infinitesimal divergence gains a positive standard part, while if {R} stays close to {3/2}, the divergence remains infinitesimal. This makes sense, since if we take a bigger circle we forget more and more about the fact that {\zeta,\lambda} are not the same random variable, since Brownian motion has more time to “forget more stuff” as it just wanders around aimlessly. So in the regime where {R} is close to {3/2}, it is reasonable to take standard parts and pass to {\zeta^{(\infty)}} and {\lambda^{(\infty)}}, while in the regime where {R} is close to {1} this costs us dearly.

3. Case zero

Suppose that {a} is infinitesimal.

We showed last time that {\zeta \in \overline{D(0, 1)} \setminus \overline{D(a, 1)}}, so {d(\zeta, C) = O(a)} is infinitesimal. Therefore {\zeta^{(\infty)} \in C} almost surely.

I think there’s a typo here, because Tao lets {K} range over {D(0, 1) \setminus C} and considers points {e^{i\theta} \in D(0, 1) \setminus C}, which don’t exist since {|e^{i\theta}| = 1} while every point in {D(0, 1)} has {|\cdot| < 1}. I think this can be fixed by taking closures, which is what I do in the next lemma.

Tao proves a “qualitative” claim and then says that by repeating the argument and looking out for constants you can get a “quantitative” version which is what he actually needs. I’m just going to prove the quantitative argument straight-up. The idea is that if {K} is a compact set which misses {C} and {\lambda \in K} then a Brownian motion with initial datum {\lambda} will probably escape through an arc {J} which is close to {K}, but {J} is not close to {C} so a Brownian motion which starts at {\zeta} will probably not escape through {J}. Therefore {\lambda,\zeta} have very different balayage, even though the difference in their balayage was already shown to be infinitesimal.

I guess this shows the true power of balayage: even though the moment-generating function is “just” a formal power series, we know that the essential supports of {\lambda,\zeta} must “look like each other” up to rescaling in radius. This still holds in case one, where one of them is a circle and the other is the center of the circle. Either way, you get the same balayage, since whether you start at some point on a circle or you start in the center of the circle, if you’re a Brownian motion you will exhibit the same long-term behavior.

In the following lemmata, let {K \subset \overline{D(0, 1)} \setminus C} be a compact set. The set {\{\theta \in (-\pi/2, \pi/2): e^{i\theta} \in K\}} is compact since it is the preimage of a compact set, so contained a compact interval {I_K \subseteq (-\pi/2, \pi/2)}.

Lemma 7 One has

\displaystyle \inf_{w \in K} \int_{I_K} P(Re^{i\theta}, w) ~d\theta > 0.

Proof: Since {K} is compact the minimum is attained. Let {w} be the minimum. Since {P} is a real-valued harmonic function in {w}, thus

\displaystyle \Delta \int_{I_K} P(Re^{i\theta}, w) ~d\theta = \int_{I_K} \Delta P(Re^{i\theta}, w) ~d\theta = 0,

the maximum principle implies that the worst case is when {K} meets {\partial D(0, R)} and {w \in \partial D(0, R)}, say {w = Re^{i\alpha}}. Then

\displaystyle P(Re^{i\theta}, w) = \sum_{m=-\infty}^\infty e^{im(\theta - \alpha)}.

Of course this is just a formal power series and doesn’t make much sense. But if instead {w = re^{i\alpha}} where {r/R} is very small depending on a given {\varepsilon > 0}, then, after discarding quadratic terms in {r/R},

\displaystyle P(Re^{i\theta}, w) \leq \frac{1 + \varepsilon}{1 - 2(r/R)\cos(\theta - \alpha)}.

This follows since in general

\displaystyle P(Re^{i\theta}, w) = \frac{1 - (r/R)^2}{1 - 2(r/R) \cos(\theta - \alpha) + (r/R)^2}.

Now

\displaystyle \int_{I_K^c} \frac{d\theta}{1 - 2(r/R)\cos(\theta - \alpha)} < \pi

since the integrand is maximized when {\cos(\theta - \alpha) = 0}, in which case the integrand evaluates to the measure of {I_K^c}, which is {< \pi} since {I_K^c = (-\pi/2, \pi/2) \setminus I_K} and {I_K} has positive measure. Therefore

\displaystyle \int_{I_K^c} P(Re^{i\theta}, w) ~d\theta < \frac{3\pi}{2}.

On the other hand, for any {w} one has

\displaystyle \int_{-\pi/2}^{\pi/2} P(Re^{i\theta}, w) ~d\theta = 2\pi,

so this implies gives a lower bound on the integral over {I_K}. \Box

Lemma 8 If {1 < R \leq 3/2} then

\displaystyle \mathbf P(\lambda \in K) \leq C_K\left(a + R - 1 + \frac{\log \frac{1}{R - 1}}{n(R-1)^2} \right).

Proof: Let {w = \lambda} in the previous lemma, conditioning on the event {\lambda \in K}, to see that

\displaystyle \int_{I_K} P(Re^{i\theta}, \lambda) ~d\theta \geq \delta_K

where {\delta_K > 0}. Taking expectations and dividing by the probability that {\lambda \in K}, we can use Fubini’s theorem to deduce

\displaystyle \mathbf P(\lambda \in K) \leq C_K \int_{I_K} \text{Bal}(\lambda)(Re^{i\theta}) ~d\theta

where {C_K\delta_K = 1}. Applying the bound on {|\text{Bal}(\lambda) - \text{Bal}(\zeta)|} from the section on balayage, we deduce

\displaystyle \mathbf P(\lambda \in K) \leq C_K \int_{I_K} \text{Bal}(\zeta)(Re^{i\theta}) ~d\theta + C_K\frac{\log\frac{1}{R-1}}{n(R-1)^2}.

We already showed that {d(\zeta, C) = O(a)}. So in order to show

\displaystyle \int_{I_K} \text{Bal}(\zeta)(Re^{i\theta}) ~d\theta \leq C_K(a + R - 1),

which was the bound that we wanted, it suffices to show that for every {re^{i\alpha}} such that {d(re^{i\alpha}, C) = O(a)},

\displaystyle \int_{I_K} P(Re^{i\theta}, re^{i\alpha}) ~d\theta \leq C_K(a + R - 1).

Tao says that “one can show” this claim, but I wasn’t able to do it. I think the point is that under those cirumstances one has {r = R - O(a)} and {\cos \alpha \ll a} even as {\cos \theta \gg 0}, so we have some control on {\cos(\theta - \alpha)}. In fact I was able to compute

\displaystyle \int_{I_K} P(Re^{i\theta}, re^{i\alpha}) ~d\theta = -\sum_m (r/R)^{|m|}\frac{e^{-im(\alpha + \delta) + e^{-im(\alpha - \delta)}}}{m}

which suggests that this is the right direction, but the bounds I got never seemed to go anywhere. Someone bug me in the comments if there’s an easy way to do this that I somehow missed. \Box

Now we take {R = 1 + n^{-1/3}} to complete the proof.

4. Case one

Suppose that {1 - a} is infinitesimal. Let {\mu} be the expected value of {\lambda} (hence also of {\zeta}). Let {0 < \delta \leq 1/2} be a standard real.

We first need to go on an excursion to a paper of Dégot, who proves the following theorem:

Lemma 9 One has

\displaystyle |f'(a)| \geq cn |f(\delta)|.

Moreover,

\displaystyle |f(\delta)| \leq (1 + \delta^2 - 2\delta \text{Re }\mu)^{n/2}.

I will omit the proof since it takes some complex analysis I’m pretty unfamiliar with. It seems to need Grace’s theorem, which I guess is a variant of one of the many theorems in complex analysis that says that the polynomial image of a disk is kind of like a disk. It also uses some theorem called the Walsh contraction principle that involves polynomials on the projective plane. Curious.

In what follows we will say that an event {E} is standard-possible if the probability that {E} happens has positive standard part.

Lemma 10 For every {\varepsilon > 0}, {\mathbf P(\text{Re }\zeta \leq \varepsilon)} is standard-possible. Besides, {|f'(a)| > n}.

Proof: Since {|\zeta - a| > 1} almost surely and

\displaystyle U_\zeta(a) = -\frac{\log n}{n - 1} - \frac{1}{n - 1} \log |f'(a)|

but

\displaystyle U_\zeta(a) = -\mathbf E \log |\zeta - a| < 0,

we have

\displaystyle |f'(a)| > n.

Combining this with the lemma we see that the standard part of {|f(\delta)|} is {> 0}, so

\displaystyle 1^{1/n} \leq O(\sqrt{1 + \delta^2 + \delta\text{Re }\mu}).

On the other hand,

\displaystyle 1 - O(n^{-1}) \leq 1^{1/n}

and since {n} is nonstandard, {1/n} is infinitesimal, so the constant in {O(\sqrt{1 + \delta^2 + \text{Re }\mu})} gets eaten. In particular,

\displaystyle 1 - O(n^{-1}) \leq \sqrt{1 + \delta^2 + \delta\text{Re }\mu}

which implies that

\displaystyle 1 + o(1) \leq 1 + \delta^2 + \delta\text{Re }\mu

and hence

\displaystyle \text{Re }\mu \leq \frac{\delta}{2} + o(1).

Since this is true for arbitrary standard {\delta}, underspill implies that there is an infinitesimal {\kappa} such that

\displaystyle \text{Re }\mu \leq \kappa.

But {|\text{Re }\zeta| \leq 1} almost surely, and we just showed

\displaystyle \mathbf E\text{Re }\zeta \leq \kappa.

So the claim holds. \Box

We now allow {\delta} to take the value {0}, thus {0 \leq \delta \leq 1/2}.

Lemma 11 One has

\displaystyle |f(0)| \sim |f(\delta)| \sim 1

and

\displaystyle |f'(a)| \sim n.

Moreover, {|f(z)| \sim 1} if {|z - 1/2| < 1/100}, so {f} has no zeroes {z} in that disk.

Proof: Since

\displaystyle \mathbf E \log\frac{1}{|z - \zeta|} = -\frac{\log n}{n - 1} - \frac{1}{n - 1} \log |f'(z)|

one has

\displaystyle \log |f'(a)| - \log |f'(\delta)| = (n-1)\mathbf E \frac{|a - \zeta|}{|\delta - \zeta|}.

Now {|a - \zeta| \geq 1} and {|\zeta| \leq 1}.

Here I drew two unit circles in {\mathbf C}, one entered at the origin and one at {1} (since {|a - 1|} is infinitesimal); {\zeta} is (up to infinitesimal error) in the first circle and out of the second. The rightmost points of intersection between the two circles are on a vertical line which by the Pythagorean theorem is to the left of the vertical line {x = a/2}, which in turn is to the left of the perpendicular bisector {x = (a+\delta)/2} {[\delta, a]}. Thus {|a - \zeta| \geq |\delta - \zeta|}, and if {|\delta - \zeta| = |a - \zeta|} then the real part of {\zeta} is {(a+\delta)/2}. In particular, if the standard real part of {\zeta} is {< 1/2} then {|a - \zeta| > |\delta - \zeta|}, so {\log |a - \zeta|/|\delta - \zeta|} has positive standard part.

By the previous lemma, it is standard-possible that the standard real part of {\zeta} is {\leq 1/4 < 1/2}, so the standard real part of {\zeta} is standard-possibly positive and {\mathbf E \log|a-\zeta|/|\delta - \zeta|} is almost surely nonnegative. Plugging into the above we deduce the existence of a standard absolute constant {c > 0} such that

\displaystyle \log |f'(a)| - \log |f'(\delta)| \geq cn.

In particular,

\displaystyle f'(\delta) \leq |f'(\delta)| \leq e^{-cn} |f'(a)|.

Keeping in mind that {|f'(a)| > n} is nonstandard, this doesn’t necessarily mean that {f'(\delta)} has nonpositive standard part, but it does give a pretty tight bound. Taking a first-order Taylor approximation we get

\displaystyle f(0) = f(\delta) + O(e^{-cn|f'(a)|}).

But one has

\displaystyle |f'(a)| \geq cn |f(\delta)|

from the Dégot lemma. Clearly this term dominates {e^{-cn}|f'(a)|} so we have

\displaystyle |f(0)| \geq \frac{c}{n} |f'(a)|.

Since one has a lower bound {|f'(a)| > n} this implies {|f(0)|} is controlled from below by an absolute constant.

We also claim {|f(0)| \leq 1}. In fact, we showed last time that

\displaystyle -U_\lambda(0) = \frac{1}{n} \log |f(0)|;

we want to show that {\log |f(0)| \leq 0}, so it suffices to show that {U_\lambda(0) \geq 0}, or in other words that

\displaystyle \mathbf E \log |\lambda| \leq 0.

Since {|\lambda| \leq 1} by assumption on {f}, this is trivial. We deduce that

\displaystyle |f(0)| \sim |f(\delta)| \sim 1

and hence

\displaystyle |f'(a)| \sim n.

Now Tao claims that the proof that {|f(z)| \sim 1} is similar, if {|z - 1/2| < 1/100}. Since {\delta = 1/2} was a valid choice of {\delta} we have {|f(1/2)| \sim 1}. Since {|z - 1/2| < 1/100}, if {\text{Re }\zeta \leq 1/4} then {|a - \zeta|/|z - \zeta| \geq c > 1} where {c} is an absolute constant. Applying the fact that {\text{Re }\zeta \leq 1/4} is standard-possible and {\mathbf E \log|a-\zeta|/|z - \zeta|} is almost surely nonnegative we get

\displaystyle f'(z) \leq e^{-cn} |f'(a)|

so we indeed have the claim. \Box

We now prove the desired bound

\displaystyle \mathbf E \log \frac{1}{|\lambda|} \leq O(n^{-1}).

Actually,

\displaystyle \mathbf E \log \frac{1}{|\lambda|} = \frac{1}{n} \log \frac{1}{|f(0)|}

as we proved last time, so the bound {|f(0)| \sim 1} guarantees the claim.4

In particular

\displaystyle \mathbf E \log \frac{1}{|\lambda^{(\infty)}|} = 0

by Fatou’s lemma. So {|\lambda^{(\infty)}| = 1} almost surely. Therefore {U_{\lambda^{(\infty)}}} is harmonic on {D(0, 1)}, and we already showed that {|f(z)| \sim 1} if {|z - 1/2|} was small enough, thus

\displaystyle U_\lambda(z) = O(n^{-1})

if {|z - 1/2|} was small enough. That implies {U_{\lambda^{(\infty)}} = 0} on an open set and hence everywhere. Since

\displaystyle U_\eta(Re^{i\theta}) = \frac{1}{2} \sum_{m \neq 0} \frac{e^{im\theta}}{|m|} \mathbf E\eta^{|m|}

we can plug in {\eta = \lambda^{(\infty)}} and conclude that all moments of {\lambda^{(\infty)}} except the zeroth moment are zero. So {\lambda^{(\infty)}} is uniformly distributed on the unit circle.

By overspill, I think one can intuit that if {f} is a random polynomial of high degree which has a zero close to {1}, all zeroes in {D(0, 1)}, and no critical point close to {a}, then {f} sort of looks like

\displaystyle z \mapsto \prod_{k=0}^{n-1} z - \omega^k

where {\omega} is a primitive root of unity of the same degree as {f}. Therefore {f} looks like a cyclotomic polynomial, and therefore should have lots of zeroes close to the unit sphere, in particular close to {1}, a contradiction. This isn’t rigorous but gives some hint as to why this case might be bad.

Now one has

\displaystyle \mathbf E \log |\zeta - a| = \frac{1}{n} \log \frac{|f'(a)|}{n} = O(n^{-1})

and in particular by Fatou’s lemma

\displaystyle \mathbf E \log |\zeta^{(\infty)} - 1| = 0.

But it was almost surely true that {\zeta^{(\infty)} \notin D(1, 1)}, thus that {\log |\zeta^{(\infty)} - 1| \geq 0}. So this enforces {\zeta^{(\infty)} \in \partial D(1, 1)} almost surely. In particular, almost surely,

\displaystyle \zeta^{(\infty)} \in \partial D(1, 1) \cap \overline{D(0, 1)} = \gamma.

Since {\gamma} is a contractible curve, its complement is connected. We recall that {U_{\lambda^{(\infty)}} = U_{\zeta^{(\infty)}}} near infinity, and since we already know the distribution of {\lambda^{(\infty)}}, we can use it to compute {U_{\zeta^{(\infty)}}} near infinity. Tao says the computation of {U_{\zeta^{(\infty)}}} is a straightforward application of the Newtonian shell theorem; he’s not wrong but I figured I should write out the details.

For {\eta = \lambda^{(\infty)}} one has

\displaystyle U_\eta(z) = \mathbf E \log \frac{1}{|z - \eta|} = \frac{1}{2\pi} \int_{\partial D(0, 1)} \log \frac{1}{|z - w|} ~d|w|

where the {d|w|} denotes that this is a line integral in {\mathbf R^2} rather than in {\mathbf C}. Translating we get

\displaystyle U_\eta(z) =- \frac{1}{2\pi} \int_{\partial D(z, 1)} \log |w| ~d|w|

which is the integral of the fundamental solution of the Laplace equation over {\partial D(z, 1)}. If {|z| > 1} (reasonable since {z} is close to infinity), this implies the integrand is harmonic, so by the mean-value formula one has

\displaystyle U_\eta(z) = -\log |z|

and so this holds for both {\eta = \lambda^{(\infty)}} and {\eta = \zeta^{(\infty)}} near infinity. But then {\zeta^{(\infty)}} is harmonic away from {\gamma}, so that implies that

\displaystyle U_{\zeta^{(\infty)}} = \log \frac{1}{|z|}.

Since the distribution {\nu} of {\zeta^{(\infty)}} is the Laplacian of {U_{\zeta^{(\infty)}}} one has

\displaystyle \nu = \Delta \log \frac{1}{|z|} = \delta_0.

Therefore {\zeta^{(\infty)} = 0} almost surely. In particular, {\zeta} is infinitesimal almost surely. This completes the proof in case one.

By the way, I now wonder if when one first learns PDE it would be instructive to think of the fundamental solution of the Laplace equation and the mean-value formulae as essentially a consequence of the classical laws of gravity. Of course the arrow of causation actually points the other way, but we are humans living in a physical world and so have a pretty intuitive understanding of what gravity does, while stuff like convolution kernels seem quite abstract.

Next time we’ll prove a contradiction for case zero, and maybe start on the proof for case one. The proof for case one looks really goddamn long, so I’ll probably skip or blackbox some of it, maybe some of the earlier lemmata, in the interest of my own time.

Let’s Read: Sendov’s conjecture in high degree, part 1

Having some more free time than usual, I figured I would read a recent paper that looked interesting. However, I’m something of a noob at math, so I figure it’s worth it to take it slowly and painstakingly think through the details. This will be a sort of stream-of-conciousness post where I do just that.

The paper I’ll be reading will be Terry Tao’s “Sendov’s conjecture for sufficiently high degree polynomials.” This paper looks interesting to me because it applies “cheap nonstandard” methods to prove a result in complex analysis. In addition it uses probability-theoretic methods, which I’m learning a lot of right now. Sendov’s conjecture is the following:

Sendov’s conjecture Let {f: \mathbf C \rightarrow \mathbf C} be a polynomial of degree {n \geq 2} that has all zeroes {\leq 1}. If {a} is a zero then {f'} has a zero in {\overline{D(a, 1)}}.

Without loss of generality, we may assume that {f} is monic and that {a \in [0, 1]}. Indeed, if {f} is a polynomial in the variable {z} we divide by the argument of {z} and then rescale by the top coefficient.

Tao notes that by Tarski’s theorem, in principle it suffices to get an upper bound {n_0} on the degree of a counterexample and then use a computer-assisted proof to complete the argument. I think that for every {n \leq n_0} you’d get a formula in the theory of real closed fields that could be decided in {O(n^r)} time, where {r} is an absolute constant (which, unfortunately, is exponential in the number of variables of the formula, and so is probably quite large). Worse, Tao is going to use a compactness argument and so is going to get an astronomical bound {n_0}. Still, something to keep in mind — computer-assisted proofs seem like the future in analysis.

More precisely, Tao proves the following:

Proposition 1 For every {n} in a monotone sequence in {\mathbf N}, let {f} be a monic polynomial of degree {n} with all zeroes in {\overline{D(0, 1)}} and let {a \in [0, 1]} satisfy {f(a) = 0}. If for every {n}, {f'} has no zeroes in {\overline{D(a, 1)}}, then {0 = 1}.

It’s now pretty natural to see how “cheap nonstandard” methods apply. One can pass to a subsequence countably many times and still preserve the hypotheses of the proposition, so by diagonalization and compactness, we may assume good (but ineffective) convergence properties. For example, we can assume that {a = a^{(\infty)} + o(1)}, where {a^{(\infty)}} does not depend on {n} and {o(1)} is with respect to {n}.

Using overspill, one can view the proposition model-theoretically: it says that if {n} is a nonstandard natural number, {f} a monic polynomial of degree {n} with a zero {a \in [0, 1]}, and there are no zeroes of {f'} in {\overline{D(a, 1)}}, then {0 = 1}. Tao never fully takes this POV, but frequently appeals to results like the following:

Proposition 2 Let {P} be a first-order predicate. Then:

  1. (Overspill) If for every sufficiently small standard {\varepsilon}, {P(\varepsilon)}, then there is an infinitesimal {\delta} such that {P(\delta)}.
  2. (Underspill) If for every infinitesimal {\delta}, {P(\delta)}, then there are arbitrarily small {\varepsilon} such that {P(\varepsilon)}.
  3. ({\aleph_0}-saturation) If {P} is {\forall z \in K(f(z) = O(1))} where {K \subseteq \mathbf C} is compact, then the implied constant in the statement of {P} is independent of {z}.

Henceforth we will use asymptotic notation in the nonstandard sense; for example, a quantity is {o(1)} if it is infinitesimal. This is equivalent to the cheap nonstandard perspective where a quantity is {o(1)} iff it is with respect to {n}, where {n} is ranging over some monotone sequence of naturals. I think the model-theoretic perspective is helpful here because we are going to pass to subsequences a lot, and at least in the presence of the boolean prime ideal theorem, letting {n} be a fixed nonstandard natural number is equivalent to choosing a nonprincipal ultrafilter on {\mathbf N} that picks out the subsequences we are going to pass to, in the perspective where {n} ranges over a monotone sequence of standard naturals.

This follows because the Stone-Cech compactification {\beta \mathbf N} is exactly the space of ultrafilters on {\mathbf N}. Indeed, if {n} ranges over a monotone sequence of standard naturals, then in {\beta \mathbf N}, {n} converges to an element {U} of {\beta \mathbf N \setminus \mathbf N}, which then is a nonprincipal ultrafilter. If {\mathbf N^\omega/U} denotes the ultrapower of {\mathbf N} with respect to {U}, then I think the equivalence class of the sequence {\{1, 2, \dots\}} in {\mathbf N^\omega/U} is exactly the limit of {n}. Conversely, once a nonprincipal ultrafilter {U \in \beta \mathbf N \setminus \mathbf N} has been fixed, we have a canonical way to pass to subsequences: only pass to a subsequence which converges to {U}. This is possible since {\beta \mathbf N} is compact.

I think it will be at times convenient to go back to the “monotone sequence of standard naturals” perspective, especially when we’re doing computations, so I reserve the right to go between the two. We’ll call the monotone sequence perspective the “cheap perspective” and the model-theoretic perspective the “expensive perspective”.

I’m not familiar with the literature on Sendov’s conjecture, so I’m going to blackbox the reduction that Tao carries out. The reduction says that, due to the Gauss-Lucas theorem and previously existing partial results on Sendov’s conjecture, to prove Proposition 1, it suffices to show:

Proposition 3 Let {n} be a nonstandard natural, let {f} be a monic polynomial of degree {n} with all zeroes in {\overline{D(0, 1)}} and let {a \in [0, 1]} satisfy {f(a) = 0}. Suppose that {f'} has no zeroes in {\overline{D(0, 1)}} and

  1. (Theorem 3.1 in Tao) either {a = o(1/\log n)}, or
  2. (Theorem 5.1 in Tao) there is a standard {\varepsilon_0 > 0} such that

    \displaystyle 1 - o(1) \leq a \leq 1 - \varepsilon_0^n.

Then {0 = 1}.

In the former case we have {a = o(1)} and in the latter we have {|a - 1| = o(1)}. We’ll call the former “case zero” and the latter “case one.”

Tao gives a probabilistic proof of Proposition 3, and that’s going to be the bulk of this post and its sequels. Let {\zeta} be a random zero of {f'}, drawn uniformly from the finite set of zeroes. Le {\lambda} denote a random zero of {f} chosen independently of {\zeta}.

In the cheap perspective, {\zeta} depends on {n}, we are going to study properties of the convergence of {\zeta} as {n \rightarrow \infty}, by using our chosen ultrafilter to repeatedly pass to subsequences to make {\zeta} converge in some suitable topology. The probability spaces that {\zeta} lives in depend on {n}, but as long as we are interested in a deterministic limit {c} that does not depend on {n}, this is no problem. Indeed {\zeta} will converge to {c} uniformly (resp. in probability) provided that for every {\varepsilon > 0}, {|\zeta - c| \leq \varepsilon} almost surely (resp. {\mathbf P(|\zeta - c| \leq \varepsilon) = 1 - o(1)}), and this makes sense even though the probability space we are studying depends on {n}. The usual definition of convergence in distribution still makes sense even for random variables {\zeta} converges to a random variable {X} that deos not depend on {n} provided that their distributions {\zeta_*\mathbf P} converge vaguely to {X_*\mathbf P}.

Okay, it’s pretty obvious what being infinitesimally close to a deterministic standard real is in the uniform or probabilistic sense. Expanding out the definition of the vague topology of measures, a nonstandard measure {\mu} on a locally compact Hausdorff space {Q} is infinitesimally close to a standard measure {\nu} provided that for every continuous function {f} with compact support,

\displaystyle \left|\int_Q f~d\mu - \int_Q f~d\nu\right| = o(1).

This induces a definition of being infinitesimally close in distribution.

Okay, no more model-theoretic games, it’s time to start the actual proof.

Definition 4 Let {\eta} be a bounded complex random variable. The logarithmic potential of {\eta} is

\displaystyle U_\eta(z) = \mathbf E \log \frac{1}{|z - \eta|}.

Here {\mathbf E} denotes expected value. Tao claims but does not prove that this definition makes sense for almost every {z \in \mathbf C}. To check this, let {K} be a compact set {\subseteq \mathbf C} equipped with Lebesgue measure {\mu} and let {\nu} be the distribution of {\zeta}. Then

\displaystyle \int_K U_\eta(z) ~d\mu(z) = \int_K \int_\mathbf C \log \frac{1}{|z - \omega|} ~d\nu(\omega) ~d\mu(z)

and the integrand is singular along the set {\{z = \omega\}}, which has real codimension {2} in {K \times \text{supp} \nu}. The double integral of a logarithm makes sense almost surely provided that the logarithm blows up with real codimension {2} (to see this, check the double integral of log {1/x} on {\mathbf R^2}) so this looks good.

Definition 5 Let {\eta} be a bounded complex random variable. The Stieltjes transform of {\eta} is

\displaystyle s_\eta(z) = \mathbf E \frac{1}{z - \eta}.

Then {s_\eta} is “less singular” than {U_\eta}, so this definition is inoffensive almost everywhere.

Henceforth Tao lets {\mu_\eta} denote the distribution of {\eta}, where {\eta} is any bounded complex random variable. Then {s_\eta = -\partial U_\eta} and {\mu_\eta = \overline \partial s_\eta/2\pi} where {\partial,\overline \partial} are the complex derivative and Cauchy-Riemann operators respectively. Since {\partial \overline \partial = \Delta} we have

\displaystyle 2\pi\mu_n = -\Delta s_\eta.

This just follows straight from the definitions. Of course {\mu_\eta} might not be an absolutely continuous measure, so this only makes sense if we use the calculus of distributions.

Does this make sense if {\eta} is deterministic, say {\eta = 0} almost surely? In that case {\mu_\eta} is a Dirac measure at {0} and {s_\eta(z) = 1/z}. Everything looks good, since {U_\eta(z) = -\log |z|}.

For the next claims I need the Gauss-Lucas theorem:

Theorem 6 (Gauss-Lucas) If {P} is a polynomial on {\mathbf C}, then all zeroes of {P'} belong to the convex hull of the variety {P = 0}.

Lemma 7 (Lemma 1.6i in Tao) {\lambda} surely lies in {\overline{D(0, 1)}} and {\zeta} surely lies in {\overline{D(0, 1)} \setminus \overline{D(a, 1)}}.

Proof: The first claim is just a tautology. For the second, by assumption on {f} all zeroes of {f} lie in the convex set {\overline{D(0, 1)}}, so so does their convex hull. In particular {\zeta} lies in {\overline{D(0, 1)}} almost surely by the Gauss-Lucas theorem. Our contradiction hypothesis says that {\zeta \notin \overline{D(a, 1)}}. \Box

Lemma 8 (Lemma 1.6ii-iv in Tao) One has {\mathbf E\lambda = \mathbf E\zeta}. For almost every {z \in \mathbf C},

\displaystyle U_\lambda(z) = -\frac{1}{n} \log |f(z)|

and

\displaystyle U_\zeta(z) = -\frac{\log n}{n - 1} - \frac{1}{n - 1} \log |f'(z)|.

Moreover,

\displaystyle s_\lambda(z) = \frac{1}{n} \frac{f'(z)}{f(z)}

and

\displaystyle s_\zeta(z) = \frac{1}{n - 1} \frac{f''(z)}{f'(z)}.

Moreover,

\displaystyle U_\lambda(z) - \frac{n-1}{n}U_\zeta(z) = \frac{1}{n} \log |s_\lambda(z)|

and

\displaystyle s_\lambda(z) - \frac{n - 1}{n} s_\zeta(z) = -\frac{1}{n} \frac{s'_\lambda(z)}{s_\lambda(z)}.

Proof: We pass to the cheap perspective, so {n} is a large standard natural. Since {n} is large, in particular {n \geq 2}, if {b_{n-1}} is the coefficient of {z^{n-1}} in {f} then the roots of {f} sum to {-b_{n-1}}. The roots of {f'} sum to {-(n-1)b_{n-1}/n} by calculus. So {\mathbf E\lambda = \mathbf E\zeta}.

We write

\displaystyle f(z) = \prod_{j=1}^n z - \lambda_j

and

\displaystyle f'(z) = n\prod_{j=1}^{n-1} z - \zeta_j.

Taking {-\log|\cdot|} of both sides and then dividing by {n} we immediately get {U_\lambda} and {U_\zeta}. Then we take the complex derivative of both sides of the {U_\lambda} and {U_\zeta} formulae to get the formulae for {s_\lambda} and {s_\zeta}.

Now the formula for {\log |s_\lambda|} follows by subtracting the above formulae, as does the formula for {s_\lambda'/s_\lambda}. \Box

Since the distributions of {\lambda} and {\zeta} have bounded support (it’s contained in {\overline{D(0, 1)}}) by Prokhorov’s theorem we can find standard random variables {\lambda^{(\infty)}} and {\zeta^{(\infty)}} such that {\lambda - \lambda^{(\infty)}} is infinitesimal in distribution and similarly for {\zeta}. The point is that {\lambda^{(\infty)}} and {\zeta^{(\infty)}} give, up to an infinitesimal error, information about the behavior of {f}, {f'}, and {f''} by the above lemma and the following proposition.

Proposition 9 (Theorem 1.10 in Tao) One has:

  1. In case zero, {\lambda^{(\infty)}} and {\zeta^{(\infty)}} are identically distributed and almost surely lie in {C = \{e^{i\theta}: 2\theta \in [\pi, 3\pi]\}}, so {d(\lambda, C)} is infinitesimal in probability. Moreover, for every compact set {K \subseteq \overline{D(0, 1)} \setminus C},

    \displaystyle \mathbf P(\lambda \in K) = O\left(a + \frac{\log n}{n^{1/3}}\right).

  2. In case one, {\lambda^{(\infty)}} is uniformly distributed on {\partial D(0, 1)} and {\zeta^{(\infty)}} is almost surely zero. Moreover,

    \displaystyle \mathbf E \log \frac{1}{|\lambda|}, \mathbf E\log |\zeta - \eta| = O\left(\frac{1}{n}\right).

The proposition gives quantitative bounds that force the zeroes to all be in certain locations. Looking ahead in the paper, it follows that:

  1. In case zero, the Stieltjes transform of {\lambda^{(\infty)}} is infinitesimally close to {f'/nf}, so by a stability-of-zeroes argument to show that {f} has no zeroes near the origin, even though {a = o(1)}.
  2. In case one, if {\sigma} is the standard deviation of {\zeta}, then we have control on the zeroes of {f} up to an error of size {o(\sigma^2) + o(1)^n}, which we can then use to deduce a contradiction.

Next time I’ll start the proof of this proposition. Its proof apparently follows from the theory of Newtonian potentials, which is not too surprising since {-\Delta \log x = \delta(x)} if {\Delta} is the Laplacian of {\mathbf R^2}. It needs the following lemma:

Lemma 10 (Lemma 1.6vi in Tao) If {\gamma} is a curve in {\mathbf C} that misses the zeroes of {f} and {f'} then

\displaystyle f(\gamma(1)) = f(\gamma(0)) \exp\left(n \int_\gamma s_\lambda(z) ~dz\right)

and

\displaystyle f'(\gamma(1)) = f'(\gamma(0)) \exp\left((n-1) \int_\gamma s_\zeta(z) ~dz\right).

Proof: One has

\displaystyle ns_\lambda(z) = \frac{f'(z)}{f(z)}

by the previous lemma. Breaking up {\gamma} into finitely many parts one can assume that {\gamma} is a contractible curve in a simply connected set, in which case we have a branch of the logarithm along {\gamma}. Now apply the fundamental theorem. The case for {s_\zeta} is the same. \Box