Before we begin, I want to fuss around with model theory again. Recall that if is a nonstandard complex number, then
denotes the standard part of
, if it exists. We previously defined what it meant for a nonstandard random variable to be infinitesimal in distribution. One can define something similar for any metrizable space with a notion of
, where
is infinitesimal provided that
is. For example, a nonstandard random variable
is infinitesimal in
if for every compact set
that
can take values in,
is infinitesimal, since
is metrizable with
whenever is a compact exhaustion. If
is nonstandard,
is infinitesimal in some metrizable space, and
is standard, then we call
the standard part of
in
; then the standard part is unique since metrizable spaces are Hausdorff.
If the metrizable space is compact, the case that we will mainly be interested in, then the standard part exists. This is a point that we will use again and again. Passing to the cheap perspective, this says that if is a compact metric space and
is a sequence in
, then there is a
which is approximates
infinitely often, but that’s just the Bolzano-Weierstrass theorem. Last time used Prokohov’s theorem to show that if
is a nonstandard tight random variable, then
has a standard part
in distribution.
We now restate and prove Proposition 9 from the previous post.
Theorem 1 (distribution of random zeroes) Let
be a nonstandard natural,
a monic polynomial of degree
with all zeroes in
, and let
be a zero of
. Suppose that
has no zeroes in
. Let
be a random zero of
and
a random zero of
. Then:
- If
(case zero), then
and
are identically distributed and almost surely lie in the curve
In particular,
in probability. Moreover, for every compact set
,
- If
(case one), then
is uniformly distributed on the unit circle
and
is almost surely zero. Moreover,
1. Moment-generating functions and balayage
We first show that and
have equal moment-generating functions in a suitable sense.
To do this, we first show that they have the same logarithmic potential. Let be a random variable such that
almost surely (that is,
is almost surely bounded). Then the logarithmic potential
is defined almost everywhere as we discussed last time, and is harmonic outside of the essential range of .
Lemma 2 Let
be a nonstandard, almost surely bounded, random complex number. Then the standard part of
is
according to the topology of
under Lebesgue measure.
Proof: We pass to the cheap perspective. If we instead have a random sequence of and
in distribution, then
in
, since up to a small error in
we can replace
with a test function
; one then has
where in the weak topology of measures,
is the distribution of
,
is the distribution of
, and
is a compact set equipped with Lebesgue measure.
Lemma 3 For every
, we have
In particular,
.
Proof: By definition, , so
. Now
is a disc with diameter
where
is a rotation around the origin. Taking reciprocals preserves discs and preserves
, so
sits inside a disc
with a diameter
. Then
is convex, so the expected value of
is also
. Therefore the Stieltjes transform
satisfies . In particular,
But we showed that
almost everywhere last time. This implies that for almost every ,
but all terms here are continuous so we can promote this to a statement that holds for every . In particular,
hence
Since while
,
is bounded from above and below by a constant times
. Therefore the same holds of its logarithm
, which is bounded from above and below by a constant times
. This implies the first claim.
To derive the second claim from the first, we use the previous lemma, which implies that we must show that
in . But this follows since
is integrable in two dimensions.
Lemma 4 Let
be an almost surely bounded random variable. Then
Proof: One has the Taylor series
Indeed, by rescaling and using , we may assume
. The summands expand as
and the imaginary parts all cancel by symmetry about . Using the symmetry about
again we get
This equals the left-hand side as long as . Taking expectations and commuting the expectation with the sum using Fubini’s theorem (since
is almost surely bounded), we see the claim.
Lemma 5 For all
, one has
In particular,
and
have identical moments.
Proof: If we take then we conclude that
The left-hand side is a Fourier series, and by uniqueness of Fourier series it holds that for every ,
This gives a bound on the difference of moments
which is only possible if the moments of and
are identical. The left-hand side doesn’t depend on
, but if
,
, then
and
so the claim holds. On the other hand, if
then this claim still holds, since we showed last time that
and obviously .
Here I was puzzled for a bit. Surely if two random variables have the same moment-generating function then they are identically distributed! But, while we can define the moment-generating function of a random variable as a formal power series , it is not true that
has to have a positive radius of convergence, in which case the inverse Laplace transform of
is ill-defined. Worse, the circle is not simply connected, and in case one, we have to look at a uniform distribution on the circle, whose moments therefore aren’t going to points on the circle, so the moment-generating function doesn’t tell us much.
2. Balayage
We recall the definition of the Poisson kernel :
whenever is a radius. Convolving the Poisson kernel against a continuous function
on
solves the Dirichlet problem of
with boundary data
.
Definition 6 Let
be a random variable. The balayage of
is
Balayage is a puzzling notion. First, the name refers to a hair-care technique, which is kind of unhelpful. According to Tao, we’re supposed to interpret balayage as follows.
If is an initial datum for Brownian motion
, then
is the probability density of the first location
where
passes through
. Tao asserts this without proof, but conveniently, this was a problem in my PDE class last semester. The idea is to approximate
by the lattice
, which we view as a graph where each vertex has degree
, with one edge to each of the vertices directly above, below, left, and right of it. Then the Laplacian on
is approximated by the graph Laplacian on
, and Brownian motion is approximated by the discrete-time stochastic process wherein a particle starts at the vertex that best approximates
and at each stage has a
chance of moving to each of the vertices adjacent to its current position.
So suppose that and
are actually vertices of
. The probability density
is harmonic in
with respect to the graph Laplacian since it is the mean of
as
ranges over the adjacent vertices to
; therefore it remains harmonic as we take
. The boundary conditions follow similarly.
Now if is a random initial datum for Brownian motion which starts in
, the balayage of
is again a probability density on
that records where one expects the Brownian motion to escape, but this time the initial datum is also random.
I guess the point is that balayage serves as a substitute for the moment-generating function in the event that the latter is just a formal power series. We want to be able to use analytic techniques on the moment-generating function, but we can’t, so we just use balayage instead.
Let be the balayage of
. Since
is bounded, we can use Fubini’s theorem to commute the expectation with the sum and see that
provided that . It will be convenient to rewrite this in the form
so is uniquely determined by the moment-generating function of
. In particular,
and
have identical balayage, and one has a bound
We claim that
which implies the bound
To see this, we discard the term since
, which implies that
Up to a constant factor we may assume that the logarithms are base in which case we get a bound
The constant is absolute since .
By the integral test, we get a bound
Using the bound
for any and the change of variable
(thus
), we get a bound
since the error in the exponent can’t affect the exponential decay of the integral in
. Since we certainly have
this is a suitable tail bound.
To complete the proof of the claim we need to bound the main term. To this end we bound
Here denotes exponentiation. Now if
is small enough (say
), this supremum will be attained when
, thus
. Therefore
Luckily is easy to differentiate: its critical point is
. This gives
so
which was the bound we needed, and proves the claim. Maybe there’s an easier way to do this, because Tao says the claim is a trivial consequence of dyadic decomposition.
Let’s interpret the bound that we just proved. Well, if the balayage of is supposed to describe the point on the circle
at which a Brownian motion with random initial datum
escapes, a bound on a difference of two balyages should describe how the trajectories diverge after escaping. In this case, the divergence is infinitesimal, but at different speeds depending on
. As
, our infinitesimal divergence gains a positive standard part, while if
stays close to
, the divergence remains infinitesimal. This makes sense, since if we take a bigger circle we forget more and more about the fact that
are not the same random variable, since Brownian motion has more time to “forget more stuff” as it just wanders around aimlessly. So in the regime where
is close to
, it is reasonable to take standard parts and pass to
and
, while in the regime where
is close to
this costs us dearly.
3. Case zero
Suppose that is infinitesimal.
We showed last time that , so
is infinitesimal. Therefore
almost surely.
I think there’s a typo here, because Tao lets range over
and considers points
, which don’t exist since
while every point in
has
. I think this can be fixed by taking closures, which is what I do in the next lemma.
Tao proves a “qualitative” claim and then says that by repeating the argument and looking out for constants you can get a “quantitative” version which is what he actually needs. I’m just going to prove the quantitative argument straight-up. The idea is that if is a compact set which misses
and
then a Brownian motion with initial datum
will probably escape through an arc
which is close to
, but
is not close to
so a Brownian motion which starts at
will probably not escape through
. Therefore
have very different balayage, even though the difference in their balayage was already shown to be infinitesimal.
I guess this shows the true power of balayage: even though the moment-generating function is “just” a formal power series, we know that the essential supports of must “look like each other” up to rescaling in radius. This still holds in case one, where one of them is a circle and the other is the center of the circle. Either way, you get the same balayage, since whether you start at some point on a circle or you start in the center of the circle, if you’re a Brownian motion you will exhibit the same long-term behavior.
In the following lemmata, let be a compact set. The set
is compact since it is the preimage of a compact set, so contained a compact interval
.
Lemma 7 One has
Proof: Since is compact the minimum is attained. Let
be the minimum. Since
is a real-valued harmonic function in
, thus
the maximum principle implies that the worst case is when meets
and
, say
. Then
Of course this is just a formal power series and doesn’t make much sense. But if instead where
is very small depending on a given
, then, after discarding quadratic terms in
,
This follows since in general
Now
since the integrand is maximized when , in which case the integrand evaluates to the measure of
, which is
since
and
has positive measure. Therefore
On the other hand, for any one has
so this implies gives a lower bound on the integral over .
Lemma 8 If
then
Proof: Let in the previous lemma, conditioning on the event
, to see that
where . Taking expectations and dividing by the probability that
, we can use Fubini’s theorem to deduce
where . Applying the bound on
from the section on balayage, we deduce
We already showed that . So in order to show
which was the bound that we wanted, it suffices to show that for every such that
,
Tao says that “one can show” this claim, but I wasn’t able to do it. I think the point is that under those cirumstances one has and
even as
, so we have some control on
. In fact I was able to compute
which suggests that this is the right direction, but the bounds I got never seemed to go anywhere. Someone bug me in the comments if there’s an easy way to do this that I somehow missed.
Now we take to complete the proof.
4. Case one
Suppose that is infinitesimal. Let
be the expected value of
(hence also of
). Let
be a standard real.
We first need to go on an excursion to a paper of Dégot, who proves the following theorem:
Lemma 9 One has
Moreover,
I will omit the proof since it takes some complex analysis I’m pretty unfamiliar with. It seems to need Grace’s theorem, which I guess is a variant of one of the many theorems in complex analysis that says that the polynomial image of a disk is kind of like a disk. It also uses some theorem called the Walsh contraction principle that involves polynomials on the projective plane. Curious.
In what follows we will say that an event is standard-possible if the probability that
happens has positive standard part.
Lemma 10 For every
,
is standard-possible. Besides,
.
Proof: Since almost surely and
but
we have
Combining this with the lemma we see that the standard part of is
, so
On the other hand,
and since is nonstandard,
is infinitesimal, so the constant in
gets eaten. In particular,
which implies that
and hence
Since this is true for arbitrary standard , underspill implies that there is an infinitesimal
such that
But almost surely, and we just showed
So the claim holds.
We now allow to take the value
, thus
.
Lemma 11 One has
and
Moreover,
if
, so
has no zeroes
in that disk.
Proof: Since
one has
Now and
.
Here I drew two unit circles in , one entered at the origin and one at
(since
is infinitesimal);
is (up to infinitesimal error) in the first circle and out of the second. The rightmost points of intersection between the two circles are on a vertical line which by the Pythagorean theorem is to the left of the vertical line
, which in turn is to the left of the perpendicular bisector
. Thus
, and if
then the real part of
is
. In particular, if the standard real part of
is
then
, so
has positive standard part.
By the previous lemma, it is standard-possible that the standard real part of is
, so the standard real part of
is standard-possibly positive and
is almost surely nonnegative. Plugging into the above we deduce the existence of a standard absolute constant
such that
In particular,
Keeping in mind that is nonstandard, this doesn’t necessarily mean that
has nonpositive standard part, but it does give a pretty tight bound. Taking a first-order Taylor approximation we get
But one has
from the Dégot lemma. Clearly this term dominates so we have
Since one has a lower bound this implies
is controlled from below by an absolute constant.
We also claim . In fact, we showed last time that
we want to show that , so it suffices to show that
, or in other words that
Since by assumption on
, this is trivial. We deduce that
and hence
Now Tao claims that the proof that is similar, if
. Since
was a valid choice of
we have
. Since
, if
then
where
is an absolute constant. Applying the fact that
is standard-possible and
is almost surely nonnegative we get
so we indeed have the claim.
We now prove the desired bound
Actually,
as we proved last time, so the bound guarantees the claim.4
In particular
by Fatou’s lemma. So almost surely. Therefore
is harmonic on
, and we already showed that
if
was small enough, thus
if was small enough. That implies
on an open set and hence everywhere. Since
we can plug in and conclude that all moments of
except the zeroth moment are zero. So
is uniformly distributed on the unit circle.
By overspill, I think one can intuit that if is a random polynomial of high degree which has a zero close to
, all zeroes in
, and no critical point close to
, then
sort of looks like
where is a primitive root of unity of the same degree as
. Therefore
looks like a cyclotomic polynomial, and therefore should have lots of zeroes close to the unit sphere, in particular close to
, a contradiction. This isn’t rigorous but gives some hint as to why this case might be bad.
Now one has
and in particular by Fatou’s lemma
But it was almost surely true that , thus that
. So this enforces
almost surely. In particular, almost surely,
Since is a contractible curve, its complement is connected. We recall that
near infinity, and since we already know the distribution of
, we can use it to compute
near infinity. Tao says the computation of
is a straightforward application of the Newtonian shell theorem; he’s not wrong but I figured I should write out the details.
For one has
where the denotes that this is a line integral in
rather than in
. Translating we get
which is the integral of the fundamental solution of the Laplace equation over . If
(reasonable since
is close to infinity), this implies the integrand is harmonic, so by the mean-value formula one has
and so this holds for both and
near infinity. But then
is harmonic away from
, so that implies that
Since the distribution of
is the Laplacian of
one has
Therefore almost surely. In particular,
is infinitesimal almost surely. This completes the proof in case one.
By the way, I now wonder if when one first learns PDE it would be instructive to think of the fundamental solution of the Laplace equation and the mean-value formulae as essentially a consequence of the classical laws of gravity. Of course the arrow of causation actually points the other way, but we are humans living in a physical world and so have a pretty intuitive understanding of what gravity does, while stuff like convolution kernels seem quite abstract.
Next time we’ll prove a contradiction for case zero, and maybe start on the proof for case one. The proof for case one looks really goddamn long, so I’ll probably skip or blackbox some of it, maybe some of the earlier lemmata, in the interest of my own time.