# Sundry facts about pseudodifferential operators

In this blog post I will just record some things I’ve been trying to learn about lately, largely just so I can have a place to collect my thoughts. Most of this is in Hörmander’s monograph on differential operators, and is motivated by trying to understand Vasy’s method and Atiyah-Singer index theory.

Pseudodifferential operators on manifolds.

Let us recall that a symbol on an open subset X of $\mathbb R^d$ is by definition a smooth function on the cotangent bundle of X (for which certain seminorms are finite). This was curious to me — you can motivate it by saying that a symbol is an observable and the cotangent bundle is “phase space” in the sense that a point $(x, \xi) \in T^*X$ consists of a position x and a momentum $\xi$, but why should the momentum live in a cotangent space and not the fiber of some other vector bundle? When we quantize a symbol a, defining an operator a(D) by formally substituting the differential operator $D = -i\nabla$ in place of the momentum, we by definition obtain a pseudodifferential operator. Now let $\kappa: X \to Y$ be a diffeomorphism, and introduce the pushforward symbol $\kappa_* a(y, \eta) = e^{-iy\eta} a(\kappa^{-1}(y), D) a^{iy\eta}$. This is the “right” definition in the sense that $)\kappa_*a(x, D)u)(\kappa(x)) = a(x, D)u(\kappa(x))$.

If a is a symbol of order m, then $\kappa_* a(y, \eta) = a(\kappa^{-1}(y), \kappa'(\kappa^{-1}(y))^t \eta)$ modulo symbols of order m – 1. But $\kappa'(x)$ is invariantly defined as an isomorphism of tangent bundles $\kappa'(x): TX \to TY$, so its transpose should be an isomorphism $(\kappa')^{-1}(x): T^*Y \to T^*X$ of the dual bundle. This only makes sense if $\eta \in T^*_yY$ is a covector at y.

The above paragraphs are totally obvious, and yet puzzled me for the past three years, until last week when I sat down and decided to work out the details for myself.

The consequence is that we cannot define the symbol of a pseudodifferential operator invariantly. Rather, we declare that a pseudodifferential operator A has the property that for every chart $\kappa: X \to Y$ and every pair of cutoffs $\phi, \psi$ on Y, then the operator $\phi \circ \kappa_* \circ A \circ \kappa^* \circ \psi$ is a pseudodifferential operator on Y (in the sense that it is the quantization of a symbol on Y; here the pushforward $\kappa_*$ is defined to be the inverse of the pullback $\kappa^*$). Since Y is an open subset of $\mathbb R^d$ this makes sense.

Previously we have discussed pseudodifferential operators on manifolds M. These can be viewed more abstractly as acting on sections of the trivial line bundle $M \times \mathbb C$. However, in geometry one frequently has to deal with sections of more general vector bundles over M. For example, a 1-form is a section of the cotangent bundle. If E, F are vector bundles over M of rank r, s respectively, one may define the Hom-bundle Hom(E, F), which locally is isomorphic to the matrix bundle $M \times \mathbb C^{r \times s}$. Then a pseudodifferential operator from sections of E to sections of F is nothing more than a linear map which, after trivialization of E and F, looks like a $s \times r$ matrix of pseudodifferential operators on M. The principal symbol of such an operator sends the cotangent bundle of M into the Hom-bundle Hom(E, F).

Wavefront sets.

In this section we will impose that all pseudodifferential operators have Schwartz kernels K such that the projections of supp K are both proper maps. Modulo the space $\Psi^{-\infty}$ of pseudodifferential operators of order $-\infty$, this assumption is no loss of generality. Under this assumption, the top-order term of a symbol — that is, the principal symbol — satisfies the pushforward formula $\kappa_* a(y, \eta) = a(\kappa^{-1}(y), \kappa'(\kappa^{-1}(y))^t \eta)$, so the principal symbol is well-defined as an element of $S^m/S^{m-1}$ (here $S^\ell$ is the $\ell$th symbol class). The principal symbol encodes important information about the nature of the operator; for example we have:

Definition. An elliptic pseudodifferential operator of order m is one whose principal symbol is $\sim |\xi|^m$ near infinity of each cotangent space.

The important property is that if A is an elliptic pseudodifferential operator, then A is also invertible modulo the quantization $\Psi^{-\infty}$ of $S^{-\infty}$. For example the Laplace-Beltrami operator is elliptic on Riemannian manifolds since its symbol is $\xi^2$; since the quadratic form induced by a Lorentzian metric is not positive-definite, it follows that on Lorentzian manifolds, the Laplace-Beltrami operator is not elliptic. Since a Lorentzian Laplace-Beltrami operator is really just the d’Alembertian, whose symbol is $\xi^2 - \tau^2$, this should be no surprise.

Recall that a conic set in a vector space is a set which is closed under multiplication by conic scalars. A conic set in a vector bundle, then, is one which is conic in every fiber.

Definition. Let a be the principal symbol of a pseudodifferential operator A of order m. We say that A is noncharacteristic near $(x_0, \xi_0) \in T^*M$ if there is a conic neighborhood of $(x_0, \xi_0)$ wherein $a(x, \xi) \sim |\xi|^m$ near infinity. Otherwise, we say that $(x_0, \xi_0)$ is a characteristic point. The set of characteristic points is denoted Char A and the set of noncharacteristic points is denoted Ell A.

Thus a pseudodifferential operator A is noncharacteristic at $(x, \xi)$ if in a neighborhood of x, A is elliptic when restricted to the direction $\xi$. By definition, Char A is closed, so we may make the following definition.

Definition. Let u be a distribution. The wavefront set WF(u) is the intersection of all sets Char A, where A ranges over pseudodifferential operators such that $Au \in C^\infty$.

Then WF(u) is a closed conic subset of the cotangent bundle $T^*M$, and its projection to M is exactly the singular support ss(u). Indeed, $x \notin ss(u)$ iff for every pseudodifferential operator A in a sufficiently small neighborhood of x, $Au \in C^\infty$; in other words no matter how hard we try, we cannot force u to become singular without differentiating it away from x. The wavefront set also remembers the direction in which this singularity happens; by elliptic invertibility, it will not happen in a direction that A is noncharacteristic.

For example, the only way that $u(x, y) = \delta_{y = 0}$ can be made smooth is by cutting off u to away from $\{(x, y): y = 0\}$, which can be done by pseudodifferential operators of order 0 which are elliptic in the x-direction, but not possibly in the y-direction, along the x-axis.

Pseudotransport equations.

Hyperbolic operators are meant to generalize the transport equation $(\partial_t - \partial_x)u(t, x) = 0$. Let us therefore begin by studying the “pseudotransport” equation $(\partial_t + a(t, x, D_x))u(t, x) = 0$.

We assume that $t \mapsto a(t, x, D_x)$ is uniformly bounded in $S^1$ and continuous in $C^\infty$, and the real part of a is uniformly bounded from below. Then we have the energy estimate

$\displaystyle \frac{1}{2} \int_0^T ||e^{-\lambda t} u(t)||_{H^s}^p \lambda~dt \leq ||u(0)||_{H^s}^p$

valid for any $s \in \mathbb R$ and $\lambda$ large enough depending on s. Applying the Hanh-Banach theorem we conclude that for every initial data in $H^s$ we can find $u \in C^0([0, \infty) \to H^s)$ which solves the pseudotransport equation. In particular, given Schwartz initial data, it follows that u is smooth.

Now fix initial data $\phi \in H^s$ and assume that the principal symbol exists and is imaginary. (This forces the transport operator to be real and of order 1.) Let q be a symbol of order 0 on space, with principal symbol $q_0$. If in fact Q(D) is a pseudodifferential operator on spacetime such that such at time 0, Q(0) = q, and Q(t, D) commutes with $\partial_t + a(t, x, D_x)$ then Qu solves the pseudotransport equation. (Actually, we will find Q so that $[Q(t), \partial_t + a(t, x, D_x)]$ is a pseudodifferential operator of order $-\infty$; this is good enough.) In particular if $q\phi \in C^\infty_0$ then WF(u) is contained in Char Q, and WF(u) should be the intersection of all such sets Char Q.

To compute WF(u), let $ia_0$ be the principal symbol of a(D) and suppose that $Q \sim \sum_j Q_j$, where $Q_0$ is principal, is given. Then the principal symbol of $[\partial_t + a(t, x, D_x), Q(t, x, D)]$ is the Poisson bracket

$\displaystyle \{\tau + a_0(t, x, \xi), Q_0(t, x, \xi)\} = (\partial_t + H_{a_0})Q_0$

where $H_p$ is the Hamilton vector field of a symbol p. By inducting on j, we can use this computation to compute $Q_j$ and conclude that modulo an error term of order $-\infty$, we can choose Q to be invariant along the Hamiltonian flow $\psi$ given by the Hamiltonian $a_0$. That is, if $F_tu(0) = u(t)$, then $WF \circ F_t = \psi_t \circ WF$. This result is a sort of “propagation of singularities” for the pseudotransport equation, which generalizes the fact that the transport equation acts on Dirac masses by transporting them, as expected.

Solving the hyperbolic Cauchy problem.

Let X be a manifold that represents “spacetime”. A priori we may not have a Lorentzian metric to work with, so instead we fix a function $\phi$ that is a “time coordinate”. The level surfaces of $\phi$ can be viewed as “spacelike hypersurfaces” in X.

Throughout we will let $X_0 = \{\phi = 0\}$ and $X_+ = \{\phi > 0\}$ denote the present and future, respectively.

Definition. A hyperbolic operator is a differential operator P of principal symbol p and order m such that $p(x, d\phi(x)) = 0$ and for every $(x, \xi) \in T^*M$ such that $\xi$ is not in the span of $d\phi$, there are m distinct $\tau \in \mathbb R$ such that $p(x, \xi + \tau d\phi(x)) = 0$.

Since P is a differential operator, p(x) is a homogeneous polynomial of order m. To make sense of the condition, let me restrict to the case that $X = \mathbb R^2$ with its usual Riemannian metric and $\phi$ is the projection onto the t-axis. Then after rotating the first coordinate so that $\xi$ is a covector dual to the x-axis, the condition says that given $(x, t, \xi)$ we can find exactly m real numbers $\tau$ such that $p(x, t, \xi, \tau) = 0$. In the case of the d’Alembertian, we have $p(x, t, \xi, \tau) = \xi^2 - \tau^2$, and indeed given $\xi$ we can set $\tau = \pm \xi$.

To state the initial-value problem with initial data in the “initial-time slice” $X_0$, let v be a vector field such that $v\phi = 1$, so v points “forward in time”. The action of v is “differentiating with respect to time”. Note that this hypothesis prevents $\phi$ from degenerating.

Theorem (solving the hyperbolic Cauchy problem). Let P be a hyperbolic operator of order m with smooth coefficients, Y a precompact open submanifold of X, and $s \geq 0$. Assume we are given an inhomogeneous term $f \in H^s_{loc}(X_+)$ satisfying $f|X_0 = 0$ and initial data $\psi_j \in H^{loc}_{s + m - 1 -j}(X_0)$, j < m. Then there is $u \in H^{s + m - 1}_{loc}(X)$ supported in $\overline X_+$ such that Pu = f in $X_+ \cap Y$ and $v^ju = \psi_j$ in $X_0 \cap Y$.

The proof is in Chapter 23.2 of Hörmander. The idea is to first prove uniqueness of solutions. By compactness, we may cover Y with finitely many charts U which are isomorphic to open subsets of Minkowski spacetime in which level sets of $\phi$ are spacelike hypersurfaces and orbits of v are worldlines. Since Minkowski spacetime has an honest-to-god time coordinate, the hyperbolicity hypothesis allows us to factor the principal symbol p into first-order factors, and hence factor P into pseudotransport operators on U, at least modulo a lower-order error. We may then apply the solution of the Cauchy problem for pseudotransport operators to solve the Cauchy problem for Pu = f in each chart U, and since there were only finitely many, uniqueness allows us to stitch the local solutions together into a global solution.

The proof outlined in the above paragraph is motivated by the special case when P is the d’Alembertian, which already appears in Chapter 2 of Evans. In that proof, one first observes that the Cauchy problem for the transport equation has an explicit solution. Then one reduces to the case that spacetime is two-dimensional, in which case there is an explicit factorization of P into transport operators, namely $P = (\partial_x - \partial_t)(\partial_x + \partial_t)$.

Propagation of singularities, part I.

To study the propagation of singularities we need to recall some symplectic geometry. Let Q be a pseudodifferential operator on X and q its principal symbol. Then the Hamilton vector field $H_q$ induces a flow on $T^*X$ which preserves q.

Definition. The bicharacteristic flow of a pseudodifferential operator Q of principal symbol q is the flow of $H_q$ on $q^{-1}(0)$. A bicharacteristic of Q is an orbit of the bicharacteristic flow.

The intuition for the bicharacteristic flow is that its projection to X is “lightlike”, at least if Q is the d’Alembertian.

Theorem (Hörmander’s propagation of singularities). Let P be a pseudodifferential operator of order m such that the Schwartz kernel of P has proper support, and the principal symbol of P is real. Then for every distribution u, WF(u) – WF(f) is invariant under the bicharacteristic flow of P.

By definition of the wavefront set, for every distribution u, WF(u) – WF(Qu) is contained in Char Q. But if Q is a differential operator, then Char Q is exactly the “characteristic variety” $q^{-1}(0)$, which is exactly the variety where the bicharacteristic flow of Q is defined. Therefore we can ask that WF(u) – WF(Qu) be invariant under the bicharacteristic flow.

If P is a hyperbolic operator of principal symbol p, then the solutions $\tau$ of the equation $p(x, \xi + \tau d\phi(x)) = 0$ are all real and distinct, and modulo lower-order terms this can be used to enforce that the coefficients of p are real. We phrase this more simply by saying that the principal symbol of every hyperbolic operator is real.

A partial converse to the reality of principal symbols of hyperbolic operators holds. If Q is a differential operator, then its principal symbol q is a homogeneous polynomial on each cotangent space. Fixing a particular cotangent space, we can write $q(\xi) = \sum_\alpha c_\alpha \xi^\alpha$ where $\alpha$ ranges over all multiindices of order m and $c_\alpha \in \mathbb R$. In order that the characteristic variety of Q have more than one real point, there must be some $c_\alpha$ positive and some negative. But this is exactly the situation of the d’Alembertian, whose principal symbol is $q(\xi, \tau) = \xi^2 - \tau^2$.

Thus, while the propagation of singularities theorem only assumes that the principal symbol is real, if the operator P is (for example) elliptic or parabolic, then the conclusion of the theorem is degenerate in the sense that the characteristic variety only has a single real point, so that WF(u) – WF(f) is invariant under EVERY group action on the characteristic variety, not just the bicharacteristic flow.

The interpretation of the propagation of singularities theorem is that P is something like the d’Alembertian, in which case p is something like a Lorentzian metric. The bicharacteristic flow is a flow on the characteristic bundle, which is the space whose points $(x, \xi)$ consist of a position x and a lightlike momentum $\xi$. Therefore the projection of any bicharacteristic to X consists of a worldline. Thus, if the initial data is something like a Dirac mass at x, then the Dirac mass travels along the worldline containing x.

To prove the propagation of singularities theorem, we need a propagation estimate. Recall that if A is a pseudodifferential operator, then WF(A) denotes the microsupport of A; that is, the complement of the largest conic set on which A has order $-\infty$.

Theorem (propagation estimate). Let U be an open conic set, and let $A, B, B_1 \in \Psi^0(X)$. Let P be a pseudodifferential operator of real principal symbol p and order m.
For every N > 0 and $s \in \mathbb R$ there is C > 0 such that for every distribution u and every inhomogeneous term f with Pu = f,

$\displaystyle ||Au||_{H^{s+m-1}} \leq C||B_1 f||_{H^s} + C||Bu||_{H^{s+m-1}} + C||u||_{H^{-N}}$

given that the following criteria are met:

1. The projection of U is precompact in X.
2. For every $(x, \xi) \in U$, if $p(x, \xi) = 0$, then $H_p$ and the radial vector field $\xi\partial_\xi$ are linearly independent at $(x, \xi)$.
3. WF(A) and WF(B) are contained in U, while $WF(1 - B_1) \cap U = \emptyset$.
4. For every trajectory $(x(t), \xi(t))$ of $H_p$ with $(x(0), \xi(0)) \in WF(A)$, there is T < 0 such that for every $T \leq t \leq 0$, $(x(t), \xi(t)) \in U$ and $(x(-T), \xi(-T)) \in Ell(B)$.

The term $C||u||_{H^{-N}}$ is an error term created by the use of pseudodifferential operators and is not interesting. The operator $B_1$ is a cutoff which microlocalizes the problem to a neighborhood to the conic set U. We are interested in WF(u) – WF(f), so we want $WF(B_1) \cap WF(f)$ and $B_1|U = 1$. Actually, since we only care about the complement of WF(f), we might as well take f Schwartz, in which case we can take $B_1 = 1$ and simplify the propagation estimate to

$\displaystyle ||Au||_{H^{s+m-1}} \leq C||f||_{H^s} + C||Bu||_{H^{s+m-1}} + \text{error terms}.$

The interesting point here is the relationship between the operators A and B. We can optimize the propagation estimate by assuming that WF(B) = Ell B. This is because we really desperately want B to be elliptic on its microsupport, so that it does not introduce any new singularities. Under the assumption WF(B) = Ell B, B is a microlocalization to WF(B), and if $(x, \xi) \in WF(A)$, then $(x, \xi)$ got to WF(A) after passing through WF(B). The point is that if u has a singularity at $(x, \xi) \in WF(A)$, then (if the regularity exponent s is taken large enough) $||Au||_{H^{s+m-1}} = \infty$, but we assumed f Schwartz, so this implies $||Bu||_{H^{s+m-1}} = \infty$, so that if we traveled back along the bicharacteristic flow $(x(t), \xi(t))$ from $(x, \xi)$ for long enough, we would see that u already had a singularity at some time $(x(T), \xi(T))$ with T < 0.

Moreover, the propagation estimate is time-reversible in the sense we can replace T < 0 with -T > 0. Thus the bicharacteristic flow neither creates nor destroys singularities in the distribution u. This readily implies the propagation of singularities theorem.

The proof of the propagation estimate is quite technical and this post is meant as a more of a conceptual discussion so I will omit it.