DMat0101, Notes 6: Introduction to singular integral operators; the Hilbert transform

This week we come to the study of singular integral operators, that is operators of the form

\displaystyle T(f)(x)=\int K(x,y)f(y)dy, \quad x\in {\mathbb R}^n, \ \ \ \ \ (1)

defined initially for `nice’ functions {f\in\mathcal S({\mathbb R}^n)}. Here we typically want to include the case where {K} has a singularity close to the diagonal

\displaystyle \Delta=\{(x,x):x\in{\mathbb R}^n\}\subset {\mathbb R}^{2n},

which is not locally integrable. Typical examples are

\displaystyle K(x,y)=\frac{1}{|x-y|^n},\quad x,y\in{\mathbb R}^n,

\displaystyle K(x,y)=\frac{x_j-y_j}{|x-y|^{n+1}},\quad x,y\in {\mathbb R}^n

and in one dimension

\displaystyle K(x,y)=\frac{1}{x-y},\quad x,y\in {\mathbb R},

and so on. Observe that these kernels have a non integrable singularity both at infinity as well as on the diagonal {\Delta}. It is however the local singularity close to the diagonal that is important and will lead us to characterize a kernel as a singular kernel. For example, the kernel

\displaystyle K(x-y)=\frac{1}{|x-y|^{n-\epsilon}},\quad \epsilon>0

is not a singular kernel since its singularity is locally integrable. Observe that for Schwartz functions {f\in{\mathcal S(\mathbb R^n)}({\mathbb R}^n)} it makes perfect sense to define

\displaystyle T(f)(x)=\int_{{\mathbb R}^n}\frac{f(y)}{|x-y|^{n-\epsilon}}dy,

and in fact the previous integral operator was already considered in the Hardy-Littlewood-Sobolev inequality of Exercise 12 in Notes 5 and can be treated via the standard tools we have seen so far.

Thus, if one insists on writing the representation formula (1) throughout {{\mathbb R}^n} then {K} will not be a function in general. Indeed, the discussion in Notes 4 reveals that if the operator {T} is translation invariant then the kernel {K} must necessarily be of the form {K(x-y)} for an appropriate tempered distribution {K\in {\mathcal S'(\mathbb R^n)}}:

\displaystyle T(f)=K*f.

Bearing in mind that there are tempered distributions which do not arise from functions or measures we see that (1) does not make sense in general and it should be understood in a different way. To give a more concrete example, think of the principal value distribution {K=\textnormal{p.v.}\frac{1}{y}\in \mathcal S'({\mathbb R})} and write

\displaystyle T(f)=(f*\textnormal{p.v.}\frac{1}{y})(x).

Here we would like to rewrite this in the form

\displaystyle T(f)=\int_{\mathbb R} \frac{f(y)}{x-y}dy,

but this does not make sense even for {f\in \mathcal S ( {\mathbb R} )} since the function {\frac{1}{x-y}} is not locally integrable on the diagonal {x=y}.

In fact, the representation (1) of the operator will not be true in general but we will satisfy ourselves with its validity for functions {f\in L^2({\mathbb R}^n)}, of compact support, and whenever {x} does not lie in the support of {f}. Indeed, if {f} has compact support and {x\notin{\mathrm{supp}}(f)} then {|y-x|>\epsilon} in (1) and thus we are away from the diagonal. Indeed, returning to the principal value example, observe that the integral

\displaystyle \int_{{\mathbb R}}\frac{f(y)}{x-y}dy,

makes perfect sense when {f} has compact support and {x\notin {\mathrm{supp}}(f)}.

Eventually the theory of singular integral operators does not depend on translation invariance; singular kernels of the type {K(x-y)} can be viewed as a special case of the more general class of singular kernels {K(x,y)} which satisfy appropriate growth and regularity assumptions. It is however instructive to consider the translation invariant case first. In the Calderón-Zygmund theory of singular integral operators we will start with more or less assuming that the operator {T} is well defined and bounded on {L^2({\mathbb R}^n)} and that its kernel {K} satisfies certain growth and regularity conditions. Alternatively, assumptions on {K} will allow us to show the {L^2}-boundedness. We will see that under these conditions {T} will extend to a bounded operator on {L^p({\mathbb R}^n)} for {1<p<\infty} and of weak type {(1,1)}.

1. The Hilbert transform

In order to illustrate the general ideas let us consider what is probably the primordial example of a singular integral operator, the Hilbert transform, given in the form

\displaystyle \begin{array}{rcl} H(f)(x)&:=&\textnormal{p.v.}\frac{1}{\pi}\int_{\mathbb R} \frac{f(y)}{x-y}dy=\textnormal{p.v.}\frac{1}{\pi}\int_{\mathbb R} \frac{f(x-y)}{y}dy\\ \\ &=&\lim_{\epsilon\rightarrow 0}\frac{1}{\pi}\int_{|x|>\epsilon}\frac{f(x-y)}{y}dy. \end{array}

Remembering the principal value distribution we can rewrite this in the form

\displaystyle H(f)(x)=(\textnormal{p.v.}\frac{1}{\pi y}*f)(x),

at least whenever {f\in \mathcal{S}({\mathbb R})}. The previous formula makes sense just because the principal value of {1/{\pi y}} is a well defined tempered distribution. Alternatively, we can repeat the argument we used for {\textnormal{p.v.}\frac{1}{\pi y}} to write for any {\epsilon>0} and a Schwartz function {f\in\mathcal {S}({\mathbb R})}

\displaystyle \int_{|y|>\epsilon}\frac{f(x-y)}{y}dy=\int_{\epsilon<|y|<1}\frac{f(x-y)-f(x)}{y}dy+\int_{1<|y|<\infty}\frac{f(x-y)}{y}dy.

Observe that we heavily rely on the fact that the kernel {\frac{1}{y}} has zero mean on symmetric intervals around (and away from) the origin:

\displaystyle \int_{a<|y|<b}\frac{1}{y}dy=0,\quad 0<a<b<+\infty.

The mean value theorem now shows that {\frac{f(x-y)-f(x)}{y}} is uniformly bounded by {\|f'\|_{L^\infty({\mathbb R})}} thus the limit of the first summand as {\epsilon\rightarrow 0} exists and we have that

\displaystyle H(f)(x)=\int_{0<|y|<1}\frac{f(x-y)-f(x)}{y}dy+\int_{|y|>1}\frac{f(x-y)}{y}dy, \ \ \ \ \ (2)

whenever {f\in \mathcal S({\mathbb R})}.

Remark 1 Trying to write the Hilbert transform as an integral operator with respect to a kernel {K},

\displaystyle T(f)(x)=\int_{{\mathbb R}}K(x,y)f(y) dy,

we immediately run into the problem that the principal value distribution does not arise from a function. The previous discussion allows us however to write

\displaystyle H(f)(x)=\frac{1}{\pi}\int_{{\mathbb R}} \frac{f(y)}{x-y}dy=\frac{1}{\pi}\int_{\mathbb R} \frac{f(x-y)}{y}dy,

whenever {f} is a compactly supported function in { \mathcal S ({\mathbb R})} or {L^2({\mathbb R})} and {x\notin {\mathrm{supp}}(f)}. This is essentially equivalent to the fact that the integrals

\displaystyle \frac{1}{\pi}\int_{|y|>\epsilon} \frac{f(x-y)}{y}dy,

are absolutely convergent whenever {f\in L^2({\mathbb R})} and {\epsilon>0} is fixed.

Thus we see that the Hilbert transform is a linear operator which is at least well defined on the Schwartz class {\mathcal S({\mathbb R})}. This is quite promising since we know that {\mathcal{S}({\mathbb R})} is dense in {L^p({\mathbb R})} for {p<\infty}. Of course, in order to extend the action of {H} to say {L^2({\mathbb R})} we need to exhibit the continuity of {H} on the dense subclass {\mathcal {S}({\mathbb R})}. In our general theory this will be a `given’, that is that our operator is bounded on {L^2}. To make this general assumption meaningful we have to exhibit that it is indeed satisfied in the model case of the Hilbert transform. We begin this investigation by first showing a simple asymptotic relationship.

Lemma 1 Let {f\in \mathcal S({\mathbb R}^n)}. Then we have

\displaystyle \lim_{|x|\rightarrow +\infty} x H(f)(x)=\int_{{\mathbb R}}f(y)dy.

Before giving the proof of this Lemma let us discuss its consequences. Already the expression (2) shows that {H(f)} is a bounded function whenever {f\in \mathcal S({\mathbb R})}. Indeed, using the mean value theorem for the first term in (2) and Hölder’s inequality for the second term we have that

\displaystyle \begin{array}{rcl} |H(f)(x)|\lesssim \|f' \|_{L^\infty({\mathbb R})}+\|f\|_{L^2({\mathbb R})}. \end{array}

As a result, the integrability of {H(f)} for {f\in\mathcal S({\mathbb R})} solely depends on the behavior of {H(f)} at infinity. Now the lemma just stated shows that

\displaystyle H(f)(x)\simeq_f \frac{1}{|x|},\quad |x|\rightarrow \infty,

whenever {f\in\mathcal S({\mathbb R})} with {\int_{\mathbb R} f(y)dy\neq 0}. Thus for a general {f\in\mathcal {S}} with non-zero mean, {H(f)} fails to be in {L^1({\mathbb R})} since it doesn’t decay fast enough at infinity. It is however in {L^p({\mathbb R})} for any {p>1}. As we shall see the failure of continuity of {H} on {L^1} has a weak substitute, namely that {H} is of weak type {(1,1)} and this is the typical behavior of all singular integral operators we want to consider.

Proof of Lemma 1: The proof is a variation of the idea used in (2). For any {\epsilon>0} and {|x|} large we can write

\displaystyle \begin{array}{rcl} \lim_{\epsilon\rightarrow 0} x\int_{|y|>\epsilon}\frac{f(x-y)}{y}dy&=&x\int_{0<|y|\leq \frac{|x|}{2} }\frac{f(x-y)-f(x)}{y}dy\\ \\ && +x\int_{\frac{|x|}{2}<|y|\leq 2|x|}\frac{f(x-y)}{y}dy \\ \\ &&+ x \int_{|y|>2|x|}\frac{f(x-y)}{y}dy\\ \\ &=:&I_1+I_2+I_3. \end{array}

For {I_1} observe that {|x|/2\leq |x-y|\leq 3|x|/2} whenever {|y|\leq|x|/2} thus we have that

\displaystyle |I_1|\lesssim |x|^2\sup_{ |\xi| \simeq |x|}|f'(\xi)|\simeq \sup_{|\xi|\simeq |x|}|\xi^2f'(\xi)|\rightarrow 0

as {|x|\rightarrow \infty} since {f} is a Schwartz function. On the other hand, for {I_3} we have that {|x-y|\geq |x|} whenever {|y|>2|x|}. We get

\displaystyle |I_3|\lesssim |x|\int_{|x-y|\geq |x|}|f(x-y)|dy\leq \int_{|y|\geq |x|}|yf(y)|dy\rightarrow 0,

as {|x|\rightarrow \infty} since {yf(y)} is integrable, {f} being a Schwartz function. Now consider the expression

\displaystyle I_2- \int_{\mathbb R} f(x-y)dy=\int_{\frac{|x|}{2}<|y|\leq 2|x|}( {x}/{y}-1) f(x-y) dy - \int_{\{|y|<|x|/2\}\cup\{|y|>2|x|\}} f(x-y)dy,


\displaystyle \bigg|I_2-\int_{{\mathbb R}}f\bigg |\leq \frac{1}{|x|}\int_{\mathbb R} |yf(y)|dy+\int_{|y|>|x|/2} |f(y)|dy \rightarrow 0,

as {|x|\rightarrow \infty}. \Box

Exercise 1 Let {f\in\mathcal S({\mathbb R}^n)}. Show that {H(f)\in L^1({\mathbb R})} if and only if {\int_{{\mathbb R}}f(y)dy=0}

Hint: Examine the decay of H(f)(x) for |x|\to +\infty  by using the identity \widehat {H(f)}(\xi)=-i \textnormal{sgn}(\xi) \hat f(\xi).

1.1. The Hilbert transform on {L^2({\mathbb R})}

Having exhibited that {H(f)\in L^2({\mathbb R})} whenever {f\in \mathcal S({\mathbb R})} our next task is to show that {H} is bounded as an operator {H:\mathcal S({\mathbb R})\cap L^2({\mathbb R})\rightarrow L^2({\mathbb R})}, that is to show that

\displaystyle \|H(f)\|_{L^2({\mathbb R})}\lesssim \|f\|_{L^2({\mathbb R})},

for all {f\in\mathcal S({\mathbb R})}. Remember that since {\mathcal S({\mathbb R})} is dense in {L^2({\mathbb R})} such an estimate will allow us to extend {H} to a bounded linear operator on {L^2({\mathbb R})}. There are several different approaches to such a theorem, most of them connected to the significance of the Hilbert transform in complex analysis and in the theory of holomorphic functions. First we exhibit the connection with Cauchy integrals.

Proposition 2 Let {f} be a function on {{\mathbb R}} such that {H(f)} is well defined, say {f\in C^1({\mathbb R})} and {|f(x)|\lesssim(1+|x|)^{-1}} for {|x|} large. Then

\displaystyle \lim_{\epsilon\rightarrow 0}\frac{1}{2\pi i }\int_{\mathbb R} \frac{f(y)}{y-(x\pm i \epsilon)} dy=\frac{\pm f(x)+iH(f)(x)}{2},

for every {x\in {\mathbb R}}.

Proof: By translation invariance of {H} and taking complex conjugate in both sides of the identity it suffices to show that

\displaystyle \lim_{\epsilon\rightarrow 0}\frac {1}{2\pi i } \int_{\mathbb R} \frac{f(y)}{y-i \epsilon } dy=\frac{ f(0)+iH(f)(0)}{2}, \ \ \ \ \ (3)

which is equivalent to

\displaystyle \lim_{\epsilon\rightarrow 0} \frac {1}{2\pi i } \int_{\mathbb R} \frac{f(y)}{y-i\epsilon}dy -\frac{1}{2}f(0)-\frac{i}{2\pi}\int_{|y|>\epsilon}\frac{f(y)}{-y}dy=0.

Changing variables {y=\epsilon u} this is equivalent to

\displaystyle \lim_{\epsilon \rightarrow 0} \int_{\mathbb R} \bigg( \frac{1}{u-i}- \chi_{\{|u|>1\}}(u)\frac{1}{u} \bigg) f(\epsilon u) du =\pi i f(0).

Now let

\displaystyle h(u)= \frac{1}{u-i}- \chi_{\{|u|>1\}}(u)\frac{1}{u}.

For {|u|\leq 1} we have that

\displaystyle |h(u)| =\frac{1}{|u-i|}=\frac{1}{(1+u^2)^\frac{1}{2}}\leq 1,

while for {|u|>1} we can calculate

\displaystyle |h(u)|=\frac{1}{|u^2-iu|}=\frac{1}{(u^2+u^4)^\frac{1}{2}}\leq \frac{1}{u^2}.

The previous estimates obviously imply that {h} is absolutely integrable on {{\mathbb R}}. Furthermore

\displaystyle \int_{\mathbb R} h(u) du = \int_{\mathbb R} \bigg( \frac{1}{u-i}- \chi_{\{|u|>1\}}(u)\frac{1}{u} \bigg) du=i \pi,

as can be seen by a direct calculation. Thus by the previous calculations it suffices to show that

\displaystyle \lim_{\epsilon\rightarrow 0} \int_{\mathbb R} (f(\epsilon u )-f(0))h(u)du=0, \ \ \ \ \ (4)

which follows by dominated convergence since {h\in L^1({\mathbb R})} and {f} is bounded. \Box

Exercise 2 Show that for {f\in C^1({\mathbb R})} satisfying {|f(x)|\leq (1+|x|)^{-1}} for {|x|\rightarrow \infty} the Hilbert transform {H(f)} is indeed well defined. Furthermore, show that it indeed suffices to show (3) in the previous proposition. In particular exhibit how the full statement of the previous follows from (3).

Theorem 3 If {f\in \mathcal S({\mathbb R})} then

\displaystyle \widehat {H(f)}(\xi)=-i\,\textnormal{sgn}(\xi)\hat f(\xi).

Proof: Let us define the Cauchy-type integral

\displaystyle C_\epsilon(f)(x)=\frac{1}{2\pi i }\int_{\mathbb R} \frac{f(y)}{y-(x-i\epsilon)}dy.

Then Proposition 2 shows that

\displaystyle \lim_{\epsilon\rightarrow 0} C_\epsilon(f)(x)=\frac{-f(x)+iH(f)(x)}{2}.

Observe by the proof of the proposition applied to the function {\tau_{-x}f} that

\displaystyle C_\epsilon(f)(x)- \frac{-f(x)+iH(f)(x)}{2}=\int_{\mathbb R}( \tau_{-\epsilon u}f(x)-f(x))h(u)du

for all {x\in {\mathbb R}}. Thus by Minkowski’s integral inequality we get that

\displaystyle \bigg\|C_\epsilon(f)- \frac{-f+iH(f)}{2} \bigg\|_{L^2({\mathbb R})}\leq \int_{\mathbb R} \| \tau_{-\epsilon u}f-f\|_{L^2({\mathbb R})} |h(u)|du.

By dominated convergence we conclude that {C_\epsilon(f)} converges to {\frac{-f+iH (f)}{2}} in {L^2} as well. By Plancherel’s theorem we get that we must also have that

\displaystyle \widehat {C_\epsilon(f)}\rightarrow \frac{1}{2}(-\hat f +\widehat {H(f) }),

in {L^2}, as {\epsilon\rightarrow 0}. Note here that the Fourier transform {\widehat{H(f)}} is well defined since {f\in\mathcal S({\mathbb R})} and in this case we have exhibited that {H(f)\in L^2({\mathbb R})}. The problem now reduces to calculating the Fourier transform of {C_\epsilon(f)} for {\epsilon>0} and see what happens in the limit. Consider the truncations {C_{\epsilon,R}(f)}

\displaystyle C_{\epsilon,R}(f)(x)=\frac{1}{2\pi i}\int_{|x-y|<R}\frac{f(y)}{y-(x-i\epsilon)}dy.

Let us write

\displaystyle g_\epsilon(t)=\frac{1}{2\pi i}\frac{1}{-t+i\epsilon },\quad g_{\epsilon,R}(t)=\frac{1}{2\pi i}\frac{1}{-t +i\epsilon }\chi_{\{|t|<R\}}.

Then {g_{\epsilon,R}(t)\rightarrow g_\epsilon} as {R\rightarrow \infty} in {L^2} by dominated convergence and thus

\displaystyle \|C_{\epsilon,R}(f)-C_\epsilon(f)\|_{L^2({\mathbb R})}=\| f*g_{\epsilon,R}-f*g_{\epsilon}\|_{L^2({\mathbb R})}\leq \|f\|_{L^1({\mathbb R})}\|g_{\epsilon,R}-g_\epsilon \|_{L^2({\mathbb R})}\rightarrow 0,

as {R\rightarrow 0}. We now have that

\displaystyle \widehat {C_{\epsilon,R}(f)}(\xi)=\hat f(\xi) \widehat {g_{\epsilon,R}}(\xi).

However we have that

\displaystyle \widehat {g_{\epsilon,R}}(\xi)=-\frac{1}{2\pi i} \int_{|x|<R}\frac{e^{-2\pi i x\xi}}{-x+i\epsilon}dx.

Now Cauchy’s theorem from Complex analysis shows that {\lim_{R\rightarrow \infty} \widehat{g_{\epsilon,R}}(\xi)=0} whenever {\xi>0}.

The previous definitions allow us to conclude that the Fourier transform

\displaystyle \widehat{C_\epsilon(f)}(\xi)=0,

whenever {\xi>0} and thus that

\displaystyle \frac{1}{2}(-\hat f(\xi)+i\widehat{H(f))(\xi)}=0

whenever {\xi>0}. We conclude that

\displaystyle \widehat{H(f)}(\xi)=-i\hat f(\xi),\quad \xi>0.

Now not that the Hilbert transform satisfies

\displaystyle H(f)(-x )=\lim_{\epsilon\rightarrow 0}\int_{|y|>\epsilon}\frac{f(-x-y)}{y}dy=-H(\tilde f)(x),

where remember that {\tilde f(x)=f(-x)}. So for {\xi>0} we can write

\displaystyle \begin{array}{rcl} \widehat {H(f) }(-\xi)&=& \int_{\mathbb R} H(f)(x)e^{2\pi i x\xi}dx=-\int_R H(\tilde f)(x) e^{-2\pi i x \xi } dx\\ \\ &=& -\widehat{H(\tilde f)}(\xi)=i \hat{\tilde f}(\xi)=i\hat f(-\xi). \end{array}

In other words for {\xi\in{\mathbb R}} we get that {\widehat {H(f)}(\xi)=-i\,\textnormal{sgn}(\xi)\hat f(\xi)}. \Box

The previous theorem shows in particular that {\|H(f)\|_{L^2({\mathbb R})}=\|f\|_{L^2({\mathbb R})}} for all {f\in \mathcal S({\mathbb R})}. This allows us to extend the Hilbert transform to a bounded linear operator on {L^2({\mathbb R})}. In fact {H} is an isometry by Plancherel’s theorem and the fact that {|-i\textnormal{sgn} (\xi) |=1}. Furthermore, although at the current stage it is not clear that our original definition makes sense on {L^2({\mathbb R})}, we can directly define the Hilbert transform on {L^2({\mathbb R})} by means of

\displaystyle \widehat {H(f)}(\xi)=-i \textnormal{sgn}(\xi)\hat f(\xi),

which is a good definition whenever {f\in L^2({\mathbb R})}. In fact, recalling the discussion on multiplier transformations it is clear that the operator {H} on {L^2} is the multiplier transformation associated with the multiplier {m(\xi)=-i\textnormal{sgn}(\xi)} which is obviously a bounded function. This is automatic from the definition

\displaystyle \widehat {H(f)}(\xi)=m(\xi) \hat f(\xi),

and the fact that {m\in L^\infty({\mathbb R})}. We also have that {\|H\|_{L^2\rightarrow L^2}=\|m\|_{L^\infty}=1} which is also obvious from the fact that {H} is an isometry.

Corollary 4 The Hilbert transform extends to an isometry on {L^2({\mathbb R})}. We have that

\displaystyle \|H(f)\|_{L^2({\mathbb R})}=\|f\|_{L^2({\mathbb R})},

for all {f\in L^2({\mathbb R})}. Furthermore, for {f\in L^2({\mathbb R})} the Hilbert transform can be defined as

\displaystyle \widehat{H(f)}(\xi)=-i\textnormal{sgn}(\xi)\hat f(\xi),\quad f\in L^2({\mathbb R}).

Corollary 5 Consider the Hilbert transform {H:L^2({\mathbb R})\rightarrow L^2({\mathbb R})}. Then we have the following properties (i) The Hilbert transform {H} commutes with translations and dilations (but not modulations).

\displaystyle H\tau_{x_o}=\tau_{x_o} H,\quad \textnormal{Dil}_\lambda ^p H =H \textnormal{Dil}_\lambda ^p.

(ii) The Hilbert transform is skew-adjoint on {L^2({\mathbb R})}

\displaystyle \int_{\mathbb R} H(f)\bar g= - \int_{\mathbb R} f \overline{H(g)},\quad f,g\in L^2({\mathbb R}).

(iii) We have the identity {H^2=-\textnormal{id}} on {L^2({\mathbb R})}:

\displaystyle H(H(f))=-f,\quad f\in L^2({\mathbb R}).

Exercise 3 Prove Corollary 5 above. Hint: Use the formula of Theorem 3.

Exercise 4 Let {f(x)=\chi_{[0,1]}(x)}. Show that

\displaystyle H(f)(x)=\frac{1}{\pi}\log\bigg|\frac{x}{x-1}\bigg|.

Conclude that the Hilbert transform is not of strong type {(1,1)} nor of strong type {(\infty,\infty)}.

1.2. The Hilbert transform on {L^p({\mathbb R})}

So far we have defined our first singular integral operator, the Hilbert transform. This is an operator that is bounded on {L^2({\mathbb R})} and that has the representation

\displaystyle H(f)(x)=\int_{\mathbb R} f(y)\frac{1}{x-y}dy,

whenever {f\in L^2({\mathbb R})} has compact support and {x\notin \textnormal{supp}(f)}. The function

\displaystyle K(x,y)=\frac{1}{x-y}

is the singular kernel associated with the Hilbert transform. Although we have seen that the Hilbert transform can be described for all {x\in{\mathbb R}}, at least for nice functions {f\in \mathcal S({\mathbb R})}, the restricted representation just described is all we really need to execute our program. Furthermore, this approach will serve as a good introduction to the general case of Calderón-Zygmund operators. From the previous discussion we know that the Hilbert transform is not of type {(1,1)} nor of type {(\infty,\infty)}. The following theorem is the main result of the theory.

Theorem 6 (i) The Hilbert transform is of weak type {(1,1)}; for {f\in L^1({\mathbb R})} we have that

\displaystyle |\{x\in{\mathbb R}: |H(f)(x)>\lambda\}|\lesssim \frac{\|f\|_{L^1({\mathbb R})}}{\lambda}, \quad \lambda>0.

(ii) For {1<p<\infty}, the Hilbert transform is of strong type {(p,p)}; for {f\in L^p({\mathbb R})} we have

\displaystyle \|H(f)\|_{L^p({\mathbb R})}\lesssim_p \|f\|_{L^p({\mathbb R})}.

Proof: We will divide the proof in several steps. The most important one however is the proof of the weak type {(1,1)}. All the rest really relies on exploiting the symmetries of the Hilbert transform, interpolation and duality.

step 1; the weak {(1,1)} bound: We fix a level {\lambda>0} and a function {f\in L^1({\mathbb R})\cap L^2({\mathbb R})} and write the Calderón-Zygmund decomposition of the function {f} at level {\lambda} in the form

\displaystyle f = g+b.

Recall that the `bad part’ {b} is described as

\displaystyle b=\sum_{Q\in\mathcal B} b_Q

where {\mathcal B} is a collection of disjoint dyadic intervals (since {n=1}) and each {b_Q} is supported on {Q}. Furthermore we have that

\displaystyle \int_Q b_Q=0,


\displaystyle \frac{1}{|Q|}\int_Q|b_Q|\lesssim \lambda.

Recall also that

\displaystyle |\cup_{Q\in\mathcal B}Q|\leq\frac{\|f\|_1}{\lambda},

by the maximal theorem. On the other hand the `good part’ {g} is bounded

\displaystyle \|g\|_\infty \lesssim \lambda

and its {L^1} norm is controlled by the {L^1} norm of {f}:

\displaystyle \|g\|_1\leq\|f\|_1.

Observe that {g\in L^1\cap L^\infty} thus {g\in L^2({\mathbb R})} and by the log-convexity of the norm we have

\displaystyle \|g\|_{L^2({\mathbb R})}\leq \|g\|^\frac{1}{2} _{L^1({\mathbb R})}\|g\|^\frac{1}{2} _{L^\infty({\mathbb R})}\lesssim \lambda \|f\|_{L^1({\mathbb R})}. \ \ \ \ \ (5)

Remark 2 Since {f,g\in L^2({\mathbb R})} it follows that {b\in L^2({\mathbb R})} as well. Also, by the definition of the pieces {b_Q} it is easy to see that {b_Q\in L^2(Q)} as well. However, we will not use the {L^2} bounds on {b} nor on {b_Q}, the fact that they belong to {L^2} being merely a technical assumption that allows us to define their Hilbert transforms. Overall, the hypothesis that {f\in L^2({\mathbb R})} cannot be used in any quantitative way if we ever want to extend our results to {L^p({\mathbb R})} for {p\neq 2}.

Since {f=b+g} and {H} is linear, we have the following basic estimate

\displaystyle \{|H(f)(x)| >\lambda\}|\leq |\{x\in{\mathbb R}: |H(g)(x) |>\lambda/2\}|+|\{ |H(b)(x) |>\lambda/2\}|. \ \ \ \ \ (6)

The part that corresponds to {g} is the easy one to estimate. This is not surprising since {g} is the good part. Since we already know that {H} is of strong type {(2,2)} it’s certainly of weak type {(2,2)} thus we have

\displaystyle \begin{array}{rcl} |\{x\in{\mathbb R}: |H(g)(x)|>\lambda/2\}|\lesssim \frac{\|g\|^2 _{L^2({\mathbb R})}}{\lambda^2}\leq \frac{\|f\|_{L^1({\mathbb R})}}{\lambda}, \end{array}

by (5). Thus this estimate for the good part is exactly what we want. Let’s move now to the estimate for the bad part. The main ingredient for the estimate of the bad part is the following statement which we formulate as a lemma for future reference.

Lemma 7 Let {I=(x_o-\epsilon,x_o+\epsilon)} be any interval in {{\mathbb R}} and denote by {I^*} the interval with the same center as {I} and twice its length. For {f\in L^1({\mathbb R})\cap L^2({\mathbb R})} support in {I} and with zero mean on {I}, {\int_I f=0}, we have

\displaystyle |H(f)(x)|\lesssim \frac{|I|}{|x-x_o|^2}\int_I |f|,

for all {x\notin I^*}. We conclude that

\displaystyle \int_{{\mathbb R}\setminus I*} |H(f)(x)|dx\lesssim \int_I |f|.

Remark 3 Here we require that {f} is also in {L^2({\mathbb R})} just in order to make sure that {H(f)(x)} is well defined. Note that in the case of the Hilbert transform it can be verified directly that {H(f)(x)} is well defined for {f\in L^1(I)} and {x\notin I^*}. However we prefer this formulation since for more general Calderón-Zygmund operators we will only have a formula available to us for {f\in L^2({\mathbb R})} with compact support and {x\notin \textnormal{supp}(f)}.

Proof: Using the zero mean value hypothesis for {f} we can write for {x\notin I^*}

\displaystyle \begin{array}{rcl} |H(f)(x)|&=&\big|\int_I \frac{f(y)}{x-y}dy\big|=\bigg|\int_I\bigg(\frac{1}{x-y}-\frac{1}{x-x_o}\bigg)f(y)dy\bigg|\\ \\ &\leq & \int_I \frac{|y-x_o|}{|x-x_o||x-y|}|f(y)|dy. \end{array}

Now since {x\notin I^*} we have that

\displaystyle |x-y|\geq |x-x_o|-|y-x_o|=|x-x_o|-\epsilon\geq |x-x_o|-|x-x_o|/2=|x-x_o|/2

so we can write

\displaystyle |H(f)(x)|\lesssim \frac{|I|}{|x-x_o|^2} \int_I |f(y)|dy,

as we wanted to show. The second claim of the lemma follows easier by integrating this estimate. \Box

We now go back to the estimate of {m_b}. First of all note that

\displaystyle |H(b)(x)|\leq \sum_{Q\in \mathcal B} |H(b_Q)(x)|,

for almost every {x\in {\mathbb R}}. Indeed, if we enumerate the cubes in {\mathcal B} as {Q_1,\ldots,Q_N,\ldots} then we have that {b_N(x):=\sum_{j=1} ^N b_{Q_j}(x)\nearrow b(x)} for every {x\in{\mathbb R}} thus {b_N\rightarrow b} in {L^2({\mathbb R})}. Since {H} is an isometry on {L^2({\mathbb R})} it follows that {H(b_N)} converges to {H(b)} in {L^2} as well. Taking subsequences we then have that {H(b_{N_j})(x)\rightarrow H(b)(x)} almost everywhere. Thus

\displaystyle |H(b_{N_j})(x)|=|\sum_{m=1} ^{N_j}b_{Q_m}(x)|\leq \sum_{Q\in\mathcal B}| H(b_Q)(x)|,

almost everywhere and we get the claim by letting {j\rightarrow +\infty}.

For each {Q\in\mathcal B} let {Q^*} denote the cube with the same center and twice the side-length. We now estimate the `bad part’ as follows

\displaystyle \begin{array}{rcl} |\{x\in {\mathbb R}:|H(b)(x)|>\lambda/2\}|&\leq& |\cup_{Q\in\mathcal B} Q^*|+|\{x\notin \cup_{Q\in\mathcal B} Q^*:\sum_{Q\in\mathcal B} |H(b_Q)(x)|>\lambda/2\}|. \end{array}

By the Calderón-Zygmund decomposition we have that

\displaystyle |\cup_{Q\in\mathcal B}Q^*| =2|\cup_{Q\in\mathcal B}Q| \lesssim\frac{\|f\|_1}{\lambda},

which takes care of the first summand. For the second we use Lemma 7 to write

\displaystyle \int_{{\mathbb R}\setminus Q^*}|H(b_Q)(x)|dx \lesssim \int|b_Q(x)|dx \lesssim |Q|\lambda,

again by the Calderón-Zygmund decomposition. Observe that each {b_Q\in L^1(Q)\cap L^2(Q)} and has mean zero on {Q} so the appeal to Lemma 7 is legitimate. Summing up the estimates for all the bad cubes in {\mathcal B} we get

\displaystyle \bigg\|\sum_{Q\in\mathcal B}|H(b_Q)| \bigg\|_{L^1({\mathbb R}\setminus \cup_{Q\in\mathcal B} Q^*)} \lesssim \lambda \sum_{Q\in \mathcal B}|Q| =\lambda\frac{\|f\|_1}{\lambda} = \|f\|_1.

By Chebyshev’s inequality we thus get

\displaystyle |\{x\in {\mathbb R}\setminus \cup_{Q\in\mathcal B} Q^*): \sum_{Q\in\mathcal B} |H(b_Q)(x)|>\lambda/2 \}|\lesssim\frac{\|f\|_1}{\lambda}.

Summing up the estimates for the bad part we conclude that

\displaystyle |\{x\in {\mathbb R}:|H(b)(x)|>\lambda/2\}|\lesssim\frac{\|f\|_1}{\lambda}.

By (6) now we conclude that

\displaystyle |\{x\in {\mathbb R}:|H(f)(x)|>\lambda\}|\lesssim \frac{\|f\|_1}{\lambda},

whenever {f\in L^1({\mathbb R})\cap L^2({\mathbb R})}.

We have a priori assumed that {f\in L^2({\mathbb R})\cap L^1({\mathbb R})} in order to have a good definition of {H}. However, the weak {(1,1)} inequality on {L^1\cap L^2} allows us to extend the Hilbert transform to a linear operator on {L^1({\mathbb R})} which is also of weak type {(1,1)}. The details are left as an exercise.

Exercise 5 Let {T:L^1({\mathbb R}^n )\cap {\mathcal S(\mathbb R^n)} \rightarrow L^1({\mathbb R}^n) } be a linear operator which is of weak type {(1,1)}. Show that {T} extends to a linear operator on {L^1({\mathbb R}^n)} which is of weak type {(1,1)}, with the same {(1,1)} constant.

step 2; the strong {(p,p)} bound: As promised, the difficult part of the proof was the weak {(1,1)} bound. The rest is routine. first of all observe that since {H} is of weak type {(1,1)} and strong type {(2,2)}, the Marcinkiewicz interpolation theorem allow us to show that {H} is of strong type {(p,p)} for any {1<p<2}. To treat the interval {1<p<2} we argue by duality, exploiting the fact that {H} is almost self-adjoint (in fact it is skew adjoint as we have seen in Corollary 5). Indeed, let {f\in\mathcal S({\mathbb R})} and {2<p<\infty}. Now for any {g\in L^{p'}({\mathbb R})} we have

\displaystyle \begin{array}{rcl} \big| \int_{{\mathbb R}} H(f) \bar g \big|=\big|\int_{\mathbb R} f \overline {H(g)}\big|\leq \|f\|_{L^p({\mathbb R})} \|H(g)\|_{L^{p'}({\mathbb R})} \lesssim_p\|g\|_{L^{p'}({\mathbb R})}\|f\|_{L^p({\mathbb R})}, \end{array}

using the fact that {H} is of strong type {(p',p')} since {1<p'<2}. Taking the supremum over all {g\in L^{p'}({\mathbb R})} with {\|g\|_{L^{p'}}\leq 1} we get

\displaystyle \|H(f)\|_{L^p({\mathbb R})}\lesssim_p \|f\|_{L^p({\mathbb R})},

for {2<p<\infty} as well, whenever {f\in\mathcal S({\mathbb R})}. Using standard arguments again this shows that {H} extends to a bounded linear operator on {L^p({\mathbb R})}, {1<p<\infty}. \Box

Remark 4 In fact, tracking the constants in the previous argument we see that

\displaystyle \|H\|_{L^p\rightarrow L^p}\lesssim\frac{1}{p-1} \quad \mbox{as}\quad p\rightarrow 1


\displaystyle \|H\|_{L^p\rightarrow L^p}\lesssim \frac{1}{p'-1}=\frac{p}{p'}\simeq p \quad \mbox{as}\quad p\rightarrow\infty.

Overall we have proved that {H} is of strong type {(p,p)} with a norm bound of the order

\displaystyle \|H\|_{L^p\rightarrow L^p}\lesssim \max( (p-1)^{-1},p),\quad 1<p<\infty.

Remark 5 We have exhibited that {H} extends to a bounded linear operator to {L^p} for {1<p<\infty} and that it is of weak type {(1,1)}. However, for a general {f\in L^p({\mathbb R})}, {1\leq p \leq 2}, there is no reason why {H(f)} should by given by the same formula by which it was initially defined; remember that

\displaystyle H(f)=\lim_{\epsilon \rightarrow 0}\int_{|y|>\epsilon} \frac{f(x-y)}{y}dy=:\lim_{\epsilon\rightarrow 0} H_\epsilon(f),\quad f\in \mathcal S({\mathbb R}).

Thus the question whether {H_\epsilon(f)(x)\rightarrow H(f)(x)} a.e., for {f\in L^p({\mathbb R})}, is very natural. Since we know this convergence is true for the dense subset {{\mathcal S(\mathbb R^n)}({\mathbb R})}, the study of the pointwise convergence amounts to studying the boundedness properties of the corresponding maximal operator

\displaystyle H^*(f)(x):=\sup_{\epsilon>0}\int_{|y|>\epsilon}\frac{f(x-y)}{y}dy.

Thus if one can show that {H^*} is of weak type {(1,1)} for example, the pointwise convergence of {H_\epsilon(f)} to {H(f)} would follow by Proposition 1 of Notes 5. Such an estimate is actually true and thus this formula extends to all {L^p} functions for {1\leq p \leq \infty}. We will however see this in the general theory of Calderón-Zygmund operators of which the Hilbert transform is a special case and so we postpone the proof until then.

1.3. The Hilbert transform and the boundary values of holomorphic functions

In this section we briefly discuss the connection of the Hilbert transform with the boundary values of holomorphic functions in the upper half plane. Let us write

\displaystyle {\mathbb R}_+^2={\mathbb C}_+=\{(x,y):x\in {\mathbb R},y>0\}=\{x+iy:x\in{\mathbb R},y>0\},

for the upper half plane. Two function {u,v} on {{\mathbb R}_+} are called conjugate harmonic functions if they are the real and imaginary part respectively of a holomorphic function {F(z)} in the upper half plane, where {z=x+iy}. Thus we have that

\displaystyle F(z)=F(x+iy)=u(x,y)+iv(x,y).

By definition both {u,v} are real and harmonic. Moreover, they satisfy the Cauchy-Riemann equations (since {F} is holomorphic). Now assume that {F} has a boundary value {F_o(x)=u_o(x)+iv_o(x)} on the real line {x\in {\mathbb R}}. Then

\displaystyle v_o(x)=H(u_o)(x),\quad \mbox{and} \quad u_o(x)=-H(v_o)(x).

Of course, some technical assumptions are needed to make all these claims rigorous as for example assuming that the holomorphic function F has some decay of the form {|F(z)|\lesssim(1+|z|)^{-1}} in the upper half plane.

Conversely, Let {f\in L^p({\mathbb R})} be a real function and {P_y(x)} be the Poisson kernel for the upper half plane

\displaystyle P_y(x)=\frac{1}{\pi}\frac{y}{y^2+x^2}.

As we have seen, the convolution {u(x,t)=(f*P_y)(x)} is a harmonic function in the upper half plane {{\mathbb R}_+=\{(x,t):x\in {\mathbb R},t>0\}}. Observe that

\displaystyle u(x,y)=\frac{y}{\pi}\int_{\mathbb R} \frac{f(t)}{y^2+(x-t)^2}dt.

Consider now the conjugate Poisson kernel

\displaystyle Q_t(x,y)=\frac{1}{\pi}\frac{x}{y^2+x^2}.

The name comes from the fact that both {P_t,Q_t} are both real harmonic functions and writing {z=x+iy} we have

\displaystyle P_t(x)+iQ_t(x)=\frac{1}{\pi}\frac{ix+y}{x^2+y^2}=\frac{i}{\pi}\frac{x-iy}{x^2+y^2}=\frac{i}{\pi z},

which is holomorphic in the upper half plane. Thus {P_t}, {Q_t} are conjugate harmonic functions which is what makes the functions {u,v} conjugate harmonic functions as well. We conclude that the function

\displaystyle v(x,y)=(f*Q_t)(x)=\frac{1}{\pi}\int_{\mathbb R} \frac{f(t)(x-t)}{y^2+(x-t)^2}dt,

is harmonic in the upper half plane and that

\displaystyle F(z)=u(x,y)+iv(x,y),\quad z=x+iy\in {\mathbb C}_+,

is holomorphic in the upper half plane.

Finally observe that according to the previous formulae we have

\displaystyle F(z)=u(x,y)+iv(x,y)=\frac{1}{\pi}\int_{{\mathbb R}} \frac{f(t)[y+i(x-t)]}{y^2+(x-t)^2}dt=\frac{1}{\pi i}\int_{\mathbb R} \frac{f(t)}{t-x-iy}dt.

In this language, Proposition 2 just states that {F(x+iy)} converges to its boundary value {f+iH(f)} as {y\rightarrow 0}. We also see that the imaginary part of {F} converges to the Hilbert transform:

\displaystyle \lim_{y\rightarrow 0} (f*Q_t)(x)=H(f)(x),

both in {L^p({\mathbb R}) } and almost everywhere.

1.4. Frequency cut-off multipliers and partial Fourier integrals

Remember that for a bounded function {m\in L^\infty({\mathbb R})} the operator

\displaystyle T:L^2({\mathbb R})\rightarrow L^2({\mathbb R}),\quad \widehat{T(f)}(\xi)=m(\xi)\hat f(\xi)

is a multiplier operator (associated to the multiplier {m}) and that {\|T\|_{L^2\rightarrow L^2}=\|m\|_{L^\infty({\mathbb R})}}. We also say that {m} is a multiplier on {L^p} if {T} extends to a bounded linear operator {T:L^p({\mathbb R})\rightarrow L^p({\mathbb R})}. Thus we see that the Hilbert transform is a multiplier operator on {L^p({\mathbb R})} associated with the multiplier

\displaystyle m(\xi)=-i \textnormal{sgn} (\xi),\quad \xi \in {\mathbb R},

which is obviously a bounded function with {\|m\|_{L^\infty({\mathbb R})}=1}. A very closely related multiplier is the frequency cutoff multiplier. Given an interval {(a,b)} in the frequency space, where {a<b}, we define the operator {S_{(a,b)}:L^2({\mathbb R})\rightarrow L^2({\mathbb R})} by means of the formula

\displaystyle \widehat{S_{(a,b)}f}(\xi)=\chi_{(a,b)}(\xi)\hat f(\xi).

Thus the operator {S_{(a,b)}} applied to {f}, localizes the function {f} in frequency, in the interval {(a,b)}. Such operators as well as their multidimensional analogues turn out to be very important in harmonic analysis as well as in the theory of partial differential operators. Obviously {S_{(a,b)}} is bounded on {L^2({\mathbb R})}, since {\|S_{(a,b)}\|_{L^2\rightarrow L^2}=\|\chi_{(a,b)}\|_{L^\infty({\mathbb R})}=1}. However, the corresponding estimate in {L^p({\mathbb R})} is far from obvious. After all the work we have done for the Hilbert transform though, we can get the {L^p} bounds for {S_{(a,b)}} as a simple corollary. This is based on the observation that

\displaystyle S_{(a,b)} =\frac{i}{2}(\textnormal{Mod}_a H\textnormal{Mod}_{-a} - \textnormal{Mod}_b H\textnormal{Mod}_{-b} ),

where the equality should be understood as an equality of operator in {L^2({\mathbb R})}. Here remember that

\displaystyle \textnormal{Mod}_{x_o}(f)(x)=e^{2\pi i x_o x}f(x).

The verification of this formula is left as an exercise. Formula (7) is also true when {a=-\infty} or {b=+\infty } with obvious modifications.

Exercise 6 Prove formula (7).

A simple corollary of the {L^p} boundedness of the Hilbert transform is the corresponding statement for {S_{(a,b)}}.

Lemma 8 The operator {S_{(a,b)}} is of strong type {(p,p)} for {1<p<\infty}:

\displaystyle \|S_{(a,b)}(f)\|_{L^p({\mathbb R})}\lesssim_p \|f\|_{L^p({\mathbb R})}.

Note that the operator norm of {S_{(a,b)}} does not depend on {a,b}.

Now for {N>0} and {f\in\mathcal S({\mathbb R})} define the partial Fourier integral operator

\displaystyle S_N(f)(x)=\int_{-N} ^N \hat f(\xi) e^{2\pi i x \xi }d\xi\int_{{\mathbb R}}\chi_{(-N,N)}(\xi)\hat f(\xi)e^{2\pi i x\xi}d\xi,\quad x\in {\mathbb R} .

Observe that these integrals are the {\chi_{(-N,N)}}-means of the integral {\int\hat f(\xi )e^{2\pi i x \xi}d\xi}. We have seen that the Gauss-Weierstrass or Abel means of this integral converge to {f}, both almost everywhere as well as in the {L^p} sense. However the function {\chi_{(-N,N)}} is much rougher. We still have the following theorem as a consequence of the {(p,p)} bound for the Hilbert transform.

Theorem 9 For {1<p<\infty} the operator {S_N} has a unique extension to a bounded linear operator on {L^p({\mathbb R})} for {1<p<\infty}.

However the {L^p} boundedness of {S_N} control the {L^p} convergence of partial Fourier integrals.

Lemma 10 The partial Fourier integrals {S_N(f)} converge to {f} in the {L^p} norm for {1<p<\infty} if and only if {S_N} is of strong type {(p,p)} uniformly in {N}.

Now Theorem 9 and Lemma 10 immediately imply:

Corollary 11 For {1<p<\infty } the partial Fourier integrals {S_N(f)} converge to {f} in the {L^p} norm.

The question whether {S_N(f)} converges to {f} almost everywhere is much harder. For {f\in L^p({\mathbb R})} the answer is positive and this is the content of the famous Carleson-Hunt theorem. This theorem was first proved by Carleson for {L^2} and then extended to {L^p} by Hunt. A counterexample by Kolmogorov shows that both the {L^1} and the almost everywhere convergence of the partial Fourier integrals fail for {L^1}.

Exercise 7 Show that {S_N} extends to an operator of weak type {(1,1)} on {L^1({\mathbb R})} and that the partial Fourier integrals converge to {f} in measure for {f\in L^1({\mathbb R})}. Conclude that for almost every {x\in {\mathbb R}} there is a subsequence {\{N_k\}} such that {S_{N_k}(f)(x)\rightarrow f(x)}.

[Update 15th May 2011: Equation (7) moved to the right place, Exercise 1 slightly changed.]


About ioannis parissis

I'm a postdoc researcher at the Center for mathematical analysis, geometry and dynamical systems at IST, Lisbon, Portugal.
This entry was posted in Dmat0101 - Harmonic Analysis, math.CA, Mathematics, Teaching and tagged , , , , , , , . Bookmark the permalink.

4 Responses to DMat0101, Notes 6: Introduction to singular integral operators; the Hilbert transform

  1. K says:

    Great article – I like the motivation you provided! In particular Lemma 1, was a point I was unaware of when I first learned singular integrals, but it’s good to know because it tells you the most you can expect for L^p estimates. Also, it’s the same estimate Mf(x) \sim |x|^{-1} for the Hardy–Little maximal inequality in 1 dimensions as you remarked in the previous lecture.

    By the way, the link to Carleson’s theorem at the end doesn’t work. [Corrected thanks! Y.]

  2. Florian says:

    Thank you for the article, very pleasant to read.
    I have one additional question : I read somewhere that the Hilbert transform of a Hölder continuous function, say on [-a ; a], is still Hölder continuous on [-a/2 ; a/2].
    Do you have any idea how to prove it ?

    • Dear Florian,

      to be honest I don’t know of this specific result from the top of my head. I would try to argue via the Fourier transform of H(f) though. I will try to come back with a more precise answer. To clear out one thing, do you mean that your function is also compactly supported on [-a,a]?


      • Florian says:

        Thank you Yannis for this answer. I also had the idea of using the Fourier transform but I have not done the calculation so far (I have tried the naive way to show it, with some refinements, and it did not work…)
        My question was general, I did not assume that the function is compactly supported, only that it is square integrable on R.
        Yet, if it is simpler, let us assume this hypothesis at first.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s