DMat0101, Notes 2: Convolution, Dense subspaces and interpolation of operators

1. Convolutions and approximations to the identity

We restrict our attention to the Euclidean case {({\mathbb R}^n,\mathcal L,dx)}. As we have seen the space {L^1({\mathbb R}^n)} is a vector space; linear combinations of functions in {L^1({\mathbb R}^n)} remain in the space. There is however a `product’ defined between elements of {L^1({\mathbb R}^n)} that turns {L^1} into a Banach algebra. For {f,g\in L^1({\mathbb R}^n)} we define the convolution of {f*g} to be the function

\displaystyle (f*g)(x)=\int_{{\mathbb R}^n} f(y)g(x-y)dy = \int_{{\mathbb R}^n} g(y)f(x-y)dy.

Furthermore, using Fubini’s theorem to change the order of integration we can easily see that

\displaystyle  \begin{array}{rcl}  \|f*g\|_{L^1({\mathbb R}^n)}\leq \|f\|_{L^1({\mathbb R}^n)}\|g\|_{L^1({\mathbb R}^n)}. \end{array}

Thus for {f,g\in L^1({\mathbb R}^n)} we have that their convolution {f*g} is again an element of {L^1({\mathbb R}^n)}. Note that the previous estimate is the main difficulty in showing that {(L^1({\mathbb R}^n),*)} is a Banach algebra.

More generally, the convolution of {f\in L^p({\mathbb R}^n)}, {1\leq p \leq +\infty}, and {g\in L^1({\mathbb R}^n)}, is a well defined element of {L^p({\mathbb R}^n)} and we have that

\displaystyle  \|f*g\|_{L^p({\mathbb R}^n)}\leq \|f\|_{L^p({\mathbb R}^n)}\|g\|_{L^1({\mathbb R}^n)}. \ \ \ \ \ (1)

Exercise 1 Use the integral version of Minkowski’s inequality to prove estimate (1) above.

Let us summarize some properties of convolution in the following proposition. We take the chance to give two definitions here that we will use throughout these notes.

Definition 1 Let {X} be a topological space and {f\in C(X)} be a continuous function. The support of {f:{\mathbb C}\rightarrow X}, denoted by {{\mathrm{supp}}(f)}, is the set

\displaystyle {\mathrm{supp}}(f)=\overline{\{x\in X:f(x)\neq 0\}}=\overline{f^{-1}({\mathbb C}\setminus\{0\})}.

This is the smallest closed set in {X} outside which {f=0}.

Observe that we gave the definition of the support of a function for continuous functions. This is mostly a technical issue. It is easily understood that, in general, the support of a measurable function can only be defined up to sets of measure zero. The precise definition is as follows.

Definition 2 Let {\mu} be a regular Borel measure on a topological space {X} and {f:X\rightarrow {\mathbb C}} be a Borel measurable function. A point {x\in X} is called a support point for {f} if

\displaystyle \mu(\{y\in U_x:f(y)\neq 0\})>0 ,

for every open neighborhood {U_x} of {x}. The set

\displaystyle {\mathrm{supp}}(f)=\{x\in X: x \mbox{ is a support point for } f\}

is called the support of {f}.

Exercise 2 Assume that the measure {\mu} in the previous definition has the additional property that {\mu(U)>0} for every open set {U\subset X}. Use exercise 1 of notes 1 to prove that for any continuous function {f:X\rightarrow {\mathbb C}} the two definitions of {{\mathrm{supp}}(f)}, that is Definition 1 and Definition 2, coincide.

Proposition 3 Let {f,g,h:{\mathbb R}^n\rightarrow {\mathbb C}} be such that the convolutions below are well defined.

(i) (commutative) {f*g=g*f.}

(ii) (associative) {(f*g)*h=f*(g*h).}

(iii) (translations) For {x,y\in {\mathbb R}^n} and {f:{\mathbb R}^n\rightarrow{\mathbb C}} we define the translation operator

\displaystyle \tau_y(f)(x)=f(x-y).

For {y\in{\mathbb R}^n} we have

\displaystyle \tau_y(f*g)=(\tau_y f)*g=f*(\tau_yg)


(iv) (support) If {f,g\in C({\mathbb R}^n)} then

\displaystyle {\mathrm{supp}} (f*g)\subset \overline{\{x+y:x\in{\mathrm{supp}}(f),y\in{\mathrm{supp}}(g)\}}.

Proof: Statements (i), (ii) and (iii) are trivial consequences of changes of variables and Fubini’s theorem. For (iv) observe that if {z\notin \overline{\{x+y:x\in {{\mathrm{supp}}}(f),y\in{{\mathrm{supp}}}(g)\}}} then for any {y\in{\mathrm{supp}}(g)} we have {z-y\notin {\mathrm{supp}} (f)}. Thus {g(y)f(z-y)=0} for all {y\in{\mathbb R}^n}, so {(f*g)(z)=0}. \Box

A very useful property of the convolution of two functions is that it adopts the smoothness of the `nicest’ function. Formally this is because any differentiation operator applied to {f*g} can be transferred to either {f} or {g}:

\displaystyle \partial^\alpha (f*g)=(\partial^\alpha f)*g=f*(\partial^\alpha g) .

Here we use the standard multi-index notation: for {\alpha=(\alpha_1,\ldots,\alpha_n)\in{\mathbb N}^n} and {f:{\mathbb R}^n \rightarrow {\mathbb C}} we write as usual {\partial^\alpha f=\frac{\partial^{\alpha_1}}{\partial x_1 ^{\alpha_1}}\cdots \frac{\partial^{\alpha_n}}{\partial x_n ^{\alpha_n}} f}. We also write {|\alpha|=\alpha_1+\cdots+\alpha_n}. In practice we need one of the functions to have some regularity and some mild conditions on the second function to do this rigorously. For example we have the following:

Proposition 4 Let {f\in L^1({\mathbb R}^n)} and suppose that {g} has continuous partial derivatives up to {k}-th order, that is {g\in C^k({\mathbb R} ^n)}. Suppose also that {\partial ^\alpha g} is bounded for all {|a|\leq k}. Then {f*g} has continuous derivatives up to {k}-th order, i.e. {f*g\in C^k({\mathbb R}^n)}, and {\partial^\alpha (f*g)=f*(\partial^\alpha g)}.

Proof: Let’s just see the special case {n=1} and {k=1}. The proof in the general case is identical. Call {d\mu(y)=f(y)dy}. Since {f\in L^1({\mathbb R})}, {d\mu} is a finite, absolutely continuous measure. We then need to show that

\displaystyle  \frac{d}{dx}\int_{{\mathbb R}} g(x-y)d\mu(y)=\int_{\mathbb R} g'(x-y)d\mu(y).

Fix some sequence {x_n\rightarrow x}. Observe that {g'(x-y)=\lim_nh_n(y)=\lim_n \frac{g(x_n-y)-g(x-y)}{x_n-x}}. By the mean value theorem we have that

\displaystyle  |h_n(x)|\leq \|g'\|_{\infty}.

Using Lebesgue’s dominated convergence theorem we get

\displaystyle  \begin{array}{rcl} (f*g)'(x) &=&\lim_n \int_{{\mathbb R}}\frac{g(x_n-y)-g(x-y)}{x-x_n}d\mu(y)= \int_{{\mathbb R} }\lim_n h_n(x)\\ \\&=& \int_{\mathbb R} g'(x-y)d\mu(y).\end{array}

Observe that the hypothesis on the boundedness of the higher order derivatives will be used to show the uniform boundedness of (the analogues of) the functions {h_n(x)} in the general case. \Box

It is instructive to fix one function {g} to be an indicator function, say {g_1(x)=\frac{1}{2}\chi_{(-1,1)}(x)} where the constant {1/2} is there just in order to normalize the total {L^1}-mass of the function {g_1} to {1}. Usually we consider smooth versions of {g_1} but let’s just stick to case of the characteristic function for the sake of simplicity. Consider the `reflection’ of {g_1} give as {\tilde g_1(t)=g_1(-t)}. Since we have started with an even function this makes no difference so that {g_1=\tilde g_1}. Observe that we can write

\displaystyle  f*g(x)=\int f(y)g_1(x-y)dy=\int f(y)\tilde g_1(y-x) dy=\int f(y)(\tau_x\tilde g_1)(y)dy.

For some fixed {x\in{\mathbb R}}, the translations of {\tilde g_1} by {x\in {\mathbb R}}, {\tau_x\tilde g_1} centers the function {\tilde g_1} at the point {x}. So {\tau_x\tilde g_1} is (a multiple of) the indicator function of an interval of length {2}, centered at {x}. Integrating against {f(y)} essentially averages the function {f} around the point {x} with `weight’, the function {\tilde g_1}. In this averaging process, our choice of {g_1} implies that only the values of {f} at a scale {1} around {x} will be important. Thus the convolution of {f} and {g_1} replaces the value of {f} at a point with the average of the values of {f} at a scale {1} around a point. One can take this process one step further and consider sequences of functions that are more or and more concentrated around the origin, but have the same {L^1} mass, say {1}. For example the second function in this sequence would be {g_2=\chi_{(-\frac{1}{2},\frac{1}{2})}}, the third could be {g_3(x)=2\chi_{(-\frac{1}{4},\frac{1}{4})}}, and so on. Taking convolutions of the function {f} with the functions {g_1,g_2,g_3,\ldots} amounts to averaging the function {f} around every point, in smaller and smaller scales around the point. Intuitively one thinks that in the limit, one should recover the function itself, at least in some weak sense. This turns out to be indeed the case. But what’s the gain? We just saw that taking convolutions of an integrable (say) function with a smooth bounded function gives as again a smooth function. Thus the previous process allows us to approximate (in some sense) any reasonable function by a sequence of very smooth functions. This has many technical advantages as one can think of any function as a limit, in the appropriate sense, of smooth approximations. This also gives a heuristic explanation of why the convolution of two functions behaves at least as good as the `nicest’ function in the convolution; averaging is a smoothing process.

We will now make the previous heuristic discussion precise. Let {\phi} be a function on {{\mathbb R}^n} and {t>0}. We define the dilations of the function {\phi} to be

\displaystyle \phi_t(x)=\frac{1}{t^n}\phi(\frac{x}{t}),\quad x\in \mathbb R^n.

Usually we will have a lot of freedom in choosing the function {\phi} and we will require at least that {\phi \in L^1(R^n)}. Observe that dilating the function {\phi} by {t>0} doesn’t change the integral:

\displaystyle  \int_{{\mathbb R}^n} \phi_t (x) dx=\frac{1}{t^n} \int_{{\mathbb R}^n} \phi\big(\frac{x}{t}\big)dx =\int_{{\mathbb R}^n} \phi(x)dx.

You should think of the function {\phi} as a function concentrated around a point as was for example {g_1} in the previous discussion or, even better, as smooth approximations of it (bump function). Thus for example {\phi } could be a smooth function with compact support around the origin. Observe that as {t\rightarrow 0}, the mass of the function {\phi_t}, which is constant, becomes more and more concentrated around the origin. We will refer to this construction as `an approximation to the identity’. The reason is that, as was mentioned before, one can recover any reasonable function {f} by convolving with {\phi_t} and taking the limit as {t\rightarrow 0}, at least in the {L^p} sense. A more rigorous explanation is that {\phi_t} converges (in a weak sense) to a dirac mass at {0}.

Theorem 5 Let {\phi\in L^1({\mathbb R}^n)} with {\int \phi(x) dx =1}. For {t>0} define the dilations of {\phi} as before, {\phi_t(x)=t^{-n}\phi(t/x)}. Then, for any {1\leq p<\infty} we have that {f*\phi_t \rightarrow f} in the {L^p} as {t\rightarrow 0}:

\displaystyle 	\|f*\phi_t -f \|_{L^p({\mathbb R}^n)}\rightarrow 0\quad\mbox{as}\quad t\rightarrow 0.

Proof: For {y\in {\mathbb R}^n} we use the notation

\displaystyle  (\tau_yf)(x)=f(x-y),

for the translation operator. Using the fact that {\phi_t} has integral {1} we can write

\displaystyle  \begin{array}{rcl}  	(f*\phi_t)(x)-f(x)&=&\int_{{\mathbb R}^n} [f(x-y)-f(x)]\phi_t(y)dy\\ \\ 	&=&\int_{{\mathbb R}^n}[f(x-tu)-f(x)]\phi(u)du\\ \\ 	&=&\int_{{\mathbb R}^n}[(\tau_{tu}f)(x)-f(x)]\phi(u)du. \end{array}

By Minkowski’s integral inequality we get that

\displaystyle  \begin{array}{rcl}  	\| f*\phi_t -f\|_{L^p({\mathbb R}^n)}& =&\bigg\| \int_{{\mathbb R}^n}[(\tau_{tu}f)(x)-f(x)]\phi(u)du\bigg\|_{L^p({\mathbb R}^n)}\\ \\ 	 &\leq& \int_{{\mathbb R}^n}\|(\tau_{tu}f)(x)-f(x)\|_{L^p({\mathbb R}^n)}|\phi(u)|du. \end{array}

Now {\|\tau_{tu}f-f\|_{L^p({\mathbb R}^n)}\rightarrow 0} as {t\rightarrow 0} (see remark below) and {\|\tau_{tu}f-f\|_{L^p({\mathbb R}^n)}\leq 2 \|f\|_{L^p({\mathbb R}^n)}} so by the dominated convergence theorem we get the result. \Box

Remark 1 The translation operator is continuous in {L^p} for all {1\leq p <\infty}, that is

\displaystyle \|\tau_y f -f\|_{L^p({\mathbb R}^n)}\rightarrow 0\quad\mbox{as}\quad y\rightarrow 0,

for all {f\in L^p({\mathbb R}^n)}, {1\leq p <\infty}.

Observe that for {p=\infty}, {\|\tau_yf -f\|_{L^\infty({\mathbb R}^n)}\rightarrow 0} as {y\rightarrow 0} means that {f} is uniformly continuous. This shows why the previous theorem breaks down in {L^\infty}.

Exercise 3 Show that the translation operator is continuous in {L^p({\mathbb R}^n)} for {1\leq p <\infty}. Use the fact that continuous functions with compact support are dense in {L^p} for {1\leq p <\infty}. See also section 2.

Exercise 4 (i) Show that

\displaystyle 	\|f*\phi_t-f\|_{\infty}\rightarrow 0,,

as t\to 0 for all f which are bounded and uniformly continuous.

(ii) If f is bounded and continuous on \mathbb R^n show that

\displaystyle f*\phi_t\to f

as t \to 0, uniformly on compact subsets of \mathbb R^n.

Remark 2 There is a slight abuse of notation here. We use {\|\cdot\|_\infty} for the norm in the space {L^\infty} defined in terms of the essential supremum of a function. However, the right norm in spaces of continuous functions should be defined in terms of the actual supremum of the function. Note however that for a continuous function, the two notions are identical so this should create no confusion.

Exercise 5 Let {1\leq p\leq \infty} and {p'} be its dual exponent. Suppose that {f\in L^p({\mathbb R}^n)} and {g\in L^{p'}({\mathbb R}^n)}. Show that {f*g} exists for every {x\in {\mathbb R}^n} and that it is a continuous and decays to zero at infinity. Also show the estimate

\displaystyle  \|f*g\|_{L^\infty({\mathbb R}^n)} \leq \|f\|_{L^p({\mathbb R}^n)}\|g\|_{L^{p'}({\mathbb R}^n)}.

Remark 3 If {\mu} is a finite Borel measure on {{\mathbb R}^n} and {f\in L^p({\mathbb R}^n)} it makes perfect sense to define the convolution of {f} with {\mu} to be the function

\displaystyle (f*\mu)(x)=\int_{{\mathbb R}^n}f(x-y)d\mu (y).

We then have

\displaystyle  \|f*\mu\|_{L^p({\mathbb R}^n)}\leq \|\mu\| \|f\|_{L^p({\mathbb R}^n)},

where {\mu} is the total variation of the measure {\mu}.

2. Some dense classes of functions

In this paragraph we will discuss some dense sub-classes of functions inside the {L^p} space. These will prove to be very useful as many estimates will be easier to establish for these special sub-classes. Also, many times, working with a dense class in {L^p}, help us avoid several technical difficulties or even define operators that are not obviously defined directly on some {L^p} space. We will state some of the results here in the generality of a Hausdorff (or locally Hausdorff) space noting that everything goes through for {{\mathbb R}^n} equipped with the Lebesgue measure.

Simple functions: Let {S} be the class of all simple functions {s:X\rightarrow{\mathbb C}} such that

\displaystyle \mu(\{x\in X:s(x)\neq 0\})<\infty,

that is all simple complex valued functions that have support of finite measure. For {1\leq p<\infty} the space {S} is dense in {L^p(X,\mu)}. The space of all simple functions (not necessarily of finite compact support) is dense in {L^p} for {1\leq p \leq \infty}.

Continuous functions with compact support: Let {(X,\mathcal X,\mu)} be a measure space, where {X} is a locally Hausdorff space, {\mathcal X} is a {\sigma}-algebra that contains all compact subsets of {X} and such that

(i) locally finite: {\mu(K)<+\infty} for all compact sets {K\subset X}.

(ii) {\mu} is inner regular, meaning {\mu(A)=\sup\{\mu(K):K\subset A, K\mbox{ is compact.}\}}

(iii) {\mu} is outer regular, meaning {\mu(A)=\inf\{\mu(U): A\subset U, U\in\mathcal X\mbox{ and }U\mbox{ is open.}\}}

We denote by {C_c(X)} the space of continuous functions {f:X\rightarrow {\mathbb C}} with compact support. Then, for every {1\leq p < \infty}, {C_c(X)} is dense in {L^p(X,\mu)}.

Remark here that whenever we embed {C_c(X)} into {L^p(X,\mu)}, {C_c(X)} automatically inherits the topology induced by the larger space, that is, the one defined by the norm {\|\cdot\|_{L^p(X,\mu)}}. Since {L^p} spaces are complete under our hypotheses, this says that {L^p(X,\mu)} is the completion of {C_c(X)} with respect to the norm of {L^p(X,\mu)} for {p<\infty}. For {p=\infty}, the completion of {C_c(X)} with respect to the {\|\cdot\|_{L^\infty(X,\mu)}} is not {L^\infty(X,\mu)} but the space of continuous functions on {X} that vanish at infinity.

Continuous functions that vanish at infinity: Let {X} be a locally compact Hausdorff space (a Hausdorff space where every point has a compact neighborhood). A function {f:X\rightarrow {\mathbb C}} is said to vanish at infinity if for every {\epsilon>0} there exists a compact set {K\subset X} such that {|f(x)|<\epsilon } for all {x\notin K}. We denote by {C_o(X)} the space of all complex valued continuous functions on {X} that vanish at infinity.

It is clear that {C_c(X)\subset C_o(X)}, and actually the two spaces coincide whenever {X} is compact. We can equip the space {C_o(X)} with the norm

\displaystyle \|f\|_{\infty}=\sup_{x\in X} |f(x)|.

Theorem 6 If {X} is a locally compact Hausdorff space, then {C_o(X)} is the completion of {C_c(X)} with respect to the supremum norm defined above.

For the proofs of the previous classical results see for example [F] or [R].

All the previous results apply to the Euclidean setup {({\mathbb R}^n,\mathcal L,dx)}. Of course simple functions with support of finite measure are dense in {L^p({\mathbb R}^n)} whenever {1\leq p <+\infty}. A bit more can be said as we can choose our simple functions to be linear combinations of ({n}-dimensional) bounded intervals, and these are still dense in {L^p({\mathbb R}^n)}. Continuous functions with compact support are also dense in {L^p({\mathbb R}^n)} for all {1\leq p <\infty}. We can also restrict to a smaller class of more regular functions:

Infinitely differentiable functions with compact support: Let us consider the space of functions {f:{\mathbb R}^n\rightarrow C} which are infinitely differentiable and have compact support. We denote this space by {\mathcal{D}({\mathbb R}^n)=C_c ^\infty ({\mathbb R}^n)}. First of all it is not totally trivial that this space is non-empty.

Lemma 7 There exists a function {\phi_1\in \mathcal D({\mathbb R})}. From this we easily conclude that there is a {\phi\in \mathcal D({\mathbb R}^n)}.

Exercise 6 Consider the function

\displaystyle g(t)=\begin{cases} e^{-\frac{1}{t}}\quad t>0,\\ 0,\quad\mbox{otherwise}.\end{cases}

(i) Show that {g}, together with its derivatives of any order, is infinitely differentiable and bounded.

(ii) Consider the function {\phi_1(t)=g(1+t)g(1-t)}. Show that {\phi_1(t)=e^{-2/(1-t^2)}} if {|t|<1} and {\phi_1(t)=0} otherwise. It is obvious then that {\phi_1\in\mathcal D({\mathbb R})}.

(iii) For {x=(x_1,\ldots,x_n)\in{\mathbb R}^n} consider the function {\phi(x)=\phi_1(x_1)\cdots\phi_1(x_n)} belongs to {\mathcal D(R^n)}. (iv) For {x=(x_1,\ldots,x_n)\in{\mathbb R}^n} consider the function

\displaystyle \psi(x)=\begin{cases}e^{-2/(1-|x|^2)},\quad |x|<1,\\ 0,\quad\mbox{otherwise}.\end{cases}.

Show that {\psi\in\mathcal D({\mathbb R}^n)}.

Obviously {\mathcal D ({\mathbb R}^n)\subset C_c ({\mathbb R}^n)}. However, it is not hard to see the space {\mathcal D ({\mathbb R}^n)} is still dense in {L^p({\mathbb R}^n)} for {1\leq p <\infty}. It will however be easier to show that once we’ve introduced some more tools from real analysis and, in particular, convolution.

Schwartz functions: Here we introduce the space of Schwartz functions {\mathcal S({\mathbb R}^n)}, which will turn out to be extremely useful in what follows. So let {\mathcal S ({\mathbb R}^n)} be the space of all infinitely differentiable ({C^\infty}) functions {f:{\mathbb R}^n\rightarrow {\mathbb C}} such that

\displaystyle  \sup_{x\in{\mathbb R}^n}| x^\alpha D^\beta f(x)|<\infty,

for all multi-indices {\alpha=(\alpha_1,\ldots,\alpha_n),\beta=(\beta_1,\ldots,\beta_n)}, of nonnegative integers. In other words, Schwartz functions are smooth functions that, together with their partial derivatives of every order, decay faster than any polynomial power at infinity. Of course every function in the class {\mathcal D({\mathbb R}^n)} is trivially a Schwartz function since it vanishes identically at infinity together with its derivatives of every order. A more interesting example of a Schwartz function is the Gaussian function {\phi:{\mathbb R}^n\rightarrow {\mathbb R}}:

\displaystyle \phi(x)=e^{-\delta |x|^2}, \quad \delta>0.

The space {\mathcal S({\mathbb R}^n)} is also dense in all {L^p({\mathbb R}^n)} spaces for {1\leq p<\infty}. Of course this is immediate once one shows that {\mathcal D({\mathbb R}^n)} is dense in {L^p({\mathbb R}^n)}.

Schematically we have the following inclusions

\displaystyle  \begin{array}{rcl}  	\mathcal D({\mathbb R}^n) &\subset& \mathcal S({\mathbb R}^n) \subset L^p({\mathbb R}^n),\\ \\ 	\mathcal D({\mathbb R}^n) &\subset& C_c({\mathbb R}^n) \subset L^p({\mathbb R}^n). \end{array}

and each space in this chain is dense in {L^p({\mathbb R}^n)} with the topology induced by {L^p({\mathbb R}^n)} for {1\leq p <\infty}. We will discuss the space of Schwartz functions in much more detail in what follows. For now you can think of it as another nice class of functions that is dense in all the spaces {L^p({\mathbb R}^n)} for {1\leq p <\infty}.

In the following proposition we use convolutions to show the previous denseness properties:

Proposition 8 The space {\mathcal D({\mathbb R}^n)}, and thus also the space {\mathcal S({\mathbb R}^n)}, is dense in {L^p({\mathbb R}^n)} for all {1\leq p<\infty}. Also the space {\mathcal D({\mathbb R}^n)} is dense in {C_o({\mathbb R}^n)} in the supremum norm.

Proof: Let {f\in L^p({\mathbb R}^n)} and {\epsilon>0}. Since the space {C_c({\mathbb R}^n)} is dense in {L^p({\mathbb R}^n)}, there is a {g\in C_c({\mathbb R}^n)} such that

\displaystyle \|f-g\|_{L^p({\mathbb R}^n)}<\frac{\epsilon}{2}.

Let {\phi\in \mathcal D({\mathbb R}^n)} with {\int \phi =1}. By 5 we have that there is {\phi_t*g\rightarrow g} in {L^p({\mathbb R}^n)} as {t\rightarrow 0}. Thus for {t} small enough we have that

\displaystyle \|g*\phi_t-g\|_{L^p({\mathbb R}^n)}<\frac{\epsilon}{2}.

We conclude that

\displaystyle \|g*\phi_t-f\|_{L^p({\mathbb R}^n)}<\epsilon.

It remains to verify that {g*\phi_t} is in {\mathcal D({\mathbb R}^n)} for every {t>0}. Note however that {g*\phi_t} is smooth by Proposition 4. Also, since both {g} and {\phi_t} have compact support, Proposition 3 shows that {g*\phi_t} also has compact support and we are done. Observe that the same argument applies if we start with a {f\in C_o({\mathbb R}^n)}. Using the fact {C_c({\mathbb R}^n)} is dense in {C_o({\mathbb R}^n)} it suffices to approximate a function {g\in C_o({\mathbb R}^n)}. However, functions in {C_o({\mathbb R}^n)} are obviously bounded, so Exercise 4 completes the proof in this case as well. \Box

Let us go back to approximations of the identity and justify their name.

Exercise 7 (convergence of approximations to the identity in the sense of distributions) For {a\in{\mathbb R}^n} we denote by {\delta_a} the Dirac measure at the point {a}:

\displaystyle 	\int_E d\delta_a=\begin{cases} 1, \quad a\in E,\\ 0,\quad a\notin E. \end{cases}

Let {\phi\in L^1 ({\mathbb R}^n)} with {\int_{{\mathbb R}^n}\phi=1 } and consider the approximation to the identity {\phi_t(x)=t^{-n}\phi(x/t)}, {t>0}. Show that

\displaystyle \lim_{t\rightarrow 0}\int_{{\mathbb R}^n}\phi_t(x)\psi(x)dx=\int_{{\mathbb R}^n} \psi (x)d\delta_0(x),

for every {\psi\in \mathcal D({\mathbb R}^n)}. We say that {\phi_t(x)} (considered as a sequence of finite measures) converges in the sense of distributions to the measure {d\delta_0}. We will come back to that point later on in the course.

3. Operators on {L^p} spaces; boundedness and interpolation

Having set up our main environment, the spaces {L^p(X,\mathcal X,\mu)}, we come to the core of this introduction: operators acting on these spaces and their properties. In general, we will consider operators {T} taking functions on some measure space {(X,\mathcal X,\mu)} to function on some other measure space {(Y,\mathcal Y,\nu)}. Many times our operators will be initially defined on `nice functions’ such as smooth functions with compact support of Schwartz functions. The goal would then be to extend the operator to a standard normed vector space such as {L^p(X,\mu)}.

Suppose that {(Z,\| \cdot \| _Z )} and {(W,\|\cdot\|_W)} are two normed vector spaces (usually Banach spaces of functions) and {T:Z\rightarrow W} be a linear operator, that is, we have

\displaystyle T(a x+by)=aTx+bTy,

for all {x,y,\in Z} and complex numbers {a,b}. We will say that {T} is bounded if there is a constant {c>0} such that {\|Tz\|_W \leq c \|z\|_Z} for every {z\in Z}. The norm of the operator {T}, denoted by {\|T\|_{Z\rightarrow W}} or just {\|T\|}, is the smallest constant {c>0} so that such an inequality is true. We thus have

\displaystyle \|T\|=\sup_{z\in Z}\frac{\|Tz\|_W}{\|z\|_Z}=\sup_{\|z\|_Z\leq 1}\|Tz\|_W.

Continuity is equivalent to boundedness for linear operators:

Lemma 9 Let {T:Z\rightarrow W} be a linear operator. The following are equivalent: (i) The operator {T} is continuous.

(ii) The operator {T} is continuous at {0}.

(iii) The operator {T} is bounded.

Suppose that we want to show that a linear operator {T:Z\rightarrow W} is a well defined bounded linear operator, where {Z,W} are Banach spaces. Many times however we can only define the operator on some dense subset {Z_o\subset Z}. Suppose we have then that {T:Z_o\rightarrow W}. When can we extend {T} to the whole class {Z}? Given {z\in Z}, the obvious thing to do is to consider some sequence {\{z_n\}\subset Z_o} such that {z_n\rightarrow z}. We then need to examine whether the limit {Tz_n} exists. Suppose that {T} is bounded on the dense sub-class, that is if,

\displaystyle \|Tz\|_W\leq \|T\| \|z\|_{Z},

for all {z\in Z_o}. Using the boundedness of {T} on the dense class and linearity (essential) we can conclude that

\displaystyle \|Tz_m-Tz_n\|_W \leq \|T\| \|z_m-z\|_n,

so the sequence {\{Tz_n\}} is a Cauchy sequence. The completeness of {W} then implies that the limit of {\{Tz_n\}} does indeed exist, so we can define

\displaystyle Tz:=\lim_n Tz_n.

Observe also that for any other sequence {y_n\rightarrow z} we must have

\displaystyle  \begin{array}{rcl}  	\|\lim_n Tz_n-\lim_n Ty_n\|_W &\leq& \|\lim_n Tz_n- Tz_{n_o}\|_W+ \|\lim_n Ty_n-Ty_{n_o}\|_W \\&+ &\|T\|\|z_{n_o}-y_{n_o}\|_Z, \end{array}

for any {n_o\in \mathbb N}. From this we conclude that {\lim_nTz_n=\lim_nTy_n} thus the extension is unique. Many times we will only define the operator {T} on the dense class and show its continuity on the dense sub-class. We will then say that {T} is densely defined.

We will use this device many times in trying to show that some linear operator {T:L^p\rightarrow L^q} is well defined and bounded, by examining the continuity of {T} on one of the dense classes that we have considered before (depending on what is more convenient).

A more general class of operators we will come across quite often is that of sublinear operators. Suppose that {T} is an operator acting on a vector space of measurable functions. Then {T} is called sublinear if {|T(af)|=|a||Tf|} for all complex constants {a} and

\displaystyle |T(f+g)(x)|\leq |T(f)(x)|+|T(g)(x)|,

for all {f,g} in the vector space. Of course all linear operators are sublinear. However, the most typical example of a sublinear operators we will come across is a maximal type operator. Such an operator has the form

\displaystyle Tf=\sup_{t\in \Lambda} |T_t f|,

where {{T_t}} is a family of linear operators acting on some vector space of measurable functions, {\Lambda} is an infinite countable or uncountable index set, and the function {t\rightarrow T_t f} is a measurable function of {t}. Such operators are called maximal operators and the linearity of each {T_t} guarantees that {T} is sublinear.

Definition 10 (i) Let {0<p,q\leq \infty} and {T} be a sublinear operator on {L^q(X,\mu)}. We will say that {T} is of strong type {(p,q)} if

\displaystyle \|Tf\|_{L^q(Y)} \lesssim_{p,q,T}\|f\|_{L^p(X)} ,

for all {f\in L^p(X)}, where the implied constant depends only on {p,q} and {T}. In this case we write {\|T\|_{L^p\rightarrow L^q}} for the norm of the operator {T:L^p\rightarrow L^q}.

(ii) We will say that {T} is of weak type {(p,q)} if

\displaystyle  \|f\|_{L^{q,\infty}(X,\mu)}\lesssim_{p,q,T} \|f\| _{L^p(X,\mu)},

for all {f\in L^p(X,\mu)}. We will write {\|T\|_{L^p\rightarrow L^{q,\infty}}} for the norm of the operator {T:L^p \rightarrow L^{q,\infty}}.

Observe that for fixed {(p,q)}, the strong type {(p,q)} property of {T} trivially implies that {T} is of weak type {(p,q)}. The opposite, of course, is not true. However, we will see that in many cases the strong type bound can be deduced by interpolating between suitable endpoint weak type bounds. The first such result is the Marcinkiewicz interpolation theorem.

Theorem 11 (Marcinkiewicz interpolation theorem) Let {(X,\mu)} and {(Y,\nu)} be measure spaces, {1\leq p_1<p_2 \leq \infty}, and let {T} be a sublinear operator defined on {L^{p_1}(X,\mu)+L^{p_2}(X,\mu)} and taking values in the space of measurable functions on {(Y,\nu)}. Suppose that {T} is of weak type {(p_1,p_1)} and of weak type {(p_2,p_2)}. Then {T} is of strong type {(p,p)} for any {p_1<p<p_2}.

Remark 4 Before going into the proof of this theorem let us discuss a bit its hypothesis. Given a function {f\in L^p(X,\mu)} we first need to show that {T(f)} is well defined. Having the information that {T} is well defined on {L^{p_1}+L^{p_2}} we essentially need to see that {L^p\subset L^{p_1}+L^{p_2}} whenever {p_1<p<p_2}. To see this, fix a positive constant {\beta>0}, to be defined later, and consider the functions

\displaystyle f_1(x)=f(x)\chi_{\{x\in X: |f(x)|>\beta\}},

\displaystyle f_2(x)=f(x)\chi_{\{x\in X: |f(x)|\leq \beta\}}.

Obviously we have {f(x)=f_1(x)+f_2(x)}. Moreover,

\displaystyle \int_X |f_1(x)|^{p_1}dx =\int_X |f_1(x)|^{p}|f_1(x)|^{p_1-p} dx \leq \beta ^{p_1-p}\int_X |f(x)|^p dx.

Similarly we can estimate

\displaystyle \int_X |f_2(x)|^{p_2}dx =\int_X |f_2(x)|^{p}|f_2(x)|^{p_2-p} dx \leq \beta ^{p_2-p}\int_X |f(x)|^p dx.

This shows that we can decompose any function {f\in L^p(X,\mu)} to a sum of two functions {f_1\in L^{p_1}(X,\mu)} and {f_2\in L^{p_2}(X,\mu)}, whenever {p_1<p<p_2}, thus {L^p\subset L^{p_1}+L^{p_2}}. In particular, {T(f)} is well defined for any {f\in L^p(X,\mu)}.

Proof: We first prove the theorem when {p_2<\infty}. Since our hypothesis involves the distribution sets of of {T(f)} it is convenient to recall the representation of the {L^p} norm of a function in terms of its distribution set. Indeed, from Proposition 9 of notes 1 we have

\displaystyle \|Tf\|_{L^p(X,\mu)} ^p=\int_X |f(x)|^p d\mu(x)=p\int_0 ^\infty \lambda^{p-1}\mu(\{x\in X:|T(f)(x)|>\lambda\})d\lambda.

The measure of the set {\{x\in X:|T(f)(x)|>\lambda\}} will appear many times in the proof so it is convenient to give it a shorter notation:

\displaystyle \rho(\lambda)=\mu(\{x\in X: |T(f)(x)|>\lambda\}),\quad \lambda>0.

With this notation we have

\displaystyle  	\|T(f)\|_{L^p(X,\mu)} ^p=p\int_0 ^\infty \lambda^{p-1} \rho(\lambda)d\lambda. \ \ \ \ \ (2)

Fix {\lambda>0} for a moment and consider the decomposition of the function {f=f_1+f_2} at level {\lambda} as in the remark before:

\displaystyle  \begin{array}{rcl}  	f_1(x)&=&f(x)\chi_{\{x\in X:|f(x)|>\lambda\}},\\ \\ 	f_2(x)&=&f(x)\chi_{\{x\in X:|f(x)|\leq \lambda\}}. \end{array}

The sublinearity of {T} allows us to write

\displaystyle |T(f)(x)|\leq |T(f_1)(x)|+|T(f_2)(x)|,

for any {x\in X}. Thus,

\displaystyle  \begin{array}{rcl}  \{|Tf|>\lambda\} \subset \{|Tf_1|>\lambda/2\}\cup \{|Tf_2|>\lambda/2\}, \end{array}

so that

\displaystyle  \begin{array}{rcl}  	\rho(\lambda) \leq \mu(\{x\in X: |T(f_1)(x)|>\lambda/2\}) + \mu(\{x\in X: |T(f_2)(x)|>\lambda/2\}). \end{array}

Since {f_1\in L^{p_1}(X,\mu)} and {T} is of weak type {(p_1,p_1)} we can estimate the first summand as

\displaystyle  \begin{array}{rcl}  	\mu(\{x\in X:|T(f_1)(x)|>\lambda/2\})&\leq& (2A_1) ^{p_1}\frac{\|f_1\|^{p_1}_{L^{p_1}(X,\mu)}}{\lambda^{p_1}}. 	\end{array}

Similarly, since {f_2\in L^{p_2}(X,\mu)} and {T} is of weak type {(p_2,p_2)} we have

\displaystyle  \begin{array}{rcl}  	\mu(\{x\in X:|T(f_2)(x)|>\lambda/2\})&\leq& (2A_2) ^{p_2}\frac{\|f_2\|^{p_2}_{L^{p_2}(X,\mu)}}{\lambda^{p_2}}, \end{array}

where {A_1, A_2} are two numerical constants depending only on {p_1,p_2} respectively and on {T} and {X}. For simplicity we suppress the dependence of the constants {A_1,A_2} on these parameters. Combining the previous estimates we can write

\displaystyle  	\rho(\lambda)\leq \bigg(\frac{2A_1\|f_1\|_{L^{p_1}(X,\mu) }}{\lambda}\bigg)^{p_1}+ \bigg(\frac{2A_2\|f_2\|_{L^{p_2}(X,\mu) }}{\lambda}\bigg)^{p_2} \ \ \ \ \ (3)

Unravelling the definitions of {f_1,f_2} the previous estimate yields

\displaystyle  		\rho(\lambda)\leq \bigg(\frac{2A_1}{\lambda}\bigg)^{p_1}\int_{\{x\in X:|f(x)|>\lambda \}} |f(x)|^{p_1}dx + \bigg(\frac{2A_2}{\lambda}\bigg)^{p_2} \int_{\{x\in X:|f(x)|\leq \lambda \}}|f(x)|^{p_2}dx. 	\ \ \ \ \ (4)

In order to recover the {L^p} norm of {T(f)} observe by (2) that it’s enough to multiply {\rho(\lambda)} by {p\lambda^{p-1}} and integrate in {\lambda\in(0,\infty)}.

Multiplying the first summand on the right hand side of (4) by {p\lambda^{p-1}} and integrating we get

\displaystyle  \begin{array}{rcl}  	&&(2A_1)^{p_1} p\int_0 ^\infty \lambda^{p-p_1-1} \int_{\{x\in X:|f(x)|>\lambda \}} |f(x)|^{p_1}dx\ d\lambda\\ \\ &&=	(2A_1)^{p_1} p \int_X|f(x)|\int_0 ^{|f(x)|} \lambda^{p-p_1-1}d\lambda\ dx\ =p\frac{(2A_1)^{p_1}}{p-p_1}\|f\|^p _{L^p(X,\mu)}. \end{array}

Similarly, multiplying the second summand in (4) by {p\lambda^{p-1}} and integrating we have

\displaystyle  \begin{array}{rcl}  	&&(2A_2)^{p_2} p\int_0 ^\infty \lambda^{p-p_2-1} \int_{\{x\in X:|f(x)|\leq \lambda \}} |f(x)|^{p_2}dx\ d\lambda\\ \\ &&=	(2A_2)^{p_2} p \int_X|f(x)|\int_{|f(x)|} ^\infty \lambda^{p-p_2-1}d\lambda\ dx\ =p\frac{(2A_2)^{p_2}}{p_2-p}\|f\|^p _{L^p(X,\mu)}. \end{array}

Summing up the previous two estimates we conclude that

\displaystyle  \begin{array}{rcl}  	\|T(f)\|^p _{L^p(X,\mu)}\leq p \bigg(\frac{(2A_1)^{p_1}}{p-p_1}+\frac{(2A_2)^{p_2}}{p_2-p}\bigg)\|f\|_{L^p(X,\mu)} ^p, \end{array}

which shows that {T} is of strong type {(p,p)} with

\displaystyle \|T\|_{L^p\rightarrow L^p}\leq p^\frac{1}{p} \bigg(\frac{(2A_1)^{p_1}}{p-p_1}+\frac{(2A_2)^{p_2}}{p_2-p}\bigg)^\frac{1}{p}.

Observe that there is no claim here that this quantitative estimate on the norm of {T} is optimal in general.

The proof in the case {p_2=\infty} is very similar. Now the hypothesis that {T} is of weak type {(p_2,p_2)} is replaced by the hypothesis that {T} maps {L^\infty} to {L^\infty}. That is, there exists some constant {A_2>0}, depending only on {T} and {X}, such that

\displaystyle \|T(g)\|_{L^\infty(X,\mu)}\leq A_2 \|g\|_{L^\infty(X,\mu)},

for all {g\in L^\infty (X,\mu)}. We fix some level {\lambda>0} and we split the function {f} as {f=f_1+f_2} where {f_2(x)=f(x)\chi_{\{x\in X:|f(x)|<\lambda/{2A_2}\}}}. Obviously {f_2\in L^\infty(X,\mu)} so by the hypothesis we have that {\|Tf_2\|_{L^\infty(X,\mu)}\leq \|f_2\|_{L^\infty(X,\mu)}\leq A_2 \lambda/2}. Arguing as in the case {p_2<\infty} we can write

\displaystyle  \begin{array}{rcl}  	\rho(\lambda)&\leq& \mu(\{x\in X: |T(f_1)(x)|>\lambda/2\}) + \mu(\{x\in X: |T(f_2)(x)|>\lambda/2\}). \end{array}

Since {\|T(f_2)\|_{L^\infty(X,\mu)}\leq \lambda/2}, the second summand in the previous estimate vanishes identically. We conclude that

\displaystyle  \begin{array}{rcl}  	\|T(f)\|^p _{L^p(X,\mu)}&=&p\int_0 ^\infty \lambda^{p-1}\rho(\lambda)d\lambda\leq (2A_1) ^{p_1} p\int_0 ^\infty \lambda^{p-1-p_1}\int_X |f_1(x)|^{p_1}dx \ d\lambda\\ \\ 	&=&(2A_1) ^{p_1}p\int_0 ^\infty \lambda^{p-p_1-1}\int_{\{x\in X:|f(x)|>\lambda/(2A_2)\}}|f(x)|^{p_1}dx \ {\mathrm d}\lambda\\ \\ 	&=&(2A_1) ^{p_1} p\int_X |f(x)|^{p_1} \int_0 ^{2A_2|f(x)|	}\lambda^{p-p_1-1}d\lambda \ dx\\ \\ 	&=& \frac{(2A_1) ^{p_1} (2A_2)^{p-p_1}}{p-p_1}\|f\|_{L^p(X,\mu)} ^p. \end{array}

This concludes the proof in the case {p_2=\infty} as well as providing the quantitative estimate {\|T\|_{L^p\rightarrow L^p}\leq 2\big(\frac{(2A_1) ^{p_1} A_2^{p-p_1}}{p-p_1}\big)^\frac{1}{p}}. \Box

Exercise 8 Modify the proof above to show that under they hypotheses of the Marcinkiewicz interpolation theorem we can conclude that

\displaystyle \|T\|_{L^p\rightarrow L^p}\leq 2 p^\frac{1}{p}\bigg(\frac{1}{p-p_1}+\frac{1}{p_2-p}\bigg)^\frac{1}{p} A_1 ^{1-\theta} A_2 ^\theta,

where {\frac{1}{p}:=\frac{1-\theta}{p_1}+\frac{\theta}{p_2}} for some {0<\theta<1}.

Hint: This is already the constant appearing in the case {p_2=\infty}. For the case {p_2<\infty} split the function {f} at the level {c\lambda} (instead of {\lambda}), for some {c>0}, and optimize in the parameter {c>0} at the end of the proof. For this, use the heuristic that a sum is optimized when the terms in the sum are roughly equal in size.

Exercise 9 Let {0<p_1<p_2\leq \infty} and suppose that {f\in L^{p_1,\infty}(X,\mu)\cap L^{p_2,\infty}(X,\mu)}. Show that {f\in L^p(X,\mu)} for all {p_1<p<p_2}. Hint: The proof is very similar to the proof of the Marcinkiewicz interpolation theorem, only simpler. Use again the fact that

\displaystyle \|f\|_{L^p(X,\mu)} ^p=p\int_0 ^\infty \lambda^{p-1} \mu(\{x\in X:|f(x)|>\lambda\})d\lambda,

and split the range of {\lambda\in(0,\infty)} as {(0,\infty)=(0,\beta)\cup (\beta,\infty)}, at an appropriate level {\beta>0}. Use the weak integrability conditions for {f} in the appropriate intervals of {\lambda}.

Exercise 10 Let {X} be a finite set equipped with counting measure and let {f:X\rightarrow{\mathbb C}} be a function. Show that for any {0<p<\infty} we have that

\displaystyle \|f\|_{L^{p,\infty}(X)}\leq \|f\|_{L^p(X)}\lesssim_p \log(1+|X|) \|f\|_{L^{p,\infty}(X)}.

Thus on finite sets, the spaces {L^p} and {L^{p,\infty}} are equivalent. Here {|X|} denotes the cardinality of {X}.

Hint: Observe that {|\{x\in X:|f(x)|>\lambda\}|\leq \min({\|f\|^p _{L^{p,\infty}}}/{\lambda^p},|X|)} and use Proposition 9 of notes 1.

Exercise 11 (Dual formulation of {L^{p,\infty}}) Let {1<p\leq \infty}. Show that for every {f\in L^{p,\infty}(X,\mu)}, we have

\displaystyle  \|f\|_ {L^{p,\infty}(X,\mu)} \simeq_p \sup \big\{ \mu(E) ^{-\frac{1}{p'}} \int _E |f(x)| d\mu(x) :0<\mu (E)<\infty\big\},

where {\frac{1}{p}+\frac{1}{p'}=1}.

Hint: As in the previous exercise, write

\displaystyle  \int_E |f(x)| d\mu(x) = \int_0 ^\infty \mu(\{x\in E: |f(x)|>\lambda\}) d\lambda.

Since the set {E} has finite measure one can estimate further the measure of the level set by

\displaystyle  \mu(\{x\in E: |f(x)|>\lambda \})\leq \min \bigg(|E|, \|f\|^p _{L^{p,\infty} } /\lambda^p \bigg).

Now split the integral we want to estimate accordingly in order to take advantage of this estimate. See also the hint in the previous exercise. This will give you one direction of the estimate, the other direction being trivial.

While the Marcinkiewicz interpolation theorem is the prototype of real interpolation, complex methods can be used to derive similar conclusions. An example of such a method has already been used via the three lines lemma applied to exhibit the log convexity of the {L^p} norms (which is also a form of interpolation). We will now describe the prototype of complex interpolation.

The following theorem has some differences compared to the Marcinkiewicz interpolation theorem. First of all we assume that {T} is linear rather than sublinear. Note as well that our hypotheses concern strong type bounds for the operator {T} rather than weak endpoint bounds. On the other hand, the conclusion gives a good estimate for the norm of the operator when interpolating between the endpoints and allows more freedom in the choice of the exponents at the endpoints.

Theorem 12 (Riesz-Thorin interpolation theorem) Let {1\leq p_0,p_1\leq \infty} and {1\leq q_0,q_1\leq \infty}. Let

\displaystyle T:L^{p_0}(X,\mu)+L^{p_1}(X,\mu)\rightarrow L^{q_0}(Y,\nu)+L^{q_1}(Y,\nu),

be a linear operator that is of strong type {(p_0,q_0)} with norm {k_0} and of strong type {(p_1,q_1)} with norm {k_1}. That is we have that

\displaystyle \|Tf\|_{L^{q_0}(Y,\nu)}\leq k_0\|f\|_{L^{p_0}(X,\mu)},

for all {f\in L^{p_0}(X,\mu) } and

\displaystyle \|Tf\|_{L^{q_1}(Y,\nu)}\leq k_1\|f\|_{L^{p_1}(X,\mu)},

for all {f\in L^{p_1}(X,\mu)}. Then {T} is of strong type {(p_\theta,q_\theta)} with norm at most {k_\theta=k_0^{1-\theta} k_1 ^\theta}:

\displaystyle  \begin{array}{rcl}  	\|Tf\|_{L^{q_\theta}(Y,\nu)}\leq k_\theta \|f\|_{L^{p_\theta}(X,\mu)}, \end{array}

for all {f\in L^{p_\theta}(X,\mu)}, where {\frac{1}{p_\theta}=\frac{1-\theta}{p_0}+\frac{\theta}{p_1}} and {\frac{1}{q_\theta}=\frac{1-\theta}{q_0}+\frac{\theta}{q_1}}, with {0\leq \theta \leq 1}.

Proof: Let us first consider the case {p_0=p_1=p_\theta}. Then by the log-convexity of the {L^p} norm we get directly that

\displaystyle \|Tf\|_{L^{q_\theta}}\leq \|Tf\|_{L^{q_0}} ^{1-\theta} \|Tf\|_{L^{q_1}} ^\theta\leq k_0^\theta k_1^{1-\theta}\|f\|_{L^{p_\theta}},

as desired. We can therefore focus on the case {p_0\neq p_1} so that {p_\theta <\infty}. Without loss of generality we can assume that {p_0<p_1}.

We divide the proof in several steps:

step 1: It is enough to prove the theorem for {k_0=k_1=k_\theta=1}. To see this just observe that we can always replace the measures {\mu,\nu} by {c_\mu \mu}, {c_\nu \nu} respectively, for appropriate constants {c_\mu,c_\nu>0}. We can choose these constants so that {k_0=k_1=1} and then we also have {k_\theta=1}. Doing the calculations you will see that we need to define the constants {c_\mu,c_\nu} by means of the equations

\displaystyle c_\nu ^\frac{1}{q_0}c_\mu ^{-\frac{1}{p_0}}k_0=1\quad\mbox{and}\quad c_\nu ^\frac{1}{q_1}c_\mu ^{-\frac{1}{p_1}}k_1=1.

In what follows we will therefore assume that {k_0=k_1=k_\theta} in the statement of the theorem.

step 2: We have that

\displaystyle   \big|	\int_Y (Tf)g d\nu\big| \leq \|f\|_{L^{p_\theta}}\|g\|_{L^{q' _\theta}}, \ \ \ \ \ (5)

for all simple functions of finite measure support {f,g}. Here {q_\theta '} is the dual exponent of {q_\theta}.

First of all, since {T} is of strong type {(p_0,q_0)}, Hölder’s inequality shows that

\displaystyle  	\big| \int_Y (Tf)g d\nu \big|\leq \|f\|_{L^{p_0}}\|g\|_{L^{q' _0}},	 \ \ \ \ \ (6)

and, similarly, by the {(p_1,q_1)} type of {T} we get that

\displaystyle  	\big| \int_Y (Tf)g d\nu \big|\leq \|f\|_{L^{p_1} } \|g\|_{L^{q' _1} }. \ \ \ \ \ (7)

Thus, estimate (5) is true for {\theta=0,1}. It is obvious that we need to interpolate between the two endpoint estimates above. We will do that by means of the three lines convexity lemma. First we define the map

\displaystyle {\mathbb C}\ni z \mapsto F(z)=\int_Y \big(T\big[|f|^{(1-z)p_\theta/p_0+zp_\theta/p_1}\textnormal{sgn}(f)\big]\big) |g|^{(1-z)q' _\theta/{q' _0}+zq' _\theta/{q' _1}}\textnormal{sgn}(g) d\nu,

where {\textnormal{sgn}(h)=h/|h|}. Here there is a problem in the case {q_0 =q_1=q_\theta=1} since the dual exponents are equal to {\infty}. In this case the definition of {F} should be understood as

\displaystyle F(z)=\int_Y (T[|f|\textnormal{sgn}(f)])|g|\textnormal{sgn}(g)])d\nu.

The function {F} is a holomorphic function of {z}. Furthermore, since {f,g} are simple functions of finite measure support, it is not hard to see that {F} is actually bounded on the strip {S=\{z=x+iy:y\in{\mathbb R},0\leq x\leq 1\}}. Furthermore, for {z=\theta+0i} we see that {F(\theta)=\int_Y (Tf)g}. Now, on the boundary of the strip we have that

\displaystyle  \begin{array}{rcl}  	|F(0+iy)|\leq \|f\|_{L^{p_\theta}} ^{{p_\theta}\over{p_0}} \|g\|_{L^{q' _\theta}} ^\frac{{q_\theta}'}{{q_0}'}. \end{array}

from (6) and similarly

\displaystyle  \begin{array}{rcl}  	|F(1+iy)|\leq \|f\|_{L^{p_\theta}} ^{{p_\theta}\over{p_1}} \|g\|_{L^{q' _\theta}} ^\frac{{q_\theta}'}{{q_1}'}. \end{array}

from (7). Using the three lines lemma we get that

\displaystyle  |F(x+iy)|\leq \|f\|_{L^{p_\theta}} ^{{(1-x)p_\theta}\over{p_0}} \|g\|_{L^{q' _\theta}} ^\frac{{(1-x) q_\theta}'}{{q_0}'} \|f\|_{L^{p_\theta}} ^{{xp_\theta}\over{p_1}} \|g\|_{L^{q' _\theta}} ^\frac{{xq_\theta}'}{{q_1}'}.

The right hand side however is equal to {\|f\|_{L^{p_\theta}}\|g\|_{L^{q' _\theta}}}. Applying the result for {x=\theta} and {y=0} we get the claim of step 2. Observe that nothing really changes in the case {q_0=q_1=q_\theta=1}.

step 3: We have that

\displaystyle \big|	\int_Y (Tf)g d\nu \big| \leq \|f\|_{L^{p_\theta}}\|g\|_{L^{q' _\theta}},

for all {f\in L^{p_\theta}} and all simple functions {g} of finite measure support.

To see this let {f\in L^{p_\theta}} and {g} be a simple function with finite measure support. We write {f=f\chi_{\{|f|>1\}}+f\chi_{\{|f|\leq 1\}}=f_1+f_2}. Observe that {f_1\in L^{p_0}\cap L^{p_\theta}} and {f_2\in L^{p_1}\cap L^{p_\theta}}. Now let {\tilde \phi_j,\psi_j} be sequences of simple functions of finite measure support such that

\displaystyle \|\phi_j-f_1\|_{p_0},\|\phi_j-f_1\|_{p_\theta}\rightarrow 0,


\displaystyle \|\psi_j- f_2 \|_{L^{p_1}} , \|\psi_j- f_2\|_{p_\theta }\rightarrow 0,

as {j\rightarrow \infty}. We write {\tilde f_j=\phi_j+\psi_j}. By step 2 and (6) and (7) we have that

\displaystyle  \begin{array}{rcl}  	\big|\int_Y (Tf)g d\nu \big|&\leq& \big|\int_Y (T\tilde f_j)g d\nu \big|+\int_Y |T(f_1-\phi_j)g|d\nu+\int_Y |T(f_2-\psi_j)g|d\nu\\ \\ 	&\leq& \|\tilde f_j\|_{L^{p_\theta}}\|g\|_{L^{q' _\theta}}+\|f_1-\phi_j\|_{L^{p_0}}\|g\|_{L^{q' _0}}+\|f_2-\psi_j\|_{L^{p_1}}\|g\|_{L^{q' _1}}. \end{array}

Letting {j\rightarrow \infty} and observing that { \|\tilde f_j\|_{L^{p_\theta}}\rightarrow \|f\|_{L^{p_\theta}}} as {j\rightarrow \infty} we get the claim of this step as well.

step 4: We have that

\displaystyle \big|	\int_Y (Tf)g d\nu \big| \leq \|f\|_{L^{p_\theta}}\|g\|_{L^{q' _\theta}},

for all {f\in L^{p_\theta}} and {g\in L^{q_\theta '}}.

First of all observe that from step 3 we can actually conclude that

\displaystyle \int_Y |(Tf)g| d\nu \leq \|f\|_{L^{p_\theta}}\|g\|_{L^{q' _\theta}},

for all {f\in L^{p_\theta}} and simple functions {g} of finite measure support. In order to see this let {g} be any simple function that vanishes outside a set {E} of finite measure and define {h=\textnormal{sgn}(g T(f)) g}. Consider a sequence of simple functions {h_n} such that {h_n\rightarrow h} and {|h_n|\leq |h|}. In particular {h_n} vanishes outside the set {E}. We thus have the estimate {|h_n|\leq \|h\|_\infty \chi_E }. Also observe that {\int |Tf| \chi_E<\infty} since {Tf} is a function in {L^{q_0}+L^{q_1}} by our hypothesis. Lebesgue’s dominated convergence theorem now shows that

\displaystyle \int_Y |(Tf)g| d\nu =\lim_{n\rightarrow\infty} \int_Y (Tf)h_n\leq \|f\|_{L^{p_\theta}} \lim_{n\rightarrow\infty} \|h_n\|_{L^{q' _\theta}}\leq \|f\|_{L^{p_\theta}} \|g\|_{L^{q' _\theta}}.

Now for any {f\in L^{p_\theta}} and {g\in L^{q' _\theta}}, let { g_j} be a sequence of simple functions with finite measure support such that {g_j\rightarrow g} pointwise and {|g_1|\leq |g_2|\leq \cdots\leq |g|}. Fatou’s lemma now gives

\displaystyle  \begin{array}{rcl}  \int_Y |(Tf)g|d\nu &\leq& \liminf_{n\rightarrow\infty} \int_T |(Tf)g_n|d\nu \\ \\ &\leq& \|f \|_{L^{p_\theta}} \liminf_{n\rightarrow\infty} \|g_n\|_{L^{q' _\theta}}\leq \|f \|_{L^{p_\theta}} \|g\|_{L^{q' _\theta}}.	 \end{array}

This proves the claim of this step as well. Duality between {L^{q_\theta}} and {L^{q_\theta '}} now completes the proof of the theorem. \Box

As a first application of the Riesz-Thorin interpolation theorem we will now prove Young’s inequality on convolutions of functions.

Proposition 13 (Young’s inequality) Let {f,g:{\mathbb R}^n\rightarrow{\mathbb C}}. Let {1\leq p,q,r\leq \infty} be such that {\frac{1}{p}+\frac{1}{q}=\frac{1}{r}+1}. If {f\in L^p({\mathbb R}^n)} and {g\in L^q({\mathbb R}^n)}, then {f*g} is a well defined function in {L^r({\mathbb R}^n)} and we have the estimate

\displaystyle \|f*g\|_{L^r({\mathbb R}^n)}\leq \|f\|_{L^p({\mathbb R}^n)}\|g\|_{L^q({\mathbb R}^n)}.

Proof: For {1\leq q\leq \infty} and {g\in L^q({\mathbb R}^n)} fixed we define the operator

\displaystyle T(f)=g*f.

As we have already seen (see Exercise 1) we have the bound {\|T(f)\|_{L^q}\leq \|g\|_{L^q}\|f\|_{L^1}}, that is, {T} is of strong type {(1,q)}. It is also very easy to see that if {q'} is the conjugate exponent of {q} then we have

\displaystyle |(f*g)(x)|=\big|\int f(x-y)g(y) dy\big| \leq \|f\|_{L^{q'}}\|g\|_{L^q},

that is {\|T(f)\|_{L^\infty}\leq \|g\|_{L^q}\|\|f\|_{L^{q'}}} and {T} is of strong type {(q',\infty)}. Letting {\frac{1}{q_\theta}=\frac{1}{r}=\frac{1-\theta}{q}+\frac{\theta}{\infty}} and {\frac{1}{p_\theta}=\frac{1-\theta}{1}+\frac{\theta}{q'}}, the Riesz-Thorin interpolation theorem shows that {T} is of strong type {(p_\theta,q_\theta)}. Replacing {1-\theta=q/r} and using the hypothesis {1/p+1/q=1/r+1} we get that {p_\theta=p}. Thus we conclude that {T} is of strong type {(p,r)} with norm at most {\|g\|^{1-\theta} _{L^q} \|g\|^{\theta} _{L^q}=\|g\| _{L^q}}. That is we have {\|f*g\|_{L^r}\leq \|g\|_{L^q}\|f\|_{L^p}} as we wanted to show. \Box

Exercise 12 (Schur’s test) Let {1\leq p_1,q_0\leq \infty} and {B_0,B_1>0}. Let {K:X\times Y\rightarrow {\mathbb C}} be a {\mathcal X \otimes \mathcal Y}-measurable function such that (i) For almost every {x\in X} we have that

\displaystyle \|K(x,\cdot)\|_{L^{q_0}(Y)}\leq B_0.

(ii) For almost every {y\in Y} we have that

\displaystyle \|K(\cdot,y)\|_{L^{p' _1}(X)}\leq B_1 .

We consider the operator

\displaystyle  T(f)(x)=\int K(x,y)f(x)d\mu(x) ,

for suitable functions f: X \to \mathbb C. Define {p_0=1} and {q_1=\infty}.

Show that {T} is of strong type {(p_\theta,q_\theta)} with norm at most {B_0^\theta B_1^{1-\theta}} where {p_\theta} and {q_\theta} are as in the Riesz-Thorin interpolation theorem.

Hint: First consider the sublinear operator

\displaystyle  |T|(f)(x)=\int |K(x,y)||f(x)|d\mu(x) ,

which is always well defined (though maybe infinite) and controls T(f). Use Minkowski’s integral inequality and Hölder’s inequality to show that |T|, and thus $T$ is of strong type (1,q_0) and of strong type (p_1,\infty). Use the Riesz-Thorin interpolation theorem to conclude the proof.

[Update 24 Feb 2011: Omission in Exercise 12 corrected and a solution hint added.]

[Update 15 Mar 2011: Typo in Exercise 8 corrected, Exercise 4 edited.]


About ioannis parissis

I'm a postdoc researcher at the Center for mathematical analysis, geometry and dynamical systems at IST, Lisbon, Portugal.
This entry was posted in Dmat0101 - Harmonic Analysis, math.CA, Teaching and tagged , , , , , , , . Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s