DMat0101, Notes 4: The Fourier transform of the Schwartz class and tempered distributions

In this section we go back to the space of Schwartz functions {\mathcal S({\mathbb R}^n)} and we define the Fourier transform in this set up. This will turn out to be extremely useful and flexible. The reason for this is the fact that Schwartz functions are much `nicer’ than functions that are just integrable. On the other hand, Schwartz functions are dense in all {L^p} spaces, {p<\infty}, so many statements established initially for Schwartz functions go through in the more general setup of {L^p} spaces. A third reason is the dual of the space {\mathcal S({\mathbb R}^n)}, the space of tempered distributions, is rich enough to allow us to define the Fourier transform of much rougher objects than integrable functions

1. The space of Schwartz functions as a Fréchet space

We recall that the space of Schwartz functions {\mathcal S({\mathbb R}^n)} consists of all smooth (i.e. infinitely differentiable) functions {f:{\mathbb R}^n\rightarrow {\mathbb C}} such that the function itself together with all its derivatives decay faster than any polynomial at infinity. To make this more precise it is useful to introduce the seminorms {p_N} defined for any non-negative integer {N} as

\displaystyle p_N(f)=\sup_{|\alpha|\leq N,|\beta|\leq N}\sup_{x\in{\mathbb R}^n}|x^\alpha \partial^\beta f(x)|,

where {\alpha,\beta\in\mathbb N^n _o} are multi-indices and as usual we write {|\alpha|=\alpha_1+\cdots+\alpha_n}. Thus {f\in \mathcal S({\mathbb R}^n)} if and only if {f\in C^\infty({\mathbb R}^n)} and {p_N(f)<+\infty} for {N\in{\mathbb N}_o}.

It is clear that {\mathcal S({\mathbb R}^n)} is a vector space. We have already seen that a basic example of a function in {\mathcal S({\mathbb R}^n)} is the Gaussian {f(x)=e^{-\pi|x|^2}} and it is not hard to check that the more general Gaussian function {f(x)=e^{-\langle Ax,x\rangle}}, where {A} is a positive definite real matrix, is also in {\mathcal S({\mathbb R}^n)}. Furthermore, the product of two Schwartz functions is again a Schwartz function and the space {\mathcal S({\mathbb R}^n)} is closed under taking partial derivatives or multiplying by complex polynomials of any degree. As we have already seen (and it’s obvious by the definitions) the space of infinitely differentiable functions with compact support is contained in {\mathcal S({\mathbb R}^n)}, {\mathcal D({\mathbb R}^n)=C^\infty _c({\mathbb R}^n)\subset \mathcal S({\mathbb R}^n)}, and each one of these spaces is a dense subspace of {L^p({\mathbb R}^n)} for any {1\leq p<\infty} and also in {C_o({\mathbb R}^n)}, in the corresponding topologies.

The seminorms defined above define a topology in {\mathcal S({\mathbb R}^n)}. In order to study this topology we need the following definition:

Definition 1 A Fréchet space is a locally convex topological vector space which is induced by a complete invariant metric.

A translation invariant metric on {\mathcal{S}({\mathbb R}^n)}. It is not hard to actually define a metric on {\mathcal S({\mathbb R}^n)} which induces the topology. Indeed for two functions {f,g\in \mathcal S({\mathbb R}^n)} we set

\displaystyle  \rho(f,g)=\sum_{N=0} ^\infty \frac{1}{2^N}\frac{p_N(f-g)}{1+p_N(f-g)}.

The function {\rho:\mathcal S({\mathbb R}^n)\times \mathcal S({\mathbb R}^n) \rightarrow [0,+\infty)} is translation invariant, symmetric and that it separates the elements of {\mathcal{S}({\mathbb R}^n)}. The metric {\rho} induces a topology in {\mathcal S({\mathbb R}^n)}; a set {U\subset \mathcal S({\mathbb R}^n)} is open if and only if there exists exists {f\in U} and {\epsilon>0} such that

\displaystyle B_\rho(f,\epsilon):=\{g\in\mathcal S({\mathbb R}^n):\rho(f,g)<\epsilon\}\subset U.

Convergence in {\mathcal{S}({\mathbb R}^n)}. By definition, a sequence {\{\phi_k\}_{k\in{\mathbb N}}\subset \mathcal S({\mathbb R}^n)} converges to {0} if {\rho(\phi_k,0)\rightarrow 0} as {k\rightarrow \infty}. A more handy description of converging sequences in {\mathcal S({\mathbb R}^n)} is given by the following lemma.

Lemma 2 A sequence {\{\phi_k\}_{k\in{\mathbb N}}\subset \mathcal S({\mathbb R}^n)} converges to {0} if and only if

\displaystyle p_N(\phi_k)\rightarrow 0 \quad\mbox{ as } \quad k\rightarrow \infty,

for all {N\in {\mathbb N}_o}.

Proof: First assume that {\rho(\phi_k)\rightarrow 0 } as {k\rightarrow \infty}. Then, since

\displaystyle \sum_{N=1} ^\infty \frac{1}{2^N}\frac{p_N(\phi_k)}{1+p_N(\phi_k)}

converges to zero as {k\rightarrow \infty} and all summands are positive, we conclude that for every {N} we have that

\displaystyle \frac{p_N(\phi_k)}{1+p_N(\phi_k)}\rightarrow 0,

as {k\rightarrow \infty}. However, this easily implies that {p_N(\phi_k)\rightarrow 0 } as {k\rightarrow \infty}, for every {N\in{\mathbb N}_o}.

Assume now that {p_N(\phi_k)\rightarrow 0} as {k\rightarrow \infty} for every {N\in{\mathbb N}_o} and let {\epsilon>0}. We choose a positive integer {M} such that {2^{-M}<\frac{\epsilon}{2}}.


\displaystyle  \begin{array}{rcl} 	 		\rho(\phi_k,0)&=& \sum_{N=1} ^M \frac{1}{2^N}\frac{p_N(\phi_k)}{1+p_N(\phi_k)}+\sum_{N=M+1} ^\infty \frac{1}{2^N}\frac{p_N(\phi_k)}{1+p_N(\phi_k)}\\ \\ 		&\leq & \sum_{N=1} ^M \frac{1}{2^N}\frac{p_N(\phi_k)}{1+p_N(\phi_k)}+\frac{\epsilon}{2}. 		\end{array}

Now, every term in the finite sum of the first summand converges to {0} as {k\rightarrow \infty} and we get that {\rho(\phi_k)\rightarrow 0} as {k\rightarrow \infty}. \Box

{\mathcal S({\mathbb R}^n)} is a topological vector space. The topology induced by {\rho} turns {\mathcal S({\mathbb R}^n)} into a topological vector space. To see this we need to check that addition of elements in {\mathcal S({\mathbb R}^n)} and multiplication by complex constants are continuous with respect to {\rho}. This is very easy to check.

Local convexity. For {\epsilon>0} and {N\in{\mathbb N}_o} consider the family of sets

\displaystyle U_{\epsilon,N}:=\{f\in\mathcal S({\mathbb R}^n): p_N(f)<\epsilon\}.

We claim that {\{U_{\epsilon,N}\}_{\epsilon>0,N\in{\mathbb N}_o}} is a neighborhood basis of the point {0} for the topology induced by {\rho}. Indeed, the system {B_\rho(0,\epsilon)} defines a neighborhood basis of {0}. On the other hand it is implicit in the proof of Lemma 2 that for every {\epsilon>0} there is some {\epsilon'>0} and some {N>0} such that {U_{\epsilon',N}\subset B_\rho(0,\epsilon)}. This proves the claim.

Now, in order to show that {\mathcal S({\mathbb R}^n)} endowed with the topology induced by {\rho} is locally convex it suffices (by translation invariance) to show that the point {0} has a neighborhood basis which consists of convex sets. This is clear for the neighborhood basis {U_{\epsilon,N}} defined above since the seminorms {p_N} are positive homogeneous. Observe however that the balls {B_\rho(0,\epsilon)} are not convex.

Exercise 1 Show that the balls {B_\rho(0,\epsilon)}, {\epsilon>0}, are not convex sets.

Completeness. The space {\mathcal S({\mathbb R}^n)} is a complete topological vector space with the topology induced by {\rho}. If {\phi_k} is a Cauchy sequence in {\mathcal S({\mathbb R}^n)} then for every {\alpha,\beta\in{\mathbb N}_o ^n}, the sequence

\displaystyle  x^\alpha\partial ^\beta \phi_k

is a Cauchy sequence in the space {C_o({\mathbb R}^n)}, with the topology induced by the supremum norm. Since this space is complete we conclude that {\phi_k} converges uniformly to some {\phi\in C_o({\mathbb R}^n)}. A standard uniform convergence argument shows now that {\phi \in \mathcal S({\mathbb R}^n)}.

Remark 1 In general, a sequence {\{\phi_k\}} in a topological vector space is called a Cauchy sequence if for every open neighborhood of zero {U}, there exists some positive integer {N} so that {\phi_k-\phi{k'}\in U} for all {k,k'>N}. If the topology is induced by a translation invariant metric, this definitions coincides with the more familiar one, that is: for every {\epsilon>0} there exists {N>0} such that {\rho(\phi_k,\phi_{k'})<\epsilon} whenever {k,k'>N}.

The discussion above gives the following:

Theorem 3 The space {\mathcal S({\mathbb R}^n)}, endowed with the metric {\rho} and the topology induced by {\rho}, is a Fréchet space.

We now give a general Lemma that describes continuity of linear operators acting on {\mathcal{S}({\mathbb R}^n)} by giving a simple description of continuity of linear transformations.

Lemma 4 (i) Let {(X,\|\cdot\|_X)} be a Banach space and {T :\mathcal S({\mathbb R}^n)\rightarrow X} be a linear operator. Then {T} is continuous if and only if there exists {N\geq 0} and {C>0} such that

\displaystyle  	\|T(\phi)\|_X \leq C p_N(\phi), 		 \ \ \ \ \ (1)

for all {\phi \in\mathcal S({\mathbb R}^n)}.

(ii) Let {T:\mathcal S({\mathbb R}^n)\rightarrow \mathcal S({\mathbb R}^n)} be a linear operator. Then {T} is continuous if and only if for each {N>0} there exists {N'>0} and {C>0} such that

\displaystyle  p_N(T(\phi))\leq C p_{N'}(\phi),	 \ \ \ \ \ (2)

for all {\phi \in \mathcal S({\mathbb R}^n)}.

Proof: For (i) it is clear that {T} is continuous if (1) holds. On the other hand, suppose that {T:\mathcal{S}({\mathbb R}^n)\rightarrow X} is continuous and let {B_X(0,1)} be the open ball of center {0} and radius {1} in {X}. Then {T^{-1}(B_X(0,1))} is a neighborhood of {0} in {\mathcal{S}({\mathbb R}^n)} and hence it contains some {U_{\epsilon,N}}. Thus {p_N(\phi)<\epsilon} implies that {\|T(\phi)\|_X<1}. Now we have that

\displaystyle \|T(\phi)\|_X =\frac{2}{\epsilon} |p_N(\phi)|\bigg\| T\bigg(\frac{\epsilon}{2p_N(\phi)} \phi\bigg)\bigg\|\lesssim p_N(\phi).

Similarly, if {T:\mathcal S({\mathbb R}^n)\rightarrow \mathcal S({\mathbb R}^n)} is continuous then for every {N,\epsilon} there is {N',\epsilon'} so that

\displaystyle  T^{-1}(U_{N,\epsilon})\supset U_{N',\epsilon'}.

This implies (2) using the same trick we used to deduce (1). \Box

It is obvious that for every {0<p\leq \infty}, {{\mathcal S(\mathbb R^n)}\subset L^p({\mathbb R}^n)}. Let us show however that this embedding is also continuous:

Proposition 5 Let {0<p\leq \infty}. Then the identity map {\textnormal{Id}:{\mathcal S(\mathbb R^n)}\rightarrow L^p({\mathbb R}^n)} is continuous, that is, there exists {N} so that

\displaystyle \|f\|_{L^p({\mathbb R}^n)}\lesssim_{p,n} p_N(f),

for all {f\in {\mathcal S(\mathbb R^n)}}.

Proof: Let {f\in {\mathcal S(\mathbb R^n)}}. For {p<\infty} and {N>n/p} we have that

\displaystyle  \begin{array}{rcl}  		\|f\|_{L^p({\mathbb R}^n)} &\leq& \bigg(\int_{|x|\leq 1} |f(x)|^p dx\bigg)^\frac{1}{p}+\bigg(\int_{|x|> 1} |f(x)|^p dx\bigg)^\frac{1}{p}\\ \\ 	&\leq &\|f\|_{L^\infty}|B(0,1)|^\frac{1}{p}	+ \sup_{x\in{\mathbb R}^n} |x|^{N}|f(x)| \bigg(\int_{|x|>1} |x|^{-N p} dx\bigg)^\frac{1}{p} \\ \\ 	& \lesssim _{n,p}&p_N(f). 	\end{array}

If {p=\infty } observe that {\|f\|_\infty=p_0(f)} so there is nothing to prove. \Box

2. The Fourier transform on the Schwartz class

Since {\mathcal S({\mathbb R}^n)\subset L^1({\mathbb R}^n)} there is no difficulty in defining the Fourier transform on {\mathcal S({\mathbb R}^n)} by means of the formula

\displaystyle \mathcal{F}(f)(\xi)=\hat f(\xi)=\int_{{\mathbb R}^n}f(x) e^{-2\pi i x\cdot \xi} dx, \quad f\in\mathcal{S}({\mathbb R}^n), \ \xi \in {\mathbb R}^n.

All the properties of {\mathcal F} that we have seen in the previous week’s notes are of course valid for the Fourier transform on {\mathcal S({\mathbb R}^n)}. As we shall now see, there is much more we can say for the Fourier transform on {\mathcal S({\mathbb R}^n)}.

For {f\in \mathcal S({\mathbb R}^n)} and every polynomial {P} we have that {P(-2\pi i x)f,P(\partial ^\alpha) f \in\mathcal S({\mathbb R}^n)}. Using the commutation relations

\displaystyle  \begin{array}{rcl}  	\mathcal F(P(-2\pi i x)f)(\xi)&=& P(\partial ^\alpha)\hat f(\xi),\\ \\ 	\mathcal F (P(\partial ^\alpha) f)(\xi)&=& P(2\pi i \xi) \hat f(\xi), \end{array}

we see that {\hat f \in {\mathcal S(\mathbb R^n)}}. Furthermore, since {{\mathcal S(\mathbb R^n)}\subset L^1({\mathbb R}^n)} we can use the inversion formula to write

\displaystyle f(x)=\int_{{\mathbb R}^n}\hat f(\xi) e^{2\pi i x\cdot \xi} d\xi=\mathcal F^{-1}(\hat f)=\mathcal F^{-1}\mathcal F f.

This shows that {\mathcal F:{\mathcal S(\mathbb R^n)}\rightarrow {\mathcal S(\mathbb R^n)}} is onto and of course it is a one to one operator as we have already seen. Finally let us see that it is also a continuous map. To see this observe that

\displaystyle  \begin{array}{rcl}  	p_N(\hat f)&=&\sup_{|\alpha|,|\beta|\leq N} \|\xi^\alpha \partial^\beta \hat f\|_{L^\infty({\mathbb R}^n)}= \sup_{|\alpha|,|\beta|\leq N}|2\pi|^{-|\alpha|} \|\mathcal F( \partial^\alpha (-2\pi i x)^\beta f)\|_{L^\infty({\mathbb R}^n)}	\\ \\ &\leq & \sup_{|\alpha|,|\beta|\leq N} |2\pi|^{|\beta|-|\alpha|} \|\partial^\alpha x^\beta f\|_{L^1({\mathbb R}^n)}\lesssim_{N} \sup_{|\alpha|,|\beta|\leq N}\|x^\beta \partial^\alpha f\|_{L^1({\mathbb R}^n)} \\ \\ &\leq & \sup_{|\alpha|,|\beta|\leq N} p_{M}(x^\beta \partial^\alpha f), \end{array}

for every {M>n}, by Proposition 5. However, {\sup_{|\alpha|,|\beta|\leq N} p_{M}(x^\beta \partial^\alpha f)\leq p_{M+N}(f)} so we get that

\displaystyle  p_N(\hat f )\lesssim_N p_{M+N}(f),

for every {M>N} which shows that {\mathcal F:{\mathcal S(\mathbb R^n)} \rightarrow {\mathcal S(\mathbb R^n)}} is continuous.

We have thus proved the following:

Theorem 6 The Fourier transform is a homeomorphism of {{\mathcal S(\mathbb R^n)}} onto itself. The operator

\displaystyle \mathcal F^{-1}:{\mathcal S(\mathbb R^n)} \rightarrow {\mathcal S(\mathbb R^n)},\quad g\mapsto \mathcal F^{-1}(g)(x)=\int_{{\mathbb R}^n} f(\xi) e^{2\pi i x\cdot \xi} d\xi ,

is the continuous inverse of {\mathcal F} on {{\mathcal S(\mathbb R^n)}}:

\displaystyle  \mathcal F \mathcal F^{-1} = \mathcal F^{-1} \mathcal F= \textnormal{Id},

on {{\mathcal S(\mathbb R^n)}}.

We immediately get Plancherel’s identities:

Corollary 7 Let {f,g\in{\mathcal S(\mathbb R^n)}}. We have that

\displaystyle \int_{{\mathbb R}^n} f(x) \overline{g(x)}dx=\int_{{\mathbb R}^n} \hat f(\xi) \overline{ \hat g(\xi)} d\xi.

In particular, for every {f\in {\mathcal S(\mathbb R^n)} } we have that

\displaystyle \|\hat f\|_{L^2({\mathbb R}^n)}=\|f\|_{L^2({\mathbb R}^n)}.

Proof: The multiplication formula of the previous week’s notes reads

\displaystyle \int f \hat g=\int \hat f g,

for {f,g\in L^1({\mathbb R}^n)} and thus for {f,g\in{\mathcal S(\mathbb R^n)}}. Now let {f,g\in {\mathcal S(\mathbb R^n)}} and apply this formula to the functions {f,h\in {\mathcal S(\mathbb R^n)}} where {h=\bar{ \hat g}}. Observing that {\hat {\bar{\hat g}}=\bar g} we get the first of the identities in the corollary. Applying this identity to the functions {f} and {g=f} we also get the second. \Box

We also get an nice proof of the fact that convolution of Schwartz functions is again a Schwartz function.

Corollary 8 Let {f,g\in{\mathcal S(\mathbb R^n)}}. Then {f*g\in S}.

Proof: For {f,g\in{\mathcal S(\mathbb R^n)}} we have that {\widehat{f*g}=\hat f \hat g}. Since {\hat f,\hat g \in S} we conclude that {\widehat{f*g}\in{\mathcal S(\mathbb R^n)}} and thus that {f*g\in{\mathcal S(\mathbb R^n)}}. \Box

3. The Fourier transform on {L^2({\mathbb R}^n)}

We have already seen that the Fourier transform is defined for functions {f\in L^1({\mathbb R}^n)} by means of the formula

\displaystyle \hat f(\xi)=\int_{{\mathbb R}^n}f(x) e^{-2\pi i x\cdot \xi} dx .

While this integral converges absolutely for {f\in L^1({\mathbb R}^n)}, this is not the case in general for {f\in L^2({\mathbb R}^n)}. However, Corollary 7 says that the Fourier transform is a bounded linear operator on {{\mathcal S(\mathbb R^n)}} which is a dense subset of {L^2({\mathbb R}^n)} and in fact we have that

\displaystyle  \|f\|_{L^2({\mathbb R}^n)}=\|\hat f\|_{L^2({\mathbb R}^n)}	 \ \ \ \ \ (3)

for every {f\in{\mathcal S(\mathbb R^n)}}. As we have seen several times already, this means that the Fourier transform has a unique bounded extension, which we will still denote by {\mathcal F}, throughout {L^2({\mathbb R}^n)}. In fact the Fourier transform {\mathcal F} is an isometry on {L^2({\mathbb R}^n)} as identity (3) shows.

Definition 9 A linear operator {S:L^2({\mathbb R}^n)\rightarrow L^2({\mathbb R}^n)} which is an isometry and maps onto {L^2({\mathbb R}^n)} is called a unitary operator.

Corollary 10 The Fourier transform is a unitary operator on {L^2({\mathbb R}^n)}.

The definition of the Fourier transform on {L^2} given above suggest that given {f\in L^2({\mathbb R}^n)}, one should find a sequence {\{h_k\}\subset {\mathcal S(\mathbb R^n)}} such that {h_k\rightarrow f} in {L^2} and define

\displaystyle ( \mathcal F f )S(\xi)=\hat f(\xi)= L^2-\lim_{k\rightarrow \infty} \int_{{\mathbb R}^n} h_k(x) e^{-2\pi i x\cdot \xi} dx.

This, however, is a bit too abstract. The following lemma gives us an alternative way to calculate the Fourier transform on {L^2({\mathbb R}^n)}.

Lemma 11 Let {f\in L^2({\mathbb R}^n)}. The following formulas are valid

\displaystyle  \begin{array}{rcl}  		\hat f(\xi)&=&L^2-\lim_{R\rightarrow +\infty} \int_{|x|\leq R}f(x)e^{-2\pi i x\cdot \xi} dx,\\ \\ 	 f(x)&=&L^2-\lim_{R\rightarrow +\infty} \int_{|\xi|\leq R} \hat f(\xi) e^{2\pi i x\cdot \xi} d\xi, 	\end{array}

where the notation above means that the limits are considered in the {L^2} norm.

Proof: Given {f\in L^2({\mathbb R}^n)} let us define the functions

\displaystyle  \begin{array}{rcl}  	f_R(x)=\begin{cases}f(x),\quad &\mbox{if}\quad |x|\leq R,\\ 0,\quad &\mbox{if}\quad |x|>R.\end{cases} \end{array}

Then on the one hand we have that {\lim _{R\rightarrow +\infty} f_R=f} in {L^2({\mathbb R}^n)}. On the other hand the functions {f_R} belong to {L^1({\mathbb R}^n)} for all {R>0} so we can write

\displaystyle  \widehat {f_R}(\xi)=\int_{|x|\leq R} f(x) e^{-2\pi i x\cdot \xi} dx,\quad \xi\in{\mathbb R}^n.

Since the Fourier transform is an isometry on {L^2({\mathbb R}^n)} we also have that {\widehat{f_R}\rightarrow \hat f} as {R\rightarrow +\infty} in {L^2({\mathbb R}^n)}. The proof of the second formula is similar. \Box

4. The Fourier transform on {L^p({\mathbb R}^n)} and Hausdorff-Young

Having defined the Fourier transform on {L^1({\mathbb R}^n)} and on {L^2({\mathbb R}^n)} we are now in position to interpolate between these two spaces. Indeed, we have established that

\displaystyle \mathcal F:L^1+L^2 \rightarrow L^2+L^\infty,

and that {\mathcal F} is of strong type {(1,\infty)} and of strong type {(2,2)} both with norm {1}. We have also seen that it is well defined on the simple functions with finite measure support and on the Schwartz class, both dense subsets of all {L^p} spaces for {p<\infty}. Setting {\frac{1}{p}=\frac{1-\theta}{2}+\frac{\theta}{2}} we get {\theta=\frac{2}{p'}} where {p'} is the dual exponent of {p}. This shows that {\frac{1}{q}=\frac{1-\theta}{2}+\frac{\theta}{\infty}=\frac{1}{p'}}. The Riesz-Thorin interpolation theorem now applies to show the following:

Theorem 12 (Hausdorff-Young Theorem) For {1\leq p \leq 2 } the Fourier transform extends to bounded linear operator

\displaystyle \mathcal F:L^p({\mathbb R}^n)\rightarrow L^{p'}({\mathbb R}^n),

of norm at most {1}, that is we have

\displaystyle  \|\mathcal F f\|_{L^{p'}({\mathbb R}^n)}\leq \|f\|_{L^p({\mathbb R}^n)}, \quad f\in L^p({\mathbb R}^n),\quad 1\leq p \leq 2.

Remark 2 This is one instance where the Riesz-Thorin interpolation theorem fails to give the sharp norm, although the endpoint norms are sharp. Indeed, the actual norm of the Fourier transform is

\displaystyle \|\mathcal F\|_{L^p\rightarrow L^{p'}}=\frac{p^\frac{1}{2p}}{{p'}^\frac{1}{2p'}}<1 ,\quad 1\leq p \leq 2.

This is a deep theorem that has been proved firstly by K.I. Babenko in the special case that {p} is an even integer and then by W. Beckner in the general case.

Exercise 2 Let {f} be a general Gaussian function of the form

\displaystyle f(x)=ce^{2\pi i x\cdot \xi_o}e^{-\langle A(x-x_o),x-x_o\rangle},

for some positive definite real matrix {A:{\mathbb R}^n\rightarrow {\mathbb R}^n}. Show that

\displaystyle \|\mathcal F f\|_{L^{p'}({\mathbb R}^n)}=\frac{p^\frac{1}{2p}}{{p'}^\frac{1}{2p'}} \|f\|_{L^p({\mathbb R}^n)}.

Observe that this gives a lower bound on the norm {\|\mathcal F\|_{L^p\rightarrow L^{p'}}}.

Hint: Write {f} as a composition of translations, modulations and generalized dilations of the basic Gaussian function {e^{-\pi |x|^2}}.

Remark 3 The inversion problem for {L^p}, {1<p<2} has a similar solution as the {L^1} case. One can easily see that the {\Phi} means of {\check f} converge to {f} in {L^p} as well as for every Lebesgue point of {f} if {\Phi} is appropriately chose. In particular this is the case for the Abel or Gauss means of {\check f}.

We also have the following extension on the action of the Fourier transform on convolutions.

Proposition 13 Let {f\in L^({\mathbb R}^n)} and {g\in L^p({\mathbb R}^n)} for some {1\leq p \leq 2}. Then, as we know, the function {f*g} belongs to {L^p({\mathbb R}^n)}. We have that

\displaystyle \widehat {(f*g)}(x)=\hat f(x) \hat g(x),

for almost every {x\in{\mathbb R}^n}.

We close this section by discussing the possibility of other mapping properties of the Fourier transform, besides the ones given by the Hausdorff-Young theorem. In particular we have seen that the Fourier transform is of strong type {(p,p')} for all {1\leq p \leq 2}. But are there any other pairs {(p,q)} for which the Fourier transform is of strong, or even weak type {(p,q)}?

The easiest thing to see is that whenever {\mathcal F} is of type {(p,q)} we must have that {q=p'}.

Exercise 3 Suppose that {\mathcal F} is of weak type {(p,q)}. Show that we must necessarily have {q=p'}.

Hint: Exploit the scale invariance of the Fourier transform; in particular remember the symmetry {\mathcal F \textnormal{Dil}_\lambda ^p =\textnormal{Dil}_{\lambda^{-1}} ^{p'} \mathcal F}.

The previous exercise thus shows that the only possible type for {\mathcal F} is of the form {(p,p')}. The Hausdorff-Young theorem shows that this is actually true whenever {1\leq p \leq 2}. It turns out however that the bound {(p,p')} fails for {p>2}. The following exercise describes one way to prove this.

Exercise 4 Suppose that {\mathcal F} cannot be of strong type {(p,p')} when {p>2}. (i) Let {N} be a large positive integer and {g(x)=e^{-\pi |x|^2}}. For {y\in {\mathbb R}^n} consider the function

\displaystyle f(x)=\sum_{j=1} ^N e^{2\pi i x\cdot{jy}}g(x-jy),\quad x\in{\mathbb R}^n.

Show that

\displaystyle \hat f(\xi)=\sum_{j=1} ^N e^{-2\pi i \xi \cdot {jy}} \hat g(\xi-jy).

(ii) For any {1\leq p \leq \infty} show that

\displaystyle \|f\|_{L^p({\mathbb R}^n)}\simeq _p N^\frac{1}{p},

if {N} and {|y|} are large enough. For this show first the endpoint bounds for {p=1} and {p=\infty}. This will also give you the intermediate upper bounds by log-convexity. For the lower bounds, consider the values of {f} close to integer multiples of {y}.

(iii) The previous steps show that

\displaystyle  \| \hat f\|_{L^{p'}({\mathbb R}^n)}\simeq_p N^{\frac{1}{p'}-\frac{1}{p}}\|f\|_{L^p({\mathbb R}^n)},

which allows you to conclude the proof.

5. The space of tempered distributions

The purpose of this paragraph is to introduce a space of `generalized functions’ that is much larger than all the spaces we have seen so far, namely the space of tempered distributions. Let us begin with an informal discussion, drawing some analogies with some more classical (though not so classical) function spaces.

We have seen already that at whenever {1\leq p<\infty} and the underlying measure is {\sigma}-finite, then the space {L^{p'}({\mathbb R}^n)} can be identified with the dual {(L^p({\mathbb R}^n))^*}, by means of the pairing:

\displaystyle  g\in L^{p'}\mapsto g^*:L^p\rightarrow {\mathbb C},\quad g^*(f)=\int_{{\mathbb R}^n} f(x)\overline{g(x)}dx.

This is already quite interesting. A function in {L^p} is already a generalized object in the sense that it is only defined up to sets of measure zero; so in fact it represents and equivalent class. Furthermore, it can be identified with a linear functional acting on another function space.

We have see that the space {{\mathcal S(\mathbb R^n)}} is contained in every {L^p} space and furthermore that it is dense in {L^p({\mathbb R}^n)} for all {p<\infty}. Restricting our attention to a smaller class of function, the space {{\mathcal S(\mathbb R^n)}}, we get a larger dual space:

\displaystyle  {\mathcal S(\mathbb R^n)} \subset L^p({\mathbb R}^n) \implies L^{p'}({\mathbb R}^n) =(L^p({\mathbb R}^n))^*\subset ({\mathcal S(\mathbb R^n)})^*.

We thus obtain a space of generalized functions that contains the `classical’ {L^p} spaces. As we shall see, this space is much bigger and in particular it allows us to differentiate (in the appropriate sense) and remain in this class of generalized functions and, most notably, consider the Fourier transform of these objects and still remain in the class. These operation many times are not even available on {L^p} spaces; for example we cannot even define the Fourier transform on {L^p({\mathbb R}^n)} for {p>2}. Furthermore, even when there is a way to define these operations on {L^p} functions we don’t necessarily stay in the given class of functions. For example, while it is perfectly legitimate to define the Fourier transform of an {L^1} function, the resulting function {\hat f} is not in general an integrable function. We shall see that the fact that {{\mathcal S(\mathbb R^n)}} is closed under taking partial derivatives, multiplying by polynomials and by taking the Fourier transform of its elements, its dual space is also closed under the corresponding operations.

In what follows we will many times write {\mathcal S'} for the dual {({\mathcal S(\mathbb R^n)})^*} and {\langle f,g\rangle} for the pairing {\int f \bar g}.

Definition 14 A linear functional {\lambda: {\mathcal S(\mathbb R^n)}\rightarrow {\mathbb C}} will be called a tempered distribution if it is continuous on {{\mathcal S(\mathbb R^n)}} with respect to the topology on {{\mathcal S(\mathbb R^n)}} described in the previous sections.

That is, the linear functional {\lambda:{\mathcal S(\mathbb R^n)}\rightarrow{\mathbb C}} is a tempered distribution if and only if there exists some {N\in{\mathbb N}_o} and {C>0} such that

\displaystyle |\lambda(\phi)|\leq C p_N(\phi),

for all {\phi \in {\mathcal S(\mathbb R^n)}}.

We equip the space {({\mathcal S(\mathbb R^n)})^*} with the weak-* topology; a sequence of tempered distributions {\lambda_k} converges to a limit {\lambda } if one has {\lambda_k(\phi)\rightarrow \lambda(\phi)} for all {\phi\in{\mathcal S(\mathbb R^n)}}. This is the weakest topology such that for each {f\in{\mathcal S(\mathbb R^n)}} the functional

\displaystyle  f^*:({\mathcal S(\mathbb R^n)})^*\rightarrow {\mathbb C},\quad f^*(\lambda)=\lambda(f)

is continuous. The space {({\mathcal S(\mathbb R^n)})^*} equipped with this topology will also be denoted by {\mathcal S'({\mathbb R}^n)}.

In what follows we will also use the notation {(f,\lambda)=(\lambda,f)} for {\lambda(f)} whenever {\lambda\in \mathcal S'({\mathbb R}^n)} and {f\in{\mathcal S(\mathbb R^n)}}. Be careful not to confuse this pairing with {\langle f,g\rangle=\int f\bar g}.

6. Examples of tempered distributions

We now describe several examples of classes of tempered distributions. We begin by showing how we can identify some known function classes with tempered distributions.

(i) Any element {f\in L^p({\mathbb R}^n)}, {1\leq p \leq \infty} can be identified with an element {\lambda_f\in \mathcal S'({\mathbb R}^n)} by means of the formula

\displaystyle  \lambda_f(\phi)=\int_{{\mathbb R}^n} f(x)\phi(x) dx,\quad \phi\in{\mathcal S(\mathbb R^n)},

and the map {L^p \ni f\mapsto \lambda_f} is continuous. We will say in this case that the tempered distribution {\lambda_f} is an {L^p} function.

It is clear that {\lambda_f} is linear. Furthermore we have that

\displaystyle |\lambda_f(\phi)| \leq \|f\|_{L^p({\mathbb R}^n)}\|\phi\|_{L^q({\mathbb R}^n)}\lesssim_{p,n} \|f\|_{L^p({\mathbb R}^n)}p_N(\phi),

for some non-negative integer {N}, by Proposition 5, which shows that {\lambda_f\in {\mathcal S'(\mathbb R^n)} } by Lemma 4. Furthermore, the mapping {f\mapsto \lambda_f} is continuous. Indeed, if {f_k\rightarrow f} in {L^p({\mathbb R}^n)} we set {\lambda_k=\lambda_{f_k}}. We need to show that {\lambda_k\rightarrow \lambda_f} in the weak-* topology, that is, that {\lambda_k(\phi)-\lambda_f(\phi)\rightarrow 0} for every {\phi\in{\mathcal S(\mathbb R^n)}}. However this is a consequence of the previous estimate.

(ii) Any element {\psi\in {\mathcal S(\mathbb R^n)}} can be identified with an element {\lambda_\psi\in \mathcal S'({\mathbb R}^n)} by means of the formula

\displaystyle  \lambda_\psi(\phi)=\int_{{\mathbb R}^n} \psi(x)\phi(x) dx,\quad \phi\in{\mathcal S(\mathbb R^n)},

and the map {{\mathcal S(\mathbb R^n)} \ni \psi \mapsto \lambda_\psi} is continuous. We will say in this case that the tempered distribution {\lambda_\phi} is an Schwartz function. The proof is very similar to that of (i).

(iii) If {\mu\in \mathcal M({\mathbb R}^n)} be a finite Borel measure. Then {\mu} can be identified with a tempered distribution {\lambda_\mu\in {\mathcal S'(\mathbb R^n)}} by means of the formula

\displaystyle  \lambda_\mu(\phi)=\int_{{\mathbb R}^n}\phi(x)d\mu(x),

and the map {\mathcal M({\mathbb R}^n)\mapsto \lambda_\mu} is continuous. We will say in this case that the tempered distribution {\lambda_\mu} is a (finite Borel) measure. The proof is the same as that of the preceding cases.

(iv) Let {0<p\leq \infty}. A measurable function {f} such that {f(x)(1+|x|^2)^k\in L^p({\mathbb R}^n)} for some non-negative integer {k} is called a tempered {L^p} function. Again the functional {\lambda_f} is an element of {{\mathcal S'(\mathbb R^n)}}. For {p=\infty} such a function is often called a slowly increasing function. Similarly a Borel measure {\mu} such that

\displaystyle \int_{{\mathbb R}^n} (1+|x|^2)^{-k}d|\mu|(x)<+\infty,

is called a tempered Borel measure and it defines an element of {{\mathcal S'(\mathbb R^n)}} by setting

\displaystyle \lambda_\mu(\phi)=\int_{{\mathbb R}^n} \phi(x)d\mu(x).

We will say that the tempered distribution {\lambda_\mu} is a tempered Borel measure.

Exercise 5 Show that if {\mu} is a tempered Borel measure then {\lambda_\mu\in {\mathcal S'(\mathbb R^n)}} and the map {\mu\mapsto \lambda_\mu} is continuous. Conclude the corresponding statement if {f} is a tempered {L^p} function. Observe that {f(x)dx} defines a tempered measure.

Exercise 6 Show that a Borel measure {\mu} is a tempered measure if and only if it is of polynomial growth: for every {R>0} we have that

\displaystyle \mu(B(0,R))\lesssim R^k,

for some positive integer {k} and all {R\geq 1}. In particular, {\mu} is locally finite.

Remark 4 From the previous definitions one gets the impression that the term `tempered’ is closely connected to `of at most polynomial growth’. This is in some sense correct since all functions or measure of at most polynomial growth define tempered distribution. On the other hand, the opposite claim is not true. Indeed, observe that the function {\sin(e^x)} is a slowly increasing function (actually it is bounded) and thus defines a tempered distribution. Thus, the derivative of this function, {e^x\cos(e^x)} is also a tempered distribution although it grows exponentially fast.

All the previous examples identify functions and measures (of moderate growth) with tempered distributions and the embeddings are continuous. However the space {{\mathcal S'(\mathbb R^n)}} also contains `rougher’ objects which are neither functions nor measures.

Exercise 7 Show that the functional {\delta' _0:\phi\mapsto -\phi'(0)} for all {\phi\in{\mathcal S(\mathbb R^n)}} is a tempered distribution which does not arise from a tempered measure (and thus it does not arise from a tempered function either).

Example 1 (The principal value distribution) We define the functional {\textnormal{p.v.}\frac{1}{x}} as

\displaystyle (\textnormal{p.v.}\frac{1}{x},\phi):=\lim_{\epsilon\rightarrow 0}\int_{|x|>\epsilon} \frac{\phi(x)}{x}dx.

Then {\textnormal{p.v.}\frac{1}{x}\in\mathcal S({\mathbb R}) }. To see that {\textnormal{p.v.}\frac{1}{x}\in{\mathcal S'(\mathbb R^n)}} let us fix some {0<\epsilon<1} and {\phi\in{\mathcal S(\mathbb R^n)}} and write

\displaystyle  \begin{array}{rcl}  	\int_{|x|>\epsilon} \frac{\phi(x)}{x}dx=\int_{\epsilon<|x|<1}\frac{\phi(x)-\phi(0)}{x}dx+\int_{|x|>1}\frac{\phi(x)}{x}dx. \end{array}

Now observe that {\big|\frac{\phi(x)-\phi(0)}{x}\big|\leq \|\phi'\|_\infty} thus the limit of the first summand as {\epsilon\rightarrow 0 } exists and

\displaystyle (\textnormal{p.v.}\frac{1}{x},\phi)=\int_{|x|<1}\frac{\phi(x)-\phi(0)}{x}dx+\int_{|x|>1}\frac{\phi(x)}{x}dx.

Moreover we have that

\displaystyle |(\textnormal{p.v.}\frac{1}{x},\phi)|\lesssim \|\phi '\|_{\infty}+\|x\phi\|_\infty \lesssim p_1(\phi).

Furthermore this distribution does not arise from any locally finite measure. It is also easy to see that this tempered distribution cannot arise from any locally finite Borel measure. For this consider a Schwartz function {\phi} adopted to an interval of the form {(\delta,1)} for {\delta\rightarrow 0}.

Exercise 8 (The principal value distribution in many dimensions) Let {K:{\mathbb R}^n\rightarrow {\mathbb C}} be a homogeneous function of degree {-n}. This means that

\displaystyle K(\lambda x)=\lambda^{-n} K(x),\quad \lambda>0.

(i) Show that there exists a function {\Omega:S^{n-1}\rightarrow {\mathbb C}} such that {K(x)=\Omega(x')/|x|^n} where {x'=x/|x|\in S^{n-1}}.

(ii) Assume that {\int_{S^{n-1}}\Omega(x') d\sigma_{n-1}(x')=0}. For {\phi\in {\mathcal S(\mathbb R^n)}} we define

\displaystyle  \textnormal{p.v.}_K (\phi)=\lim_{\epsilon\rightarrow 0}\int_{|x|>\epsilon} K(x)\phi(x) dx.

Show that the limit in the previous definition exists and that {\textnormal{p.v.}_K} defines a tempered distribution.

7. Basic operations on the space of tempered distributions

We have already seen that the space {{\mathcal S(\mathbb R^n)}} is closed under several basic operations: differentiation, multiplying by polynomials, multiplication between elements of the Schwartz space and, most notably, the Fourier transform. The space of tempered distributions has very similar properties:

Derivatives in {{\mathcal S'(\mathbb R^n)}}: We begin the discussion by considering {\phi,\psi \in{\mathcal S(\mathbb R^n)}} and writing down the integration by parts formula

\displaystyle \int_{{\mathbb R}^n} (\partial^\beta \psi)(x)\phi(x)dx=(-1)^{|\beta|} \int_{{\mathbb R}^n} \psi(x)(\partial^\beta\phi)(x)dx.

According to the previous definitions we can rewrite the previous formula as

\displaystyle  (\partial^\beta\psi,\phi)=(-1)^{|\beta|}(\psi,\partial ^\beta \phi),


\displaystyle \lambda_{\partial^\beta\psi}(\phi)=(-1)^{|\beta|}\lambda_\psi(\partial^\beta \phi).

The right hand side of the previous identity though makes sense for any {\lambda\in{\mathcal S'(\mathbb R^n)}} in the place of {\lambda_\phi} whenever {\phi\in{\mathcal S(\mathbb R^n)}}. Also, for {\lambda \in{\mathcal S'(\mathbb R^n)}} the mapping {\phi\mapsto \lambda (\partial^\beta \phi)} is continuous since {\lambda } is continuous and the map {\phi\mapsto \partial ^\beta \phi} is continuous. We thus define the partial derivative {\partial^\beta \lambda} of any {\lambda \in {\mathcal S'(\mathbb R^n)}} by means of

\displaystyle  (\partial^\beta \lambda)(\phi)=(-1)^{|\beta|}\lambda(\partial^\beta \psi).

The previous discussion implies that {\partial^\beta \lambda \in {\mathcal S'(\mathbb R^n)}}.

Example 2 Let {f} be the tempered {L^\infty} function defined as

\displaystyle  \begin{array}{rcl}  		f(x)=\begin{cases} 			0, \quad x< 0,\\ 1,\quad x\geq 0. 		\end{cases}		 	\end{array}

The function {f} is many times called the Heaviside step function. Clearly {f} defines a tempered distribution {\lambda_f} in the usual way

\displaystyle \lambda_f(\phi)=\int_{{\mathbb R}} f(x)\phi(x)dx,\quad \phi\in\mathcal S({\mathbb R}).

For every {\phi\in\mathcal S({\mathbb R})} we then have

\displaystyle  \begin{array}{rcl}  	\lambda_f '(\phi)=-\lambda_f(\phi')=-\int_{\mathbb R} f(x)\phi'(x)dx=-\int_0 ^\infty \phi'(x)dx=\phi(0)=\int_{\mathbb R}\phi(x)d\delta_0(x). \end{array}

That is {\lambda_f ' =d\delta_0.}

Remark 5 The fact that the distributional derivative of the Heaviside step function is the Dirac mass at {0} is intuitively obvious. The function {f} is differentiable everywhere except at {0} and {f'(x)=0} whenever {x\neq 0}. On the other hand there is a jump discontinuity of weight equal to {1} at {0} which roughly speaking requires an infinite derivative to be realized. In general, a jump discontinuity of weight {a} at a point {x_o} has a distributional derivative which coincides with Dirac mass of weight {a} at the point {x_o} .

Example 3 Let {\delta_0} be a Dirac mass at {0}. We then have

\displaystyle (\partial^\beta \delta_0)(\phi)=(-1)^{|\beta|}\delta_0(\partial^\beta \phi)=(-1)^{|\beta|}\partial^\beta\phi(0).

This also explains the minus sign in Exercise 7.

Exercise 9 In dimension {n=1} show that:

(i) The distributional derivative of the signum function {\textnormal{sgn}(x)} is {2\delta_0}.

(ii) The distributional derivative of the locally integrable function {\log|x|} is equal to {\textnormal{p.v.}\frac{1}{x}}.

(iii) The distributional derivative of the locally integrable function {|x|} is equal to {\textnormal{sgn}(x)}.

Translations, Modulations, Dilations and reflections in {{\mathcal S'(\mathbb R^n)}}: We have see that the translation operator {\tau_h} maps a measurable function {f} to the function {f(\cdot-h)}, where {h\in{\mathbb R}^n}. A trivial change of variables shows that whenever {f\phi\in L^1({\mathbb R}^n)} we have that

\displaystyle \int_{{\mathbb R}^n} (\tau_hf)(x)\phi(x)dx=\int_{{\mathbb R}^n}f(x)(\tau_{-h}\phi)(x)dx.

Now assume that {f} is a tempered {L^p} function (say). In the language of distributions we can rewrite the previous identity as

\displaystyle  \lambda_{\tau_h f}(\phi)=\lambda_f(\tau_{-h}\phi),

for all {\phi \in {\mathcal S(\mathbb R^n)}}. Again, the write hand side of this identity is well defined for any {\lambda\in{\mathcal S'(\mathbb R^n)}} and we define the translation of any distribution {\lambda\in{\mathcal S'(\mathbb R^n)}} as

\displaystyle (\tau_h\lambda)(\phi)=\lambda (\tau_{-h}\phi), \quad \phi\in {\mathbb R}^n.

It is easy to see that {\tau_h\lambda\in{\mathcal S'(\mathbb R^n)}}.

Similarly we define for {\lambda\in{\mathcal S'(\mathbb R^n)}} and {\phi\in{\mathcal S(\mathbb R^n)}} the tempered distributions

\displaystyle  \begin{array}{rcl}  	\tilde \lambda (\phi) &=&\lambda(\tilde \phi),\\ \\ (	\textnormal{Mod}_y)\lambda(\phi)&=& \lambda ({\textnormal{Mod}_y \phi}),\\ \\ (	\textnormal{Dil}^p _t) \lambda(\phi) &=&\lambda(\textnormal{Dil}^{p'} _{t^{-1}}\phi),\quad t>0.	 \end{array}

Convolution in {{\mathcal S'(\mathbb R^n)}}: Let {f,g,h\in{\mathcal S(\mathbb R^n)}}. Then it is an easy application of Fubini’s theorem that

\displaystyle \int_{{\mathbb R}^n} (f*g)(x)h(x)dx=\int_{{\mathbb R}^n}f(x)(\tilde g*h)(x)dx,

where {\tilde g(x)=g(-x)} is the reflection of {g}. In the language of distributions the previous identity reads

\displaystyle \lambda_{f*g}(h)=\lambda_f(\tilde g*h),\quad h\in{\mathcal S(\mathbb R^n)}.

Now the right hand side of the previous identity is well defined whenever { g*h \in {\mathcal S(\mathbb R^n)}} while in order to define the distribution {f*g} we need to have that {h \in {\mathcal S(\mathbb R^n)}}. Now assume that { g } is a function such that {g * \phi \in {\mathcal S(\mathbb R^n)}} for all {\phi \in {\mathcal S(\mathbb R^n)}}. This is obviously the case if {g \in {\mathcal S(\mathbb R^n)}}. Thus we can define the convolution of any {\lambda \in {\mathcal S'(\mathbb R^n)}} with a function {g\in{\mathcal S(\mathbb R^n)}} by means of the formula

\displaystyle  (\lambda * g )(\phi)=\lambda(\tilde g* \phi),\quad \phi \in {\mathcal S(\mathbb R^n)}.

It is easy to see that the function {{\lambda*g}} is continuous as a composition of the continuous maps {\phi\mapsto\tilde g*\phi} and {\psi\mapsto \lambda(\psi)} thus {\lambda*g\in{\mathcal S'(\mathbb R^n)}} for every {\lambda\in{\mathcal S'(\mathbb R^n)}} and {g\in{\mathcal S(\mathbb R^n)}}.

Exercise 10 Actually, the condition {g\in{\mathcal S(\mathbb R^n)}} is a bit too much to ask if one just wants to define the convolution {\lambda *g}. As we have observed, the only requirement is that {g*\phi\in{\mathcal S(\mathbb R^n)}} whenever {\phi\in {\mathcal S(\mathbb R^n)}}. Suppose that {g} is a rapidly decreasing function, that is {|x|^k f(x)\in L^\infty({\mathbb R}^n)} for all {k=0,1,2,\ldots}. Show the convolution of {\lambda \in{\mathcal S'(\mathbb R^n)}} and {g} can be defined and that is again an element of {{\mathcal S'(\mathbb R^n)}}.

It turns out that the convolution of a tempered distribution with a Schwartz function is a function:

Theorem 15 Let {\lambda\in{\mathcal S'(\mathbb R^n)}} and {h\in{\mathcal S(\mathbb R^n)}}. Then the convolution {\lambda*h} is the function {f} given by the formula

\displaystyle (\lambda*h)(x)=\lambda(\tau_x\tilde h),\quad x\in{\mathbb R}^n.

Moreover, {f\in C^\infty({\mathbb R}^n)} and for all multi-indices {\alpha} the function {\partial^\alpha f} is slowly increasing.

For the proof of this theorem see [SW].

The Fourier transform on {{\mathcal S'(\mathbb R^n)}}: We now come to the definition and action of the Fourier transform of tempered distribution. As in all the other definitions, first we investigate what happens in the case the tempered distribution is a Schwartz function. So, letting {\phi,f \in{\mathcal S(\mathbb R^n)}} the multiplication formula implies that

\displaystyle  \int_{{\mathbb R}^n} \phi(x) \hat f(x)dx=\int_{{\mathbb R}^n}\hat \phi(x) f (x) dx.

In the language of tempered distributions we have that

\displaystyle  \lambda_{\hat f}(\phi) = \lambda_f (\hat \phi).

Observing once more that the right hand side is well defined for all {\phi \in {\mathcal S(\mathbb R^n)}} and that the map {{\mathcal S(\mathbb R^n)}\ni \phi\mapsto \lambda(\hat \phi)} is well defined and continuous we define the Fourier transform of any tempered distribution {\lambda \in {\mathcal S'(\mathbb R^n)}} as

\displaystyle  \mathcal F (\lambda)(\phi)=\hat \lambda (\phi) =\lambda (\hat \phi),\quad \phi \in {\mathcal S(\mathbb R^n)}.

We have that {\hat \lambda\in{\mathcal S'(\mathbb R^n)}} whenever {\lambda \in {\mathcal S'(\mathbb R^n)}}. It is also trivial to define the inverse Fourier transform of a tempered distribution as

\displaystyle  \mathcal F^{-1}(\lambda)(\phi)=\check\lambda(\phi)=\lambda(\check \phi),

and to show that {\mathcal F} is a homeomorphism of {{\mathcal S'(\mathbb R^n)}} onto itself. Also the operator {\mathcal F:{\mathcal S'(\mathbb R^n)}\rightarrow {\mathcal S'(\mathbb R^n)}} satisfies all the symmetry properties that the classical Fourier transform satisfies and commutes with derivatives in the same way.

Example 4 (The Fourier transform of {|x|^{-2}} in {{\mathbb R}^3}) We consider the function

\displaystyle f(x)=\frac{1}{|x|^2},\quad x\in{\mathbb R}^3.

Note that {f} is locally integrable in {{\mathbb R}^3} and it decays at infinity thus it can be identified with a tempered distribution which we will still call {f}. On the other hand {f} is not in any {L^p} space so we can’t consider its Fourier transform in the classical sense. We claim that the Fourier transform of {f} in the sense of distributions is given as

\displaystyle \widehat{\frac{1}{|x|^2}}(\xi)=\frac{\pi}{|\xi|}.

First of all observe that it suffices to show that

\displaystyle  \int_{{\mathbb R}^3} \frac{1}{|x|^2}\hat \phi(x)dx=\int_{{\mathbb R}^3} \frac{\pi}{|\xi|}\phi(\xi)d\xi,	 \ \ \ \ \ (4)

for all {\phi\in\mathcal S({\mathbb R}^3)}. Here it is convenient to express the function {1/|x|^2} as an average of functions with known Fourier transforms. Indeed, this can be done my means of the identity

\displaystyle 	\frac{1}{2\pi|x|^2}=\int_0 ^\infty t e^{-\pi t^2 |x|^2}dt,

which can be proved by simple integration by parts. Now fix a function {\phi\in \mathcal S({\mathbb R}^3)}. We have that

\displaystyle  \begin{array}{rcl}  	\int_{{\mathbb R}^3} \frac{1}{|x|^2}\hat \phi(x)dx &=& 2\pi \int_{{\mathbb R}^3} \int_0 ^\infty t e^{-\pi t^2 |x|^2} dt \ \hat \phi(x)dx\\ \\ 	&=& 2\pi \int_0 ^\infty t \bigg(\int_{{\mathbb R}^3} e^{-\pi t^2|x|^2}\hat \phi(x)dx \bigg) dt, \end{array}

by an application of Fubini’s theorem since the function {te^{-\pi t^2|x|^2}\hat \phi(x)} is an integrable function on {(0,\infty)\times {\mathbb R}^3}. The inner integral can be calculated now by using the multiplication formula and the (known) Fourier transform of a Gaussian. Indeed we have

\displaystyle \int_{{\mathbb R}^3} e^{-\pi t^2|x|^2}\hat \phi(x)dx=\int_{{\mathbb R}^3}\frac{1}{t^3}e^{-\pi\frac{|x|^2}{t^2}}\phi(x)dx.

Putting the last two identities together we get

\displaystyle  \begin{array}{rcl}  	\int_{{\mathbb R}^3}\frac{1}{|x|^2}\hat \phi(x)dx =2\pi \int_0 ^\infty \bigg(\int_{{\mathbb R}^3}\frac{1}{t^2} \phi(x) e^{-\pi|x|^2 /t^2}dx \bigg) dt. \end{array}

Now observe that by changing variables {s=|x|/t} we have

\displaystyle \int_0 ^\infty \frac{1}{t^2}e^{-\pi |x|^2/t^2}dt =\frac{1}{|x|} \int_0 ^\infty e^{-\pi s^2}ds=\frac{1}{2|x|},

and thus

\displaystyle \int_0 ^\infty \int_{{\mathbb R}^3} |\phi(x)|\frac{1}{t^2}e^{-\pi |x|^2/t^2}dt=\int_{{\mathbb R}^3}|\phi(x)|\frac{1}{|x|}dx<\infty,

since {|x|^{-1}} is locally integrable in {{\mathbb R}^3} and {\phi\in\mathcal S({\mathbb R}^3)}. A second application of Fubini’s theorem then gives (4) and proves the claim.

Exercise 11 (i) Let {f} be a smooth function such that for all multi-indices {\alpha} the partial derivatives {\partial^\alpha f} have at most polynomial growth: {|\partial ^\alpha f(x)|\lesssim (1+|x|^2)^k,} for some {k\geq 0}. Then the product of a tempered distribution {\lambda\in{\mathcal S'(\mathbb R^n)}} with {f} is well defined by means of the formula

\displaystyle (\lambda f)(\phi)=\lambda(f\phi), \quad \phi\in{\mathcal S(\mathbb R^n)},

and {\lambda f\in{\mathcal S'(\mathbb R^n)}}.

(ii) If {\lambda\in{\mathcal S'(\mathbb R^n)}} and {f\in{\mathcal S(\mathbb R^n)}} then show that

\displaystyle  \widehat {\lambda*f}=\hat \lambda \hat f\quad\mbox{in}\quad {\mathcal S'(\mathbb R^n)}.

Remark 6 The definition of the Fourier transform on {{\mathcal S'(\mathbb R^n)}} implies that whenever {f\in L^p({\mathbb R}^n)}, {1\leq p \leq 2} we have that

\displaystyle  \hat \lambda_{f}=\lambda_{\hat f}.

Thus the Fourier transform on tempered distributions is an extension of the classical definition of the Fourier transform. If on the other hand {f\in L^p({\mathbb R}^n)} for some {2<p\leq \infty} then {f} is a tempered {L^p} function and thus {\lambda_f} is a tempered distribution. This allows us to define the Fourier transform of {f} by looking at {f} as a tempered distribution. The discussion the followed the Hausdorff-Young theorem however suggests that {\hat \lambda_f} will not be a function in general.

Exercise 12 (Poisson summation formula) For {f\in{\mathcal S(\mathbb R^n)}} we define

\displaystyle \Lambda(f)=\sum_{k=(k_1,\ldots,k_n)\in{\mathbb Z}^n}f(k).

Note that {\Lambda} can be identified with the sum of a unit masses positioned on every point of the integer lattice {{\mathbb Z}^d}

\displaystyle \Lambda = \sum_{k\in{\mathbb Z}^n}\delta_k.

Show that {\Lambda \in {\mathcal S'(\mathbb R^n)}} and that {\mathcal F \Lambda = \Lambda}.

Hints: (a) First prove the case of dimension {n=1} by proving the following intermediate statements.

(i) Show that {\Lambda} satisfies the invariances {\tau_1\Lambda=\Lambda} and {\textnormal{Mod}_1\Lambda =\Lambda}.

(ii) Consider a Schwartz function {g\in\mathcal S({\mathbb R})} with support in the interval {(-\frac{1}{4},\frac{1}{4})} and {g(0)=1}. If {f\in\mathcal S({\mathbb R})} has compact support show that the function

\displaystyle h(x)=\frac{f(x)-\sum_{m\in{\mathbb Z}}f(m)\tau_mg(x)}{1-e^{2\pi i x}}

is a smooth function with compact support.

(iii) Let {\Lambda'} be a tempered distribution which satisfies the invariances {\tau_1\Lambda'=\Lambda'} and {\textnormal{Mod}_1\Lambda'=\Lambda'}. Show that

\displaystyle \Lambda'(f-\sum_{k\in{\mathbb Z}}f(k)\tau_k(g))=0

whenever {f,g} are as in step (ii). Conclude that

\displaystyle \Lambda'(f)=c\Lambda(f)

for some {c\in{\mathbb C}}, whenever {f} is a Schwartz function with compact support. Extend this equality to all {f\in\mathcal S({\mathbb R})} by a density argument.

(iv) Step (iii) essentially shows that any tempered distribution that has the symmetries in {(i)} must agree with {\Lambda} up to a multiplicative constant. Observe that {\mathcal F \Lambda} satisfies the same invariances. Conclude that {\Lambda=c\hat \Lambda} by step (i). Determine the numerical constant {c\in{\mathbb C}} by testing against the Schwartz function {f(x)=e^{-\pi x^2}}. This concludes the proof for the one dimensional case.

(b) For general {n} use Fubini’s theorem to show that

\displaystyle \mathcal F = \mathcal F_{x_1}\mathcal F_{x_2} \cdots \mathcal F_{x_n},

where {\mathcal F_{x_j}} denotes the (one-dimensional) Fourier transform in the {j-th} direction. Thus step (a) implies that

\displaystyle  \mathcal F_{x_j} \Lambda =\Lambda,

for every {j=1,2,\ldots,n}. Conclude the proof by iterating this identity.

Exercise 13 (Equivalent form of Poisson summation formula) If {f\in\mathcal S({\mathbb R}^n)} and {x\in{\mathbb R}^n} then we have that

\displaystyle  \sum_{k\in\mathbb Z^n} f(x+k)=\sum_{k\in{\mathbb Z}^n} \hat f (k) e^{2\pi i x\cdot k}.

8. Translation invariant operators

Let {V,W} be vector spaces of functions on {{\mathbb R}^n} and suppose that {T} is an operator that maps {V} into {W}. We will say that {T} commuted with translations or that {T} is translation invariant if {T \tau_y=\tau_y T } for all {y\in{\mathbb R}^n}. To see an example of such an operator, consider {K\in L^1({\mathbb R}^n)} and define {T_K(f)(x)=(f*K)(x)} for all {f\in L^p({\mathbb R}^n)}, {1\leq p\leq \infty}. We have seen that {T_K} is well defined and furthermore that

\displaystyle \|T_K(f)\|_{L^p({\mathbb R}^n)}\leq \|K\|_{L^1({\mathbb R}^n)}\|f\|_{L^p({\mathbb R}^n)},

that is, {T_K} is of strong type {(p,p)}. We have seen that the convolution commutes with translations which implies that {T_K} commutes with translations. It is quite interesting that, in some sense, all translation invariant operators are given by a convolution with an appropriate `kernel’ {K} (which might not be a function).

Theorem 16 Let {T:L^p({\mathbb R}^n)\rightarrow L^q({\mathbb R}^n)}, {1\leq p,q\leq \infty}, be a bounded linear operator that commutes with translations. Then there exists a unique tempered distribution {K} such that

\displaystyle T(f)=f*K,\quad\mbox{for all}\quad f\in\mathcal S({\mathbb R}^n).

Thus bounded linear operators of strong type {(p,q)} are in a one to one correspondence with the subclass of tempered distributions {K} such that

\displaystyle \|T_K(f)\|:=\|K*f\|_q\lesssim \|f\|_q,

for all {f\in {\mathcal S(\mathbb R^n)}.} In this case we will slightly abuse language and say that the tempered distribution {K} is of type {(p,q)}. It would be desirable to characterize this class of tempered distribution for all {1\leq p,q\leq \infty} but such a characterization is not known in general and probably does not exist. Here we gather some partial results in this direction:

Proposition 17 (`The high exponents are on the left’) Suppose that {T} is a linear operator which is translation invariant and of strong type {(p,q)}. Then we must have that {p\leq q}. In particular the class of tempered distributions of type {(p,q)} is empty whenever {p>q}.

Exercise 14 Prove Proposition 17 above.

Hint: Suppose that a that {T} is translation invariant and of strong type {(p,q)} with {p<\infty}. Let {f\in L^p({\mathbb R}^n)} and consider the function

\displaystyle g(x)=\sum_{k=1} ^N f(x-x_n),

for some large positive integer {N} and points {x_1,\ldots,x_n} that will be chosen appropriately. Show that by choosing the points {x_1,\ldots ,x_n} to be far apart from each other (how far depends only on {f}) we have that {\|g\|_p\simeq_{f,p} N^\frac{1}{p}\|g\|_p } while the left hand side will be of the order {\|Tg\|_q\simeq_{q,f} N^\frac{1}{q}} for {N} large. However, if {T} is of strong type {(p,q)} this is only possible if {q\geq p}.

We also have a characterization of translation invariant operators in the following two special cases.

Theorem 18 ({p=q=2}) A distribution {K} is of type {(2,2)} if and only if there exists {m\in L^\infty({\mathbb R}^n)} such that {\hat K=m}. In this case, the norm of the operator

\displaystyle T_K:L^2({\mathbb R}^n)\cap {\mathcal S(\mathbb R^n)}\rightarrow L^2({\mathbb R}^n)

defined on {{\mathcal S(\mathbb R^n)}} as

\displaystyle T_K(f)=f*K,\quad f\in {\mathcal S(\mathbb R^n)},

is equal to {\|m\|_{L^{\infty}({\mathbb R}^n)}}. Moreover, {\widehat T_K(f)=\hat K \hat f}.

Theorem 19 ({p=q=1}) A distribution {K} is of type {(1,1)} if and only if it is a finite Borel measure. In this case, the norm of the operator

\displaystyle T_K:L^1({\mathbb R}^n)\cap {\mathcal S(\mathbb R^n)}\rightarrow L^1({\mathbb R}^n),

defined on {{\mathcal S(\mathbb R^n)}} as

\displaystyle T_K(f)=f*K,\quad f\in {\mathcal S(\mathbb R^n)},

is equal to the total variation {\|K\|} of the measure {K}.

For the proofs of these theorems and more details see [SW].

In this course we will not actually need that every translation invariant operator is a convolution operator since we will mostly consider specific examples where this is obvious. We will focus instead on the following case.

8.1. Multiplier Operators

Let {m\in L^\infty({\mathbb R}^n)}. For {f\in L^2({\mathbb R}^n)} we define

\displaystyle  \widehat {T_m(f)}(\xi)=m(\xi) \hat f(\xi),\quad \xi \in{\mathbb R}^n.

We will say that {T_m} is a multiplier operator associated to the multiplier {m}.

Observe that {T_m} is a well defined linear operator on {L^2({\mathbb R}^n)} and in fact it is bounded. Rather than relying on Theorem 18 let us see this directly:

\displaystyle  \begin{array}{rcl}  	\|T_m(f)\|_{L^2({\mathbb R}^n)}&=&\|\widehat{T_m(f)}\|_{L^2({\mathbb R}^n)}= \|m\hat f\|_{L^2({\mathbb R}^n)} \\ \\ &\leq& \|m\|_{L^\infty({\mathbb R}^n)}\|\hat f\|_{L^2({\mathbb R}^n)}=\|m\|_{L^\infty({\mathbb R}^n)}\|f\|_{L^2({\mathbb R}^n)}.	 \end{array}

In fact it is not hard to check that the opposite inequality is true so that {\|T_m\|_{L^2\rightarrow L^2}= \|m\|_{L^\infty({\mathbb R}^n)}}.

Exercise 15 If {T_m} is a multiplier operator associated to the multiplier {m\in L^\infty({\mathbb R}^n)} show that

\displaystyle \|T_m\|_{L^2\rightarrow L^2}\geq \|m\|_{L^\infty({\mathbb R}^n)}.

Thus {T_m} is a linear operator of type {(2,2)}. If {T_m} extends to a linear operator of type {(p,p)}, that is if there is an estimate of the form

\displaystyle  \|Tf\|_{L^p({\mathbb R}^n)}\leq c_{p,T}\|f\|_{L^p({\mathbb R}^n)},

for all {f\in {\mathcal S(\mathbb R^n)}}, then we will say that {m} is multiplier on {L^p}.

Remark 7 The previous discussion and in particular Theorem 18 shows that {T_m} is in fact given in the form

\displaystyle T_m(f)=f*K,

for some {K\in {\mathcal S'(\mathbb R^n)}}. In fact {K} will be the inverse Fourier transform of {m} in the sense of distributions.


About ioannis parissis

I'm a postdoc researcher at the Center for mathematical analysis, geometry and dynamical systems at IST, Lisbon, Portugal.
This entry was posted in Dmat0101 - Harmonic Analysis, math.CA, Teaching and tagged , , , , , , , , , , . Bookmark the permalink.

11 Responses to DMat0101, Notes 4: The Fourier transform of the Schwartz class and tempered distributions

  1. Alex Yuffa says:

    In “For the proof of this theorem see [SW],” which book/paper is SW referring to? I would be very interested in taking a look at it. BTW, great post!!!

  2. Alex, The references are defined in the introductory post here (clicky). The book [SW] is “Stein and Weiss, An introduction to Fourier Analysis on Euclidean spaces”.

    • Alex Yuffa says:

      Thank you for such a quick response!!! Just noticed that email notifications come from “do not reply” email address.

  3. Pingback: Being simple

  4. I stumbled across this while looking for information on band-limited signals. Perhaps this question is fodder for an article.

    If I have a function in t, f(t), that has a Fourier transform, F(w), and the Fourier transform is band-limited (i.e there is a finite value, w0, such that F(w)=0 for all |w| > w0), is the signal, f(t), continuous?

    My intuitive sense is that the “sharpest edge” that f(t) can have (the suprenum of the derivative of f(t)) is the maximum of the slope available in the maximum frequency component of f(t); i.e. F(w0).
    This frequency component would be e^{2 \pi i t * w0} and this maximum slope would be w0*F(w0).

    Is there a way to make this intuition rigorous?

    • Dear John

      It´s quite important to clarify in what sense you take the Fourier transform of some function f. For example, in the simplest case that f\in L^1(\mathbb R) say and F=\hat f has compact support, then clearly F\in L^1(\mathbb R) and the inversion formula is true. Thus f is equal to the inverse Fourier transform of F everywhere, and thus it is continuous (in fact it is much smoother than that because of the Paley-Wiener theorem).

      One can answer this question in much greater generality, that is, without the apriori assumption that f \in L^1 (you will still however need some mild assumption on f to guarantee that you can embed it in the class of tempered distributions and take the Fourier transform in that weaker sense). Then all the Fourier transforms involved should be considered in the sense of distributions and your assumption would be something like “F is a distribution of compact support”. This case is also covered by a version of the Paley-Wiener theorem.

      Hope this helps.


      • johnwashburn says:

        It may. I thank you for the new direction (the Schwartz-Paley-Wiener Theorem). My Fourier transform is of an almost periodic function from number theory. The transform is thus limited to a spectrum of +-1, but consists of Dirac delta “functions” located at the rational points within the interval. I am not sure the transform is well behaved enough for Schwartz-Paley-Wiener, but perhaps it is sufficient for the Hormander generalization mentioned in the Wikipedia article. It is fortunate for me that the only property I desire to be proved for the underlying signal is that is be continuous for all t greater than some interval about zero. Hopefully the modesty of this goal is enough to overcome the difficulties presented by a Fourier transform that is a distribution and not continuous.

        Thank you very much for your consideration and on this matter. After all a new direction is a new adventure in analysis, eh?

  5. I am sorry that last bit should have been the inverse transform of spectrum, wF(w), which is the Fourier transform of the derivative of f(t). But that very statement, “transform of the derivative of f(t)”, already makes presumptions on the continuity of f(t) I was hoping to avoid with the band lmited question. Are there spectral only conditions of F(w) which are sufficient to demonstrate f(t) is continuous?

  6. Pablo says:

    The Fourier transform of a finite Borel measure is, by definition,
    \displaystyle \int_{\mathbb{R}^n} \exp{(i \xi \cdot x)} d\mu(x) \,.
    I can’t see the connection with its transform viewed as a tempered distribution. Should it be the same or the definition is made by analogy to L^1 functions thinking f(x)dx as the measure’s density?
    Great post btw.

    • ioannis parissis says:

      Dear Pablo, thanks for your comment.

      You’re right, the Fourier transform of a finite Borel measure can be defined directly, without appealing to the theory of distributions. As you commented, one thinks of an L^1 function as the density of the Borel measure f(x)dx. Replace this by d\mu(x) and you have a perfectly meaningful definition of the Fourier transform on the class of finite Borel measures. One can quite easily check that this definition coincides with the definition given by distribution theory. Indeed we have (by definition)
      \displaystyle \langle \widehat{d\mu},\phi \rangle := \langle \mu , \hat \phi \rangle = \int \big(\int e^{-2\pi i x\cdot \xi} \phi(x)dx\big) d\mu(\xi)= \int \big(\int e^{-2\pi i x\cdot \xi} d\mu(\xi)d\xi\big)\phi(x) dx.

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s