## DMat0101, Notes 1: Quick review of measure theory

The notes that will follow are meant to be a companion to the Harmonic Analysis course that I’m giving this semester at IST. These notes are inspired, influenced and sometimes shamelessly copied from books, lecture notes of other people, research papers and online material. The whole idea and structure of the course and, in particular, the use of the blog as a general platform of communication and interaction in relevance to the course is highly influenced by similar efforts of the other people and, especially, from Terence Tao’s blog as well as his lecture notes. Be sure to check the originals!

1. Introduction and notations

As mentioned in the overview of the course, we will be mainly concerned with operators acting on certain function spaces, or even spaces of more rough objects such as measures or distributions. Typically we will want to study the mapping properties of such an operator, that is whether it maps one function space to another and so on. A typical estimate in this context is of the form

$\displaystyle \|Tf\|_Y\leq C \|f\|_X,$

where ${X,Y}$ are certain, usually Banach, spaces of functions or measure or distributions and ${\|\cdot\|_X, \|\cdot\|_Y}$ are norm, or semi-norms or, in general, norm-like quantities. Thus such an estimate states that the operator ${T}$ takes functions (or objects’) from the space ${X}$ to the space ${Y}$ in a continuous way. Already such an estimate can reveal quite a lot for the nature and the properties of the operator ${T}$.

When studying the mapping properties of an operator, it is oftentimes useful to restrict attention to a nice’ subclass ${V}$ inside ${X}$. If ${X}$ for example consists of integrable functions, a good idea is to first consider the action of ${T}$ on the class of smooth functions with compact support, or on the class of simple functions. These subclasses are nice or explicit enough to be able to overcome many technical difficulties in trying to define ${T(f)}$ for a general object ${f\in X}$. Furthermore, when these classes are dense in the original space, there is a very natural candidate for the extension of ${T}$ to the whole class. It turns out that this happens whenever ${T}$ is bounded on the dense subclass. Another useful technique is to decompose a general function ${f\in X}$ into different pieces. Since ${T}$ is usually linear, we can then examine the effect of ${T}$ on each piece and sum the pieces together. Likewise, we can decompose the operator ${T}$ to different components, each component being easier to control than the whole’ operator ${T}$. Finally, we combine these two ideas and decompose both the function and the operator into different pieces. Usually good control on the different pieces is expected to imply a good control on the original operator and/or function. There are however technical difficulties in putting the pieces together, justifying how each individual estimate sums up to a global’ estimate.

Overall this course is all about estimates: Estimating the norm of a function, the norm of an operator, the norms of the different pieces of a decomposition of a function and so on. It is very useful to introduce some notation:

Hardy notation; a constant ${c>0}$ that has an unspecified value. Such a constant ${c,c_1,c_2,\ldots}$, or ${C, A, B}$ and so on, usually represents a numerical constant that does not depend on any of the parameters of the problem. Using this notation, we will many times using a letter, ${c}$ for example, to denote a generic numerical constant. Different appearances of the letter ${c}$ will not necessarily denote the same numerical constant, even in the same line of text. For example a very useful estimate is the following

$\displaystyle \frac{2}{\pi}|x|\leq |\sin(x)|\leq |x|,\quad |x|\leq \frac{\pi}{2}.$

We will use write estimates likes this in the form

$\displaystyle c_1|x|\leq |\sin x|\leq c_2 |x|,\quad |x|\leq \frac{\pi}{2},$

which is just the statement the fact that the function ${\sin x}$ behaves linearly close to ${0}$. The precise values of the constants, that is, the precise slopes of the linear functions appearing in the estimate, are rarely of any importance and the do not depend on anything interesting. Taking this one step further we would write for example

$\displaystyle |2\sin(x)/(1+x)|\leq c|x/(1+x)|\leq c|x|$

when ${x}$ is close to ${0}$ and

$\displaystyle |2\sin(x)/(1+x)|\leq c/|1+x|,$

when ${|x|\rightarrow \infty}$.

A variation of this notation is useful when a constant actually depends on one of the parameters of the problem. Thus we could write

$\displaystyle \|Tf\|_{Y}\leq c_{X,Y,T}\|f\|_X,$

which means that the constants ${c_{X,Y,T}}$ may depend on ${X,Y}$ and ${T}$ but not on the function ${f}$. One should be careful with estimates like this. For example, the notation

$\displaystyle 2^n\leq c_n,$

is correct though not very useful as the notation ${c_n}$ hides’ the dependence of the constant ${c_n}$ on ${n}$ (for example whether it is bounded in ${n}$, whether it grows to infinity in ${n}$ and so on). On the other hand, the notation

$\displaystyle 2^n\leq c$

is wrong though the estimate is actually true for fixed ${n}$. Such a notation would imply that the sequence ${2^n}$ is uniformly bounded in ${n}$ which is of course not true. Such a notation is true for example in the case

$\displaystyle |\sin(2\pi n)|\leq c.$

The Vinogradov notation. Suppose that we have an estimate of the form ${Y\leq c X}$ where ${X,Y}$ could be norms of functions, or operators and so on. We will write this estimate in the form

$\displaystyle Y\lesssim X.$

Similarly we write ${Y\gtrsim X}$ whenever ${Y\geq c X}$. If we have that ${Y\lesssim X}$ and ${Y\gtrsim X}$ then we will use the notation ${X\simeq Y}$. This latter notation states that the quantities ${X}$,${Y}$ are equivalent up to constants.

For example, we could write ${2\sin(2\pi n)\lesssim 1}$ and also ${\sin x \simeq x}$ for ${x}$ close to ${0}$.

If we want to state a dependence on a parameter we use a subscript. For example we write

$\displaystyle \|Tf\|_Y\lesssim_{X,Y,T}\|f\|_X,$

to denote the dependence of the implied constant on ${X,Y}$ and ${T}$.

A lot of attention should be given when iterating this notation. While this is legitimate for a finite number of steps, an infinite number of steps can create many problems. Beware of this situation especially in inductive arguments. Never hide the dependence on the induction parameter in the Vinogradov notation!

The Landau – big ${O}$ – notation. In this notation, writing ${Y=O(X)}$ means that there exists a numerical constant ${C>0}$ such that ${|Y|\leq C X}$. The big ${O}$ notation however is mostly useful when we want to denote a main term and an error term, and keep track of everything in a nice way. Imagine for example that we want to study the function ${\sin x}$ for ${x}$ close to zero, say ${|x|<\frac{\pi}{2}}$. The Taylor expansion of ${\sin x}$ around zero is of the form

$\displaystyle \sin x= x-\frac{x^3}{3!}+\frac{x^5}{5!}-\frac{x^7}{7!}+\cdots.$

While it is correct that ${\sin x =O(|x|)}$ as ${x\rightarrow 0}$, what happens if we want to keep track of lower order terms? Well, we could use the big-${O}$ notation to write

$\displaystyle \sin x = x+ O({|x|^3}).$

Note that this is correct since the higher order terms ${x^5, x^7}$ and so on, are always controlled by ${x^3}$ when ${x\rightarrow 0}$. This is a very useful device if we want to carry’ the lower order terms in our calculations. For example, since

$\displaystyle \sin x=x+O(|x|^3),\quad \cos x=1-\frac{x^2}{4}+O(x^6),\quad x\rightarrow 0,$

we can write

$\displaystyle \sin x \cos x=(x+O(|x|^3))(1-\frac{x^2}{4}+O(|x|^4))=x+O(|x|^3).$

If we want to state the dependence on some parameter we use subscripts again. Thus we could write ${Y=O_n(X)}$ meaning that ${|Y|\leq c_n X}$. Also note that the bound ${\|Tf\|_Y\lesssim_{X,Y,T}\|f\|_X}$ can be written in the form ${\|Tf\|_Y=O_{X,Y,T} (\|f\|_X)}$.

2. Recalling notions from measure theory

We begin this introductory lecture by reminding some basic facts from Measure theory. As mentioned in the description of the course, our first task will be to recall all the basic notions and tools from integration theory and ${L^p}$ spaces, thus defining our main setup. Our basic environment is a measure space ${(X,\mathcal X,\mu)}$, that is a set ${X}$ together with a ${\sigma}$-algebra ${\mathcal X}$ of sets in ${X}$ and a non-negative measure ${\mu}$ on ${X}$. The measure ${\mu}$ will always assumed to be ${\sigma}$-finite (${X}$ can be decomposed as a countable union of sets of finite ${\mu}$ measure). Recall that our subject is Euclidean harmonic analysis so, in most cases, the underlying space ${X}$ will be the ${n}$-dimensional Euclidean space, ${\mu}$ will be the Lebesgue measure on ${\mathbb R^n}$ and ${\mathcal B}$ will be either the ${\sigma}$-algebra of Lebesgue-measurable sets, or the ${\sigma}$-algebra of Borel-measurable sets in ${\mathbb R^n}$.

Typically we will consider measurable functions ${f:(X,\mathcal X,\mu)\rightarrow (Z,\mathcal Z,\nu)}$; recall here that measurable means that the pre-image of every measurable set (thus of every set in ${\mathcal Z}$) is a measurable set (that it is a set in ${\mathcal X}$). However, we will mostly consider functions ${f:X\rightarrow {\mathbb C}}$, where it is understood that ${{\mathbb C}}$ is equipped with the Borel ${\sigma}$-algebra. Again, the special case of Lebesgue-measurable complex valued functions on ${\mathbb R^n}$ is of particular importance. Thus the main example to keep in mind is a Lebesgue-measurable, complex valued function

$\displaystyle f:\mathbb R^n\rightarrow \mathbb C,$

where ${\mathbb R^n}$ is equipped with the Lebesgue ${\sigma}$-algebra and ${{\mathbb C}}$ is equipped with the Borel ${\sigma}$-algebra of sets in ${{\mathbb C}}$. Note these definitions and conventions here since we won’t repeat them every time we consider measurable functions.

Let us go back to the case of a general measure space ${(X,\mathcal X, \mu)}$. If not otherwise stated, a set in ${X}$ will mean a measurable set in ${\mathcal X}$ and a function ${f}$ will mean a measurable complex valued function. For a set ${E}$ in ${X}$, the indicator function of ${E}$ will be denoted by ${\mathbf 1_E(x)}$ or ${\chi_E(x)}$:

$\displaystyle \begin{array}{rcl} \mathbf 1_E (x)=\chi_E(x)=\begin{cases}1,\quad\mbox{if}\quad x\in E,\\ 0,\quad \mbox{if}\quad x\notin E.\end{cases} \end{array}$

A simple function is then a finite linear combination of indicator functions, that is a function ${g(x)}$ defined as

$\displaystyle g(x)=\sum_{j=1} ^N c_j \chi_{E_j},$

where ${c_1,\ldots,c_N\in{\mathbb C}}$ and ${E_1,\ldots, E_N}$ are (measurable) sets. A standard way to identify a set in ${X}$ with a measurable function on ${X}$ is via the map ${E\mapsto \chi_E}$.

Two functions (or sets) in a measure space will be considered one and the same object if they agree almost everywhere. For example, consider a set ${E}$ in ${X}$ and a subset ${E'\subset E}$ with ${\mu(E\setminus E')=0}$. For the purposes of this course, the functions ${\chi_E}$ and ${\chi_{E'}}$ are one and the same function. If you want to be more rigorous, you have to think of a measurable function as an equivalence class of functions, where two measurable functions ${f,f'}$ are equivalent if and only if ${f=f'}$, ${\mu}$-almost everywhere (${\mu}$-a.e.). That is, ${f=f'}$ everywhere on ${X}$ except maybe on a set of measure zero. We will however abuse language a bit and just refer to ${f}$ as a function arbitrarily choosing a representative from every equivalence class. Moreover, we can choose the member of the class that is more convenient for our purposes. To give an example of the usefulness of this principle, think of the equivalence class of functions ${f}$, say on ${\mathbb R}$, that agree with ${0}$ almost everywhere. One can think of functions that behave very erratically and are equal to ${0}$ almost everywhere. However, the function ${f}$ that is identically equal to ${0}$ everywhere still belongs to the same equivalence class and is continuous, thus it qualifies as a nice’ representative of this equivalence class.

For continuous functions however, there is no ambiguity.

Exercise 1 Let ${X,Y}$ be two topological spaces and suppose that ${Y}$ is Hausdorff. Assume that ${\mu}$ is a Borel measure on ${X}$ such that ${\mu(U)>0}$ for every open set ${U\subset X}$. Prove that if ${f,g:X\rightarrow Y}$ are continuous and ${f=g}$ ${\mu}$-a.e. on ${X}$, then ${f=g}$ on ${X}$.

Hint: Since the space ${Y}$ is Hausdorff, open sets separate points’: for every ${y_1,y_2\in Y}$ with ${y_1\neq y_2}$ there exist disjoint open neighborhoods ${V_{y_1},V_{y_2}}$ of ${y_1,y_2}$, respectively.

3. ${L^p}$ spaces

Let us begin by fixing a measure space ${(X,\mathcal X,\mu)}$. We assume as usual that ${\mu}$ is a non-negative ${\sigma}$-finite measure on ${X}$. The most important spaces of functions in this course will be the spaces ${L^p=L^p(X,\mu)}$, ${0 < p < \infty}$, defined as the spaces of measurable functions ${f:X \rightarrow {\mathbb C}}$ such that

$\displaystyle \| f\|_{L^p(X,\mu)}= \bigg(\int_X|f(x)|^p d\mu(x) \bigg)^\frac{1}{p}<+\infty.$

For ${p=\infty}$ we define the space of essentially bounded functions ${f:X \rightarrow {\mathbb C}}$, that is the space of measurable functions ${f}$ such that

$\displaystyle \| f\|_{ L^\infty(X,\mu)}= \mathop{\mathrm{ess}\sup}_{x\in X}|f(x)|<+\infty.$

Recall here that the essential supremum of a function ${f}$ is the smallest positive number ${c}$ such that ${|f(x)|\leq c}$, ${\mu}$-almost everywhere:

$\displaystyle \mathop{\mathrm{ess}\sup}_{x\in X}|f(x)|= \inf \big \{c>0: \mu (\{ x\in X: |f(x)|> c \})= 0 \big\}.$

We will alternatively use the notations ${\|f\|_{L^p}}$ or even ${\|f\|_p}$ whenever the underlying measure space is clear from context or not relevant for a statement.

Exercise 2 Let ${f}$ be a simple function of finite measure support, that is, a finite linear combination of indicator functions of sets of finite measure. Show that

$\displaystyle \lim_{p\rightarrow \infty}\|f\|_p=\|f\|_{\infty},$

and that

$\displaystyle \lim_{p\rightarrow 0}\|f\|_p ^p = \mu({\mathrm{supp}}(f)),$

where

$\displaystyle {\mathrm{supp}}(f)=\{x\in X:f(x)\neq 0\}.$

As we shall shortly see, for ${1\leq p \leq \infty}$, the quantities ${\|\cdot\|_{L^p(X,\mu)}}$ are norms. In order to show this, the only difficulty is the triangle (or Minkowski, in this case) inequality. For ${0 these quantities are not norms any more but we have a quasi-triangle inequality, that is a triangle inequality with a constant different than ${1}$, and the spaces ${L^p(X,\mu)}$ are still complete vector spaces.

Lemma 1 Let ${(X,\mathcal X,\mu)}$ be a measure space. For ${1\leq p < \infty}$, the quantity ${\|\cdot\|_{L^p(X,\mu)}}$ is a norm. In particular we have the following, for all functions ${f,g\in L^p(X,\mu)}$:

(i) (Point Separation)

$\displaystyle \|f\|_{L^p(X,\mu)}=0\iff f=0.$

(ii) (Positive Homogeneity) For all ${c\in{\mathbb C}}$ we have

$\displaystyle \|c f\|_{L^p(X,\mu)}=|c|\|f\|_{L^p(X,\mu)}.$

(iii) (Triangle inequality)

$\displaystyle \|f+g\|_{L^p(X,\mu)}\leq \|f\|_{L^p(X,\mu)}+\|g\|_{L^p(X,\mu)}.$

For ${0, (i) and (iii) still hold true. Triangle inequality is replaced by

(iii’)(Quasi-triangle inequality)

$\displaystyle \|f+g\|_{L^p(X,\mu)}\lesssim_p \|f\|_{L^p(X,\mu)}+\|g\|_{L^p(X,\mu)}.$

Proof: The statements (i) and (ii) are trivial, given the fact that we identify functions that agree ${\mu}$-a.e. For (iii) we can assume that ${f,g}$ are non-zero because of (i), otherwise there is nothing to prove. The case ${p=\infty}$ of (iii) is trivial so we assume that ${1\leq p<\infty}$. Because of the homogeneity property (ii), it is enough to prove that ${\|f+g\|_p\leq 1}$ whenever ${\|f\|_p+\|g\|_p=1}$. Since ${f,g}$ are non-zero this means that there exists ${\theta}$ with ${0<\theta<1}$ such that ${\|f\|_p=\theta}$ and ${\|g\|_p=1-\theta}$. Setting ${F=f/\theta}$ and ${G=g/(1-\theta)}$ the problem reduces to showing that

$\displaystyle \int |\theta F(x)+(1-\theta) G(x)|^p d\mu(x) \leq 1, \ \ \ \ \ (1)$

whenever

$\displaystyle \|F\|_p=\|G\|_p=1.$

We will show (1) by using a basic convexity estimate. For ${s\in (0,\infty)}$ we consider the function given by the formula ${h(s)=s^p}$ where ${1\leq p<\infty}$. Then the function ${h}$ is convex. This means in particular that for ${s_1,s_2>0}$ and ${0<\theta<1}$ we have ${h(\theta s_1+(1-\theta)s_2)\leq \theta h(s_1)+(1-\theta)h(s_2)}$. Using the complex triangle inequality and the convexity of ${h}$ we can thus write

$\displaystyle \begin{array}{rcl} \int |\theta F(x)+(1-\theta) G(x)|^p d\mu(x)&\leq& \int (\theta|F(x)| +(1-\theta)|G(x)|)^pd\mu(x)\\ \\ &\leq & \theta \int |F(x)|^pd\mu(x)\\ \\ && +(1-\theta)\int|G(x)|^pd\mu(x)\\ \\ &=& \theta+(1-\theta)=1, \end{array}$

because of the normalization ${\|F\|_p=\|G\|_p=1}$.

The quasi-triangle inequality is any easy consequence of the basic estimate ${(a+b)^p\leq a^p+b^p,}$ for ${a,b>0}$ and ${0, and is left as an exercise.$\Box$

Exercise 3 Show that the triangle inequality is an equality if and only if ${f=g=0}$ or ${f=cg}$ for some ${c>0}$.

Hint: Check carefully when the inequalities in the previous proof become equalities. Use the fact that for ${f\geq 0}$ we have ${\int f=0\iff f=0}$ a.e.

For ${1\leq p \leq \infty}$, the spaces ${L^p(\mathbb R^n)}$ are Banach spaces, that is they are normed vector spaces which are complete with respect to the corresponding norm ${\| \cdot\|_{L^p(\mathbb R^n)}}$. For ${0 we don’t have a triangle inequality. However, the quasi-triangle inequality allows us to show that the spaces ${L^p(X,\mu)}$ are (quasi-normed) complete vector spaces.

Proposition 2 For ${1\leq p\leq\infty}$ the space ${L^p(X,\mu)}$ is a Banach space. For ${0 the space ${L^p(X,\mu)}$ is a complete quasi-normed vector space. Furthermore, for ${0 the preceding spaces are separable. Separability fails however for ${p=\infty}$.

A very useful variation of Minkowski’s inequality is one where we replace’ the sum by an integral (which, in a way, is also a sum!). Roughly speaking this inequality states that the norm of an integral is always smaller or equal to the integral of the norm.

Proposition 3 (Minkowski’s integral inequality) Let ${(X,\mathcal {X},\mu)}$ and ${(Y,\mathcal {Y},\nu)}$ be two measure spaces where the measures ${\mu,\nu}$ are ${\sigma}$-finite non-negative measures. Let ${f}$ be a ${\mathcal X\otimes \mathcal Y}$-measurable function on the product space ${X\times Y}$.

(i) If ${f\geq 0}$ and ${1\leq p <\infty}$, then

$\displaystyle \bigg(\int_X \bigg | \int_Y f(x,y) d\nu(y) \bigg |^p d\mu(x) \bigg)^\frac{1}{p} \leq \int_Y \bigg(\int_X |f(x,y)|^pd\mu(x)\bigg)^\frac{1}{p}d\nu(y).$

(ii) If ${1\leq p \leq \infty}$, ${f(\cdot,y)\in L^p(X,\mu)}$ for ${\nu}$-a.e. ${y\in Y}$, and the function ${y\mapsto\|f(\cdot,y)\|_{L^p(X,\mu)}}$ is in ${L^1(Y,\nu)}$ for ${\mu}$-a.e. ${x\in X}$, then ${f(x,\cdot)\in L^1(Y,\nu)}$ for ${\mu}$-a.e. ${x}$, the function ${x\mapsto \int_Y f(x,y)f\nu(y)}$ is in ${L^p(X,\mu)}$ and

$\displaystyle \bigg\| \int_Y f(\cdot,y)d\nu(y)\bigg\|_{L^p(X,\mu)} \leq \int_Y \| f(\cdot,y)\|_{L^p(X,\mu)} d\nu(y).$

Writing (ii) of Minkowski’s inequality also highlights the similarity to the classical triangle inequality, where one just has to think of the integral as a generalized sum’. This is also a good trick to help you memorize the inequality. Observe that the triangle inequality is just a special case of the integral version of Minkowski’s inequality where the measure ${\nu}$ is the counting measure. You can find the proof of this inequality in most textbooks of real analysis. See for example [F].

After the triangle inequality, the next most important inequality in the spaces ${L^p(X,\mu)}$, is Hölder’s inequality.

Lemma 4 Let ${f\in L^p(X,\mu)}$ and ${g\in L^q(X,\mu)}$ for some ${0. Define the exponent ${r}$ by means of the Hölder relationship’

$\displaystyle \frac{1}{r}=\frac{1}{p}+\frac{1}{q}.$

Then the function ${fg\in L^r(X,\mu)}$ and we have the norm estimate

$\displaystyle \|fg\|_{L^r(X,\mu)}\leq \|f\|_{L^p(X,\mu)}\|g\|_{L^q(X,\mu)}.$

Exercise 4 Prove Lemma 4 above.

Hint: Note that the case ${p=q=r=\infty}$ is trivial. Assuming that ${p,q,r<\infty}$ homogeneity allows us to normalize ${\|f\|_{L^p(X,\mu)}=\|g\|_{L^q(X,\mu)}=1}$, the case ${f=0}$ or ${g=0}$ being trivial. Normalizing and setting ${F(x)=|f(x)|^p, G(x)=|g(x)|^q}$, it is enough to prove that ${\int_X F^x G^{1-x} \leq 1}$, whenever ${\int G=\int F =1}$, for suitable ${x}$. Complete the proof using the fact that the function ${x\mapsto a^x \beta^{1-x}}$ is convex, where ${a,\beta}$ are positive real numbers. To show this you can use the convexity of the function ${x\mapsto e^x}$.

Remark 1 Observe that Hölder’s inequality is invariant under the transformation ${f\mapsto c_1f}$ and ${g\mapsto c_2g}$ for any constants ${c_1,c_2>0}$. Note also that this inequality refers to a general measure space ${(X,\mu)}$. Replacing the measure ${\mu}$ by the measure ${\tilde \mu=\lambda\mu}$ for some constant ${\lambda>0}$ observe that ${f\in L^p(\mu)\iff f\in L^p(\tilde \mu)}$. Using this invariance and applying Hölder’s inequality ${f=g=\chi_A}$ with ${\mu(A)=1}$, we get

$\displaystyle \lambda ^\frac{1}{r} \leq \lambda^{\frac{1}{p}+\frac{1}{q}},$

for all ${\lambda >0}$. We conclude that we must have the Hölder relation between the exponents ${r,p,q}$,

$\displaystyle \frac{1}{r}=\frac{1}{p}+\frac{1}{q},$

whenever Hölder’s inequality holds true.

3.1. Log-convexity of of ${L^p}$-norms

We will now discuss a characteristic of ${L^p}$ norms that is implicit in many parts of the discussion on ${L^p}$ spaces, especially interpolation which we will discuss at the end of this first set of notes. It is already hidden in the proof of Hölder’s inequality above.

Let us start with a function ${F: {\mathbb R}\rightarrow{\mathbb R}}$. The function ${F}$ is called convex if for every ${x,y\in {\mathbb R}}$ and any ${0\leq \theta\leq 1}$ we have that

$\displaystyle F((1-\theta) x+ \theta y )\leq (1-\theta )F(x)+\theta F(y).$

The same definition makes perfect sense whenever the function ${F}$ is defined on some interval of the real line or, in fact, on any convex subset of a vector space. Observe that the definition states that the line connecting the points ${(x,F(x))}$ and ${(y,F(y))}$ of the graph of ${F}$ always lies above the graph of the function itself. Now if a function ${F}$ is positive, we will say that ${F}$ is log-convex if the function ${x\rightarrow \log F(x) }$ is convex. In this case we must have

$\displaystyle F((1-\theta) x +\theta y)\leq F(x)^{1-\theta} F(y)^\theta.$

Proposition 5 (Log-convexity of ${L^p}$-norms) Let ${0 and define ${p_2}$, ${p_1\leq p_2 \leq p_3}$ as

$\displaystyle \frac{1}{p_2}=\frac{1-\theta}{p_1}+\frac{\theta}{p_3},$

where ${0<\theta<1}$. Thus ${{1}/{p_2}}$ is a convex combination of ${1/p_1}$ and ${1/p_3}$. Then we have that if ${f\in L^{p_1}\cap L^{p_3}}$ and

$\displaystyle \|f\|_{L^{p_2}(X,\mu)}\leq \|f\|^{1-\theta} _{L^{p_1}(X,\mu)}\|f\|^\theta _{L^{p_3}(X,\mu)}.$

Note that this means that the function ${\frac{1}{p}\mapsto \|f\|_{L^p(X,\mu)}}$ is log-convex.

Proof: Observing that ${\frac{(1-\theta )p_2}{ p_1}+\frac{\theta p_2}{p_3}=1}$, we apply Hölder’s inequality to ${|f|^{p_2}=|f|^{(1-\theta )p_2+ \theta p_2}}$ to get

$\displaystyle \begin{array}{rcl} \int |f|^{p_2}=\int |f|^{ (1-\theta) p_2 }|f|^{\theta p_2 }\leq \bigg(\int |f|^{p_1}\bigg)^\frac{(1-\theta) p_2}{p_1}\bigg(\int |f|^{p_3}\bigg)^\frac{\theta p_2}{p_3}, \end{array}$

which proves the desired estimate. $\Box$

We will give another proof that employs a notion of convexity in complex analysis and, in particular, the maximum principle. We state the following lemma which will also be useful in the rest of the notes.

Lemma 6 (Three lines lemma) Suppose that ${F}$ is a bounded continuous complex-valued function on the closed strip ${S=\{x+iy=z\in{\mathbb C}:0\leq x\leq 1\}}$, that is analytic in the interior of ${S}$. Suppose that ${F}$ obeys the bounds ${|F(iy)|\leq A}$ and ${|F(1+iy)|\leq B}$ for all ${y\in{\mathbb R}}$. Then we have that ${|F(x+iy)|\leq A^{1-x}B^x}$ for all ${z=x+iy\in S}$.

Proof: First of all we can assume that ${A,B>0}$ otherwise there is nothing to prove. Now, consider the function ${G(z)=F(z)/A^{1-z}B^z}$ for ${z\in \bar S}$. Thus it suffices to show that ${|G(z)|\leq 1}$ for all ${z\in S}$, whenever ${|G(iy)|\leq 1}$ and ${|G(1+iy)|\leq 1}$. First consider the case that ${\lim_{|y|\rightarrow + \infty}|G(x+iy)|=0}$ uniformly in ${0\leq x \leq 1}$. Then the result follows from the maximum principle. Indeed, there is some ${y_o>0}$ such that ${|G(x+iy)|\leq 1}$ for all ${|y|\geq y_o}$. Now ${G}$ is bounded by ${1}$ on the boundary of the rectangle ${[0,1]\times[-iy_o,iy_o] }$ and the maximum principle implies that ${G}$ is also bounded by ${1}$ in the interior of the rectangle as well. Thus, in this case, ${G}$ is bounded by ${1}$ throughout the strip ${S}$. To get rid of the condition ${\lim_{|y|\rightarrow + \infty}|G(x+iy)|=0}$ consider the sequence of functions ${G_n(z)=G(z)e^{(z^2-1)/n}}$, for ${n\in {\mathbb N}}$. Since ${G}$ is bounded, say ${|G(z)|\leq M}$, we have that

$\displaystyle \begin{array}{rcl} |G_n(z)|=|G(z)|e^{-y^2/n}e^{(x^2-1)/n}\leq M e^{-y^2/n}\rightarrow 0, \end{array}$

as ${|y|\rightarrow+\infty}$, uniformly in ${0\leq x \leq 1}$. Observe that we still have the bounds ${|G_n(iy)|\leq 1}$ and ${|G_n(x+iy)|\leq 1}$ for ${y\in {\mathbb R}}$, uniformly in ${n\in{\mathbb N}}$. Thus we also conclude that ${|G_n(z)|\leq 1}$ for all ${n}$. Letting ${n\rightarrow+\infty}$ we get that ${|G(z)|\leq 1}$. $\Box$

Remark 2 Observe that if we define the function ${\phi(x)=\sup \{|F(x+iy)|:y\in{\mathbb R} \}}$, then under the hypothesis of the three lines lemma, we get that ${\phi}$ is log-convex. Another point to observe here is that the hypothesis we have stated here is not quite optimal. Indeed, we can actually relax the condition that ${F}$ is bounded with the growth condition ${|F(z)|\lesssim_F e^{O_F(e^{(\pi-\delta)|z|})}}$ for some ${\delta>0}$ when ${z\in S}$. The idea of the proof is exactly the same. One first proves the result in the case that ${\lim_{|y|\rightarrow+\infty}F(x+iy)=0}$. Then we apply this for the sequence of functions ${F_n(z)=e^{\frac{1}{n}e^{i[(\pi-\frac{1}{n})z+\frac{1}{2n}]}} F(z)}$.

Proof: We begin by making some reductions. Observe that the inequality we want to prove is invariant under the transformations ${f\mapsto cf}$ and ${\lambda \mapsto \lambda \mu}$ for any constants ${c,\lambda >0}$. Using these invariances it is enough to show that if ${\|f\|_{L^{p_1}}= \|f\|_{L^{p_3}}=1}$ then we have that ${\int |f|^{p_2}\leq 1}$, for all ${p_2}$ with ${0. To do this, consider the entire function

$\displaystyle z\mapsto F(z)=\int_X |f|^{(1-z)p_1+z p_3} d\mu$

Assuming that ${f}$ is a simple function it is easy to see that the map ${F}$ is bounded throughout the strip ${S=\{x+iy:0\leq x \leq 1, y\in{\mathbb R} \}}$. Observe also that we have the bounds ${|F(0+iy)|\leq \|f\|_{p_1}}$ and ${|F(1+iy)|\leq \|f\|_{p_3}}$. Using the three lines lemma we conclude that

${|F(1+iy)|\leq 1 }$

for all ${y\in {\mathbb R}}$ and ${0\leq x \leq 1}$. Applying this bound for ${y=0}$ gives

$\displaystyle \int_X |f|^{(1-x)p_1+x p_3} d\mu \leq 1$

for all simple functions of finite measure support $f$ and all $\leq 0\leq x\leq 1$. But this means that

$\displaystyle \int_X |f|^{p_2} d\mu \leq 1$

for all $p_1\leq p_2\leq p_3$ and for simple functions of finite measure support, as we wanted to show. A limiting argument gives the log convexity for general functions. $\Box$

Remark 3 In fact, one can go the other direction and prove Hölder’s inequality by means of the log-convexity of the ${L^p}$ norm. Also, as in the case of Hölder’s inequality, it is not hard to verify that whenever such an estimate is true, the indices ${p_1,p_2,p_3}$ must be related as

$\displaystyle \frac{1}{p_2}=\frac{1-\theta}{p_3}+\frac{\theta}{p_1}.$

To see this, apply the inequality replacing the measure ${\mu}$ by ${\lambda \mu}$, where ${\lambda>0}$.

Exercise 5 Use the three lines lemma to give a different proof of Hölder’s inequality.

Hint: Show Hölder’s inequality initially for simple functions with finite measure support. For this, apply the three lines lemma to the function

$\displaystyle F(z)=\int_X |f|^{p(1-z)}|g|^{qz} d\mu ,$

for ${z\in S=\{x+iy=z\in{\mathbb C}: 0\leq x \leq 1, y\in{\mathbb R}\}}$. Use the fact that simple functions with finite measure support are dense in ${L^p}$, ${1. Fill in the details of the limiting argument (omitted in the previous proof).

3.2. Heuristic discussion and examples of ${L^p}$ spaces

Let us now see a couple of specific examples of ${L^p}$ spaces which will come up often in this course.

Example 1 The most common setting for this course will be the Euclidean setting, that is the measure space ${(\mathbb R^n,\mathcal L, dx)}$. A typical point in ${\mathbb R^n}$ will be denoted by ${x=(x_1,\ldots,x_n)}$ and ${dx=dx_1\cdots dx_n}$ denotes the ${n}$-dimensional Lebesgue measure. For a set ${E}$ in ${\mathbb R^n}$ we will many times write ${|E|}$ for its Lebesgue measure. Here, ${\mathcal L}$ is the Lebesgue ${\sigma}$-algebra of subsets of ${\mathbb R^n}$ and we will oftentimes omit it from the notation.

Example 2 Consider the measure space ${(\mathbb Z, \mathcal D,\nu)}$, where ${\mathcal D}$ is the ${\sigma}$-algebra of all subsets of ${\mathbb Z}$. Here ${\nu}$ is the counting measure. Recall that for ${E\subset\mathbb Z}$, the counting measure of ${E}$ is the cardinality of ${E}$, typically denoted by ${|E|}$, if ${E}$ is finite, and ${\nu(E)}$ is defined to be ${+\infty}$ if ${E}$ is infinite. Every subset of ${\mathbb Z}$ is clearly measurable with respect to ${\nu}$. With these definitions taken as understood observe that, for example, the space ${L^p(\mathbb Z,\mathcal D,\nu)}$ is just the space of sequences on ${\mathbb Z}$ whose ${p}$-th powers are summable, that is, the space of all sequences ${a=\{a_k\}_{k\in\mathbb Z}}$ such that

$\displaystyle \|a\|_p= \bigg(\sum_{k\in\mathbb Z}|a_k|^p\bigg)^\frac{1}{p}<+\infty.$

These spaces come up so often in analysis that they deserve to have a special notation; we usually denote them by ${\ell^p(\mathbb Z)}$. Maybe this seems like an unnecessary complication to state a very simple definition. Observe, however, that once we put things in this language we automatically have all the tools from measure theory at our disposal.

Exercise 6 Let ${\{a^{(n)}\}_{n\in\mathbb N}}$ be a sequence of elements in ${(\mathbb Z,\mathcal D,\nu)}$, that is, a sequence of sequences. For each positive integer ${n\in\mathbb N}$ we write ${a^{(n)}=\{a_k ^{(n)}\}_{k\in\mathbb Z}}$. Assume that for each fixed ${k\in\mathbb Z}$, there is a complex number ${a_k}$ such that ${\lim_{n\rightarrow+\infty}a_k ^{(n)}=a_k}$, that is, the sequence ${\{a^{(n)}\}_{n\in\mathbb N}}$ converges pointwise to some sequence ${a=\{a_k\}_{k\in\mathbb Z}}$. State Lebesgue’s dominated convergence theorem in this setup. When can we interchange the limit with summation?

Example 3 We denote by ${{\mathbb T}}$ the torus, that is the quotient space ${{\mathbb R} / 2\pi{\mathbb Z}}$ where ${2\pi {\mathbb Z}}$ is the group of integral multiples of ${2\pi}$. Thus two points of ${{\mathbb R}}$ are identified if the differ by an integral multiple of ${2\pi}$. There is a natural identification of functions on ${{\mathbb T}}$ and ${2\pi}$-periodic functions on ${{\mathbb R}}$. The Lebesgue measure ${dt}$ on ${{\mathbb T}}$ can also be identified with the restriction of the Lebesgue measure of ${{\mathbb R}}$ on the interval ${[0,2\pi)}$, or in fact, any interval in ${{\mathbb R}}$ of length ${2\pi}$. Remember that the Lebesgue measure on ${{\mathbb R}}$ is translation invariant. We equip ${{\mathbb T}}$ with the Lebesgue ${\sigma}$-algebra. The integral of a function ${f:{\mathbb T}\rightarrow {\mathbb C}}$ can thus be written as

$\displaystyle \int_{\mathbb T} f(t)dt=\int_0 ^{2\pi} f(x)dx,$

where ${f}$ is considered as a ${2\pi}$-periodic function on ${{\mathbb R}}$. The preceding definitions imply that the measure ${dt}$ on ${{\mathbb T}}$ is translation invariant. The Lebesgue spaces ${L^p({\mathbb T})}$, ${1\leq p \leq \infty}$, are defined in the obvious way. Since the total measure of ${{\mathbb T}}$ is finite, an important feature of the spaces ${L^p({\mathbb T})}$ is that they are nested; for ${1\leq p_1\leq p_2\leq \infty}$ we have that ${L^{p_2}({\mathbb T})\subset L^{p_1}({\mathbb T})}$, ${L^\infty({\mathbb T})}$ being the smaller’ space. Furthermore this embedding is continuous. See also exercise 8.

We now briefly discuss why a function may fail to belong to ${L^p(\mathbb R^n)}$ for ${1\leq p <\infty}$. For simplicity, let us focus on the real line and consider the obstructions for membership  to the spaces ${L^p(\mathbb R)}$. Very similar conclusions hold in the ${n}$-dimensional Euclidean space. Roughly speaking, there are two main obstructions:

The decay of the function at infinity. Simply put, the function might not decay fast enough as ${|x|\rightarrow +\infty}$ for the integral of ${|f(x)|^p}$ to be finite. The most naive example one can think of is a constant function, e.g. ${f(x)=c\chi_{\{x\geq 1\}}(x)}$ for some complex number ${c\in{\mathbb C}}$. Obviously this function raised to any power cannot be integrable close to infinity. A slightly more subtle example is the function ${f}$ which agrees with ${1/x}$ for ${x\rightarrow+\infty}$, i.e. ${f(x)=\frac{1}{x}\chi_{\{x\geq 1\}}(x)}$. This function fails logarithmically to be in ${L^1(\mathbb R)}$ but belongs to ${L^p({\mathbb R})}$ for any ${p>1}$. Of course we can similarly construct functions that decay even slower at infinity so that they fail to be in ${L^p}$ for ${p>1}$ as well. Thus, whenever a function ${f}$ belongs to some ${L^p}$ space for some ${1\leq p< +\infty}$, this imposes a control on the decay of ${f}$ at infinity. Increasing ${p}$ will only make things better at infinity, provided that the function already has some decay. Observe that this obstruction does not exist on a finite measure space. This is the case for the spaces ${L^p({\mathbb T})}$ for example.

Blow up at local singularities. Here it is enough to consider any compact set and study the behavior of the function locally. If the function is bounded on compact sets, i.e. if it is locally bounded, then the local behavior will not be an obstruction for the function to belong to some ${L^p}$ space. Things become more interesting when there is a local singularity around a point. Here we can consider again the function ${f(x)=\frac{1}{x}\chi_{\{|x|\leq 1\}}(x)}$, close to zero this time. This function has a logarithmic singularity at zero, and thus it does not belong to ${L^1(\mathbb R)}$. Observe here that we have forced the function to be zero away from the origin in order to isolate the obstruction. As ${p}$ increases to values ${p>1}$, this function fails more and more dramatically to belong to ${L^p(\mathbb R)}$ since we raise this singularity to higher powers, thus ${|f|^p}$ presents a more severe singularity. The solution’ here would be to consider the ${L^p}$ spaces for ${p<1}$. Thus local singularities may also prevent a function from belonging to some ${L^p}$ space. Unlike the behavior at infinity, the local behavior of ${|f|^p}$ improves as we decrease ${p}$. For example, the function ${f(x)=\frac{1}{\sqrt{x}}\chi_{\{|x|\leq 1\}}}$ fails to be in ${L^2({\mathbb R})}$ but clearly belongs to all ${L^p(\mathbb R)}$ spaces for ${p<2}$.

Remark 4 A function ${f}$ is in some ${L^p}$ space if and only if the function ${|f|}$ belongs to the ${L^p}$ space. Thus, there is no cancellation involved in the ${L^p}$ integrability of a function. This is an essential difference between the Lebesgue integral and the Riemann integral. The typical example here is to consider the function

$\displaystyle f(x)=\sum_{n=0} ^\infty \frac{1}{n} (-1)^n\chi_{[n,n+1)}(x).$

Since ${\int |f|}$ is the harmonic series, ${f}$ is not Lebesgue integrable. However, ${f}$ is Riemann integrable since ${\int f=\sum_{n=1} ^\infty \frac{(-1)^n}{n}}$ which converges. Thus, whenever a function oscillates, we expect some cancellation in its integral that will not be reflected in the Lebesgue integrability of the function.

Exercise 7 Based on the previous discussion, answer the following questions (it’s a simple calculation):

(i) Let ${q\in(0,+\infty)}$ be a given number. Based on the previous discussion, construct a function that belongs to ${L^p({\mathbb R})}$ for all ${p but does not belong to ${L^q({\mathbb R})}$. For example, for ${q=1}$, a possible answer is the function ${f(x)=\frac{1}{x}\chi_{\{|x|\leq 100\}}}$. Also, construct a function that belongs to ${L^p({\mathbb R})}$ for all ${p>q}$ but does not belong to ${L^q({\mathbb R})}$.

(ii) For ${x\in{\mathbb R}^n}$ and ${\delta>0}$, consider the function ${f(x)=\frac{1}{|x|^\delta}\chi_{\{|x|\leq 1\}}}$. Characterize the values of ${\delta>0}$ as a function of ${n,p }$ so that the function ${f}$ belongs to ${L^p(\mathbb R^n)}$. Consider all the range ${0 and calculate the ${L^p}$ norm of the function, whenever it is finite. (iii) For ${x\in{\mathbb R}^n}$ and ${\delta>0}$, consider the function ${f(x)=\frac{1}{|x|^\delta}\chi_{\{|x|> 1\}}}$. Characterize the values of ${\delta>0}$, as a function of ${n,p}$, so that the function ${f}$ belongs to ${L^p(\mathbb R^n)}$. Consider all the range ${0 and calculate the ${L^p}$ norm of the function, whenever it is finite.

Remark 5 An interesting notion that is implicit in the previous discussion is that of local integrability of a function. A function ${f:\mathbb R^n\rightarrow{\mathbb C}}$ is called locally integrable if for every compact set ${K\subset \mathbb R^n}$ we have that

$\displaystyle \int_K |f(x)|dx<+\infty.$

al We then write ${f\in L^1 _\textnormal{loc}(\mathbb R^n)}$. Local integrability ignores the behavior of a function at infinity. We are thus left with only one obstruction: the possibility that ${f}$ has local singularities. Observe that if ${f\in L^p(\mathbb R^n)}$ for any ${p\geq 1}$ then ${f}$ will be locally integrable. Similarly we can define the space ${L^p _\textnormal{loc}({\mathbb R}^n)}$.

Exercise 8 Give a heuristic explanation of the fact that if ${f\in L^p(\mathbb R^n)}$ for any ${p\geq 1}$ then ${f\in L^1 _\textnormal{loc}(\mathbb R^n)}$ (hint: what is the only obstruction for a function to be locally integrable?). Give a rigorous proof by means of Hölder’s inequality. Show also (which is the same) that on a finite measure space ${(X,\mu)}$, we have that ${L^q(X,\mu)}$ is continuously embedded in ${ L^p(X,\mu)}$ whenever ${0. For this it is enough to show that

$\displaystyle \|f\|_{L^p(X,\mu)}\lesssim_{p,q,\mu(X)} \|f\|_{L^q(X,\mu)}.$

What is the best value of the implied constant in the previous inequality?

Exercise 9 For ${0< p \leq \infty}$ consider the spaces ${\ell^p({\mathbb N})}$ of all complex sequences ${a=\{a_n\}_{n\in{\mathbb N}}}$ such that

$\displaystyle \|a\|_p=\bigg(\sum_{n=1} ^\infty |a_n|^p\bigg)^\frac{1}{p}<\infty.$

Show that if ${0 we have that ${\ell^{p_1}\subset \ell^{p_2}}$ and the embedding is continuous

$\displaystyle \|a\|_{p_2}\leq \|a\|_{p_1}.$

A space ${(X,\mu)}$ is called granular if there is constant ${c_o>0}$ such that ${\mu(E)>c_o}$ for all measurable sets ${E}$ of positive measure. Show that  for a granular space ${L^p(X,\mu)}$ with constant ${c_o>0}$ we have that ${L^{p_1}\subset L^{p_2}}$ whenever ${0 with

$\displaystyle \|f\|_{L^{p_2}(X,\mu)}\lesssim_{p_1,p_2,c_o} \|f\|_{L^{p_1}(X,\mu)},$

whenever ${0 and ${(X,\mu)}$. What is the best value of the implied constant?

Remark 6 Note that the opposite embedding is true for ${L^p(X,\mu)}$ spaces with ${\mu(X)<\infty}$. The explanation for this is quite simple. Sequences on ${{\mathbb N}}$ (or ${{\mathbb Z}}$) cannot have local singularities so the only deciding factor for candidature to some ${\ell^p}$ space is decay at infinity. This also explains the embedding in this exercise. If a sequence belongs to some ${\ell^p}$ space, this means there is already sufficient decay at infinity for the series ${\sum |a_n|^p}$ to be summable. Raising the exponent ${p}$ only improves the decay of ${|a_n|^p}$ as ${n\rightarrow \infty}$. A similar phenomenon occurs in general in granular spaces.

Exercise 10 (i) Let ${0 and suppose that ${f\in L^{p_0}\cap L^\infty}$. Show that ${\|f\|_p\rightarrow\|f\|_\infty}$ as ${p\rightarrow \infty}$.

(ii) If ${f \notin L^\infty}$ show that ${\|f\|_p\rightarrow\infty}$ as ${p\rightarrow \infty}$.

4. The dual space of ${L^p}$

Remember that for a Banach space ${Y}$ over ${{\mathbb C}}$, its dual ${X^*}$ is the space of all bounded linear functionals ${x^*:X\rightarrow{\mathbb C}}$. Let ${1\leq p < \infty}$ and define ${p'}$ be the duality relation ${1/+1/p'=1}$. For any ${g\in L^{p'}(X,\mu)}$ we define the functional

$\displaystyle g^*:L^p(X,\mu)\rightarrow {\mathbb C} ,$

by means of the formula

$\displaystyle g^*(f)=\int_X f(x)\overline {g(x)}d\mu(x).$

It is obvious that ${g^*}$ is linear and Hölder’s inequality shows that ${g^*}$ is continuous since

$\displaystyle | g^*(f)| \leq \|g\|_{L^{p'}(X,\mu)}\|f\|_{L^p(X,\mu)},$

for all ${f\in L^p(X,\mu)}$. Thus ${g^*\in (L^p(X,\mu))^*}$. Actually, in most cases the opposite is true, that is, every functional in ${(L^p(X,\mu))^*}$ is uniquely defined by a function in ${L^{p'}}$, whenever ${1\leq p<+\infty}$ and the measure ${\mu}$ is ${\sigma}$-finite.

Theorem 7 Let ${1< p<\infty}$ and ${x^*\in (L^p(X,\mu))^*}$. There exists a unique ${g\in L^{p'}(X,\mu)}$ such that ${x^*=g^*}$. The same is true when ${p=1}$ and the measure ${\mu}$ is ${\sigma}$-finite.

Remark 7 Theorem 7 fails (in most cases) when ${p=\infty}$. In fact the dual of ${L^\infty}$ can be characterized as a space of measures but we will not pursue that here. We have however that for all ${1\leq p \leq \infty}$ and ${f\in L^p(X,\mu)}$, with ${\mu}$ ${\sigma}$-finite, we have that

$\displaystyle \|f\|_{L^p(X,\mu)}= \sup\bigg\{\bigg|\int_X f(x) \overline {g(x)} d\mu(x)\bigg|: \|g\|_{L^{p'}(X,\mu)}\leq 1\bigg\}. \ \ \ \ \ (2)$

Observe however that for this we need to know a priori that ${f\in L^p(X,\mu)}$. A way to bypass this problem is to work with a dense subclass of functions. This is essentially a duality relation but the small point just mentioned doesn’t allow one to show that the dual of ${L^\infty}$ is ${L^1}$ (luckily since it’s not true!). It is however a very useful device since it allows very often to linearize’ ${L^p}$ norms. Furthermore this duality relationship shows that the norm of the functional ${g^*\in (L^p)^*}$ is ${\|g\|_{L^{p'}}}$. Thus ${(L^p)^*}$ is isometrically isomorphic to ${L^{p'}}$, ${p'}$ being the dual exponent of ${p}$, for ${1 and also for ${p=1}$ whenever the measure ${\mu}$ is ${\sigma}$-finite.

Exercise 11 Show the duality relation (2) in the previous remark. This is essentially a consequence of Hölder’s inequality. Using this duality relation give an alternative proof of the triangle inequality.

Remark 8 Density arguments allow us to restrict ${g}$ in (2) to belong to any dense subclass of ${L^{p'}(X,\mu)}$.

5. Weak ${L^p}$ spaces

Going back to the example of the function ${h(x)=1/x}$, ${x\in\mathbb R}$, recall that this function does not belong to ${L^1({\mathbb R})}$. For ${\lambda>0}$ the following estimate is obvious

$\displaystyle |\{ x\in {\mathbb R}: |h(x)|>\lambda\}| \leq \frac{2}{\lambda}.$

On the other hand observe that for every function ${f\in L^2({\mathbb R})}$ we have that

$\displaystyle \|f\|_{L^1({\mathbb R})}= \int_{\mathbb R} |f(x)| dx \geq \lambda |\{x\in{\mathbb R}:|f(x)|>\lambda\}|.$

That is, for all ${L^1}$ functions ${f}$ the measure of the set ${ \{x\in{\mathbb R}:|f(x)|>\lambda\}}$ behaves like ${\sim \frac{1}{\lambda}}$.

In general, for any measure space ${(X,\mu)}$ we define for ${0< p<\infty}$ the space weak-${L^p(X,\mu)}$ or ${L^{p,\infty}(X,\mu)}$ to be the space of all functions ${f}$ such that

$\displaystyle \mu(\{x\in X:|f(x)|>\lambda\})\leq \frac{c^p }{\lambda ^p}, \quad \lambda>0, \ \ \ \ \ (3)$

for some constant ${c>0}$. We define the weak-${L^p(X,\mu)}$ or the ${L^{p,\infty}(X,\mu)}$ norm of a function ${f}$ to be the smaller constant ${c>0}$ such that (3) is true. Equivalently

$\displaystyle \|f\|_{L^{p,\infty}(X,\mu)}= \sup_{\lambda>0} \lambda \mu(x\in X:|f(x)|>\lambda\})^\frac{1}{p}.$

Note that ${\|\cdot\|_{L^{p,\infty}}}$ is not a norm since the triangle inequality fails. It is however a quasi-norm (the triangle inequality holds with a constant).

Exercise 12 Show that for ${0 and ${f,g\in L^{p,\infty}}$ we have the quasi-triangle inequality

$\displaystyle \|f+g\|_{L^{p,\infty}(X,\mu)}\lesssim_p \|f\|_{L^{p,\infty}(X,\mu)}+\|g\|_{L^{p,\infty}(X,\mu)} .$

Proposition 8 Let ${0. The space ${L^{p,\infty}(X,\mu)}$ is continuously embedded in ${L^p(X,\mu)}$:

$\displaystyle \|f \|_{L^{p,\infty}(X,\mu)} \leq \|f \|_{L^{p}(X,\mu)}.$

Proof: We just use Chebyshev’s inequality to write

$\displaystyle \begin{array}{rcl} \|f\|_{L^p} ^p &=& \int_X |f(x)|^p d\mu(x) \geq \int_{\{x\in X:|f(x)|>\lambda \}}|f(x)|^p d\mu(x) \\ \\ &\geq & \lambda^p \mu(\{x\in X:|f(x)|>\lambda \}), \end{array}$

for every ${\lambda>0}$. $\Box$

Let us also recall how we can write the ${L^p}$ norm of a function in terms of the distribution function of ${f}$:

Proposition 9 For ${0 we have that

$\displaystyle \|f\|_{L^p(X,\mu)} ^p = p\int_0 ^\infty \lambda^{p-1} \mu(\{x\in X:|f(x)|>\lambda \}) d\lambda.$

Exercise 13 Prove Proposition 9 above. Hint: It is elementary to see that

$\displaystyle |f(x)|^p=p \int_0 ^\infty \chi_{\{x\in X:|f(x)|\geq \lambda\}}\lambda^p \frac{d\lambda}{\lambda}.$

Use Fubini’s theorem to complete the proof.

Exercise 14 For ${0 and ${f\in L^p(X,\mu)}$ show that

$\displaystyle \int_X|f(x)|^p dx \simeq_p \sum_{n\in \mathbb Z} 2^{np} \mu(x\in X:|f(x)| \geq 2^{n}\}) .$

[Update 18 Feb, 2011: Error corrected in the proof of the log-convexity of the $L^p$ norm via the three lines lemma.]

[Update 28 Feb, 2011: Error corrected in the hypothesis of Exercise 2. Also typo in the Hint of Exercise 5 corrected]