Rating: 8.0/10.
Book on measure theory and integration and serves as a good complement to the book “Measure, Integral, Probability.” I liked it because, rather than being technical all the time, it justifies the theory from a historical context – nobody really set out to invent measure theoretic integration, instead all these ideas were first developed intuitively, but in the mid-19th century, mathematicians often found functions or sets that behaved in strange ways and broke down when attempting infinite summations or limits. Thus there was a need to formalize the foundations of mathematics to understand in which situations different theorems could be used (eg: representing a function by a sum of infinitely many functions). This book uses a lot of counterexamples to motivate the formal theory, which was an important motivation historically. A difference is this book’s primary focus is on measure theory applied to integration, with no mention of the probability applications of measure theory.
Chapter 1. The story begins in 1850. At this time, mathematicians were aware of Fourier’s method to approximate a function using trigonometric series, but when it involved infinite sums, it was unclear under what conditions this actually worked. Additionally, the concept of integration was inconsistently defined: it was sometimes defined as the inverse of differentiation, but this approach used a lot of infinitesimals, making it vague what was actually happening. Alternatively, integration was sometimes defined as the area under the curve, and the fundamental theorem of calculus is only meaningful if defined this way; it becomes trivial when defined as the inverse of the derivative. Consequently, there were many disagreements about which results were fundamental versus trivial, and there were also a lot of heuristic justifications for conditions for differentiability, infinite summation, and many surprises that emerged.
Chapter 2. Riemann’s thesis in 1850 focused on representing a function by trigonometric series, but it had a lot to unpack, as mathematician Darboux discovered. The Darboux integral involves taking a partition and finding the maximum or minimum value within each partition. A function is Riemann integrable if these two upper and lower integrals are equal (so for any epsilon, we can take a small enough partition such that the difference between the upper and lower integrals is less than epsilon). The improper integral is defined as the limit of integrals on bounded functions, and if the limits do not exist, then the improper integral is undefined. Darboux and Weierstrass found examples of functions that were continuous but nowhere differentiable, which contradicted the previously held belief that continuous functions are differentiable at most points. In the 1870s, many investigations by Hankel and others focused on which functions are Riemann integrable: Hankel thought that if the points of continuity are dense, then the function is Riemann integrable, but Cantor proved that this is incorrect.
Chapter 3. The real numbers seem like an intuitive concept, but in many ways, the definition of the reals is a human construct that involves some decisions about its properties, eg: the Archimedean principle states that every two reals have a finite multiple of the other. The alternative is the existence of infinitesimals that are infinitely small, and this can produce a system of non-standard analysis that is still self-consistent. The topology of the reals, in terms of open and closed sets, can be defined in several ways. An open set is one where every point has a neighborhood in the interior, and a set is closed if all the accumulation points of points in the set are also in the set; open and closed sets are not mutually exclusive. The Bolzano-Weierstrass theorem states that every sequence has a convergent subsequence, and this leads to Cantor’s definition of the reals using sequences of rationals, as every real is identified by a convergent sequence of rationals (eg: the decimal expansion is one such Cauchy sequence). The property that every sequence converges to a point inside the set is called completeness and is often taken as an axiom, as it is the main difference that separates the reals from the rationals.
The Heine-Borel theorem states that if an infinite collection of open sets can cover an interval [a, b], then a finite subcollection can also cover [a, b]. This is useful, eg for showing that the collection of open intervals with a total length less than 1 cannot cover [0, 1], and the Bolzano-Weierstrass theorem is a corollary of the Heine-Borel theorem. This property holds not just for closed intervals, but the definition of a compact set is a set for which this property holds, and a compact set are those that are closed and bounded. Cantor realized that rational and algebraic numbers, which are roots of integer polynomials, are countable, but reals are not, and the cardinality of natural numbers and countable sets is called aleph-null, while the cardinality of reals is called c. It is also equal to the power set of the naturals, from a mapping from infinite binary sequences to the reals. The continuum hypothesis states that there is no set with cardinality between those of the naturals and the reals, and Gödel and Cohen proved that the set can either not exist or exist; both are consistent. Also, c is not the largest possible set since its power set is always larger than itself through a diagonalization argument.
Chapter 4. The Cantor set is constructed by taking the interval [0, 1] and removing the middle 1/3 open interval recursively, this is a special case of a Smith-Volterra-Cantor (SVC) set and provides many counterexamples. It can be mapped to binary numbers since its members are those whose base 3 expansions do not use the number 1, given this bijection to a binary expansion of the reals, the Cantor set is uncountable, yet at the same time it is also nowhere dense and does not contain any open intervals. The Cantor set can be used to define the “Devil’s Staircase” function: a monotonically increasing function that has a derivative of zero almost everywhere but connects (0, 0) to (1, 1).
Volterra’s function is a construction of a function whose derivative exists and is bounded everywhere, but the derivative is not Riemann integrable. This is first constructed from the SVC(4) or “fat Cantor set”, which involves removing the middle quarter instead of one-third. This set is also nowhere dense but has a positive outer measure; in fact, you can construct nowhere dense sets that have an outer measure arbitrarily close to one, which is useful for constructing many other types of counterexamples.
Next question is when can you integrate term by term an integral of a sum into an infinite sum of integrals, or equivalently, take an integral of a limit as the limit of integrals? Uniform convergence is a sufficient criterion, but some functions are not uniformly convergent and can still switch the order of integrals. For other functions, switching the order gives an incorrect result. The Arzelà-Ascoli theorem states that if a sequence is uniformly bounded, then we can swap the order of a limit and an integral.
Baire’s category theorem says that any countable union of nowhere dense sets still cannot cover an open interval. Sets can be divided into two categories: the first category is one that is a countable union of nowhere dense sets, and the second category is one that cannot be, or equivalently, one that contains an open interval. Baire’s category theorem classifies functions into classes based on how discontinuous they are. Class 0 is continuous, and class 1 is the limit of continuous functions and is pointwise discontinuous (ie, the discontinuities are nowhere dense), class 2 is the limit of class 1 functions, and so on.
Chapter 5. In Jordan measure, the inner content is the supremum of the union of intervals that are contained in the set, and the outer content is defined similarly. A set is Jordan measurable if the inner and outer contents are equal, and if it is Jordan measurable, then the content is finitely additive; however, the outer content is well-defined but not always additive in the case of sets that are non-measurable, like the Cantor set. Borel sets have nicer properties than Jordan measure; they are countably additive instead of just finitely additive, and they are constructed from taking countable unions, differences, and intersections of intervals. The cardinality of the set of Borel sets (called a sigma algebra) is c, so it is smaller than the set of all Jordan measurable sets. Lebesgue measure is defined similarly to Jordan inner and outer measure, but it is the infimum of countable collections of intervals that cover the set, whereas the Jordan measure only allows finite collections, it is more inclusive of sets that are not Jordan measurable like the Cantor set. Caratheodory’s condition says if a set can be partitioned such that every measurable set can be divided into two parts that add up to the set, then it is Lebesgue measurable; this provides an easier way to prove statements like countable unions and intersections are Lebesgue measurable.
The Vitali set is an example of a non-measurable set, but it relies on the axiom of choice to partition the interval [0,1] into uncountably many partitions. The axiom of choice became controversial when Zermelo proved it was equivalent to the existence of a well-ordering of the reals without being able to explicitly construct one (a deeply unintuitive claim), and non-measurable sets cannot exist without the axiom of choice. However, we generally accept the axiom of choice because it is so useful in mathematics, despite that it enables some paradoxes like the Banach-Tarski paradox.
Chapter 6. A function is called measurable if the set of the domain above each possible value is a measurable set. Simple combinations and limits of measurable functions are also measurable. A simple function is one that takes finitely many values, and the domain corresponding to each value is measurable; a function is measurable if it is a limit of simple functions. The Lebesgue integral is more powerful than the Riemann integral because all Riemann integrable functions are also Lebesgue integrable, and a criterion for Riemann integrability is exactly those functions that are continuous almost everywhere. One property of the Lebesgue integral is that any integral over any null set is zero. The monotone convergence theorem states that if a sequence of functions is increasing and bounded, then the integral converges to the integral of their limit. The dominated convergence theorem states that if each fn in a sequence of functions is bounded by an integrable function g, then the integral of the limit of fn is equal to the limit of the integrals: this is a useful condition to determine when term-by-term integration is permitted, although it is a sufficient but not necessary condition.
Chapter 7: The four Dini derivatives are the limits of the inf and sup derivatives from the left and right, and all four will always exist; a function is differentiable if all four are finite and equal. Dini’s theorem states that if any of the four defined derivatives are integrable, then they are all integrable and have the same integration value. A function has bounded variation if its oscillation (ie, the supremum of the difference of output values over all partitions), is finite. Jordan’s decomposition theorem says that a function can be written as a difference of monotonically increasing functions if and only if it has bounded variation.
The Faber-Chisholm-Young theorem states that a bounded variation continuous function is differentiable almost everywhere; the proof is quite complex and involves showing that the Dini derivatives are equal almost everywhere. When does integrating the derivative give back the original function? It turns out that being continuous and bounded variation is not enough, as the devil’s staircase function has a derivative of zero almost everywhere. A stronger condition of absolute continuity is needed, and this is the necessary and sufficient condition for the fundamental theorem of calculus to hold.
Chapter 8. Various mathematicians, including Dirichlet, Lipschitz, and Jordan, each proved conditions that the Fourier series of a function converges. Lebesgue proved the strongest version: for any Lebesgue integrable function, the Fourier series converges almost everywhere, but in the Cesàro limit, which is weaker than the usual limit because not all Lebesgue integrable functions have Fourier series that converge.
The inner product has similar properties to the dot product but is defined for functions, where the inner product is the integral of the product of two functions. In this way, the norm and distance between functions can also be defined. An L^p space is a function space where f^p is integrable, and L^inf consists of functions that are bounded almost everywhere; all of these are vector spaces. Minkowski’s inequality establishes that L^p spaces satisfy the triangle inequality; therefore, they are metric spaces, and as p increases, the space L^p gets strictly smaller.
A Banach space is one where all Cauchy sequences converge to something within the space, and all L^p spaces are Banach spaces. The Riesz-Fischer theorem states that every L^2 integrable function has a unique representation in its Fourier series. This is stronger than Lebesgue’s result involving Cesàro limits. It is later proved that all L^p integrable functions have a unique Fourier series representation, except for p = 1. The book ends with a final section that proves the Riesz-Fischer theorem.