This is a follow-up to a comment I made on Tim Gowers’s post concerning the use of Zorn’s lemma (which I encourage students, in particular, to read if they have not done so yet). The issue was whether it is possible to write down concretely an unbounded (or in other words, not continuous) linear operator on a Banach space. I mentioned that I had been explained a few years ago that this is not in fact possible, in the same sense that it is not possible to “write down” a non-measurable function: indeed, any measurable linear map
$latex T\ :\ U\rightarrow V$
where U is a Banach space and V is a normed vector space, is automatically continuous.
Incidentally, I had asked this to my colleague É. Matheron in Bordeaux because the question had arisen while translating W. Appel’s book of mathematics for physicists: in the chapters on unbounded linear operators (which are important in Quantum Mechanics), he had observed that those operators could often be written down, but only in the sense of a partially defined operator, defined on a dense subspace, and we wondered if the dichotomy “either unbounded and not everywhere defined, or everywhere defined, and continuous”, was a real theorem or not. In the sense that measurability is much weaker than the (not well defined) notion of “concretely given”, it is indeed a theorem.
Not only did Matheron tell me of this automatic continuity result, he gave me a copy of a short note of his (“A useful lemma concerning subseries convergence”, Bull. Austral. Math. Soc. 63 (2001), no. 2, 273–277), where this result is proved very quickly, as a consequence of a simple lemma which also implies a number of other well-known facts of functional analysis (the Banach-Steinhaus theorem, Schur’s theorem on the coincidence of weak and norm convergence for series in l1, and a few others). On the other hand, I don’t know who first proved the continuity result (Matheron says it is well-known but gives no reference).
The proof is short enough that I will present it; it is a nice source of exercises for a first course in functional analysis, provided some integration theory has been seen before (which I guess is always the case).
Here is the main lemma, due to Matheron, in a probabilistic rephrasing, and a slightly weaker version:
Main Lemma: Let G be a topological abelian group, and let An be an arbitrary sequence of measurable (Borel) subsets of G, and (gn) a sequence of elements of G. Assume that for every n and every g in G, either g is in An, or g-gn is in An.
Let moreover be given a sequence of independent Bernoulli random variables (Xn), defined on some auxiliary probability space.
Then, under the condition that the series
$latex \sum_{n\geq 1}{X_n g_n}$
converges almost surely in G, there exists a subsequence (hn) of (gn) such that
$latex \sum_{n\geq 1}{h_n}$
converges and belongs to infinitely many An.
This is probably not easy to assimilate immediately, so let’s give the application to automatic continuity before sketching the proof. First, we recall that Bernoulli random variables are such that
$latex \mathbf{P}(X_n=0)=\mathbf{P}(X_n=1)=1/2.$
Now, let T be as above, measurable. We argue by contradiction, assuming that T is not continuous. This implies in particular that for all n, T is not bounded on the ball with radius 2-n, so there exists a sequence (xn) in U such that
$latex \|x_n\|<2^{-n},\ \text{and}\ \|T(x_n)\|>n.$
We apply the lemma with
$latex G=U,\text{ written additively},\ g_n=-x_n,\ A_n=\{x\in U\ |\ \|T(x)\|>n/2\}.$
The sets An are measurable, because T is assumed to be so. The triangle inequality shows that if x is not in An, then
$latex \|T(x-x_n)\|=\|T(x_n)-T(x)\|>\|T(x_n)\|-\|T(x)\|>n/2$
so that x-xn is in An. (This shows where sequences of sets satisfying the condition of the Lemma arise naturally).
Finally, the series formed with the xn is absolutely convergent by construction, so the series “twisted” with Bernoulli coefficients are also absolutely convergent. Hence, all the conditions of the Main Lemma are satisfied, and we can conclude that there is a subsequence (yn) of the (xn) such that
$latex y=\sum_{n\geq 1}{y_n}$
exists, and is in An infinitely often; this means that
$latex \|T(y)\|>n/2$
for infinitely many n. But this is impossible since T is defined everywhere!
Now here is the proof of the lemma. Consider the series
$latex Y=\sum_{n\geq 1}{X_ng_n}$
as a random variable, which is defined almost surely by assumption. Note that any value of Y is nothing but a sum of a subseries of the original series with terms gn. Let
$latex B_n=\{Y\in A_n\}$
so that the previous observation means that the desired conclusion is certainly implied by the condition
$latex \mathbf{P}(Y\text{ in infinitely many } A_n)>0.$
The event to study is
$latex I=\bigcap_{N\geq 1}{C_N}\ with\ C_N=\bigcup_{n\geq N}{B_n}$
The sets CN are decreasing, so their probability is the limit of the probability of CN, and each contains (hence has probability at least equal to that of) BN. So if we can show that
$latex \mathbf{P}(B_n)\geq 1/4\ \ \ \ \ \ \ \ \ \ \ \ \ (*)$
(or any other positive constant) for all n, we will get
$latex \mathbf{P}(C_N)\geq 1/4,\ \text{and hence}\ \mathbf{P}(I)\geq 1/4>0,$
which gives the desired result. (In other words, we argue from a particularly simple case of the “difficult” direction in the Borel-Cantelli lemma).
Now let’s prove (*). We start with the identity
$latex \{\sum_{m}{X_mg_m}\in g_n+A_n\ and\ X_n=1\}=\{\sum_{m\not=n}{X_mg_m}\in A_n\ and\ X_n=1\}$
(for any n), which is a tautology. From the yet-unused assumption
$latex A_n\cup (g_n+A_n)=G,$
we then conclude that
$latex \{X_n=1\}\subset \{\sum_{m}{X_mg_m}\in A_n\}\cup \{\sum_{m\not=n}{X_mg_m}\in A_n\ and\ X_n=1\}=B_n\cup S_n,$
say. Therefore
$latex 1/2=\mathbf{P}(X_n=1)\leq \mathbf{P}(B_n)+\mathbf{P}(S_n)$.
But we claim that
$latex \mathbf{P}(S_n)\leq\mathbf{P}(B_n).$
Indeed, consider the random variables defined by
$latex Z_m=X_m\ if\ m\not=n,\ \ \ Z_n=1-X_n$
Then we obtain
$latex S_n=\{\sum_{m}{Z_mg_m}\in A_n\ and\ Z_n=0\}$
but clearly the sequence (Zm) is also a sequence of independent Bernoulli random variables, so that
$latex \mathbf{P}(\sum_{m}{Z_mg_m}\in A_n\ and\ Z_n=0)=\mathbf{P}(\sum_{m}{X_mg_m}\in A_n\ and\ X_n=0)\leq\mathbf{P}(Y\in A_n)=\mathbf{P}(B_n)$
as desired. We are now done, since we have found that
$latex 1/2\leq 2\mathbf{P}(B_n)$
which is (*).
(In probabilistic terms, I think the trick of using Zm has something to do with “exchangeable pairs”, but I’m not entirely sure; in analytic terms, it translates to an instance of the invariance of Haar measure by translation on the compact group (Z/2Z)N, as can be seen in the original write-up of Matheron).