Application of Approximate Diagonalization of Commutative C*-Algebras to Invariants

The premise of my dissertation is founded on the idea that matrices over C*-algebras are important and that approximate diagonalization would make dealing with such matrices easier. I have yet to find any useful application of my own result, though I find some mildly amusing applications to results of matrices over commutative C*-algebras.

So we start with the main definition:

Definition. Let A be a C*-algebra and let n be a positive integer. A normal matrix a\in M_n(A) is approximately diagonalizable if for every \varepsilon > 0, there exist elements a_1, \dotsc, a_n\in A and a unitary u\in M_n(A) such that

\lVert uau^{*} - \mathrm{diag}(a_1,\dotsc,a_n) \rVert < \varepsilon.

Next, we present the theorem that we will be applying:

Theorem. Let X be a compact metrizable space. Every self-adjoint matrix in M_n(C(X)) is approximately diagonalizable if and only if \dim(X) \leq 2 and \check{H}^2(X) = 0.

The first mildly interesting application is that on the connection between K-theory and approximate diagonalization of matrices over commutative C*-algebras.

Theorem. (Theorem 3.4 and 4.1 of [3]) Let X be a compact metrizable space. If for every positive integer n, every projection in M_n(C(X)) is approximately diagonalizable, then K_0(C(X)) \cong C(X,\mathbb{Z}).

Proof. First, note that the K_0 class of any diagonal projection is an element of C(X,\mathbb{Z}). This is because projections in C(X) are characteristic (indictator) functions of clopen subsets and so the K_0-class of a diagonal projection is the sum of indicator functions of clopen subsets, which is a continuous, (non-negative) integer-valued function.

Given an approximately diagonalizable projection p, there exist projections p_1, p_2, \dotsc, p_n \in C(X) and a unitary u\in M_n(C(X)) such that

\lVert upu^{*} - \mathrm{diag}(p_1, p_2, \dotsc, p_n)\rVert < 1.

Note that p_1, p_2, \dotsc, p_n can be chosen to be projections by the stability of the projection relations (i.e. being self-adjoint and idempotent, see Lemma 2.5.4 of [1] for details). Since close projections (norm distance strictly less than 1) are unitary equivalent (see Lemma 2.5.1 of [1]), their K_0-classes are the same. So every projection has the same K_0-class as of a diagonal projection, and thus belongs to C(X, \mathbb{Z}). \Box

The observant reader can see that we actually proved every approximately diagonalizable projection is in fact diagonalizable. Combining this with Xue’s theorem, we get the corollary:

Corollary. Let X be a compact metrizable space such that \dim(X) \leq 2 and \check{H}^2(X) = 0. Then K_0(C(X)) \cong C(X,\mathbb{Z}).

Perhaps of more slightly more interest is the fact that we can use this method to prove the following:

Theorem. Let X be a compact metrizable space. If for every positive integer n, every positive matrix in M_n(C(X)) is approximately diagonalizable, then W(C(X)) \cong \mathrm{lsc}(X,\mathbb{N}\cup \{0\}).

Here W(A) = (A\otimes M_{\infty})/\sim denotes the Cuntz semigroup, where \sim denotes Cuntz equivalence. The proof of this theorem falls out in the same way as the previous theorem: the Cuntz class of positive elements correspond to characteristic functions of open sets, which sum to non-negative integer-valued lower semicontinuous functions and it is clear that every approximately diagonalizable matrix is Cuntz equivalent to a diagonal matrix.

When we combine this theorem with Xue’s theorem, we have the following corollary:

Collorary. Let X be a compact metrizable space such that \dim(X) \leq 2 and \check{H}^2(X) = 0. Then W(C(X)) \cong \mathrm{lsc}(X,\mathbb{N}).

This is finite and unital version of Theorem 1.3 of [2]. Perhaps the only real upside to this, is that this proof directly deals with positive elements in contrast to Robert’s proof using Hilbert modules.

I’m still looking for some applications of approximate diagonalization, especially my dissertation results. Any help in this respect would be appreciated.

[1] Lin, Huaxin. An Introduction to the Classification of Amenable C*-Algebras. World Scientific, 2001.

[2] Robert, Leonel. “The Cuntz semigroup of some spaces of dimension at most two” C. R. Math. Acad. Sci. Soc. R. Can. 35:1 (2013) pp. 22-32.

[3] Xue, Yifeng. “Approximate Diagonalization of Self-Adjoint Matrices over C(M)Funct. Anal. Approx. Comput. 2:1 (2010) pp. 53-65. [pdf]

A Probablistic Look at the Geometric Series

So talking about the St. Petersburg paradox has reminded me of my undergraduate years, including a certain quick (but kind of handwavy) proof of the formula of the geometric series shown to me by Dr. Vitaly Bergelson.

One takes a coin which has the probability 0<p<1 of being heads and probability q=1-p of being tails, and play a familiar game: flip the coin until it lands on tails. The possibilities are exhausted as follows: tails first; heads then tails; two heads then tails; three heads then tails; etc, and always heads forever. The last option has probability zero (not to be confused with impossible) because the probability of that option is less than the probability of getting n heads in n tosses is 1/p^n which converges to 0 as n\to\infty, so since the probability of always getting heads is less than all these it must be 0.

So we have to have one of these mutually exclusive options, and so their probabilities must add up to 1. The probability of n heads followed by 1 tails is p^n\cdot q. So we have that \sum_{n=0}^{\infty} p^nq=1, or if we divide by q, we get: \sum_{n=0}^{\infty} p^n= 1/q=1/(1-p).

So the geometric series comes out of the wash from playing a coin game. The question Dr Bergelson asked next was what happens when instead of stopping at 1 tails, you stop at 2? What new formula do you get? I admit that I never got the answer to that question despite trying for some time, though I haven't thought about it since my undergraduate days.

I Still Don’t Understand the St Petersburg Paradox

Let’s consider the following game: You pay a certain amount to play the game. I have a fair coin in my hand. At the beginning of each turn, I flip the coin. If it turns up tails, the next turn begins. If it turns up heads, I give you 2^n dollars, where n is the turn. So if I get heads on the first turn, you get $2. If I flip the coin and get tails, then another tails, and then heads, then I give you $8. The question is how much would you pay to play this game?

A reasonable measure for this sort of thing is the expected value. When you have a game with the probability of different cash values at the end, the expected value tells you what the average will be if you consider what happens through many of the same game. So it seems like a good choice for how much you should pay, but in this case the expected value is: \sum_{n=1}^{\infty} \mathsf{\text{probability of the game ending on turn } } n \cdot 2^n = \sum_{n=1}^{\infty} 1/2^n \cdot 2^n= \sum_{n=1}^{\infty} 1 =\infty.* So the expected value here is infinite! Okay, so that means we should spend any amount of money on this game, because no matter what it’s a good deal.

So I claim, but no one will actually do such a thing. Hence, the paradox. Now the resolution to this that I’ve heard and heard was due to Euler (or one of the Bernoulli’s?) was the idea of utility and diminishing marginal value. The idea is that your first trillion dollars is much more valuable to you than your next and you have less value of money the more you get. Now my problem with this solution is that I can modify the game based on your utility function so that the first couple of turns still give a small amount, but the higher turns give an absurd amount of money.

I’ve also heard that martingales and stopping times give a solution to the paradox, but I don’t know enough probability to understand it. So if someone does, would you be willing to give an explanation?

—–
*Notice we can compute the probability because we know if the game ended on the nth turn, then the coin tosses had to have gone n-1 tails and then heads. So the probability of getting that specific combination is 1/2^{n-1}\cdot 1/2=1/2^n.

Math in Plain English: What is a circle?

This is a continuation of yesterday’s post, but this is not part of the main story of getting towards my area of research.

I’m sure everyone knows what a circle is roughly, but let’s consult a dictionary for the word “circle,” here’s Merriam-Webster’s definition (with my own corrections): “a closed plane curve every point of which is equidistant from a fixed point (called the center) within the curve.” So there’s a center and a circle is all the points that have the same distance from that center. Everything else is superfluous, even the bit about the “curve”. So ignoring the word “curve” for now, what’s striking is that nothing about Euclidean geometry is necessary here. You have a point (which we give the name center) and a distance (which is called the radius), and you look at all the points that have that distance from the center. So all we need is a notion of “point” and “distance.”

What a coincidence! That’s exactly what a metric space is! It’s, I don’t know, as though we tailor-made that definition so we can talk about circles or something. So we see that our usual picture of a circle, as a nicely round figure is what it is, because of our sense of distance. Just imagine spinning around with your arms stretched out, the figure traced out by your hands is, of course, a circle.

Before we go into examples, we have to clarify some language. In everyday English, we use the same word circle for both the outline of the shape and the shape including all the points inside. In addition, there’s a problem of whether we just include those insides or have both insides and outline. These things have widely different purposes in math. So to clear up the ambiguity we give them separate names: a circle is the outline, so the points with the same distance; an open disk is the inside, so the points with distance within the radius from the center (but not equal to it); and the closed disk is both the inside and outline, the points with distance within or the same as the radius. “Inside” as we will see is not very useful, so it’s better to think in terms of distances. It should be noted that the words “open ball” and “closed ball” are often used instead of “open disk” and “closed disk.”

So looking back at our examples from last time:

  1. If we consider the major cities of the world. A circle centered in New York City with radius 300 miles are all the major cities exactly 300 miles away from New York, which is actually probably nothing. But the disks are a bit more interesting: the disk centered in New York City with radius 300 miles are all the major cities less than 300 miles away, for example, Philadelphia would be in the disk.
  2. In the second example, where we look at every point of the globe, circles are exactly the same as the image we have in mind. Disks are slightly different, since the globe isn’t flat, the disks look like domes rather than our flat disks.
  3. When we consider the distance being least time it takes to get between places, we get interesting results due to politics. A disk centered at a city near a boundary, it may take more time to get across the boundary despite being physically closer. So the disk might be shaped by the political boundaries. A more physical constraint occurs if one lives on a large island, it may take more time to get on a boat and travel to a nearby city off the island than it is to travel to the other side of the island, despite being closer physically. In this case it is the water that is shaping the disk. So we see that we won’t get nice uniform shapes all the time. The center plays a much larger role in shaping the circles and disks than they did in the ordinary geometry case.
  4. If you live in Paris, then it’s straightforward to take a train to your favorite point. But if you don’t live in Paris and your radius is smaller than your distance to Paris, then you can’t make any transfers. So you can only travel along the one route that goes through Paris. Your circle is exactly two points: the two places along that one route exactly that distance away from the center. The disk is also strange, it’s the points along that one route that have distance less than the radius. But if your radius is bigger than the distance to Paris, you’re in luck. You can make any transfer you like and so you can take any train provided that you have enough in your radius to have gotten to Paris and then go to the point of your choice. Here even in the generalized mathematical, make everything symmetric and easier to deal with case, the circles and disks not only depend on whether you are centered in Paris or not, but also the different radii don’t give you scaled versions of the circles and disks. The shape fundamentally changes from a puny part of a train line to part of a train line and a little bit of every train line centered at Paris depending on whether radius is large enough to let you get to Paris or not!
  5. So if you look at English words with that distance we described last time, then the open disk of radius 1/3 centered at the word “center” would be all the words that begin with “cir-“, the closed disk of radius 1/3 with center “circle” would be the words that begin with “ci-” and the circle of radius 1/3 centered at “circle” would be the words that begin with “ci-” but whose third letter is not “r”.

The main lesson, if blog posts need lessons, is that the concept of circle depends on the notion of distance. So if allow yourself to expand to a variety of notions of distance, you can expand to a variety of types of circles.

Math in Plain English: Topology I – Metric Spaces I

The series of posts that I am starting with this is an attempt to explain so-called higher-level mathematics to people who don’t have strong mathematical backgrounds. The ultimate goal will be to explain my area of study, operator algebras, which seems like an extraordinary task since the subject requires so much just to get started with what they are. In addition to the technical challenges, the goal here isn’t to build up superficial knowledge, like what the definition of a C^{\ast}-algebra is without knowing the context for why it is what it is. When trying to motivate mathematics, it seems as though the best option is to look at why they were first studied. So I will begin not quite at the very beginning, but an important historical result and for that I need to first talk about topology.

Topology, in oversimplifying terms, is the subject of qualitative features of geometric objects. Here there are lots of problems already. What do I mean by qualitative? What do I mean by geometric object? But what I mean is a bit too abstract in the fullest generality that topology often begins. So instead I will start, like most analysts, with a more quantitative and concrete notion of metric space.

A metric space, in short, is a list of the possible places (called points) to be and the distances between each place. To put this in a bit more formal language.

Definition. A metric space is a collection of points along with a notion of distance from one point to the other satisfying the following conditions:

  1. The distance between any two points always remains the same. For example, the Earth and Mars have different distances from each other depending on the time, so we’re ruling out this sort of behavior. Our points are in that sense static, they don’t move around.
  2. The distance from one point to another is always a positive number or zero.
  3. The distance from a point to itself is zero and if the distance from one point to another is zero, then the two points are the same.
  4. There are no one-way streets, where it might be easier to go one direction than it is in the other. The distance from a starting point to an ending point is the same as the distance from end to start. So we can drop the “from” and “to” language and talk about the distance between two points.
  5. Pit stops won’t make the distance any shorter. If I go by someplace else first, the distance will be at least as long as compared to going straight to my destination.

So this definition might feel cumbersome, but it has an unambiguity that’s quite nice. For now, our class of geometric objects are (metric) spaces, and we can tell when we have one such space when all these conditions hold. So let’s take some examples:

  1. Take a globe. Consider the points to be all the major cities that are labeled and the distance to be their distance as we measure it: the length of the shortest line we draw between the cities.
  2. Take the globe again. This time take every possible position on the globe with the same distance as before. So we see that we can get different spaces by considering more or fewer points.
  3. Take the globe once again. Consider the points to be all the major cities. The distance this time is the shortest time it would take to travel between the two cities. Here we might run into the problem that going from one city to another might not take the same time as the other way around. So here our points did not change, but our sense of distance changed. So none of these three spaces are considered the same, even though the points in this example are the same as in the first example.
  4. Consider a country with a major railway. The points this time are the stops and the distance is the shortest distance you have to travel to get from one stop to the other. An example of this inspires a mathematical abstraction known as the “British railway” or “French railway” metric.
  5. So despite usually associating these with places and distances, one of things about mathematics is that these notions can generalize to ideas that have nothing to do with physical distance. So consider points to be English words and the distance is 1 divide by the first place where the two words differ and the distance is 0 if they are the same word. So “about” and “abacus” have a distance of 1/3 since “ab” agree but they disagree on the third letter. For another example, “demon” and “demonstration” have a distance of 1/6, since the word “demon” ends before “demonstration” does.

So the notion of metric space gives us a starting point about talking about this qualitiative study of geometry, which I will go into more detail later.

Academia and Me

Every since I can remember I’ve liked school. And not in the way that normal people like school, I didn’t much like recess or sports or lunch breaks. I liked classes. I liked learning. Some part of this is nostalgia speaking, I probably didn’t particularly like school as much as I am claiming now and I certainly didn’t like homework as much as I claim. But it is true that I enjoyed school and schoolwork. Lectures suit my personality, reading is one of my favorite things to do, and the pursuit of knowledge feels like the primary purpose of my life. When I was a child, I said I (once) that I wanted to be a teacher, because I thought it would be the closest thing to what I wanted to do. Then I realized what I really wanted was to be a professor, teaching and researching. Academia isn’t some higher calling. Academia is like my home. So I have a visceral reaction when people criticize it.

That’s not to say that academia doesn’t deserve criticism. There’s a lot wrong with the system right now, and there’s exciting movement to change things for the better: both in researching and in teaching. I never really criticized academia when I was a student, but when the roles were switched, it became apparent how many things are done haphazardly. Yet knowing this, I still have this gut reaction whenever someone says schools or teachers are corrupt, greedy, or obsolete. It sends shivers down my back when someone says the entire school model will be going down in a matter of decades.

Their point, often is that schools don’t function to educate, especially not in matters nor manners of reality nor to act creatively. If we were shift paradigms, then the new education that would form would be better for everyone; people would learn willingly and properly. I would gladly join them, too. More than the bureaucracy or the buildings, what academia represents is the enterprise of learning new ideas and finding creative, deep thoughts.

But if I let go of what I have in the academia now, will the new one catch me as I fall?

The Trouble with Pedagogy involving the Infinite

There’s a problem I have with teaching certain mathematical concepts, especially when they involve the word “infinity.” In algebra, one of the ways this pops up is during interval notation, where if you have an unbounded interval, you use the symbols +\infty or -\infty to denote that it is not bounded on whichever side. One of the problems with this is that now it seems as though we’re treating “infinity” as though it’s a number. People have some conceptual notion of infinity prior to any formal knowledge of how mathematicians use the idea Childhood number topping games trumped by crying out “Infinity,” only to be refuted with a “Infinity + 1” come to mind. Encouraging it seems like a bad idea as if one every gets to talking about when such concerns about infinite sets or real numbers arise the naive notion inclusion of infinity as a number. The usual way of explaining it away is the unsatisfying answer that “Infinity is a concept, not a number.” But that’s strange, what makes something a “concept” and what for that matter is a “number”?

Then again, I have tried the opposite approach of introducing the extended real line in a calculus course. Of course, this often fails, because of the subtleties involving not having all arithmetic operations available. So I ask non-rhetorically, is it better for people to treat \pm \infty as part of an “extended” real number line, or to say “Infinity is a concept, but not a number”? If the latter, what are we trying to say with that?

As one final thought, yesterday, I talked about infinite sums. What is probably obvious to everyone who has dealt with analysis is that we do not in fact, add infinitely many number together. The entire point is that we can find finite sums that are within any margin of error to our value and therefore the infinite sum is that value. This is difficult to understand at the first exposure, since it’s a loaded concept with some much behind it. But the intuitive idea of adding infinitely many things together is probably how most calculus students think of the idea. Isn’t this why people have so much trouble with the equation 0.\bar{9}=1?

Infinite Summation

Infinite summation over arbitrary index sets can be described using nets, but there is an elementary way of talking about these sums which is equivalent to using nets. The elegance also shows that there is nothing mysterious or terribly subtle about the subject of infinite summation as talk of nets might imply. The material from today’s post is taken nearly wholesale from Paul Halmos’s excellent book Introduction to Hilbert Space and the Theory of Spectral Multiplicity, which as far as I can tell is out-of-print.

Definition. Let X be a Banach space. We say that a family of vectors \{x_i\} for i\in I is summable, if there exists a vector x\in X such that for all \varepsilon>0 there exists a finite set F\subseteq I such that for any finite set G \supseteq F, we have \lVert \sum_{i\in G} x_i-x\rVert <\varepsilon. We write \sum_{i\in I}x_i=x.

Of course \mathbb{R} is a (real) Banach space and \mathbb{N} is a perfectly good indexing set. We get our usual definition of summation if we restrict our attention to G of the form \{1,2,\dotsc,n\}. This means that our definition of summation is more restrictive than our usual definition for sequences, but we notice that it is equivalent to having \sum_{i=1}^{\infty}x_{\sigma(i)} be convergent series for all permutations \sigma. Equivalently, \sum_{i\in \mathbb{N}} x_i is summable (in our sense) if and only \sum_{i=1}^{\infty} x_i is absolute convergent. Considering the advantages of absolute convergence over convergence and the fact that absolute convergence makes little intuitive sense, it is tempting to consider this definition over our traditional one. Alas, there are too many practical difficulties with this definition to introduce in a calculus course.

We immediately have the fact that linearity is preserved: \alpha\sum_{i\in I}x_i=\sum_{i\in I}\alpha\cdot x_i, \sum_{i\in I}x_i+\sum_{i\in I}y_i=\sum_{i\in I}(x_i+y_i), which passes from the fact that these are true when I is finite. (Though technically, one must show that the sums are unique.)

One cannot go without some analogue of a Cauchy criterion, and this one is quite slick:

Theorem (Cauchy Criterion). A family of vectors \{x_i\} is summable if and only if for all \varepsilon>0, there exists a finite set F\subseteq I such that for any finite set G\subseteq I such that F\cap G=\varnothing, we have \lVert \sum_{i\in G} x_i \rVert < \varepsilon.

The proof of this is not so bad, but enlightening. One direction is straightforward. Suppose \{x_i\} is summable. Then take F\subseteq I such that for any F'\supseteq F one has \lVert\sum_{x\in F'}x_i-x\rVert <\varepsilon/2. Take a set G disjoint from F and consider F'=F\cup G, we then have \lVert\sum_{i\in G}x_i\rVert=\lVert\sum_{i\in F'}x_i-\sum_{i\in F}x_i\rVert\leq \lVert\sum_{i\in F'}x_i-x\rVert+\lVert x-\sum_{i\in F}x_i\rVert<\varepsilon/2+\varepsilon/2=\varepsilon. So that proves one direction. Now suppose the Cauchy condition holds. Let F_n\subseteq I be a finite set such that if G\cap F_n=\varnothing, then \lVert\sum_{i\in G} x_i\rVert<1/n. By replacing F_n with F_1\cup F_2\cup \dotsb \cup F_n, we may assume that F_{n-1}\subseteq F_{n} for all n. (Halmos makes the observation here, that \bigcup_{n\geq 1} F_n is countable and if i\not\in \bigcup_{n\geq 1} F_n, then \|x_i\|n, then \lVert \sum_{i\in F_m} x_i - \sum_{i\in F_n} x_i\rVert =  \lVert \sum_{i\in F_m\setminus F_n} x_i \rVert<1/n. So the sequence converges and we see that if we take any G\supseteq F_n, then \lVert \sum_{i\in G} x_i -x\rVert\leq \lVert \sum_{i\in F_n} x_i-x\rVert + \lVert \sum_{i\in G\setminus F_n} x_i\rVert. The former converges to zero as n\to\infty and latter is less than 1/n.

So I quickly admit that I added nothing new from what Halmos had written, but I feel that this material is not as widely known as it should be. Certainly, one can generalize this by using nets, but the simplicity here works well and requires no exposition of directed sets and any difficulties that arise from dealing with more general nets.

My Favorite College Algebra Lecture

Well, it’s not an entire lecture, not even quite half a lecture. But my favorite part of a lecture in college algebra is the properties of exponentiation. Sounds strange, right? I basically list off some algebraic identities involving exponents, and I’m done. The value does not lie in the content, but in the presentation of the material, and how it comes the closest to what I enjoy about math.

We start with b\geq 0 and m,n natural numbers. We notice that when looking at b^{n}\cdot b^{m}, this is first multiplying b a total of n times, and then multiplying b another m times. So this is the same as multiplying b a total of m+n times, or b^{m+n} Here I might take a specific m,n like m=4 and n=3 and write it out. So we get the equation b^{m+n}=b^{m}\cdot b^{n}.

Next if we take a look at (b^m)^n, we can expand it out as multiplying b^m a total of n times. So b^m\cdot b^m\dotsb b^m, then being a little suggestive, I expand the b^m out vertically, creating an array of b‘s. Then it becomes clear by appealing to the area of a rectangle that this quantity is b^{mn} the result of multiplying b a total of mn times.

So here’s the part I love. So far, we have appealed to the nature of exponents as repeated multiplication to deduce some properties about it. I then say that these are nice algebraic properties that will make our lives easier when we have to deal with exponentials, wouldn’t it be nice if they were still true even after we consider more general m,n? So let’s pretend they do still hold and see what happens.

We first take a look at m=0, n=1 and by looking at the addition property we get that b^{1+0}=b^{1}\cdot b^{0}, and since 1+0=1 and b^1=b, we see that b=b\cdot b^0, and if we divide both sides by b, we see that b^0=1. So we get a very specific value for b^0=1. It wasn’t something handed down from the heavens, it wasn’t some convoluted aspect of 0, or strange combinatorics. Just we want this addition property to be true, and for it to be so, b^0 must be 1.

Then we take a look at negative numbers. So we let m be any positive whole number and use the addition property and notice that b^{m}\cdot b^{-m}=b^{m+(-m)}=b^{0}=1. So erasing the middlemen, we have b^{m}\cdot b^{-m}=1, or by dividing both sides by b^{m}, we have b^{-m}=1/b^{m}. Now we have that negative exponents correspond to the reciprocals. Again something that pops out, just because we wanted to keep that addition property.

Finally we look at rational numbers. We first we deal with 1/n. Finally the other property comes into play: (b^{1/n})^n=b^{1/n\cdot n}=b^{1}=b. Cutting out the middlemen again, we see that (b^{1/n})^n=b. But there is a special name for a number whose nth power is equal to b (when b>0, and that is the nth root of b, written \sqrt[n]{b}. As for the arbitrary case, we see that since m/n=(1/n)\cdot m, and so b^{m/n}=(b^{m})^{1/n}=\sqrt[n]{b^m}.

Now, I may not have persuaded you that this is a good lecture, let alone one deserving of being called a favorite. The first thing to notice is that there are few hard quantities involved; it’s mostly algebra. When these classes are taught, the first thing a (pure) mathematician notices is that it’s deprived of any interesting abstract mathematical ideas. The teacher is forbidden (for good reason!) from straying too far into his own domain of general theory, and focus on the particular, lest the students become confused and bewildered. The more headstrong mathematician-teacher charges ahead anyway, and is met with sighing, head scratching and yawning. Proving the quadratic formula by completing the square of an arbitrary quadratic functions sounds like a great idea, but it’s a quick way to lose a room full of freshmen. This has a wonderful balance; easy enough that the algebra clarifies rather than obfuscate, yet complicated enough, that it says something actually interesting and not at first obvious.

Also take note that the progression of our ideas follows the same construction of the rationals from the natural numbers. We starting with a counting device; consider repeated counting, also called addition and then repeated addition, which we call multiplication. We consider 0 the neutral element in addition. Then we follow-up with negative numbers, or in more algebraic terms, the additive inverses we get from extending our monoid of natural numbers to the group of integers, which turns out to be a ring. We then form the ring of fractions, which give us multiplicative inverses. We follow the same chain of constructions without mentioning the fact that we are doing so, so we implicitly form a picture of the way numbers work without having to explicitly talk about a great deal of abstract algebra.

But what seems like the most important thing I love about this is the premise of the lecture. We begin with an observation that exponents work in a certain way, when we consider natural numbers. In particular, it’s very natural, almost completely obvious. Now rather than making rules for the other numbers and noticing that we get the same properties, making them more complicated by having so many cases, we say “wouldn’t it be nice if this still held?” And the point isn’t “This is how things are,” but “we want this, and we accept the consequences of our choice.” Too often, math is considered a bunch of strict rules, like a code of law. “You are allowed this action, but not this one.” And I admit I encourage this view, especially when dealing with things like dividing fractions, or adding fractions or… pretty much anything involving fractions. But this isn’t how math is, not the kind I love so much. It allows for so much creativity and invention. The only truly mathematical law is “Be consistent,” which is admittedly more restrictive than at first appears. But we allow every twist and turn of every possible mathematical rule or system and allow a complete freedom provided one thing: we must accept the logical consequences of our choices. “Everything not forbidden is permitted.”

Thoughts on Talks: Polya’s Theorem and Fourier Transforms

Polya’s theorem states, colloquially speaking, that “a drunk man will eventually return home, but a drunk bird might not.” Though an interesting theorem, the novelty of today’s talk by resident probabilist Aaron Montgomery was that by using Fourier transforms (and copious applications of Fubini’s theorem) the theorem is proved with relative ease and without any messy combinatorics. This explains the latter half of his title: “Local Central Limit Theorem for Z^d or: How I Learned to Stop Worrying and Love the Fourier Transform,” though as Aaron himself stated the first part of the title did not actually make an appearance.

During the talk, I began to remember the rich connection one has between integrals and our respective fields. In functional analysis, we associate integrals with algebraic functions: linear functionals and inner products. In probability, the association is with expected values. But more than a surface level association, this gives a depth to the subjects involved and the seemingly straightforward yet stubbornly difficult integral becomes something alluring yet austere.

As a contribution, I wanted to talk about how functional analysis fit into this talk, but first I have to retell the beginning.

The basic setup is that one has a sequence \{X_i\} of independent, identically distributed random vectors in \mathbb{Z}^d. These vectors correspond to the steps that our random walker (or flyer as the case may be) are taking. The example of interest to us is where we take the standard basis vectors and vectors obtained by multiplying them by -1 and give them uniform probability of being chosen. We will call this the simple random walk. So in the case of d=2, we have a 1/4 chance of going up, down, left or right. To keep track of the walker’s position, we just need to add up the vectors: S_n=\sum_{i=1}^{n}X_i.

We consider the characteristic functions \Phi_{X}(t)=\mathbb{E}(e^{it\cdot X}), and by independence we have \Phi_{S_n}=\Phi_{X_1}^n. This follows quickly from the fact that X_i are independent and identically distributed. But we pause here briefly, to mention the fact that rare numerical equality of the integral of a product with the product of integrals is the very condition of independence, about which we usually rely on our intuitive notion. This quickly reminded me about Mark Kac’s monograph about the subject, but maybe more on that later this month.

The latter is computable, at least in the simple case, we have that \Phi_{X_1}(t)=\frac{1}{d}\sum_{k=1}e^{it_k}+e^{-it_k}=\frac{1}{d}\sum_{k=1}^{d}\cos(t_k).

It is at this point in the talk that we see functional analysis taking a peek, bidding us to invite it in. We notice that by taking the proper scaling of our usual Lebesgue measure on [-\pi,\pi]^d, we have that the functions f_x(t)=e^{it\cdot x} form an orthonormal basis for x\in\mathbb{Z}^d, and that by definition of expected value, one has \Phi_{S_n}(t)=\sum_{x\in\mathbb{Z}^d}\mathbb{P}(S_n=x)e^{it\cdot x}. So we have that our characteristic function can be written as a linear combination of our orthonormal basis, but more to the point, we have numerical values for the coefficients of that sum by taking inner products!

So with a generalization of Parseval’s identity, we have that \langle \Phi_{X_1}^n(t), e^{it\cdot x} \rangle = \mathbb{P}(S_n=x), or to put this in integral terms: \int_{[-\pi,\pi]^d}\!\Phi_{X_1}^n(t)e^{-it\cdot x}\,dm=\mathbb{P}(S_n=x).

The characteristic function lines up the information we need and sorts them out, and then the Fourier transform allows us to take them out.

Of course, we still have some computations to do and it is not easy, but with this association between the probabilities of S_n and the characteristic function, we have something that we can estimate and prove the theorem.