Unappreciated Exposition

During my time in China, I often had a conversation with various people about the completion of the Elliott classification program for nuclear C*-algebras. Often, the main question was: Who’s going to write the book on the topic? The general answer seemed to be no one. Despite the usefulness of a good exposition on the subject, it seems that there is no incentive for writing such a book. And there is no incentive because no math department will consider such a book to be a worthwhile research accomplishment. And my chain of whys comes to a sudden halt here. I don’t know why expositionary work wouldn’t be rewarded (or even if this is always true). The effort to read the proofs and to write a text for a slightly more general audience (by which I mean, more than the handful of people responsible for the proof) is not worth the opportunity cost of conducting original research.

It may be the case that this is a particular problem in my field. I recall in my undergraduate days there was an attempt to write an exposition of the classification of finite simple groups (a much more difficult project, I presume). This may have more to do with where C*-algebras are right now, rather than being a permanent feature. Honestly, I don’t know, but it is disappointing to me that exposition of “known” results is less significant than even marginal original research.

Dropping Conditions on the Stone-Weierstrass Theorem

It’s a good exercise, in general, to take well-known theorems and see what happens. Can you find counterexamples? What is salvageable? As a C*-algebraist, one of my favorite theorems from baby Rudin to do this with is the Stone-Weierstrass theorem.

First, we recall the theorem as it’s presented in Rudin.

Theorem. Let K be a compact metric space. Let A be a self-adjoint subalgebra of C(K), the algebra of complex-valued continuous functions from K. If

  1. A vanishes nowhere (i.e. for every point x\in K, there exists a function f \in A such that f(x) \neq 0) and
  2. A separates points on K (i.e. for every x,y\in K such that x\neq y, there exists a function f\in A such that f(x) \neq f(y)),

then A is uniformly dense in C(K).

It’s well-known that K can be replaced by a compact Hausdorff space, but since, baby Rudin doesn’t deal with abstract topological spaces, it’s stated for metric spaces. The proof first establishes the theorem for boolean subalgebras (in the real case), and then one realizes that closed algebras (with the usual addition and multiplication) are closed boolean subalgebras since the absolute value of a function can be used to recover the max and min operations.

The first generalization is the fact that if one replaces K with a locally compact space and the algebra C(K) with C_0(K) of continuous functions vanishing at infinity, then a similar theorem holds. This is clear from the fact that the subalgebra C_c(K) of continuous functions with compact support is dense in C_0(K).

Especially the way I wrote the theorem lends itself to the following exercise:

Exercise. What does the closure of A correspond to when condition (1) is dropped? What about condition (2)?

Before proceeding, I’d like to give a spoiler alert. I found this exercise to be fun, even though it probably should have been obvious from the start. Before proceeding, it might be more fun to think about the question for a little bit, though those of you smarter than myself may find the exercise trivial.

Now if we drop the first condition, then A vanishes somewhere. If we denote Z = \{ x\in K\mid f(x) = 0 \text{ for all} f\in A \}, then we can consider C_0(K\setminus Z) a self-adjoint subalgebra of C(K) by extending by 0 on Z. With this identification in mind, we see that A is a subalgebra of C_0(K\setminus Z) that satisfies conditions (1) (for the subspace) and (2). So A is dense in C_0(K\setminus Z). Since Z is closed, A corresponds to dense subalgebras of C_0(U) where U is an open subset of K.

Dropping the second condition means that A can’t separate some points. So we define an equivalence relation on K by x \sim y if f(x) = f(y) for all f\in A. Now, I’m not sure if K/\sim is a compact metric space, but if we use the topological version of the theorem, we can so that the quotient space is a compact Hausdorff space by using the fact that the quotient is Hausdorff if and only if the relation \sim as a subspace of K\times K is closed in the product topology. Then we see that A is a subalgebra of C(K/\sim) satisfying conditions (1) and (2). So A corresponds to dense subalgebras of C(Q), where Q is a quotient space of K.

Of course, I was being a little sneaky here by not listing self-adjointness as a condition. But it seems clear that we can’t say much about non-self-adjoint subalgebras of C(K). In particular, if we take the classic example with K being the unit disk and A is the closed subalgebra generated by the constant function 1 and the map z\mapsto z, then we get the disk algebra, a proper subalgebra of C(K), where all the functions are holomorphic in the interior of the disk.

Life Lessons from the Mathematical Notion of Ordering

It is a well-known exercise to show that the field of complex numbers cannot be made into an ordered field. It’s a fun little exercise that showcases one of the unique abilities in math of being able to prove that certain things can’t exist. But this exercise doesn’t get at the whole truth; though the complex numbers don’t have a linear (or total) ordering that’s compatible with its algebraic structure, it does have plenty of partial orderings that do. This is not just a curiosity, either. The partial ordering of the complex numbers obtained by saying z \leq w if w - z is real and positive is quite natural and used quite often, even when it’s not outright stated.

In fact, the further I got into my mathematical studies, the more pervasive partial orders were in comparison to linear orders. In my area, C*-algebras have partial orders and partially ordered abelian groups are important to their study with nary a linear order in sight. They are so pervasive in this area that one story goes that during a lecture, when the lecturer used the phrase “partially ordered group”, a well-known mathematician piped in with a “What do you mean ‘partially ordered group’? You mean it’s not ‘linearly ordered’? That’s called an ‘ordered group’.” Quite the contrast from the exercise in the beginning that presumed that an ordered field meant a linear order!

Since at this point, I’m probably losing most of my non-mathematician readers, let’s have a brief introduction. An ordering is the notion of “less than” or “greater than” between objects. You are probably familiar with the order on the real numbers, but that order has a special property: every two numbers are comparable. In other words, for any two numbers x,y, either: x \leq y or y\leq x, and the only time both statements are true is when x and y are the same number. This is what we’ll call a linear order, just as the real numbers fall into a line based on their order. But there are orderings where not every two objects are comparable.

To take an example, among the movies you watched, you have a preference ordering, where you one movie is “greater than” another if you like it more. I dare say that not every two movies are comparable. I know that I both like Star Wars and 2001: A Space Odyssey, but I don’t like one better than the other. But there are comparisons that can be made; Star Wars: A New Hope is better than Star Wars: A Phantom Menace.

Starting from childhood, I had an aversion to the concept of “favorites”. To be sociable, I would say something, but it always felt dishonest. I didn’t understand what it meant to have a “favorite” food, color, movie, TV show, or whatever. It wasn’t until I learned Zorn’s lemma that I could articulate what my problem was with the concept. When learning Zorn’s lemma, two distinct terms are defined for an ordering: a greatest element is an object that is greater than every other object and a maximal element is an object where no other object is greater than it. When the ordering is linear, these notions are the same. But when you allow incomparability, they’re different ideas. Notice that a greatest element is a Highlander notion, there can only be one. The concept is dominating and competitive. On the other hand, multiple maximal elements can coexist with each other. It’s a more peaceful notion. So it became clear at last: my preferences for movies is not linear, and while there are many maximal movies (which I chose among as my “favorite”), there is no greatest movie.

But it doesn’t stop at preferences, and I’m certainly not unique in noticing this distinction. In many math books now, there are chapter flow charts with which chapters you need for a given chapter. The reader can chart a path through a book that doesn’t follow a linear order. To a lesser extent, this is precisely what footnotes are; when the main text of the book is linearly ordered, but every now and then there are partially ordered footnotes that split off from the linear text. While time flows in a linear fashion, ideas often do not. From one idea, many arise. A concept that is illustrated with the hyperlinked web.

Before I decide to chase the consequences of this idea down every rabbit hole, I ought to finish this blog post. To sum up, there’s a life lesson to gained from studying order structures: there are more partial orderings than linear orderings; and maximal elements abound, while greatest elements are scarce.

Irrationality of Aggregate Preferences

I was an economics major back in my undergraduate days and one of the common complaints about the subject (well, at least before the crash) was something along the lines: “Economists assume people are rational when they’re not.” A fair enough point, but usually the trouble with this assessment is like if someone said, the trouble with is that these people are assuming all numbers rational when they’re not. It’s certainly true, but that word probably doesn’t mean what you think it does. So one assumption that’s considered part of the assumption that people are rational is that their preferences form what mathematicians call a “preorder,” which basically consists of: (1) every thing is preferable to itself and (2) if thing A is preferable to thing B and thing B is preferable to thing C, then thing A is preferable to thing C. Here “preferable” means “at least as good as.” Now this assumption, in my view, seems fair provided that we’re only talking about any given moment, which we generally are. Preferences change over time, but we’re just talking about some moment not over long periods of time.

Now what’s interesting is that this notion of rationality fails once you consider groups of people. So if you consider three people considering three options: A, B and C. Person 1 prefers A to B to C; person 2 prefers C to A to B; and person 3 prefers B to C to A. Now they take a vote on what the preferences of the group should be: we see that 1 and 2 prefer A to B, so A is preferable to B, 1 and 3 prefer B to C, and lastly, 2 and 3 prefer C to A. So A is preferable to B which is preferable to C, but A is not preferable to C.

Aggregating preferences lead to inconsistent preferences, which is why the public demands inconsistent things like less taxes and more government programs, or why fans complain about some feature of a film and complain when it changes in the sequel. Some people might actually be irrational in this sense, but that’s not why we witness this behavior.

I Still Don’t Understand the St Petersburg Paradox

Let’s consider the following game: You pay a certain amount to play the game. I have a fair coin in my hand. At the beginning of each turn, I flip the coin. If it turns up tails, the next turn begins. If it turns up heads, I give you 2^n dollars, where n is the turn. So if I get heads on the first turn, you get $2. If I flip the coin and get tails, then another tails, and then heads, then I give you $8. The question is how much would you pay to play this game?

A reasonable measure for this sort of thing is the expected value. When you have a game with the probability of different cash values at the end, the expected value tells you what the average will be if you consider what happens through many of the same game. So it seems like a good choice for how much you should pay, but in this case the expected value is: \sum_{n=1}^{\infty} \mathsf{\text{probability of the game ending on turn } } n \cdot 2^n = \sum_{n=1}^{\infty} 1/2^n \cdot 2^n= \sum_{n=1}^{\infty} 1 =\infty.* So the expected value here is infinite! Okay, so that means we should spend any amount of money on this game, because no matter what it’s a good deal.

So I claim, but no one will actually do such a thing. Hence, the paradox. Now the resolution to this that I’ve heard and heard was due to Euler (or one of the Bernoulli’s?) was the idea of utility and diminishing marginal value. The idea is that your first trillion dollars is much more valuable to you than your next and you have less value of money the more you get. Now my problem with this solution is that I can modify the game based on your utility function so that the first couple of turns still give a small amount, but the higher turns give an absurd amount of money.

I’ve also heard that martingales and stopping times give a solution to the paradox, but I don’t know enough probability to understand it. So if someone does, would you be willing to give an explanation?

*Notice we can compute the probability because we know if the game ended on the nth turn, then the coin tosses had to have gone n-1 tails and then heads. So the probability of getting that specific combination is 1/2^{n-1}\cdot 1/2=1/2^n.

Math in Plain English: What is a circle?

This is a continuation of yesterday’s post, but this is not part of the main story of getting towards my area of research.

I’m sure everyone knows what a circle is roughly, but let’s consult a dictionary for the word “circle,” here’s Merriam-Webster’s definition (with my own corrections): “a closed plane curve every point of which is equidistant from a fixed point (called the center) within the curve.” So there’s a center and a circle is all the points that have the same distance from that center. Everything else is superfluous, even the bit about the “curve”. So ignoring the word “curve” for now, what’s striking is that nothing about Euclidean geometry is necessary here. You have a point (which we give the name center) and a distance (which is called the radius), and you look at all the points that have that distance from the center. So all we need is a notion of “point” and “distance.”

What a coincidence! That’s exactly what a metric space is! It’s, I don’t know, as though we tailor-made that definition so we can talk about circles or something. So we see that our usual picture of a circle, as a nicely round figure is what it is, because of our sense of distance. Just imagine spinning around with your arms stretched out, the figure traced out by your hands is, of course, a circle.

Before we go into examples, we have to clarify some language. In everyday English, we use the same word circle for both the outline of the shape and the shape including all the points inside. In addition, there’s a problem of whether we just include those insides or have both insides and outline. These things have widely different purposes in math. So to clear up the ambiguity we give them separate names: a circle is the outline, so the points with the same distance; an open disk is the inside, so the points with distance within the radius from the center (but not equal to it); and the closed disk is both the inside and outline, the points with distance within or the same as the radius. “Inside” as we will see is not very useful, so it’s better to think in terms of distances. It should be noted that the words “open ball” and “closed ball” are often used instead of “open disk” and “closed disk.”

So looking back at our examples from last time:

  1. If we consider the major cities of the world. A circle centered in New York City with radius 300 miles are all the major cities exactly 300 miles away from New York, which is actually probably nothing. But the disks are a bit more interesting: the disk centered in New York City with radius 300 miles are all the major cities less than 300 miles away, for example, Philadelphia would be in the disk.
  2. In the second example, where we look at every point of the globe, circles are exactly the same as the image we have in mind. Disks are slightly different, since the globe isn’t flat, the disks look like domes rather than our flat disks.
  3. When we consider the distance being least time it takes to get between places, we get interesting results due to politics. A disk centered at a city near a boundary, it may take more time to get across the boundary despite being physically closer. So the disk might be shaped by the political boundaries. A more physical constraint occurs if one lives on a large island, it may take more time to get on a boat and travel to a nearby city off the island than it is to travel to the other side of the island, despite being closer physically. In this case it is the water that is shaping the disk. So we see that we won’t get nice uniform shapes all the time. The center plays a much larger role in shaping the circles and disks than they did in the ordinary geometry case.
  4. If you live in Paris, then it’s straightforward to take a train to your favorite point. But if you don’t live in Paris and your radius is smaller than your distance to Paris, then you can’t make any transfers. So you can only travel along the one route that goes through Paris. Your circle is exactly two points: the two places along that one route exactly that distance away from the center. The disk is also strange, it’s the points along that one route that have distance less than the radius. But if your radius is bigger than the distance to Paris, you’re in luck. You can make any transfer you like and so you can take any train provided that you have enough in your radius to have gotten to Paris and then go to the point of your choice. Here even in the generalized mathematical, make everything symmetric and easier to deal with case, the circles and disks not only depend on whether you are centered in Paris or not, but also the different radii don’t give you scaled versions of the circles and disks. The shape fundamentally changes from a puny part of a train line to part of a train line and a little bit of every train line centered at Paris depending on whether radius is large enough to let you get to Paris or not!
  5. So if you look at English words with that distance we described last time, then the open disk of radius 1/3 centered at the word “center” would be all the words that begin with “cir-“, the closed disk of radius 1/3 with center “circle” would be the words that begin with “ci-” and the circle of radius 1/3 centered at “circle” would be the words that begin with “ci-” but whose third letter is not “r”.

The main lesson, if blog posts need lessons, is that the concept of circle depends on the notion of distance. So if allow yourself to expand to a variety of notions of distance, you can expand to a variety of types of circles.

Math in Plain English: Topology I – Metric Spaces I

The series of posts that I am starting with this is an attempt to explain so-called higher-level mathematics to people who don’t have strong mathematical backgrounds. The ultimate goal will be to explain my area of study, operator algebras, which seems like an extraordinary task since the subject requires so much just to get started with what they are. In addition to the technical challenges, the goal here isn’t to build up superficial knowledge, like what the definition of a C^{\ast}-algebra is without knowing the context for why it is what it is. When trying to motivate mathematics, it seems as though the best option is to look at why they were first studied. So I will begin not quite at the very beginning, but an important historical result and for that I need to first talk about topology.

Topology, in oversimplifying terms, is the subject of qualitative features of geometric objects. Here there are lots of problems already. What do I mean by qualitative? What do I mean by geometric object? But what I mean is a bit too abstract in the fullest generality that topology often begins. So instead I will start, like most analysts, with a more quantitative and concrete notion of metric space.

A metric space, in short, is a list of the possible places (called points) to be and the distances between each place. To put this in a bit more formal language.

Definition. A metric space is a collection of points along with a notion of distance from one point to the other satisfying the following conditions:

  1. The distance between any two points always remains the same. For example, the Earth and Mars have different distances from each other depending on the time, so we’re ruling out this sort of behavior. Our points are in that sense static, they don’t move around.
  2. The distance from one point to another is always a positive number or zero.
  3. The distance from a point to itself is zero and if the distance from one point to another is zero, then the two points are the same.
  4. There are no one-way streets, where it might be easier to go one direction than it is in the other. The distance from a starting point to an ending point is the same as the distance from end to start. So we can drop the “from” and “to” language and talk about the distance between two points.
  5. Pit stops won’t make the distance any shorter. If I go by someplace else first, the distance will be at least as long as compared to going straight to my destination.

So this definition might feel cumbersome, but it has an unambiguity that’s quite nice. For now, our class of geometric objects are (metric) spaces, and we can tell when we have one such space when all these conditions hold. So let’s take some examples:

  1. Take a globe. Consider the points to be all the major cities that are labeled and the distance to be their distance as we measure it: the length of the shortest line we draw between the cities.
  2. Take the globe again. This time take every possible position on the globe with the same distance as before. So we see that we can get different spaces by considering more or fewer points.
  3. Take the globe once again. Consider the points to be all the major cities. The distance this time is the shortest time it would take to travel between the two cities. Here we might run into the problem that going from one city to another might not take the same time as the other way around. So here our points did not change, but our sense of distance changed. So none of these three spaces are considered the same, even though the points in this example are the same as in the first example.
  4. Consider a country with a major railway. The points this time are the stops and the distance is the shortest distance you have to travel to get from one stop to the other. An example of this inspires a mathematical abstraction known as the “British railway” or “French railway” metric.
  5. So despite usually associating these with places and distances, one of things about mathematics is that these notions can generalize to ideas that have nothing to do with physical distance. So consider points to be English words and the distance is 1 divide by the first place where the two words differ and the distance is 0 if they are the same word. So “about” and “abacus” have a distance of 1/3 since “ab” agree but they disagree on the third letter. For another example, “demon” and “demonstration” have a distance of 1/6, since the word “demon” ends before “demonstration” does.

So the notion of metric space gives us a starting point about talking about this qualitiative study of geometry, which I will go into more detail later.