# Stone Weierstrass Theorem

Let $X$ be a topological space and let $C(X,\mathbb{R})$ denote the set of all continous functions $f:X\to\mathbb{R}$.

A family $\mathscr{F}$ of functions is an **algebra** if for every $f,g\in\mathscr{F}$ and any $c\in \mathbb{R}$, we have $f+g,fg, cf\in\mathscr{F}$.

- ex: $C([a,b],\mathbb{R})$ is an algebra for any closed interval $[a,b]\subset\mathbb{R}$.

An algbera $\mathscr{A}$ of functions $f:X\to \mathbb{R}$ **separates points** if for every $x,y\in X$ there is a function $f\in \mathscr{A}$ such that $f(x)\neq f(y)$.

- ex: Let $\mathscr{P}\subset C([a,b],\mathbb{R})$ denote the set of polynomials on $[a,b]$. Then $\mathscr{P}$ is a
*subalgebra*which separates points since it contains the polynomial $f(x)=x$.

An algbera $\mathscr{A}$ of functions $f:X\to \mathbb{R}$ **vanishes nowhere** if for every $x\in X$ there is a function $f\in\mathscr{A}$ such that $f(x)\neq 0$.

- ex: Let $\mathscr{P}\subset C([a,b], \mathbb{R})$ denote the set of polynomials on $[a,b]$. Then $\mathscr{P}$ is a
*subalgebra*that separates points since it contains the polynomial $f(x)=x$.

These definitions come together in today's theorem:

In English, this just says that any continuous, real-valued function on a compact set can be approximated (with respect to the supremum norm) by some function in your algebra $\mathscr{A}$. In other words, choose any continuous function $f:X\to \mathbb{R}$ where $X$ is compact. Then for every $\epsilon >0$, we can find a function $g\in \mathscr{A}$ such that $$\|f-g\|=\sup_{x\in X}\{|f(x)-g(x)|\}<\epsilon.$$
Here's an example: take $X=[a,b]$ to be a closed interval in $\mathbb{R}$ and let $\mathscr{A}$ be the set of all polynomials on $[a,b]$. Then for any continuous $f:[a,b]\to\mathbb{R}$ and any $\epsilon>0$, we can find a polynomial $p:[a,b]\to\mathbb{R}$ such that $\|f-p\|< \epsilon$. This is the familar Weierstrass Approximation Theorem! *Any continous function on a closed interval can be approximated - as close as you want - by a polynomial. *The Stone Weierstrass Theorem says that this result is still true if we replace $[a,b]$ by any compact set $X$, and if we replace the set of polynomials by any subalgebra of $C(X,\mathbb{R})$ which separates points and vanishes nowhere.

Now you might ask, "Why do we need $X$ to be compact?" and "Why must $\mathscr{A}$ separate points and vanish nowhere?" Today we'll see why these hypotheses are necessary. Next time we'll work through an exercise from Rudin's *Principal of Mathematical Analysis* (a.k.a. "Baby Rudin") to see the theorem in action.

## Why do we need compactness?

The best way to answer this question is to look at a counterexample. So let's consider $X=\mathbb{R}$ and let $\mathscr{A}$ be the subalgebra of all poynomials on $\mathbb{R}$. Then $\mathscr{A}$ separates points and vanishes nowhere, but $X$ is *not* compact. In this case, the theorem fails since the function $f(x)=e^x$ cannot be approximated by any polynomial! This is because any polynomial $p$ is dominated by its largest term, say $x^n$, and $e^x$ tends to $\infty$ much faster than does $x^n$ (even if $n$ is very large). As a result, the distance between $e^x$ and $x^n$ cannot be made arbitrarily small as $x$ ranges over all of $\mathbb{R}$.

But if we restrict ourselves to a closed and bounded interval, the smaller terms in $p$ have more weight, and this allows us to approximate $e^x$ by $p(x)$ with as much accuracy as we want. And this is exactly what we've all done in undergraduate calculus! You remember those problems. The $n$th degree Taylor polynomial of $e^x$ centered at 0 is $$e^x\approx \sum_{k=0}^n\frac{x^k}{k!},$$ and a typical homework question might've been something like, "For what values of $x$ is this approximation accurate to within 0.00001?". The answer would be $|x|<\delta$ for some constant $\delta$ which you could find using Taylor's Inequality. This $[-\delta,\delta]$ is precisely the compact set we need to restrict to in order to obtain a good approximation.

## Why must the algebra separate points?

Again we'll consider a counterexample. This time let $X=[a,b]\subset\mathbb{R}$ and take $\mathscr{A}$ to be the collection of all polynomials $p:[a,b]\to\mathbb{R}$ such that $p(a)=p(b)$. It's easy to check that this forms an algebra, and it clearly does not separate points. To see where the Stone Weierstrass Theorem fails, simply choose any continuous function $f:[a,b]\to\mathbb{R}$ such that $f(a),f(b)\neq p(a),p(b)$. Then we *cannot* approximate $f$ by *any* polynomial $p\in\mathscr{A}$ because we can always find an $\epsilon$ such that $\|f-p\|\geq \epsilon.$ In fact, $\epsilon=|f(b)-f(a)|/2$ does the job.

This isn't too hard to show. Let $M=\max\{|f(a)-p(a)|,|f(b)-p(b)|\}$ and observe from the picture above that $\|f-p\|\geq M$. We want to show* $$\|f-p\|\geq M\geq \frac{|f(b)-f(a)|}{2}.$$ To see this, let $m$ denote the common value $p(a)=p(b)$, assume WLOG $f(a)\leq f(b)$, and suppose $f(a)< m < f(b)$. If, for instance, $m=|f(a)+f(b)|/2$, then $M=|f(a)+f(b)|/2$ and the claim is true.

Otherwise, if, say, $m$ lies in-between $|f(a)+f(b)|/2$ and $f(b)$ (see insert on the left), then $M=|f(a)-m|$ which is greater than $|f(a)-f(b)|/2$ as claimed. And if $m< f(a)$ or $m>f(b)$, then $M$ is even larger and again the claim holds.

Alternatively we could also choose $\mathscr{A}$ to be the set of constant functions on $[a,b]$ (this definitley does not separate points). Then, for example, the function $f(x)=e^x$ can't be approximated by any constant $c$ since $\|e^x-c\|$ is bounded below by $\frac{|e^b-e^a|}{2}$ (using the same argument as above).

## Why must the algebra vanish nowhere?

Suppose $X=[0,1]$ and let's take $\mathscr{A}$ to be the set of all continuous functions $p:[0,1]\to\mathbb{R}$ such that $p(0)=0$ (one easily checks that this is an algebra). Then any continuous function $f$ which is not zero at zero can't be approximated by any $p\in\mathscr{A}$! The supremum of $|f(x)-p(x)|$ for $x$ in $[0,1]$ is bounded below by $|f(0)-p(0)|=|f(0)|.$ For instance take $f(x)=x+3$. Then $\|f-p\|$ is *at least* 3.

So there you go! Each of the conditions in the Stone Weierstrass Theorem is indeed necessary. Next week we'll use the theorem to solve this exercise from *Baby Rudin*:

- (Rudin,
*PMA*#7.20) If $f$ is continuous on $[0,1]$ and if $\int_0^1f(x)x^n\;dx=0$ for all $n=0,1,2,\ldots,$ prove that $f(x)=0$ on $[0,1]$.

*Footnote:*

* We don't want to let $M=\max\{|f(a)-p(a)|,|f(b)-p(b)|\}$ be our $\epsilon$ since we need $\epsilon$ to be independent of the polynomial $p$. (The negation of the Stone-Weierstrass Theorem says that if $X$ is not compact or if $\mathscr{A}$ is an algebra which does not separate points or does not vanish nowhere, then there exists a function $f\in C(X,\mathbb{R})$ and there exists $\epsilon>0$ such that $\|f-p\|\geq \epsilon$ for all $p\in\mathscr{A}$. The wording implies that $\epsilon$ depends on $f$ only.)