At the Interface of Algebra and Statistics
my thesis uses basic tools from quantum physics to investigate mathematical structure that is both algebraic and statistical.
What do I mean? Well, the dissertation is about 130 pages long, which I realize is a lot to chew. So I made a 10-minute introductory video! It gives a brief tour of the paper and describes what I think is the quickest way to get a feel for what's inside.
Now, let me highlight an important point that I make in the video:
I wrote my dissertation with a wide audience in mind.
In particular, there is a great deal of exposition woven into the mathematics that provides intuition and motivation for the ideas. I’ve also sprinkled several “behind the scenes” snippets throughout, and alongside the propositions, lemmas, and corollaries there are Takeaways that summarize key ideas. Several of these key ideas are introduced through simple examples that are placed before—not after—the theory they're meant to illustrate. And in a happy turn of events, there is a low entrance fee for following the mathematics. The main tools are linear algebra and basic probability theory. And yes, there is some category theory, too!
Without further ado, here is the table of contents.
And here is the official abstract.
This thesis takes inspiration from quantum physics to investigate mathematical structure that lies at the interface of algebra and statistics. The starting point is a passage from classical probability theory to quantum probability theory. The quantum version of a probability distribution is a density operator, the quantum version of marginalizing is an operation called the partial trace, and the quantum version of a marginal probability distribution is a reduced density operator. Every joint probability distribution on a finite set can be modeled as a rank one density operator. By applying the partial trace, we obtain reduced density operators whose diagonals recover classical marginal probabilities.
In general, these reduced densities will have rank higher than one, and their eigenvalues and eigenvectors will contain extra information that encodes subsystem interactions governed by statistics. We decode this information—and show it is akin to conditional probability—and then investigate the extent to which the eigenvectors capture "concepts" inherent in the original joint distribution.
The theory is then illustrated with an experiment. In particular, we show how to reconstruct a joint probability distribution on a set of data by glueing together the spectral information of reduced densities operating on small subsystems. The algorithm naturally leads to a tensor network model, which we test on the even-parity dataset. Turning to a more theoretical application, we also discuss a preliminary framework for modeling entailment and concept hierarchy in natural language—namely, by representing expressions in the language as densities.
Finally, initial inspiration for this thesis comes from formal concept analysis, which finds many striking parallels with the linear algebra. The parallels are not coincidental, and a common blueprint is found in category theory. We close with an exposition on free (co)completions and how the free-forgetful adjunctions in which they arise strongly suggest that in certain categorical contexts, the "fixed points" of a morphism with its adjoint encode interesting information.
I think the mathematics being investigated here is wonderfully interesting, and I hope you'll find it interesting, too. In a later blog post, I'll share my plans for what's coming next.
In the mean time, enjoy the math!
A note on the art.
I made the video on an iPad Pro with an Apple Pencil 2 using Notability and Enlight Videoleap. The images in my dissertation were hand drawn on the iPad using Procreate. The images on this website are all hand drawn using a pen and paper, then uploaded to my computer for light editing with Adobe Photoshop.