Language, Statistics, & Category Theory, Part 3

Welcome to the final installment of our mini-series on the new preprint "An Enriched Category Theory of Language," joint work with John Terilla and Yiannis Vlassopoulos. In Part 2 of this series, we discussed a way to assign sets to expressions in language — words like "red" or "blue" – which served as a first approximation to the meanings of those expressions. Motivated by elementary logic, we then found ways to represent combinations of expressions — "red or blue" and "red and blue" and "red implies blue" — using basic constructions from category theory.

I like to think of Part 2 as a commercial advertising the benefits of a category theoretical approach to language, rather than a merely algebraic one. But as we observed in Part 1, algebraic structure is not all there is to language. There's also statistics! And far from being an afterthought, those statistics play an essential role as evidenced by today's large language models discussed in Part 0.

Happily, category theory already has an established set of tools that allow one to incorporate statistics in a way that's compatible with the considerations of logic discussed last time. In fact, the entire story outlined in Part 2 has a statistical analogue that can be repeated almost verbatim. In today's short post, I'll give lightning-quick summary.

It all begins with a small, yet crucial, twist.

Read More →

Language, Statistics, & Category Theory, Part 2

Part 1 of this mini-series opened with the observation that language is an algebraic structure. But we also mentioned that thinking merely algebraically doesn't get us very far. The algebraic perspective, for instance, is not sufficient to describe the passage from probability distributions on corpora of text to syntactic and semantic information in language that wee see in today's large language models. This motivated the category theoretical framework presented in a new paper I shared last time. But even before we bring statistics into the picture, there are some immediate advantages to using tools from category theory rather than algebra. One example comes from elementary considerations of logic, and that's where we'll pick up today.

Let's start with a brief recap.

Read More →

Language, Statistics, & Category Theory, Part 1

In the previous post I mentioned a new preprint that John Terilla, Yiannis Vlassopoulos, and I recently posted on the arXiv. In it, we ask a question motivated by the recent successes of the world's best large language models:

What's a nice mathematical framework in which to explain the passage from probability distributions on text to syntactic and semantic information in language?

To understand the motivation behind this question, and to recall what a "large language model" is, I'll encourage you to read the opening article from last time. In the next few blog posts, I'll give a tour of mathematical ideas presented in the paper towards answering the question above. I like the narrative we give, so I'll follow it closely here on the blog. You might think of the next few posts as an informal tour through the formal ideas found in the paper.

Now, where shall we begin? What math are we talking about?

Let's start with a simple fact about language.

Language is algebraic.

By "algebraic," I mean the basic sense in which things combine to form a new thing. We learn about algebra at a young age: given two numbers $x$ and $y$ we can multiply them to get a new number $xy$. We can do something similar in language. Numbers combine to give new numbers, and words and phrases in a language combine to give new expressions. Take the words red and firetruck, for example. They can be "multiplied" together to get a new phrase: red firetruck.

Here, the "multiplication" is just concatenationsticking things side by side. This is a simple algebraic structure, and it's inherent to language. I'm concatenating words together as I type this sentence. That's algebra! Another word for this kind of structure is compositionality, where things compose together to form something larger.

So language is algebraic or compositional.

Read More →

Warming Up to Enriched Category Theory, Part 2

Let's jump right in to where we left off in part 1 of our warm-up to enriched category theory. If you'll recall from last time, we saw that the set of truth values $\{0, 1\}$ and the unit interval $[0,1]$ and the nonnegative extended reals $[0,\infty]$ were not just sets but actually preorders and hence categories. We also hinted at the idea that a "category enriched over" one of these preorders (whatever that means — we hadn't defined it yet!) looks something like a collection of objects $X,Y,\ldots$ where there is at most one arrow between any pair $X$ and $Y$, and where that arrow can further be "decorated with" —or simply replaced bya number from one of those three exemplary preorders.

With that background in mind, my goal in today's article is to say exactly what a category enriched over a preorder is. The formal definition — and the intuition behind it — will then pave the way for the notion of a category enriched over an arbitrary (and sufficiently nice) category, not just a preorder.

En route to this goal, it will help to make a couple of opening remarks.

Two things to think about.

First, take a closer look at the picture on the right. I've written "$\text{hom}(X,Y)$" in quotation marks because the notation $\text{hom}(-,-)$ is often used for a set of morphisms in ordinary category theory. But the  point of this discussion is that we're not just interested in sets! So we should use better notation: let's refer to the number associated to a pair of objects $XY$ and $Y$ as $\mathcal{C}(X,Y)$, where the letter "$\mathcal{C}$" reminds us there's an (enriched) $\mathcal{C}$ategory being investigated.

Second, for the theory to work out nicely, it turns out that preorders need a little more added to them.

Read More →

Warming Up to Enriched Category Theory, Part 1

It's no secret that I like category theory. It's a common theme on this blog, and it provides a nice lens through which to view old ideas in new ways — and to view new ideas in new ways! Speaking of new ideas, my coauthors and I are planning to upload a new paper on the arXiv soon. I've really enjoyed the work and can't wait to share it with you. But first, you'll have to know a little something about enriched category theory. (And before that, you'll have to know something about ordinary category theory... here's an intro!) So that's what I'd like to introduce today.

A warm up, if you will.

What is enriched category theory?

As the name suggests, it's like a "richer" version of category theory, and it all starts with a simple observation. (Get your category theory hats on, people. We're jumping right in!)

In a category, you have some objects and some arrows between them, thought of as relationships between those objects. Now in the formal definition of a category, we usually ask for a set's worth of morphisms between any two objects, say $X$ and $Y$. You'll typically hear something like, "The hom set $\text{hom}(X,Y)$ bla bla...."

Now here's the thing. Quite often in mathematics, the set $\text{hom}(X,Y)$ may not just be a set. It could, for instance, be a set equipped with extra structure. You already know lots of examples. Let's think about about linear algebra, for a moment.

Read More →

The Fibonacci Sequence as a Functor

Over the years, the articles on this blog have spanned a wide range of audiences, from fun facts (Multiplying Non-Numbers), to undergraduate level (The First Isomorphism Theorem, Intuitively), to graduate level (What is an Operad?), to research level. Today's article is more on the fun-fact side of things, along with—like most articles here—an eye towards category theory.

So here's a fun fact about greatest common divisors (GCDs) and the Fibonacci sequence $F_1,F_2,F_3,\ldots$, where $F_1=F_2=1$ and $F_n:=F_{n-1} + F_{n-2}$ for $n>1$. For all $n,m\geq 1$,

In words, the greatest common divisor of the $n$th and $m$th Fibonacci numbers is the Fibonacci number whose index is the greatest common divisor of $n$ and $m$. (Here's a proof.) Upon seeing this, your "spidey senses" might be tingling. Surely there's some structure-preserving map $F$ lurking in the background, and this identity means it has a certain nice property. But what is that map? And what structure does it preserve? And what's the formal way to describe the nice property it has?

The short answer is that the natural numbers $\mathbb{N}=\{1,2,3,\ldots\}$ form a partially ordered set (poset) under division, and the function $F\colon \mathbb{N}\to\mathbb{N}$ defined by $n\mapsto F_n:=F(n)$ preserves meets: $F_n\wedge F_m = F(n\wedge m)$.

Read More →

Language Modeling with Reduced Densities

Today I'd like to share with you a new paper on the arXiv—my latest project in collaboration with mathematician Yiannis Vlassopoulos (Tunnel, IHES). To whet your appetite, let me first set the stage. A few months ago I made a 10-minute introductory video to my PhD thesis, which was an investigation into mathematical structure that is both algebraic and statistical. In the video, I noted that natural language is an example of where such mathematical structure can be found.

Language is algebraic, since words can be concatenated to form longer expressions.  Language is also statistical, since some expressions occur more frequently than others.

As a simple example, take the words "orange" and "fruit." We can stick them together to get a new phrase, "orange fruit." Or we could put "orange" together with "idea" to get "orange idea." That might sound silly to us, since the phrase "orange idea" occurs less frequently in English than "orange fruit." But that's the point. These frequencies contribute something to the meanings of these expressions. So what is this kind of mathematical structure? As I mention in the video, it's helpful to have a set of tools to start exploring it, and basic ideas from quantum physics are one source of inspiration. I won't get into this now—you can watch the video or read the thesis! But I do want to emphasize the following: In certain contexts, these tools provide a way to see that statistics can serve as a proxy for meaning. I didn't explain how in the video. I left it as a cliffhanger.

But I'll tell you the rest of the story now.

Read More →

What is an Adjunction? Part 3 (Examples)

Welcome to the last installment in our mini-series on adjunctions in category theory. We motivated the discussion in Part 1 and walked through formal definitions in Part 2. Today I'll share some examples. In Mac Lane's well-known words, "adjoint functors arise everywhere," so this post contains only a tiny subset of examples. Even so, I hope they'll help give you an eye for adjunctions and enhance your vision to spot them elsewhere.

An adjunction, you'll recall, consists of a pair of functors $F\dashv G$ between categories $\mathsf{C}$ and $\mathsf{D}$ together with a bijection of sets, as below, for all objects $X$ in $\mathsf{C}$ and $Y$ in $\mathsf{D}$.

In Part 2, we illustrated this bijection using a free-forgetful adjunction in linear algebra as our guide. So let's put "free-forgetful adjuctions" first on today's list of examples.

Read More →

What is an Adjunction? Part 2 (Definition)

Last time I shared a light introduction to adjunctions in category theory. As we saw then, an adjunction consists of a pair of opposing functors $F$ and $G$ together with natural transformations $\text{id}\to\ GF$ and $FG\to\text{id}$. We compared this to two stricter scenarios: one where the composite functors equal the identities, and one where they are naturally isomorphic to the identities. The first scenario defines an isomorphism of categories. The second defines an equivalence of categories. An adjunction is third on the list.

In the case of an adjunction, we also ask that the natural transformations—called the unit and counit—somewhat behave as inverses of each other. This explains why the ${\color{red}\text{arrows}}$ point in opposite directions. (It also explains the "co.") Except, they can't literally be inverses since they're not composable: one involves morphisms in $\mathsf{C}$ and the other involves morphisms in $\mathsf{D}$. That is, their (co)domains don't match. But we can fix this by applying $F$ and $G$ so that (a modified version of) the unit and counit can indeed be composed. This brings us to the formal definition of an adjunction.

Read More →

What is an Adjunction? Part 1 (Motivation)

Some time ago, I started a "What is...?" series introducing the basics of category theory:

Today, we'll add adjunctions to the list. An adjunction is a pair of functors that interact in a particularly nice way. There's more to it, of course, so I'd like to share some motivation first. And rather than squeezing the motivation, the formal definition, and some examples into a single post, it will be good to take our time: Today, the motivation. Next time, the formal definition. Afterwards, I'll share examples.

Indeed, I will make the admittedly provocative claim that adjointness is a concept of fundamental logical and mathematical importance that is not captured elsewhere in mathematics.
- Steve Awodey (in Category Theory, Oxford Logic Guides)
Read More →

Limits and Colimits Part 3 (Examples)

Once upon a time, we embarked on a mini-series about limits and colimits in category theory. Part 1 was a non-technical introduction that highlighted two ways mathematicians often make new mathematical objects from existing ones: by taking a subcollection of things, or by gluing things together. The first route leads to a construction called a limit, the second to a construction called a colimit.

The formal definitions of limits and colimits were given in Part 2. There we noted that one speaks of "the (co)limit of [something]." As we've seen previously, that "something" is a diagram—a functor from an indexing category to your category of interest. Moreover, the shape of that indexing category determines the name of the (co)limit: product, coproduct, pullback, pushout, etc.

In today's post, I'd like to solidify these ideas by sharing some examples of limits. Next time we'll look at examples of colimits. What's nice is that all of these examples are likely familiar to you—you've seen (co)limits many times before, perhaps without knowing it! The newness is in viewing them through a categorical lens. 

Read More →

Announcing Applied Category Theory 2019

Hi everyone. Here's a quick announcement: the Applied Category Theory 2019 school is now accepting applications! As you may know, I participated in ACT2018, had a great time, and later wrote a mini-book based on it. This year, it's happening again with new math and new people! As before, it consists of a five-month long, online school that culminates in a week long conference (July 15-19) and a week long research workshop (July 22-26, described below). Last year we met at the Lorentz Center in the Netherlands; this year it'll be at Oxford.

Daniel Cicala and Jules Hedges are organizing the ACT2019 school, and they've spelled out all the details in the official announcement, which I've copied-and-pasted it below. Read on for more! And please feel free to spread the word. Do it quickly, though. The deadline is soon!


Read More →

Notes on Applied Category Theory

Have you heard the buzz? Applied category theory is gaining ground! But, you ask, what is applied category theory? Upon first seeing those words, I suspect many folks might think either one of two thoughts:

  1. Applied category theory? Isn't that an oxymoron?
  2. Applied category theory? What's the hoopla? Hasn't category theory always been applied?

For those thinking thought #1, I'd like to convince you the answer is No way! It's true that category theory sometimes goes by the name of general abstract nonsense, which might incline you to think that category theory is too pie-in-the-sky to have any impact on the "real world." My hope is to convince you that that's far from the truth.

For those thinking thought #2, yes, it's true that ideas and results from category theory have found applications in computer science and quantum physics (not to mention pure mathematics itself), but these are not the only applications to which the word applied in applied category theory is being applied.

So what is applied category theory?

Read More →

Limits and Colimits, Part 2 (Definitions)

Welcome back to our mini-series on categorical limits and colimits! In Part 1 we gave an intuitive answer to the question, "What are limits and colimits?" As we saw then, there are two main ways that mathematicians construct new objects from a collection of given objects: 1) take a "sub-collection," contingent on some condition or 2) "glue" things together. The first construction is usually a limit, the second is usually a colimit. Of course, this might've left the reader wondering, "Okay... but what are we taking the (co)limit of ?" The answer? A diagram. And as we saw a couple of weeks ago, a diagram is really a functor.

Read More →

A Diagram is a Functor

Last week was the start of a mini-series on limits and colimits in category theory. We began by answering a few basic questions, including, "What ARE (co)limits?" In short, they are a way to construct new mathematical objects from old ones. For more on this non-technical answer, be sure to check out Limits and Colimits, Part 1. Towards the end of that post, I mentioned that (co)limits aren't really related to limits of sequences in topology and analysis (but see here). There is however one similarity. In analysis, we ask for the limit of a sequence. In category theory, we also ask for the (co)limit OF something. But if that "something" is not a sequence, then what is it?

Answer: a diagram.

Read More →

Limits and Colimits, Part 1 (Introduction)

I'd like to embark on yet another mini-series here on the blog. The topic this time? Limits and colimits in category theory! But even if you're not familiar with category theory, I do hope you'll keep reading. Today's post is just an informal, non-technical introduction. And regardless of your categorical background, you've certainly come across many examples of limits and colimits, perhaps without knowing it! They appear everywhere--in topology, set theory, group theory, ring theory, linear algebra, differential geometry, number theory, algebraic geometry. The list goes on. But before diving in, I'd like to start off by answering a few basic questions.

Read More →

The Yoneda Lemma

Welcome to our third and final installment on the Yoneda lemma! In the past couple of weeks, we've slowly unraveled the mathematics behind the Yoneda perspective, i.e. the categorical maxim that an object  is completely determined by its relationships to other objects. Last week we divided this maxim into two points...

Read More →

The Yoneda Embedding

Last week we began a discussion about the Yoneda lemma. Though rather than stating the lemma (sans motivation)we took a leisurely stroll through an implication of its corollaries - the Yoneda perspective, as we called it:

 An object is completely determined by its relationships to other objects,


by what the object "looks like" from the vantage point of each object in the category.

 But this left us wondering, What are the mathematics behind this idea? And what are the actual corollaries? In this post, we'll work to discover the answers.

Read More →

The Yoneda Perspective

In the words of Dan Piponi, it "is the hardest trivial thing in mathematics." The nLab catalogues it as "elementary but deep and central," while Emily Riehl nominates it as "arguably the most important result in category theory." Yet as Tom Leinster has pointed out, "many people find it quite bewildering."

And what are they referring to?

The Yoneda lemma.

"But," you ask, "what is the Yoneda lemma? And if it's just a lemma then - my gosh - what's the theorem?"

Read More →

Naming Functors

Mathematicians are a creative bunch, especially when it comes to naming things. And category theorists are no exception. So here's a little spin on this xkcd comic. It's inspired by a recent conversation I had on Twitter and, well, every category theory book ever.

Read More →

Group Elements, Categorically

On Monday we concluded our mini-series on basic category theory with a discussion on natural transformations and functors. This led us to make the simple observation that the elements of any set are really just functions from the single-point set {✳︎} to that set. But what if we replace "set" by "group"? Can we view group elements categorically as well? The answer to that question is the topic for today's post, written by guest-author Arthur Parzygnat

Read More →

What is a Natural Transformation? Definition and Examples, Part 2

Continuing our list of examples of natural transformations, here is Example #2 (double dual space of a vector space) and Example #3 (representability and Yoneda's lemma).

Read More →

What is a Natural Transformation? Definition and Examples

I hope you have enjoyed our little series on basic category theory. (I know I have!) This week we'll close out by chatting about natural transformations which are, in short, a nice way of moving from one functor to another. If you're new to this mini-series, be sure to check out the very first post, What is Category Theory Anyway? as well as What is a Category? and last week's What is a Functor?

Read More →

What is a Functor? Definitions and Examples, Part 2

Continuing yesterday's list of examples of functors, here is Example #3 (the chain rule from multivariable calculus), Example #4 (contravariant functors), and Example #5 (representable functors).

Read More →

What is a Functor? Definition and Examples, Part 1

Next up in our mini series on basic category theory: functors! We began this series by asking What is category theory, anyway? and last week walked through the precise definition of a category along with some examples. As we saw in example #3 in that post, a functor can be viewed an arrow/morphism between two categories.

Read More →

What is a Category? Definition and Examples

As promised, here is the first in our triad of posts on basic category theory definitions: categories, functors, and natural transformations. If you're just now tuning in and are wondering what is category theory, anyway? be sure to follow the link to find out!

A category $\mathsf{C}$ consists of some data that satisfy certain properties...

Read More →

What is Category Theory Anyway?

A quick browse through my Twitter or Instagram accounts, and you might guess that I've had category theory on my mind. You'd be right, too! So I have a few category-theory themed posts lined up for this semester, and to start off, I'd like to (attempt to) answer the question, What is category theory, anyway? for anyone who may not be familiar with the subject.

Now rather than give you a list of definitions--which are easy enough to find and may feel a bit unmotivated at first--I thought it would be nice to tell you what category theory is in the grand scheme of (mathematical) things. You see, it's very different than other branches of math....

Read More →

The Most Obvious Secret in Mathematics

Yes, I agree. The title for this post is a little pretentious. It's certainly possible that there are other mathematical secrets that are more obvious than this one, but hey, I got your attention, right? Good. Because I'd like to tell you about an overarching theme in mathematics - a mathematical mantra, if you will. A technique that mathematicians use all the time to, well, do math. 

Read More →