What is a Good Quantum Encoding? Part 1

Over the past couple of years, I've been learning a little about the world of quantum machine learning (QML) and the sorts of things people are thinking about there. I recently gave an high-level talk on some of these ideas in connection to a December 2024 preprint called "Towards Structure-Preserving Quantum Encodings", coauthored with collaborators at Deloitte (Andrew Vlasic and Anh Pham) and MIT (Arthur Parzygnat). I spoke on this at the AWM Research Symposium this past May and have decided to write it up in a series of blog posts, as well.

In short, our preprint translates one aspect of an open problem in QML into the language of category theory, in hopes that by casting the problem in a more mathematically-formal light, we might be led to new tools and techniques that could help. I'll explain that in this series of articles.

Of course, the real question is: Did this help the QML problem? Right now, it's too early to know. Our preprint was written to a QML audience, and we assumed no familiarity with category theory. Still, it takes a while for ideas to spread. So, one goal for writing this series is just to get these ideas "out there" and see if they're of interest to anybody.

Another goal is just to share with everyone some topics that I personally think are interesting.

So here we go!

What is Quantum Machine Learning?

Since around 2013, QML has explored, among other things, whether quantum computers offer advantages for solving "real world" machine learning tasks. In a nutshell, quantum computers store, process, and share information in ways that are fundamentally different than everyday computers. If you're not familiar with this kind of hardware, there are lots of explainer videos on YouTube — here's a nice one by MKBHD and Cleo Abrams — but here's one way I try to explain the concept:

I think most people know, at least vaguely, that computers like to think in terms of 0s and 1s, called bits. Physically, these are instantiated by current flowing or not flowing through a little piece of material, like silicon, called a transistor, and your computer chips are comprised of millions (billions?) of these little transistors. They are very tiny! 

But now imagine taking individual atoms of silicon and storing information on them.

Then the analogues of current flowing or not flowing (which are represented by the numbers 0 or 1) are now two different energy states of the atom, for example. That's basically what a quantum computer is. Except, people prefer to use other kinds of atoms instead of silicon, like ytterbium or rubidium. Or sometimes people prefer to work with atoms that have been ionized. Or they prefer to use other quantum objects that be have like atoms, like electrons, or photons, or "artificial atoms" like superconducting qubits. As that last one suggests, these basic quantum units for processing information are called quantum bits, or qubits, for short.

Regardless of the type of qubit, the point is that physics at the quantum level behaves differently than physics at the non-quantum level, so there is hope that you might be able to do some things there that you might not be able to do usually. Some folks in QML are exploring whether this might be the case for machine learning with classical data — tasks like image classification, for instance.

To determine whether this is the case, you'd first need to upload your dataset (of images or whatever) onto a quantum computer before you can even start running experiments. That is, you'd have to encode your data onto individual atoms, or ions, or photons, or some other kind of qubits. But how in the world would you do that? It turns out there are a number of ways this is done (I'll give examples later), but there is no real consensus on which encoding scheme is the best one. That is, given a particular data set and learning task, what is the best method for encoding classical data onto a quantum computer?

That's the open problem I mentioned earlier. And it explains the title of this series: What is a good quantum encoding?

Spoiler alert: We do not propose an answer, nor are we claiming the answer will be found in category theory. Instead, we are simply offering the perspective that category theory may be a useful framework in which to think about the question.

This reminds me of something our coauthor, Arthur Parzygnat, once said: Category theory helps him know which are the right questions to ask, though it doesn't necessarily provide the answers to those questions.

Likewise, it's possible that adopting a more categorical viewpoint could shed light on the "right" questions to ask in the data encoding problem. We're just exploring this possibility. That's what research is, after all!

Why category theory?

But why category theory as opposed to something else? Well, as I hope to show you later, it's just sort of inevitable when you look at the situation from a zoomed-out perspective.

Also, there are advantages to recasting an engineering problem into more mathematical one — namely, 1) it puts the problem on a firmer, more rigorous footing, and 2) it might get mathematicians interested, and maybe that'll open doors to new collaborations and insights. We'll see!

I should say — if you're reading this article and are not familiar with category theory, I have lots of introductory articles on this site. To put it simply, though, it's a modern branch of mathematics that gives us a unifying language in which to talk about many themes we see across the mathematical landscape. And indeed, we'll see that one aspect of the data encoding problem of QML has a recurring theme, which can be neatly packaged in a categorical way.

But before we get there, let me get back to the actual machine learning problem, so you can see where we're headed. I told you what a quantum computer is. But what does it meant to do machine learning on one?

How do you do machine learning on a quantum computer?

By way of motivation, let's think about how deep learning works from a simplistic point of view. And let's suppose our learning task is to correctly label some data, just for simplicity. So the goal of this task is to find a function that correctly assigns to each input (an image, say) an output (like a label for that image). That means we're looking for a function $f:X\to Y$ for some sets $X$ and $Y$. Oftentimes those sets are just Euclidean spaces $\mathbb{R}^n$ and $\mathbb{R}^m$ , which simplifies things nicely.

Of course, there's no convenient quadratic-formula sort of thing for $f$, so instead, you try to approximate it as a composition of smaller functions $f_i$, each of which involves some weight matrices $M_i$ and bias vectors $b_ i$ and activation functions $a$. And the goal of the "learning" process is to choose the right $a$ and to find the right numbers that form the entries of the matrices and bias vectors, so that when you compose the functions $f_ i$ together, you get an overall function that does the job well.

That's (classical) deep learning. Quantum machine learning is very similar when you're working with classical data. (You can also do machine learning on a quantum computer with quantum data, but we're just focusing on classical data.) There the goal is likewise to find a function that, for instance, correctly labels images. And this function is also a composition of other functions. But the main difference is in what those functions are.

In a common QML architecture called parameterized quantum circuits, you first must encode your data onto your choice of qubits, illustrated by a vertical arrow on the left in the image below.

Those qubits are initially set up to be in a particular "fiducial" state, sort of like when you get a new luggage or keypad for a safe, and the digits are all set to 0000. The main goal of a variational quantum algorithm, then, is to manipulate the states of those qubits — for example, by using laser pulses if the qubits are ions — so that when you "measure" them, their final states mirror the kind of output you're looking for. (I realize this last sentence could sound confusing as it uses a word I haven't defined yet [measure], but I will give a concrete example later on.)

The machine learning challenge is to find the right laser parameters, for instance, (represensting frequency, pulse duration, etc.), that accomplishes this.

Notice that the goal in both QML and in deep learning is to find something simple: a function between sets. Although it's simple to state, the function itself may, of course, be very complicated. In QML, the function involves compositions of tensor products of unitary operators, which is called a quantum circuit. In classical ML, the function is a neural network.

Another thing to notice is that the open problem I mentioned earlier — i.e. identifying the best way to encode classical data onto a quantum computer — is precisely the vertical arrow on the left, labeled "encode" in the illustration above. That function is often called a quantum feature map, or quantum embedding, or quantum encoding.

The source of that arrow (i.e. the function's domain) is a set, $X$, but my drawing doesn't explicitly show the arrow's source (i.e. the function's codomain). I've just drawn four dots. So to clarify this, I need to make this picture a little more mathematical. And we'll do that next time! I'll also give some examples of quantum encodings, I'll walk us through a recurring theme that appears in encoding schemes (hint: it has to do with mathematical structure in data), and I'll show how the theme can be neatly expressed in the language of category theory.

Until then.

Related Posts

What is Superposition, Really?

Physics

New Video Podcast: fAQ

Physics

What is Quantum Technology?

Physics

Modeling Sequences with Quantum States

Probability
Leave a comment!