Boundaries and Good Regulators
===
This is my write-up of the Mathematical Boundaries workshop in April. I'm not writing it as a public document because I don't feel I have a natural place to post it, but I don't mind if some or all of its gets made public. (Maybe that will happen anyway when I press "publish"? I have no idea.)
There was a lot of discussion at the workshop about some ideas that I've been working on with Manuel Baltieri, Martin Biehl and Matteo Capucci, and so I want to write a bit about those ideas. I'm writing in first person, but this is their work as well as mine. I'm writing this after the workshop, and although we had been working on these ideas for a long time before that, this is very much the post-workshop version.
I was very surprised quite how much interest there was in this; I wasn't expecting the discussion to focus on it in the way it did. But I think it was a helpful contrubution to the workshop, and I certainly learned a lot from the workshop, both mathematically and conceptually. Some of those ideas will make it into the paper we're preparing.
The ideas I want to talk about involve boundaries in two different ways, because the word "boundary" has two different meanings in relation to agents. The first kind of boundary is the physical boundary that separates an agent from its environment. But the workshop's goal was also to understand another kind of boundary, which we might call the "viabilty boundary," or the set of "good states" for an agent to be in. This boundary doesn't exist in physical space at all but in phase space, so it's of a very different nature than the physical boundary. But nevertheless I think the two are closely related, and this work is partly about trying to get at what that relationship is.
It's also about the relationship between *both* types of boundary and an agent's *model* of its environment. But we'll get to that.
As [Sophie Libkind has said](https://topos.site/blog/2024-04-25-ontological-commitments-for-boundaries/), the following is pretty loose on ontological commitments. This is because the ideas aren't finished yet; I don't know how to generalise everything to general systems in the way I would like, and I feel like the process of figuring out how to do that and the process of figuring out what ontological commitments to make are intertwined. I think that ended up being confusing at the workshop, and I really wished I'd been able to present it in a more concretely formal way. But for sure when I say systems and interfaces I mean something very much like what Sophie said in her post.
The basic idea is pretty simple, but it has a bunch of moving parts. First, we have to talk about coupled systems. This is what the [David Jaz Myers formalism](http://davidjaz.com/Papers/DynamicalBook.pdf) in terms of monoidal (double) categories achieves or (double) operad algebras achieves. (One thing I got from the workshop, and Sophie specifically, is that the operad version is super nice; it brings the maths closer to my intuitions in a lot of ways.)
But in this post I will treat it informally. We have *systems* which have *interfaces*, and when two systems share an interface we can couple them to get a single system. A system might have more than one interface, so coupling two systems might not result in a closed system (i.e. one without any interface) but it can do.
An interface is kind of like a system's physical boundary - it's (informally speaking) the part of a system that can interact with the rest of the world. But this isn't exactly the point I want to make about physical boundaries, and I'll say more about that shortly.
We start with a pair of coupled systems, one that we call the "agent" and one that we call the "environment". For now, we assume they each have a single interface that they share, so that coupling them produces a closed system.
It's important to note that we're making a value judgment here: in reality the world is composed of many interacting systems, so if we partition it into just two then what we're really doing, in practice, is to lump a whole bunch of them together into one system, and all the rest into the other. We could have done that in many different ways, and we're choosing just one. It is *us*, as observers, who choose a way of partitioning the world into agent and observer, at least at this point in the story. Later we'll consider whether there are better or worse places to draw that boundary.
Next I want to talk about the viability boundary. To do that I have to talk about state spaces. For now, I will assume that these are sets. I will write $V$ (for "viscera") for the state of the agent and $E$ for the state of the environment. Furthermore, I will assume that the state space of the coupled system $W$ is the cartesian product of the two systems, $W = V\times E$.
Sophie wrote in her post that these symbols "$W = V\times E$" are really just intuition, and this is correct. It isn't true that state spaces are always sets (they could be vector spaces, or topological spaces, or...), and it's also not true that the state space of the coupled system is always a product of the state spaces of the components. (Sophie taught me about resource sharing machines, where it's a pullback instead, and that made me realise I understood things less well than I thought I did. This is a very good thing!) Since for now I don't know the right way to do all of the following in a more abstract way, I will just talk about sets and products.
Now that we can talk about state spaces, I want to talk about the viability boundary. I will just to the dumbest, simplest thing possible here, and say that there is a subset $G\subseteq W$ of "good states"; if the system is inside $G$ then things are fine, otherwise some kind of failure has occurred. (I think we can formulate it much more abstractly than this, but I want to tell the simplest version of the story, which is also the one I'm the most sure about.)
Perhaps more obviously than the choice of partition into agent and environment, the choice of the good set is also a value judgment. It is us, as observers, who are declaring some states to be good and others to be bad. I mention this because I think making value judgments is unavoidable when talking about agents, and I think we should strive to make it clear when we invoke them.
In control theory, the good set is usually of the form $G = V\times K$, for some $K\subseteq E$. The agent (or controller) is meant to keep the envrionment (or plant) in some set of desirable states. In this case we don't care directly what state the controller is in, but only how its dynamics affect the dynamics of the environment.
Conversely, in biology, we would expect the good set to be of the form $L\times E$ for some subset $L\subseteq V$: the agent is trying to stay alive and it ultimately only cares that its own states stay in the set of alive states; but it nevertheless might have to care about the environment states because of the effect they have on its states. Our current formalism doesn't care which, if any, of these are the case.
Again telling only the simplest version of the story, I will say that (with these value judgments in place) the agent $V$ is a "good regulator" it is is possible for the state of the system to stay inside $G$ indefinitely.
This means that there is a non-empty subset $R\subseteq G$ that is *forward-closed*, meaning that if the state of the combined system is in $R$ at time $t$ then it will still be inside $R$ at any future time. I'm not going to define that mathematically, because although I *do* have a fairly good idea how to do that abstractly, it's a bit of a long story, involving fibrations and modal logic.
So we have $R\subseteq G\subseteq W$. If we can exhibit such an $R$ (and it's non-empty) we can call $V$ a good regulator. There is actually a third value judgment being made here, though it's a bit less obvious than the other two: there might be more than one forward-invariant subset of $G$, and we're singling one of them out. You can always take $R$ to be the *largest* forward-invariant subset of $G$, but you don't have to.
Since $W = E\times V$ we can think of $R\subseteq V\times E$ as a relation between $V$ and $E$. If we have $(v,e)\in R$ we say "$v$ regulates $e$", meaning that if the agent starts in in state $v$ and the environment starts in state $e$, then everything will continue to be "good" indefinitely.
Note that we might have $(v,e)\in G$ but $(v,e)\not\in R$. If the system starts in such a state then things are good initially, but the agent is not able to perform regulation, and hence the system might leave $G$ eventually.
When you have a relation $R\subseteq V\times E$, one thing you can do is turn it into a function $\psi:V\to \mathcal{P}(E)$, defined as
$$
\psi(v) = \{e\in E\mid (v,e)\in R\}.
$$
We call this "the agent's model of its environment". We think of $\psi(e)$ as *the set of states that the agent thinks the environment might be in*, when the agent is in state $v$.
The reason this makes sense is that, at least in the case when our systems are "possibilistic Moore machines", these beliefs end up updating in a way that resembles a [possibilistic version of Bayes' rule](https://www.localcharts.org/t/lowbrow-bayesian-updates/20154). I won't go into details about that now though. I've given talks about it [here](https://youtu.be/6RSMLUhP6CY?si=GPuUDuqEcVKakDPG) and [here](https://youtu.be/wKmJibZw5Is?si=w2ZRoIMJxYz7cYfR), and it will be in our paper.
All of the above is preliminary material for the point I *really* want to make, which is this:
In the above we made, essentially, two different value judgments: (i) the partition of the world $W$ into an agent $V$ and and environment $V$; and (ii) the choice of a good set $G\subseteq W$ and a forward-invariant set $R\subseteq W$. These two things together gave us a notion of agent, and then a notion of "the agent's model". These two value judgments are different, and they are independent, in that we can change either choice without affecting the other.
However, although the two judgments are independent choices, the notion of 'model' that we get in the end depends on them both, in quite an intricate way.
In particular, suppose we have two different choices for the partition of the world, $W = V\times E$ and $W = V'\times E'$. For example, consider Otto, a character used to motivate Clark and Chalmers' [extended mind thesis](https://en.wikipedia.org/wiki/Extended_mind_thesis). Otto can't remember the address he's trying to get to but keeps a notebook with him that contains that information. $V\times E$ might represent a partition of the world that puts everything inside Otto's body into $V$, and everything else into $E$. But $V'times E'$ might put Otto's body and also the notebook into $V'$ and everything else into $E'$. It's a value judgment, so there's no right or wrong answer (at least in my opinion), and both of these are valid choices.
Let's keep the other value judgment (the choice of $G$ and $R$) fixed; say the good set $G$ consists of those states where Otto gets where he's trying to go. Then we have $R\subseteq V\times E$ and also $R\subset V'\times E'$. This gives us two different notions of model:
$$
\psi : V\to \mathcal{P}(E) \qquad\text{defined by}\qquad\psi(v) = \{e\in E\mid (v,e)\in R\}
$$
and
$$
\psi' : V'\to \mathcal{P}(E') \qquad\text{defined by}\qquad\psi(v') = \{e'\in E'\mid (v',e')\in R\}.
$$
These are different functions, with different domains and co-domains. Both of them talk about an agent having a model of a world, but one of them will talk about an agent that doesn't know the address (but does know where to look it up), while the other models an agent that does know the address. These are are both valid, but it might be that in some circumstances one of them is easier to work with, or just makes more sense, than the other.
To a further extreme, we could keep $G$ and $R$ the same, but draw the physical boundary around some random lamp-post somewhere, instead of anything related to Otto. This is also a valid choice, but it's not a very informative one. It will model an agent that (according to the observer's judgment) cares about whether Otto gets where he wants to go, but which doesn't have any knowledge of where Otto is or any way to take actions that affect it; this agent has no choice but to trust that Otto will get to his destination on his own.
So, finally, the point is that although the choice of physical boundary and the choice of viability boundary are independent and both are to some extent arbitrary, some combinations will *just make more sense* than others, and my hope is that ultimately this concept of "making sense" can be understood mathematically. Once we can do that we can start to reason much more systematically about both kinds of boundary. My sincere hope is that this can have a positive impact on the kinds of design decision the workshop was designed to address.
Where does that leave us in respect to the workshop? As I mentioned I learned a great deal, and the discussions greatly helped in focusing these ideas. I think the ideas here contributed a lot to the workshop overall, both as something people were thinking about quite directly, and as an indrect influence on other ideas. I think there is still a lot of work to be done though, especially in understanding how to make design decisions on this kind of basis.
I think the workshop was most concerned with the question of how to make an AI system understand agents' physical and viability boundaries in order to respect them, but there is also the very interesting question (to me) of where we should draw an AI system's boundary. We tend to think of an AI system as living entirely "inside" the computer, but any system that is deployed can also be seen as extending into the world, and I think that might be important.