**Towards a more general law of requisite variety**
by Martin Biehl and developed in conversation with Owen Lynch
What is the question?
We are looking for a modern formulation of Ashby's law of requisite variety. The purpose of such a law is to provide a lower bound on the number of internal states or more generally on the complexity of any system that can solve a given problem.
How does it relate to boundaries?
<!-- One way to express the idea behind the law of requisite variety is that only variety can reduce variety. So if we consider a closed system with state space $X$ we -->
In our setting we have a given interface of a system we call environment. This interface also defines a boundary in the sense of Critch's boundary sequence. It provides the perceptions $p \in P$ as outputs and takes actions $a \in A$ as inputs, it also has an internal state space $E$.
<!-- If we consider a closed system $X$ there may be multiple ways to factorize it into two interacting Moore machines called environment $E$ and viscera $V$ such that we have $X \cong E \otimes V$. Each such factorization can be seen as a choice of a boundary. A different perspective may be to take an environment Moore machine $V$ as given and thereby also define an interface or boundary and think about the systems that can interact with this environment and solve some problem within it. -->
What can you do with a solution?
We might want to construct systems that can solve specific problems but provably can't solve other ones. If the problems we want the systems to solve have a strictly higher requisite variety or requisite complexity than those we want them to solve then a law of requisite variety will be able to provide such proofs based only on the number of states of the system.
How does the existing solutions fail?
The existing literature consists of multiple results stated separately for different scenarios (e.g. in Ashby's "An Introduction to Cybernetics" p.202-219).
<!-- In Ashby's original formulation there are multiple versions of the law of requisite variety each for a different situation. -->
Applying the law of requisite variety therefore requires carefully checking the multiple existing versions in order to see if any one of them fits the case of interest. The language these versions are written in is also rather old which makes this process even harder. There are also newer results in the spirit of the law of requisite variety such as the existence of a unique minimal deterministic finite automaton (DFA) accepting any regular language that is part of the Myhill-Nerode theorem.
We believe that there is an abstract and rigorous statement of the law of requisite variety generalizing its existing versions. We also believe that the availability of such a statement would allow working generally and therefore efficiently on methods for establishing requisite varieties for various classes of problems.
<!-- We consider this multitude of results a failure since it increases the time required
A unification of these kinds of results
A second issue is that the notion of "variety"
This makes applications tedious since it requires going through the different versions checking the
-->
<!-- Thnotion of systems which include for example sets of systems whose change what systems -->
<!-- (Maybe the naive solution a.k.a. using Ashby's law of requisite variety isn't bad at all since Ashby has many versions of this law. But well ) -->
<!-- The case we are considering involves a system that only partially observes the environment. -->
<!-- The simplest version of the law of requisite variety is formulated for an environment $E$ and a system $V$ both choosing a move from their respective sets of available moves $M_E$ and $M_V$. There is also a function $T:M_E \times M_V \to O$ where $O$ is a set of outcomes which is chosen to have the same number of elements as $M_E$. The environment goes first and the system tries to restrict the set $O^* \subseteq O$ of occurring outcomes by taking actions in response. The law of requisite variety then says that says that the number of occurring outcomes is at least as large as the number of environment states divided by the number of system states: $$|O^*|\geq \frac{|M_E|}{|M_V|}$$. -->
<!-- In our setting we want to keep the environment states within the good set of environment states which is a different from just -->
<!-- We expect further generalization to be possible but the first kind of problems we consider are those that can be defined in the following way: -->
What is at least one promising solution direction?
Categorical systems theory (as found in the works of David Jaz Myers, Nelson Niu and David Spivak, and Matteo Capucci) provides a general notion of systems, how they are related to each other and how they can be composed to form new systems. It therefore provides a promising setting for a general and rigorous formulation of the law of requisite variety.
One way to make progress in this direction is to reformulate the scenarious for which a law of requisite variety is already known within categorical systems theory. This may reveal that, for example the minimal DFA of the Myhill-Nerode Theorem corresponds to some kind of universal construction. It is then possible to check if the same universal construction works in the other cases as well.
<!-- We believe this is mainly due to a lack of tools available at the time for a single general formulation of this law. A formulation in terms of categorical systems theory would allow the expression of all the versions proposed by Ashby and potentially many more currently unknown versions.
This is due to the generality of categorical system theory (as found in the works of David Jaz Myers, Nelson Niu and David Spivak, and Matteo Capucci) which includes
-->
Progress made during the workshop:
During the workshop we considered the following scenario:
There is
* an *environment* given by a deterministic Moore machine $$f:Ey^E \to Ay^P$$
* a subset $\bar E \subseteq E$ (more generally a predicate on $E$) called the set of *good environment states*
* a subset $S \subseteq \bar E$ called the set of *starting states*
A solution to this problem is a Moore machine $$g:Vy^Y \to Py^A$$ such that the environment state remains within the set $\bar E$ of good environment states when started with any environment state in $S$.
The following is tentative:
Our approach first defines a set of *good actions* for any subset $B \subseteq \bar E$ as the set of actions that lead to a set of next environment states $f^\#_a(B)$ inside the good set $\bar E$ of environment states and for which this set of next environment states again has a nonempty set of good actions. Formally: $$GA(B) := \{a \in A: f^\#_a(B) \subseteq \bar E \text{ and } GA(f^\#_a(B)) \neq \emptyset\}$$
where $f^\#_a:E \to E$ for each $a \in A$ are the update functions of the environment Moore machine.
Then we define the (minimal) set $MA(e)$ of actions required to keep the environment state within $\bar E$ when starting in $e \in \bar E \subseteq E$. Assume that when the environment state is $e \in \bar E$ we already know for each good action $a \in GA(\{e\})$ (for which we already know that it is possible to keep the environment in $\bar E$ forever) the minimal set $MA(f^\#_a(\{e\}))$ of actions necessary to keep it in $\bar E$ starting in the successor state $f^\#_a(\{e\}) \in \bar E$. Then for each $a \in GA(\{e\})$ there are two possible cases:
* either that action is already an element of $MA(f^\#_a(\{e\}))$ and $MA(f^\#_a(\{e\}))=\{a\} \cup MA(f^\#_a(\{e\}))$ is already sufficient to keep the environment in $\bar E$ when starting in $e \in E$, or
* the action has to be added to $MA(f^\#_a(\{e\}))$ to get a larger set of actions $\{a\} \cup MA(f^\#_a(\{e\}))$ that can keep the environment in $\bar E$ when starting in $e \in E$.
To find the smallest set of actions $MA(e)$ we should then compare the cardinalities $|\{a\} \cup MA(f^\#_a(\{e\}))|$ across all good actions $a \in GA(\{e\})$ and find the smallest one.
This suggest
the function mapping any $e \in \bar E$ to the minimal set of actions necessary in order to keep the environment state in $\bar E$ is the solution to
$$MA(e) := \underset{\{\{a\} \cup MA(f^\#_a(e)) \,\mid\, a \in GA(\{e\})\}}{\operatorname{argmax}} |\{a\} \cup MA(f^\#_a(e)):a \in GA(\{e\})|.$$
For a deterministic Moore machine the set of internal states has to be at least as large as the set of different outputs it needs to produce so the minimal set of actions required to keep the environment in $\bar E$ is also a lower bound on the internal states that a Moore machine needs to keep the environment state in $\bar E$.
Note that all this ignores that the machine trying to keep the state in $\bar E$ doesn't observe $e$ directly and thus probably always needs more internal states such that it can figure out enough about the environment state to keep it in $\bar E$. So there should be a tighter lower bound that also takes this into account.