Notes on DeepMind’s ‘Open Problems in Cooperative AI’ paper

4 min readJul 25, 2024

I’ve been thinking about how RAG-based ‘chatbots’ might work in built environment planning, specifically looking at the planning system in England. Planning is an area I’m interested in, but its also a ‘sandbox’ for thinking about how AI might function in organisations and institutions more generally.

I’ve realised that thinking about individual chatbots is missing a much bigger systemic change that AI could bring.

Planning documents, specifically Local & Neighbourhood plans, are key parts of the planning system, setting out communities long term ambitions for planning. They are incredibly hard to write, sometimes taking years. So once you’ve written them, they need to be valid for a long time — typically at least a decade.

If an AI tool makes planning documents a bit quicker to write, that would be helpful. But… if AIs make planning documents really quick to write, like or weeks or days or hours, perhaps you can take a totally different approach? Taking the thought experiment to the extreme we could imagine a system of AI agents that dynamically articulate and negotiate participant’s interests without any need to author monolithic decade long documents.

You’d be right to be skeptical about how plausible or desirable this is. But it also might not be as implausible as it first seems, and it feels like a really like taking a long view might also help us see the shorter-term impacts with more clarity.

This thought lead me to investigating Cooperative AI, which led me in turn to DeepMinds’s Open Problems in Cooperative AI paper from 2020.

The paper sets out a whole program of research and I felt like it was worth a post discussing it.

The paper itself attempts to define what a Cooperative AI is, and then lists the capabilities cooperative AI’s will need.

According to the paper, Cooperative AI is a subject that concerns:

AI research trying to help individuals, humans and machines, to find ways to improve their joint welfare.

Do machines have welfare? Doesn’t this definition potentially encompass all AI? I’m not sure. Definitions are always hard and often too ‘greedy’.

Much more usefully, the paper provides a list of four capabilities that AIs will need in order to cooperate:

Understanding
A key feature of a Cooperative AI will be: “(1) predictions of the consequences of actions, (2) predictions of another’s behaviour, or (3) contributing factors to behaviour such as another’s beliefs and preferences.” The gist is, AIs that want to reason about cooperation will have to understand context, particularly the objectives of other agents in the system.

Communication
In many cooperative situations, the most effective ways to reach an optimal outcome is for agents to communicate their intentions to one another.

Commitment
When AI agents communicate to solve a cooperative problem, it’s helpful for them to be able to demonstrate genuine commitment to an action, along the lines of signing a binding contract.

Institutions
In the paper, institutions roughly represent building up a durable system where AIs can understand, communicate and signal commitment.

These capabilities could sound simple, but they are far from it. The paper highlights — among other ideas — the importance of ‘recursive beliefs’ for understanding — that is, an AI has to understand what other AI believes, or understand what another AI believes that a third AI believes.

The paper also highlights the importance of teaching to Cooperative AI — that is that humans and machines create systems that allow teaching from one (human or machine) agent to another. Both of these ideas demonstrate the complexity that may be required in a cooperative AI based system.

Downsides
This post is already too long , so I’ll be brief — the authors note exclusion and coercion as downsides. These are incredibly important to mitigate, but they are fairly intuitive notions. I note another related issue below — transparency and accountability.

Some reflections on the paper

Although the authors argue that Cooperative AI is an interdisciplinary project, much of their language draws on game theory and quantitive reasoning. I wonder how much linguistic approaches using LLMs can might support building institutions for cooperation between AIs. Some research seems to be underway in this field.
When reasoning about Cooperative AI, it would be helpful to have a some consensus on how it might be tested and measured. To me, this seems like a priority not just for demonstrating the effectiveness of the tech, but also for communicating ideas between disciplines and agreeing research norms.
While the paper mentions the idea of a central planner as a ‘lens’, and acknowledges all kinds of cooperation topologies between humans and machines, the focus of discussion is on interactions between AIs. This sidelines the idea of an actual AI ‘central planner’ that all participants trust to equitably resolve competing claims. One central AI could be the simplest approach to coordination within a system, if it could earn participants trust.
One very valuable application of Cooperative AI might be understanding cooperation — that is, Cooperative AI, as a laboratory for understanding human activity. Even if Cooperative AI is never applied ‘in the wild’, this could be a huge contribution.
When I think about how Cooperative AI might play out in the planning domain, transparency and responsibility are huge area that would need to be addressed.

My predisposition is to look at academic literature, the next stop is to see how industry is exploring cooperative AI.

Notes on DeepMind’s ‘Open Problems in Cooperative AI’ paper

Written by Jimmy Tidey