Talk of ‘subagents’ is getting pretty popular in certain circles. Given your delightful use of ‘stilts’—which I take to be a pretty elegant metaphor for the way that an oft-used frame of ‘friends as supports’ really works from the inside—I think it might be valuable for us to untangle some of the other metaphors that this intersects with. I’m hoping this will make future discussion a little easier.

As the title of this letter suggests, there are four terms I want to gesture towards: subagents, supports, stilts, and shims. The first two are ‘in the air’. The third is yours. The last is mine. Let’s take them in turn.

First, there’s subagents. I take the subagents view to be (basically) rooted in some variation on the following general structure of insights:

  1. We tend to think of individual humans as unified/coherent ‘agents’. This is an imperfect model (or ‘leaky abstraction’ if you prefer).
  2. In reality, most individual humans are (to at least some extent) fragmented/incoherent in the sense that they exhibit (and usually report having) internally conflicting desires, motivations, and needs. As a result, most individual humans take not-totally-coherent actions. It’s possible to ‘let yourself get the better of your better self’.
  3. Given (2), a better model is that a given individual human is ‘composed of [sub]agents’, where each of those subagents has its own (more coherent) wants and needs.
  4. The action of an individual human arises out of (some kind of) process of negotiation between that individual human’s subagents.

There are various approaches which make radically different claims on top of this basic structure. If you don’t mind a bit of woo with your crumpets, there are theraputic frameworks such as ‘Internal Family Systems’ (IFS) which suggest techniques by which ‘you’ can access, name, and stage productive conversations between your subagents. If instead you’re more into “trying desperately to make actual progress towards being able to guarantee robustly safe & aligned AI before it’s too late”, then you can use this subagent model as a way into more general work on the possibility of ensuring that any truly powerful systems we build are both ‘inner’ and ‘outer’ aligned in various important senses of those terms. And then, of course, as always, there’s a universe of stuff that builds on that trauma book, and points out the extent to which symptoms of incoherence & conflict between these ‘subagents’—including, perhaps, ‘akrasia’—overlap with certain kinds of trauma symptoms.

You asked:

What are the consequences of letting go of the brain as a machine, and favouring the brain as a parliament? By introducing politics into our internal decision making models?

I take discussions of subagents, and the negotiated resolution of disagreements between subagents, to be at least in part discussions of that.

Second, there’s talk of (social) supports. In my mind, this kind of discourse shades all the way from vanilla ‘self-care’ to social justice ‘solidarity’ to anarchist infrastructures of ‘community’. Whereas the universe of ‘subagents’ discourse focuses on a rethinking of individual agents, my sense is that the ‘social support’ discourse is instead rooted a desire to reorient outwards, towards a line of claims that’s something like:

  1. We think of ourselves as atomic, self-sufficient individuals.
  2. In reality, we’re all intensely embedded in networks of relations to other individuals.
  3. Given (2), we should consciously build and maintain strong, positive relations to other individuals so that we ‘support’ and ‘be supported’ by them when bad things happen.
  4. Most people don’t do (3) enough.

This is all fine and good. I bounce off most self-care-is-solidarity-and-community talk, as you know. But this is all fine. (Honestly, I dream of one day stumbling onto a peer-reviewed paper that suggests ‘community’ is in fact unimportant to self-reported wellbeing, or that strong/positive social ties have absolutely no impact on any measured physiological stress markers. At least then the little Welsh troll at the base of my skull that whispers phrases like ‘file draw effect’ would shut up.)

Third, there’s your addition: stilts. I see this as something like ‘supports, but with awareness of subagents’. I take the thing you’re saying (indirectly) to be something like:

  1. If you introspect, you notice that neither ‘individual identity’ nor ‘interpersonal relations’ are particularly well-defined concepts.
  2. In practice, any serious attempt to be a coherent agent in the world entails some amount of ‘being able to rely on interpersonal relationships to hold the-you-that-you-are up’, and some amount of ‘recognising that those interpersonal relationships also alter, direct, and transform the-you-that-you-are (and the direction ’you’ go in)’.
  3. There are inherent tensions in (2). In particular: the people at the other end of the interpersonal relationships are themselves constantly shifting.
  4. ‘Balancing’ in light of (3) is less of an identifiable ‘state of affairs’ and more of a ‘constant process’.

I’m reminded of the famous line from Clifford Geertz, in ‘Thick Description’:

The concept of culture I espouse, and whose utility the essays below attempt to demonstrate, is essentially a semiotic one. Believing, with Max Weber, that man is an animal suspended in webs of significance he himself has spun, I take culture to be those webs, and the analysis of it to be therefore not an experimental science in search of law but an interpretive one in search of meaning. It is explication I am after, construing social expressions on their surface enigmatical. But this pronouncement, a doctrine in a clause, demands itself some explication. (Geertz 5)

In the stilts account, you come to understand (and tame?) the “solipsistic self” which (inevitably?) intrudes by tracing the webs of significance in which you find yourself suspended. In a coarse-grained, first-pass, quick-and-dirty sort of way, self is world. In a more careful analysis, ‘self’ and ‘world’ are phenomenologically suspect terms. Insofar as there’s a habit to be developed, here, or a reflex to be overcome, it’s something like ‘learning to stop unconsiously replacing the phrase “to help understand the world around me” with the phrase “to help understand myself”’. Here, relations are just too entangled with selves —‘supports’ too entangled with ‘subagents’—for us to have any hope of explicating one without simultaneous explication/construal of the other. No sitting in a darkened cave, alone, and hoping for deep understanding that will generalise. Be pragmatic.

This brings us, finally, to a new metaphor I want to introduce: shims.

In general, a ‘shim’ is a small, thin, tapered or wedged piece of material. A thing you use to close gaps, modify spaces, make things level, or provide a smoother interface. A shim is that little piece of metal that you slide underneath the foot of your lathe when you install it so that you can be sure that it’s perfectly level. A shim is the bit of wood you use to align a gap between two large chunks of timber when you’re building the frame of a cottage wall. In a pinch, when you’re picking a padlock, a shim is the thin piece of a Red Bull can that you cut with your Leatherman and then slide into the space between the shackle and the lock body to bypass the catch mechanism. And in computing, ‘shims’ are libraries that intercept calls to one API and transparently reformulates the call (or redirects it entirely) so that it can be handled by a different API. Writing a ‘shim’ is the thing you do when you need older code to be compatible with newer code; as Axel Rauschmayer puts it, “a shim is a library that brings a new API to an older environment, using only the means of that environment”.It’s solid material as lubricant, as glue, as improvised spacer.

In this metaphor, I claim, humans are often trying to create (or discover, or repair, or modify, or share) shims of various sorts. At the level of individual humans interacting only with themselves, ‘shims’ are the habits we form (and masts we decide to bind ourselves to) so that we can fill some of the gaps between our subagents and reduce the damage they would otherwise do to each other when they rattle about. And at the level of the ‘interpersonal’, shims are the things we use to make our relationships more manageable and robustly positive-sum (or our stilts more easy to balance on). They can be rules, or habits, or psychological technologies, or physical technologies. The key is that a metaphorical shim, here, is

  1. small,
  2. often transparent once installed, and
  3. either (a) used to fill an otherwise-damaging gap between two systems, or (b) used to create a useful gap where otherwise a damaging, friction-filled interface would exist.

When I decide to answer the phone if and only if I reflexively smile when I see the name of the contact that’s calling me, I’m (unilaterally) acting to increase the energy I put into the vibrantly positive-sum relationships in my life. That blanket decision rule is a shim.

When Zvi describes having lost 100 pounds using Timeless Decision Theory, I claim that—in the moment of actually internalising the insight that “sticking to the rules I’d decided to follow meant I would stick to rules I’d decided to follow”—Zvi was (in effect) wedging a generalised shim into the spaces between some of the-subagents-that-exist-within-the-human-agent-we-designate-as-Zvi.

And when you and I commit to writing these letters to one another, in public, as a continuation of conversations that began in private over a decade ago, this blog is a shim.