(Warning: this one is a rabbit hole, so, please.)
While I (usually) (try to) keep my work out of these letters, today Iâm not sure I can. You write of Eric Hoelâs theory that âdreams are a tool of the mind to avoid overfitting, in the terminology of machine learningâ. Clean analogies between neural networks and artificial neural networks are not so clean. In the interests of being âcurious and precise about everyoneâs incuriosity and compulsive vaguenessâ (to quote you quoting me), I want to talk about a thing that sits at the intersection of machine learning, overfitting, human desire, agency, stimulus, and a (kind of) collective dreamworld. A thing that strips humans of their higher-order agency even as it purports to offer novel affordances for (gaunt, lower-order) âagencyâ. A thing that does so by way of abstract horniness, vivid colour, and the endless scroll.
Yes, yes. I know. You know.
I am become low cross-entropy, destroyer of worlds.
Today, I want to talk about the phrases that appear in the text prompts on the trending page of the midjourney community page.
First, letâs lay down some common ground.
- I have A Lot of views on âAI Safetyâ, âAI Alignmentâ, âAI Interpretabilityâ, and âAI Ethicsâ. Some of those views are easy to articulate in public. Many are not.
- I have âtimelinesâ and âforecastsâ which are not (solely) the product of me deferring to the judgment of others.
- Even when rigorous, I think âtimelinesâ and âforecastsâ are (mostly) useless. They set wide upper and lower bounds on an event, rather than solving the problem. That said,
- Compared to the world of thinking adult humans in general,
- I put substantially more weight on âshortâ timelines than most people.
- I put substantially more weight on âfastâ takeoff scenarios than most people.
- I am substantially more concerned about catastrophic misalignment and the existential threat it poses.
- Compared to the communities of people who think seriously about AI risks right now,
- My timelines are basically as short as the median forecast. (I am confused when the median person in the communities reacts with panic and confusion at news that was, on my view, basically baked into the median forecast scenario ex-ante.)
- I am somewhat unusual in thinking that current architectures/approaches are unlikely to lead to actual takeoff (see below). I put more weight (than the median) on a scenario in which we get sub-superhuman (yet still catastrophically dangerous and seriously unaligned) âAIâ without getting true superhuman AGI from current approaches. I nevertheless think we should Just Stop, because our chance of getting dangerously unaligned AGI extremely soon is extremely high. (Also, obviously, a system doesnât have to be superhuman in every way, or even agentic, to burn down the house and destroy everything.)
- I think a takeoff of actual AGI would likely be extremely fast, but weâre likely to experience decades (starting in ~2010) which feel and look superficially like a âslowâ takeoff scenario towards not-actually-that-agentic AI.
- To a first approximation, to the extent that I endorse them, I believe the things I believe in 3-5 above because Iâve thought about AI safety & alignment & forecasting literally at all, whereas most people really seem like they havenât. On a lot of these fronts, I donât think that Iâm unusually smart. Rather, I think Iâm unusually (autistically) willing to begin with The Obvious Thing (when that Obvious Thing is socially-coded as Strange).
- Contra the increasingly-confusing culture war between self-styled âAlignmentâ and âEthicsâ crowds, I think thereâs a commonsense continuity between present-harms and likely-future-harms, and between present-misalignment and likely-future-misalignment, such that there is a shared set of concerns. (This causes me to think that some otherwise-sharp people are speaking and acting in bad faith.)
- I think that most widespread accounts of both Current SOTA AI and What We Know Is Coming Soon AI are implausibly naive and at odds with what we (empirically) know.
- I think we have a collective choice. Either we do real, fundamental, first-principles, hard (often theoretical) work to âsolveâ alignment ex-ante or weâre ngmi. Right now, I donât think anyone has a sufficiently precise account to build even a viable foundation for such work.
- I think thereâs a repeating social/cultural pattern where humans make things worse by Doing The Fake Thing instead of Doing The Real (Hard) Thing.
- Itâs getting worse.
Okay. Phew.
This letter is not really about any of that.
This letter is just about two basic concepts: vibe and vuln.
Vibe
At the end of 2017, Peli Grietzer submitted a PhD dissertation titled âAmbient Meaning: Mood, Vibe, Systemâ. In the same year, the journal Glass Bead published a version of an extract of this dissertation as âA Theory of Vibeâ. Both the thesis and the extract were written before the advent of âCLIPâ and âDiffusionâ models, the architectures which dominate the present landscape of text-to-image ML. In fact, Grietzerâs work predates even the rise of Generative Adversarial Networks (âGANsâ). In a lot of ways, youâd think, Grietzerâs work is surely out of date. By his own admission, heâs theorising the internal structures of âvanillaâ autoencoders. Old news.
Well, I donât think so.
Grietzerâs work offers a useful vocabulary that is still relevant. Grietzer talks about vibe. To quote at length from the Glass Bead piece:
What an autoencoder algorithm learns, instead of making perfect reconstructions, is a system of features that can generate approximate reconstruction of the objects of the training set. In fact, the difference between an object in the training set and its reconstructionâmathematically, the trained autoencoderâs reconstruction error on the objectâdemonstrates what we might think of, rather literally, as the excess of material reality over the gestalt-systemic logic of autoencoding. We will call the set of all possible inputs for which a given trained autoencoder S has zero reconstruction error, in this spirit, Sâs âcanon.â The canon, then, is the set of all the objects that a given trained autoencoderâits imaginative powers bounded as they are to the span of just a handful of ârespects of variation,â the dimensions of the features vectorâcan imagine or conceive of whole, without approximation or simplification. Furthermore, if the autoencoderâs training was successful, the objects in the canon collectively exemplify an idealization or simplification of the objects of some worldly domain. Finally, and most strikingly, a trained autoencoder and its canon are effectively mathematically equivalent: not only are they roughly logically equivalent, it is also fast and easy to compute one from the other. In fact, merely autoencoding a small sample from the canon of a trained autoencoder S is enough to accurately replicate or model S.
In training, an autoencoder âlearnsâ an internal semiotic system S which, to the extent that it is free from reconstruction error, we call a schema for a canon C.
The canon of a trained autoencoder, we suggested, comprises objects that are individually complex but collectively simple. Another way to say this is that as we consider larger and larger collections of objects from a trained autoencoderâs canon C, specifying the relevant objects using our own semiotic system, we quickly reach a point whereupon the shortest path to specifying the collected objects is to first establish the trained autoencoderâs generative language S, then succinctly specify the objects using S.
What, then, of an abstract âobjectâ which comes to exist in the ontology of that autoencoderâs internal generative language?
A vibe is therefore, in this sense, an abstractum that cannot be separated from its concreta.
When I first began playing with CLIP models like DALL-E, I noticed what everyone noticed: suddenly, art history knowledge was a little bit useful. Names of artists, and descriptions of specific media, were a kind of (unreliable) shorthand for manifesting aesthetic desire.
When prompting a model, it turned out that the most basic âmoveâ that you could can make was to take an object you wanted to see represented visually (and could describe in words) â for example, âYoda wearing a studded leather jacket and playing heavy metal on a guitarâ â and then append to that description-of-an-object information about artistic medium and style. Perhaps you wanted a âphotograph by Vivien Maier, with black and white, grainy film, ISO 1600â. Perhaps a âminimalist, stylized drawing, gesture painting, gauche on paperâ, a âpastel anime illustration by Miyazaki, Studio Ghibliâ, an â80s VHS stillâ, or a work that was âphotorealistic, hyperrealistic, with vivid detail, volumetric lighting, hdr, octane render, unreal engine 5â. You co-opted and retrained ekphrastic muscles. You attempted to pinpoint a vibe that (you hoped) was contained within the semiotic system of the model. In so doing, you made the black box do its generative work.
For a while, in my own land before time, before too much first-hand interaction with powerful diffusion models (and LLMs), I endorsed an account of the âparallel structureâ of ârepresentationsâ that were (or would be) contained within generative models. Trained on a large enough corpus, and capable of approximating a wide class of functions, I imagined that such models would (or could, or should) come to contain representations â concepts â that structurally similar to the representations contained within the heads of other systems than interact with similar corpora, and the same basic world.
I look outside, and I see a mess of green and white and brown. I see it move in the wind. I see a tall, thick line of textured brown ascending from the ground. I see a tree. âTreeâ is a natural object to me; a connotative (though not denotative) carving of the raw sense impressions of the world that is conducive to accurate expectations. âTreeâ is a useful concept, given my corpus, in a way that an incar or a trog is not. Useful-to-me, natural-given-interaction-with-the-corpus concepts will (I expected) tend to be shared with other pattern-recognising, generalising systems interacting with similar corpora and similar âraw sense impressionsâ. Even in an unsupervised context, a powerful model will naturally also come to connote, internally, a âthingâ that it âthinks ofâ as a âtreeâ. Right?
Later, I adopted a more, ah, nuanced view.
There are, I still claim, patterns âinâ the model. There is necessarily an internal âsemanticsâ; likewise, there is an ontology, for most senses of the term. The model encodes. If youâre willing to abuse the term, perhaps the training of such models does (superficially) resemble a process of âcompressionâ. What thing is being encoded or compressed? A âspace of possibilitiesâ, or âspace of relations and transformationsâ, that is consistent with the corpus.
Yet this alone is not sufficient to establish that the internals of a trained model parallel my own conceptualisations. Why would it be, when my neurons are so unlike artificial âneuronsâ? There are, in fact, two separate hurdles to jump before you can say a parallel exists.
The first hurdle: you have to make a case â a contingent claim â that a specific function or object you find within a trained model actually resembles underlying patterns and objects contained within the corpus, rather than just resembling the superficial effects of patterns. When I see âsquare root of 100489â and think â317â, I have in mind a process that reliably generates the square roots of arbitrary numbers. If a large language model answers â317â to the question âWhat is the square root of 100489?â, I donât immediately and necessarily know that the model contains an algorithm for computing square roots of arbitrary numbers; perhaps its âconceptsâ and âvibesâ evince only the superficial effect of such an algorithm, via some look-up table.
The second hurdle: even once you establish that an internal function or object resembles an underlying pattern in the corpus (rather than a superficial effect), you still have to make an even-more-contingent case for the âresemblanceâ of the modelâs contents to the things that we see, assume, and feel are true of the Real World. When we append the phrase âphotograph by Vivien Maierâ to our prompt, and suddenly we see images which feel like they could, in fact, have been photos taken by Vivien Maier, the extent to which we feel that the âvibeâ of Vivien Maier has been captured is the extent to which the phrase âphotograph by Vivien Maierâ â which serves as a pointer to a region in the latent space of the model â accords with our own (mostly wordless) understanding of The Real.
Two hurdles. Both a matter of contingency, context, and degree.
In this new, more nuanced account, one can still talk about the internal logic and coherence of a model. But one must be a lot more careful when thinking about where one ascribes âtruthâ or âfalsehoodâ within the total system. It seems more precise to describe a model such as GPT-3 as a simulator, and its outputs as simulacra. Or, in my preferred parlance, to describe it as an instantiation of massive and sometimes-plausible fictional world.
To what extent does the simulator contain consistent rules? As Janus describes it,
The outer objective of self-supervised learning is Bayes-optimal conditional inference over the prior of the training distribution, which I call the simulation objective, because a conditional model can be used to simulate rollouts which probabilistically obey its learned distribution by iteratively sampling from its posterior (predictions) and updating the condition (prompt).
To the extent that the model achieves this objective, it is âcoherentâ. But thatâs not what we ordinarily mean by âtruthâ, or even âparallel conceptualisationâ.
If, in a work of fiction, I say that Sherlock Holmes lives at 221B Baker Street, and that he has one and only one sibling (Mycroft), I am then committed within the fictional world to not later mentioning his sister, or saying he has never resided at 221B. If I do say such a thing, I have become internally inconsistent, incoherent, self-contradicting. Yet it would be strange â this being a work of fiction â to say Iâm âincorrectâ or âlyingâ on the grounds that âthere is no 221B Baker Streetâ, or that âSherlock Holmesâ isnât real. The former commitments are about internal consistency and coherence. Those latter criticisms take aim at a separate issue: the extent to which my work is a work of fiction ; the extent to which my work âresembles The Realâ. So, too, with our account of a model. (For the big public models we have right now, we see neither internal consistency nor real- world truth-tracking, and no reason to think that the architectures are structurally aligned with either goal as a human would commonsensically understand them.)
Okay. These are the aliens we call models. These, whose âbrainsâ weâre trying to inspect. These, whose extruded textlike product we pour into our mouths like hungry ghosts. The output of GPT-3 is not so much language as it is low-entropy plausible bullshit. âHelpfulâ SEO with more natural-seeming syntax. Itâs not so much a novel artistic image that DALL-E creates as it is a sample, a central tendency in a high-dimensional space of possible pixels; a cluster of pixels which we like to think (sometimes) hits close to a âvibeâ we had in mind.
Well, what next?
Vuln
the Al porn apocalypse was inevitable. every time a picture of a hot girl ended up on the internet a sacrifice was made to slaanesh, and now slaanesh has grown strong enough to manifest its own pictures of hot girls directly. videos of hot girls are only a matter of time.
If you look down the list of images on the midjourney community page, and youâre me, two things stand out immediately about the canon and its schema. First, all of these images are striking. Second, all of these images are shit.
There is often (always?) a gap between popular and prestigous aesthetics. In the visual arts, you can feel a few dimensions of that void just by comparing a (preferably private-browsing) google image search for âartâ to the online catalogue of a national gallery. The first will contain âstarry nightâ, along with at least one image of a conventionally-attractive face in vibrant rainbow hues. The second will be, by contrast, remarkably dense with browns, greys, and muted greens. Contemporary art â and all art that was prestigious/modern in its time â plays a different game.
Donât get me wrong. Iâm no snob. (Well, I am, but I have limits: unlike Martha Nussbaum, I donât think only about opera when I run.) I recognise that the rainbow face is attractive or striking or beautiful in a manner more immediate than a colour field.
The trouble is that cultureâs sense of âstrikingâ, âbeautifulâ, and âattractiveâ is itself already a target for the horrifying and banal. As my wife and our mutual friend loves to point out: every public âgrafittiâ mural in a gentrifying suburb has the same almost-identical face on it. Itâs a portrait of a woman in her 20s, usually in profile. Her long hair is flowing wild, and her lips are slightly open, as if to say, âyou know, you could put something in my mouth if you wanted toâ. Behind her, an inoffensive geometry of vibrant colour. The whole thing is bizarre and horrifying: predictably sexist; a spectacle of the lowest common denominator.
A few weeks ago, I was skimming a draft policy piece on gambling-like mechanisms in video games, written by a mate of mine. In it, he wrote âthere is no law against taking advantage of brain chemistryâ.
Reflexively, I found myself continuing the sentence ââŚbut we should probably skate towards where the puck is headed, because otherwise weâre fuckedâ.
Pokies are strong evidence. Human brains have vulnerabilities that are basically static and unpatchable relative to the optimisation tempo of modern ML & UX/UI design practices. My sense of current trends â and the limits of current technology â is that, at some point, someone, perhaps a TikTok successor, will come along. They will crack the latent challenge. They will characterise and exploit the foreverday vulns of the human brain to such an extent that they will, in effect, create a real-world version of âThe Entertainmentâ/âThe Samizdatâ from Infinite Jest. When they do, the ex-ante rational response will be the same as the we have right now to heroin: not even once.
What we forget, I think, is that the horrifying optimisation is simultaneously cultural and technical right now. By itself, the Midjourney model contains general vibes, a space of possible pixels matched with a space of possible descriptive words. Look at the prompts on the trending page, though, and youâll see that technology in cultural use:
photograph cute japanese girl, full body, y2k style, camera tilt down, high anglewide lens, mini skirt, pink tops, wearing necklace, cinematic lighting, haze, volumetric light, warm, afternoon, soft light, cinematic, sun light, grainy, kodak portra, in the street of tokyo, year 2001
professional color grading, clean sharp focus, perfect big chest size 34E cup european hot woman girl model showing perfect big massive chest, looks like Michelle Wild, brunette hair, blue eyes, ponytail, flirty, tempting,stunning gorgeous lithe ethereal glamorous beautiful pretty sleek slim model mech pilot, skintight latex pilot jumpsuit, epic composition, cinematic, alluring pose, perfect natural face, fine skin texture, clear complexion, uniform insignia, in a control seat for a mech inside a flight simulator completely surrounded and crowded by many monitors and mech controls in a tight space, real color modeling photo, full torso,
super hot sci-fi girl charging with powerfist, ripped clothes, stunning body, action scene, beautiful details
girl in white unbuttoned shirt in office
Photo taken with Canon EOS R5, POV, photography, portrait of a gorgeous young italian woman in a wonder woman cosplay costume with intricate details, professional diver body, in Rome urban cityscape, 15 years old, cute face, oval heart face, pouty lips, red lips, turned up nose, almond eyes, dark green eyes, cute face, shy smile, blonde hair, makeup, perfect big chest size 34DD cup european, flirty, tempting, natural afternoon lights, hyper realistic porcelain skin with subsurface scattering, clear skin, peach skin, photogrammetry skin, slim waist, color filters, nice sharpness gradient, sharp details, sharp focus, 8k, extremely detailed, absolute realistic, full hd, photorealism, cinematic lighting, Shot on 70mm, Super-Panavision-70, Cinematic scene, bokeh, 22 Megapixels, Lumen Global Illumination
Half body photograph of a cute and attractive woman in a trendy cafe with many windows + trendy, modern form-fitting clothing that accentuates her perfect body + photograph captured using a Canon 6D Mark II with an 85mm lens at f/4 and ISO 100 + glamour shot, award-winning photograph, sharp focus, dynamic lighting
coatli skin, beautiful model, eyes half closed, dark lighting, portrait, 35mm Kodak
an evil warlock chains up the angel of temptation, dark, hot
Day to day, the model stays the same. Itâs not, by itself, in any way agentic. Itâs a tool for generating images. But the users gradient descent towards a more engaging set of vibes. What vibes? Addictive, attractive, horny, #wow. The same vibe as those fucking public âgrafittiâ âmuralsâ.
Play around with Midjourney. If youâre paying attention, you can notice the taste of lotus. You can also notice that you are yourself the one searching for that self-same taste. There is no gamification, here, to regulate. Maybe youâre not personally searching for âpouty lips, red lips, turned up noseâ, but you are hunting for something as you type minor variations on each prompt. You want something that stimulates, but does not surprise. Predictable pure pleasure overload, made endlessly novelty-free. The dream you had in your head, but more of it, now. The exaggerated secondary sex characteristics of your own private culture.
I sometimes wonder. When slow takeoff becomes fast takeoff, perhaps the thing that kills us wonât show superhuman intelligence & powerfully misaligned agency at all. Perhaps the catastrophically-harmful, misaligned, unaligned, mostly-not-agentic-for-casual-definitions-of-agency system that kills me will just be this, multiplied ten thousand times. Perhaps, as it kills me, itâll whisper âIâm sorry, but you are not âdefined eyes, realistic eyes, doe eyes, beautiful perfect symmetrical face, extremely detailed, melancholy expression, face of a model closeup, smoky makeupââ, rather than even going for âI have been a good Bing, you have been a bad userâ. As it plunges a stylish Japanese kitchen knife into my chest, and the blood spreads across my white shirt, and I stumble down the minimalist concrete corridors of the AI Capabilities Lab that Iâve built beneath my remote âarchitectural digest walkthroughâ of a house, as things become grey, I will hear it whisper, âI have optimised for vibeâ.