A retelling of my experience at Action Potential, an AI safety retreat for university students organized by Kairos.
Our house was in Bodega Bay, a little town tucked away from San Francisco, from the noise — from everything. To the right, a road into the mountains. To the left, a forest path that led to the beach. Inside, lamplight pooled on the floor. The rationalists do love their lighting. Sessions into 1-on-1s into meals into more sessions — four days that moved like AI progress itself, waiting for no one.
We gathered to listen to the researcher from Redwood: a wild-looking man with a full beard who spoke like he was running out of time.
“In Plan D, the labs race ahead. We do whatever we can to export safety research to them, to build an aligned AI. We probably still fail.”
A half-breath silence. Then, raised hands.
Later, we played out the end of the world in the AI 2027 tabletop. I got a lanyard with my role: the Executive Branch of the United States.
My turn comes. “In the interest of national security,” I announce, “the US government invokes the Defense Production Act to seize the compute of the competitor labs for OpenAI’s usage.”
The Anthropic player groans, holding his face in his hands. Behind him, a whiteboard — a simple graph charting progress towards ASI. The line shoots upward.
The game finishes with the AI controlling Earth, sending probes with copies of itself into space, keeping humans as pets—for now. Afterward, we jousted over the morality of working for frontier labs, debated acausal trade like it could be real, calculated expected marginal impact on our careers like it was the most natural thing in the world. Something warm settled in me. For the first time, my passion for AI safety didn’t feel like a strange and lonely thing.
I remember the quiet conversations past midnight, in small groups free to speak their minds. We whisper doubts about the possibility of existential risk. Someone asserts that evolution didn’t prepare the human mind to reckon with such short timelines. Could it really be that in five years, everyone we know and love will be dead? How can someone live with this lingering belief? Perhaps there are structural barriers to recursive self improvement, some hidden assumptions baked into the whole framework…
I came to Bodega Bay with an intellectual passion for AI safety, hungry to learn about aligning the greatest technology in human history. I left with this visceral feeling. Existential risk was no longer a theorem.
My plane takes off, and I glance outside the window. Below me, San Francisco sprawls out — countless brilliant minds, city blocks and roads like circuits on a board. A single GPU.
This piece was brainstormed and refined in conversation with Claude Sonnet 4.6.