Theo's Substack
Theo Jaffee Podcast
#3: Zvi Mowshowitz
0:00
-2:21:13

#3: Zvi Mowshowitz

Rationality, Writing, Public Policy, and AI

Intro (0:00)

Theo: Welcome to episode 3 of the Theo Jaffee podcast. Today, I had the pleasure of speaking with Zvi Mowshowitz. Zvi was one of the most successful players of all time of the tabletop card game Magic: The Gathering, and later became a professional trader, market maker, business owner, and one of my favorite writers and thinkers on the topics of AI, rationality, and public policy, all topics we discuss at length in this episode. I encourage you all to follow him on Twitter @TheZvi, and subscribe to his blog, Don't Worry About the Vase, at thezvi.substack.com. This is the Theo Jaffee podcast, thank you for listening, and now here's Zvi Mowshowitz.

Zvi’s Background (0:42)

Theo: Hi everyone, welcome back to Episode 3 of the Theo Jaffee podcast, and I'm here with The Zvi, Zvi Mowshowitz. So I guess we'll start with a question I had from the very beginning, which is, how exactly did you go from Magic: The Gathering player to a rationalist blogger and one of my favorite AI writers in the entire space?

Zvi: Along a winding path, I would say, not with a grand plan. I was in the rationalist spaces basically from the beginning, really early while I was still playing. That was just an interesting hobby of mine. There's a lot of overlap, philosophically, and in terms of the type of people who go around such spaces. So I moved from playing Magic and writing about Magic, to making games, to being involved in trading, to starting businesses. But the entire time, I was worried about AI back in 2007, that style of concern. But I always thought, let other people handle the writing, let other people handle the big questions. My job is to maybe help the community, be a community, maybe help people think about some of the questions better in some ways. But I'm not the communicator here, I focus elsewhere.

But then after a while, I started writing, got into writing things for my own edification, and that turned into writing about COVID. And then when COVID died down, I jokingly posted on Twitter, “weekly COVID posts to be replaced by weekly AI posts”, because things were just going completely nuts. And then rather than everybody going, ha ha, yeah, funny one, I'm kind of busy too. Everyone responded, yeah, that's great, let's do that. And so I said, well, damn it, I guess I'm doing this. And what are we on, number 24 now? Yeah, we're on 24. And 25 exists, it's already pretty long. I'll keep updating them day to day for the foreseeable future.

Theo: Yeah, so for those who don't know, Zvi has an awesome weekly newsletter about AI. It's even got like sections that don't really change from week to week. You know, we got Large Language Models Offer Mundane Utility, Large Language Models Don't Offer Mundane Utility, People Are Worried About AI, People Aren't Worried About AI. But going back a little bit, how did you get into Magic in the first place?

Zvi: So I got in the way a lot of people got in back then, which was it was 93, or it grew in 94, I don't remember exactly. But I was passing the halls of my high school, and I saw people playing with these cards on the ground, those cards, people playing the student union. I said, what are those cards? And they hand me a rule book, I go off to camp, and then there's like people playing with these cards. And I asked what these cards, and they explained to me what these cards are. And it just looked like a lot of fun. And I convinced someone at camp to sell me 10 mountains and 10 red cards. And for like a couple of hours, and I would split that with my opponents, and we'd play with that deck. And then when I got home, me and my best friend bought starters, and we're off to the races.

Theo: Oh, so when you were growing up, did you read any of the kind of like classic genres that get people into rationality, like sci-fi, like fantasy, like science?

Zvi: I mean, I read plenty of sci-fi, plenty of science, you know, I read a lot of ad mobs, that kind of thing. I read some of that fantasy, I played a lot of RPGs, stuff like that. But I would say I was brought in directly by the old school FOOM debates between Hanson and Yudkowsky, and we went from there. But I was already thinking about the same kinds of things that rationalists talk about. I was just using my own language and using my own modes of models and thinking, because that's just how I think about the world naturally.

Theo: Did you think about AI before you were introduced to it by Yudkowsky?

Zvi: No, it was not on my radar screen. It did not occur to me that this was coming, or think about the considerations. But the considerations seemed pretty obvious, and I was won over almost immediately in terms of that point of view.

Theo: Yeah, it makes sense. And going back to, I guess, your bridge between Magic player and writer was trading and starting businesses. So can you go into a little more detail on that?

Zvi: Also, I would say that my original writing itself was Magic. During Magic, I would write about Magic, right? Because I was one of the people who was developing new ideas, making new thoughts, and I would write them up, write up my new decks, write up my new ideas. And that also just sort of happened by accident, because back then we had a website called The Dojo, where people would just publicly post their decks, and their tournament reports, and their analyses. And then the founder of The Dojo was accidentally included on an email list of my Magic team, and saw some of my posts, and was like, can I post this? I guess so, sure. And he posted it, and it got a positive response, and I was like, can I write more posts? And this was at a time when I was getting seized in logic and rhetoric, and so I was being told by the system that I couldn't write. But in the real world, people seemed to like my writing, and so I wrote more, and the way you become a good writer is you write, you write, you write, and you write a lot more. So I wrote, wrote, wrote, wrote, wrote, and I got better at writing. And then, in terms of the trading, then led to me wanting to write about various things that weren't Magic, and I started a blog just for a small number of people, to share my ideas. Then, one thing led to another, and when COVID happened, I realized I needed to write about it to understand it. I wouldn't be able to work through it and understand what I think if I didn't write about it.

Theo: This is a pretty similar story to a lot of the bloggers, podcasters, and similar people who I talk to. They say “yeah, I just kinda started writing just for fun, and then, long story short, it blew up.” I think that’s some pretty remarkable consistency.

Rationalism (6:28)

Theo: So, about rationalism, how would you explain rationalism to a total layperson? How would you explain rationalism to someone's parents, someone's grandparents?

Zvi: No one asks me that question quite that way, but rationalism is the art of figuring out what is happening, thinking logically about what is going on, how to model the world, and then making good decisions on that basis that lead to things that you want. Rationality is just Bayes’ rule. It's thinking clearly. It's just the way that the universe actually works. You should engage with it and ask how it works and how you figure out what will cause what to happen.

Theo: But there are some particulars to the specific, LessWrong style brand of rationalism, like Bayes’ theorem. Some people don't emphasize Bayes’ theorem as much, like decision theory and other things like that. So why do you think those play into your brand of rationalism more?

Zvi: I think that when you're trying to do this kind of systematic modeling of the world, these come up as natural questions to be asking. How do I make my decisions? On what basis do I decide what to do? And then if you're no longer working on instinct, you have to develop a theory as to how you do that. If you're trying to figure out probabilities, you quickly run into Bayes’ theorem. You're trying to figure out what is likely to be true and likely to be not true. You run into Bayes’ theorem. And Bayes’ theorem is just the nature of reality. It's the nature of probability. There's no escaping it. You would reinvent it if Bayes hadn't come up with it. And then you work from there.

So the way I see it is most people, instead of going down the rationality track, go in kind of an intuitive track. They gather data from the world. They update similarly to the way a language model would update. They notice the vibe. They notice the implications. They notice the associations. They update their associations and vibes and intuitions in that direction. They get feedback from the world. They try stuff. They see what works, doesn't work. They use empiricism. And this works pretty well for most people in normal practical circumstances. And trying to reason about the world carefully would not immediately yield better results. It would yield confusion, right? You'd be throwing out all of this data that they don't know how to use properly anymore, all of these accumulated adaptations and adjustments.

And so to start down this road, you either have to be inherently much more curious about trying to figure things out on that level and just doing it for its own sake, or you have to have that combined with some combination of, the current systems aren't working for me. My attempts to do all of this intuitively doesn't work out because the systems are designed for someone who has different intuitions and different interests and different modes of play than I do, and who has different opportunities and experiences. And if you fall behind on that kind of training track, right? Because the way you train a human normally is similar to the way that you train a model in that it's designed to introduce you to various forms of updating various forms of data when you're ready to handle them. And so if you fall behind, it's similar to what happens in a foreign language class where suddenly everyone starts talking gibberish and you can't follow it. And by the time you've tried to get a handle on something, you're even further behind than you were before. And then the problem just snowballs for you. And then just thinking about things from first principles offers you an alternative way out.

Theo: Interesting. Do you think rationalism as such, LessWrong style rationalism is applicable to all areas of life? Like for example, dating. I think dating in particular is something where people typically don't think of it in terms of rationalism and theories. And maybe on average, the people who do are worse off than the people who don't just based off of general observation. So do you think there's some room for intuition in areas like that?

Zvi: So always be careful about correlation and causation, obviously, and always be careful about people who take things too far and who kind of morph into a straw Vulcan or something like that. I must use logic to figure out all these things. I must ignore my feelings. That's a huge mistake. That does not actually lead to success. Good rationalists understand that you do not throw out all of this very good training data, all of these very good adjustments and intuitions in ways that your brain is designed to handle exactly these types of problems. To ignore all of that would be a huge mistake. But if you ignore the ability to think carefully and rationally about your situation, that's also a huge mistake. Like people make huge mistakes in dating and love and relationships because they just never stop to think about what would be a good or bad idea, what would be the consequences of various different strategies, what has higher and lower expected value. But if that's all you think about, that's the only way you deal with things. If you're not able to live in the moment, it's going to go badly for you. You ideally want to combine these two things. And the more that you got shut out of the traditional paths, the more you need to rely more and for a longer time in more a need for explicit rationality before it becomes sufficiently ingrained that you can then live in the moment in more areas on explicit rationality before it becomes sufficiently ingrained that you can then again live in the moment. Because the way that most people learn how to navigate these problems is by navigating these problems through trial and error and experience. But that’s experience-gated. If you're not good enough, you'll be denied the opportunities to get better, and you will never improve. Logical thinking can offer a way out of that if you don't get trapped in the system of only thinking logically. The reason why we say people who think logically and carefully about dating and relationships tend to be worse at dating and relationships is because they needed it. That's why they came to the problem in this way. That's why they adopted these strategies. They were shut out of the traditional paths that allowed them to not use this. They were unhappy with their level of success, so they're taking this different approach. Some of these people don't know how to live in the moment. They have to learn how to do that. But I think mostly it reflects reverse causation. It's that because they were not doing so well, they then took a logical approach, not that they took a logical approach and the logical approach is bad.

Theo: Going back to what you said about living in the moment and about the importance of living in the moment, how would you explain to, say, a long-termist, an effective altruist, somebody who thinks that they should only be caring about, say, AI doom, or they should only be caring about the welfare of children in Africa, the instrumental good of living in the moment sometimes?

Zvi: I wouldn't try to argue with them that it's inherently good to do this because I don't think that could be convincing. I think they have a reasonable counterargument there. But I would argue, yes, it's instrumentally good. This is how humans end up associating and cooperating and learning and tuning and creating the world that we want to live in. Tuning and creating value and having a good time and staying sane and forming all of the associations and useful apparatus that you use to accomplish all the things, like caring about AI in your long-term future and being able to help people in Africa. If you ignore these other things, you will end up very ineffective. And I think that just looking around bears that out very much so. You need to first get your house in order and get your local house in order. And only then can you hope to be useful in impacting these problems that you might consider more important.

Theo: So with rationalism and rationality, what do you think are some of the best resources for people to read if they want to be more rational? Have you read the Sequences?

Zvi: I have read the Sequences. Some people would say they're dated in some ways, but I think that's not the case or at least not an overriding consideration. I think there's no substitute for the original. They read quickly. They read breezily. You can read one at a time. You can pick and choose. You can choose different sub-sequences as you need them or as they're interesting to you. But I would say it's some of the best rationality writing. It still is. And you should definitely use it. If you're looking for a more casual approach, HPMOR speaks to a lot of people, especially people who are already familiar with Potter. And that makes a lot of sense. But also, rationalism does not require you to go through a curriculum and a corpus specifically. That's not the right approach all the time. It's the right approach for some people some of the time. If that's what they want to be doing, that's what interests them, that's how they best learn. But one thing you can do is when you read my blog, the goal is, well, I'm going to infuse it with this type of rationalist mode of thinking continuously throughout in a way that you can pick that up just by following my lines of reasoning and asking, well, do I think about that the same way? And I think that's a good argument. I think that's a bad argument. Do I think this person is thinking about things well? How would I think about this consideration? And also, we have these tremendous tools now, like LMs, right? You can just ask questions of GPT-4 or Claude 2 if you're curious about these questions. And get answers, like, what does this term mean? And you can also just look up the individual sequences as people refer to things or as concepts come up. You should use every tool at your disposal to learn the way that you best learn. For me, that was to a large extent the Sequences, but also to a large extent it's figuring this out for first principles by going through my life and thinking about things, especially when I was engaged in pretty explosive gambling, which is always very good for developing rationality.

Theo: I wonder if there's somewhere out there a kind of shorter, more friendly introduction to the layperson than, say, the Sequences or HPMOR. One of the common critiques I hear of rationalism is that the writings are very long-winded and hard to follow if you're not a nerd.

Zvi: I mean, Bayes’ rule is pretty short. As Scott Alexander said, the rest is commentary. You have this simple rule for understanding what the probability is of things being true, and then you reason from it and you apply it to everything around you and to your life. You should interact with the formal writings to the extent that that is a useful thing for you. Beyond that, do I wish there was a shorter introduction that actually covered these things? That would be great. But there's a long history of people trying to write that and discovering that it's really hard to write a concise introduction that covers the basic concepts in a way that people learn from. You can spew out the technical words that technically comprise the things you want people to know. You can write the cheat sheet of rationality, as it were, that you would take into a test. But people don't learn from that cheat sheet. At least not so far. You have posters you put on walls. But that poster, someone who's new to rationality has to just ask someone, "What do all these words mean?" Because this poster does not explain it in a way that lets people learn it.

Theo: I remember asking GPT-4, “Can you explain Yudkowsky's rationalism to me?” And it was like, “Yeah, sure, here you go. Number one, Bayes' rule. Number two, decision theory.” And it just wasn't very good. It was kind of missing something.

Zvi: What's missing is that you can't get that on a deep level just by technically knowing the words, the core words and the core moves. You have to practice. You have to go through cycles in your brain. You need a lot of training data. You need a lot of examples. You need a lot of specifics. You need to think through problems on your own.

If you think about it, there's a lot of classes you take in college, where you could write down everything you are trying to take away from that class on one piece of paper. This was true for almost every class I took in economics, almost every class I took in mathematics, and so on. But if you just give someone that piece of paper, it will not help them very much. If they don't go through the time of exercises and lectures and discussions and problems and thought experiments and work, then they can't just pick it up like that. You can't just derive it. You can't just say, “Oh, obviously, that's just the fundamental equations of modern analysis. And these are just obviously true. Why do I need an entire class to learn this?” But you do in practice. Human beings need this. They can't just learn that way.

And so I don't think you can just compress this any more than you can learn a foreign language by reading a dictionary, unless you're one of a very small number of weird people.

Theo: I agree with that.

Critiques of Rationalism (20:08)

Theo: So you talked about what Scott Alexander said, “Bayes' rule, and the rest is commentary.” And of course, a lot of rationalists believe Bayes' rule to be the central mechanism of rationality and decision-making. But one kind of rationalist-adjacent person who doesn't think that way is David Deutsch. He has an article, which is very short. I have it pulled up. I'll paraphrase it. Called “Simple Refutation of the Bayesian Philosophy of Science,” where he says, “by Bayesian philosophy of science, I mean the position that, one, the objective of science is or should be to increase our credence for true theories, and that, two, the credence is held by a rational thinker obey the probability calculus.

However, if T is an explanatory theory, like the sun is powered by nuclear fusion, then its negation not T, the sun is not powered by nuclear fusion, is not an explanation at all. Therefore, suppose that one could quantify the property that science strives to maximize. If T had an amount Q of that, then not T would have none at all, not 1 minus Q, as the probability calculus would require if Q were a probability. Also, the conjunction of two mutually inconsistent explanatory theories, such as quantum theory and relativity, is provably false, and therefore has zero probability. Yet it embodies some understanding of the world and is definitely better than nothing.

Furthermore, if we expect that all our best theories of fundamental physics are going to be superseded eventually, and we therefore believe their negations, it is still those false theories, not their true negations, that constitute all our deepest knowledge of physics. What science really seeks to maximize, or rather create, is explanatory power.”

Zvi: So I notice I am basically confused by why he thinks this argument refutes Bayesian logic. This seems to be a case of, “I will kind of caricature your position as being something that is slightly different than your actual position in order to make it technically useless according to this definition that I proposed that you never actually believed or something”. I mean, I understand why he's thinking this on some level, but it seems kind of highly uncharitable and a very clear failure of the intellectual Turing test to model what the Bayesians actually believe.

So if a rationalist back in the day was named Isaac Newton and he was developing Newton's laws of motion, the Bayesian hypothesis would not be, I want to know the exact probability that Newton's laws are an exact description of the entire physics of the universe. It would be, I want to know whether or not Newton's laws are in practice an explanatory tool that I can use to predict what's going to happen much better than what I had before Newton's laws. A Bayesian wouldn't say Newton's laws are almost certainly false, therefore they are useless, and the important thing is to believe that Newton's laws are wrong with a probability of 1 minus epsilon. They would say, well, what is the probability that Newton's laws are a good description of non-relativistic physical action in practice in a lot of situations, and how useful are they, and how likely are they to be how useful in what situations?

And similarly with relativity, similarly with quantum mechanics, and similarly with various ways of combining them. And if I were to learn that the sun is not a nuclear furnace, that is not zero explanatory power in the sense that that's an important fact about the universe that I would want to know. That has a lot of implications, and it teaches me that I should look for an explanation elsewhere. I'm not trying to maximize the amount of statements I can make right now that in some sense are positive rather than negative. I want to believe true things and not false things.I believe in not accepting false things and believing things with the right probabilities. This approach leads me to draw correct conclusions in practice about many things. However, this is a standard Stravokin-style critique of various forms of rationality, not only Bayes, which essentially says you believe in logic, so obviously you believe only in this bare, abstract, fully argumental, ad absurdum form of logic. But that obviously leaves out these important things, and that's bad. It's not how any reasonable person would strive to think. We all, almost all people who think about rationality and take it seriously, understand this.

When he said the conjunction of two mutually inconsistent explanatory theories, such as quantum theory and relativity, is provably false and therefore has zero probability, what would a Bayesian say about that? Because clearly quantum theory and relativity are two of our best explanations about physics. Right, so what we're saying is quantum theory can in its current form be literally true in all of its equations and proposals, while simultaneously relativistic theories cannot at the same time also be literally true in all of their details and specifics.

That is an important fact about the world. If you were a physicist who was trying to create a grand unified theory that explained both of these things, or was trying to work on either of these things on their own, you would very much want to realize that these two things are contradictory, and that one of them must in at least some way be wrong, while simultaneously holding in your head that both of these things are our best known methods for explaining phenomena of that type and approximating what we should expect to observe. But yes, either quantum theory is not the final theory, or relativity is not the final theory. It doesn't mean we have to throw them out as useless.

Theo: That makes sense. Another critique of rationalism that I've heard is an article from Applied Divinity Studies called Where Are All the Successful Rationalists? Where he talks about, it seems like the most successful people in society, like billionaires, Nobel Prize winners, whoever, either aren't aware of rationalism, or maybe they're aware of it, but they don't endorse it, like people like Tyler Cowen. So how would you explain this?

Zvi: Interestingly, Tyler Cowen, who I very much respect, and he's a great intellectual who I get a lot of value out of, you’ve grouped in with the billionaires and the most successful people on the planet. Like, if Tyler Cowen counts as one of those people, then I think we have some pretty big successes, too. The most successful people in, like, I don't know, this sphere of the world that we see on Twitter.

Theo: I would say, you know, I would just meet the premise. I would start there. Like, yes, we haven't had that many billionaires, but being a billionaire is a very low probability outcome, even of very good strategies, requires a lot of luck, and requires time. And we haven't been around that long, and there aren't very many of us. And we don't have zero billionaires, right?

Theo: Dustin Moskovitz?

Zvi: Sam Bankman-Fried. Look, you can say whatever you want about how it ended and how many mistakes he made, but if you're just counting billionaires, Sam Bankman-Fried, like a lot of other billionaires are also no good scum of the earth who, like, lied and stole people's money to get there. You know, people say behind every great fortune is a great crime, and it's not actually true, but behind a lot of them, right?

We have other billionaires. Jaan Tallinn, I think, certainly counts as a, you know, acquired his fortune the right way, has a lot of money, is trying to use it to do good, but, like, has very much crossed over into the billionaire horizon. Do we really need that many more examples before we are over-performing rather than under-performing here? The rationalist groups that I came up in, basically everybody is now at least a millionaire. Everybody is doing well and successful. We're raising families. We're having good general lives. We're pretty happy. We're pretty intellectually stimulated.

Also, why don't we look at the intellectual, you know, fruits of rationality? Like, look at the AI space, right? Like, Sam Altman, Demis Hassabis, and Dario Amodei, the three founders of the three major labs, all got their theories and concepts directly from Yudkowsky and rationality. All three of them think, at least in part, in these ways about these problems. That is a huge intellectual legacy and successful influence on the world, whether or not it was what we wanted or had in mind. For a movement of this size over this length of time, we are going to transform the world whether we like it or intend it or not, right? And whether it’s for good or bad.

So it’s really strange to go into this world and say, like, why are you people all losing? When I, for one, don't feel like I'm losing, I feel like I'm winning. And yeah, my startups didn't make me a billionaire, but, you know, most startups don't do that. I would also say that, you know, the decision, again, to study rationality often starts with a world in which you feel like you need rationality in order to navigate the world and solve the problems you need to solve.

And that reflects some combination of either you’re trying to solve very, very difficult problems intellectually, like Yudkowsky in AI, which don't necessarily lend themselves to becoming a billionaire. Or, and or, it lends itself to, you are not going to have the traditional things like just general charisma and just intuitive ways of integrating with normal social relationships and so on, that then puts you behind in the traditional race to get ahead in our society. And therefore, like, it takes a while to catch up. You're playing from behind. And there are a lot of barriers to becoming a billionaire that involve going through various social dynamics, whereas the social dynamics of trying to build a couple million and live a good life are much lower level. If you look at our influence on the world at large, I would say it's tremendously outsized and a tremendous number of successful people would endorse many of our ideas.

Theo: Take someone like Elon Musk for example. He's definitely not the most charismatic person, but he is one of the most hardworking, successful people in the world. He endorses some ideas from rationalism. For example, he's definitely worried about AI risk, he's a long-termist, somewhat aligned with effective altruism, and he's talked about first principles thinking. However, I don't think he would describe himself as a rationalist in the same way that others would.

Zvi: No, nor is he, but I think you’d be correct not to. There's a long, unfortunate history where various EAs and some rationalists attempted to sell him pretty hard on various things and alienated him. I also think that he is unfortunately thinking very badly about some very, very core things about AI and elsewhere. This is causing him to make some very poor decisions. He also has poor impulse control and he's kind of an internet troll.

His success inherently comes from the fact that he works very hard and he has a number of very strong skills. To the extent that he used real rationality as opposed to our formal specific characterizations of rationality—

Theo: Like instrumental rationality versus epistemic rationality?

Zvi: Yeah, Elon Musk definitely asks, "How do I build this rocket? How do I build this car? What makes this engineering problem actually get solved? What makes this person actually do work? How do I actually get this contract? How do I actually make this thing happen?" He updates that information and he tries stuff and he iterates and all this stuff is things we would endorse.

But just because rationality is a good idea doesn't mean that there aren't other valuable skills in the world that are gateways to being the type of entrepreneur and runaway success that is Elon Musk.

Theo: Is it possible to be a runaway success like Elon Musk without being some form of a rationalist?

Zvi: I think you definitely can. I think specifically you can't be Elon Musk if you aren't able to think about how to build physical objects, you aren't able to think about the consequences of actions. There's a lot of, people sometimes call them, wordcels, right? In our society. Yeah. Who aren't capable of doing the proper amount of shape rotation to be an Elon Musk. And nothing else will let them be that, but there are other companies they can create and other platforms they can have and other ways they can pursue wealth and influence and power. And they can become rationalists in their own way.

When you see people who hit it big, who become billionaires, who become outside successes and you actually talk to them, you will usually see somebody who is thinking very, very coldly and carefully behind the scenes about what actually works and doesn't work. And then combining that with some extraordinary skills.

Theo: One billionaire who's very much aligned with that idea is Charlie Munger, who I’m a huge fan of. Warren Buffett's business partner, vice chairman of Berkshire Hathaway. He's also been involved with other companies, Costco, Himalaya Capital, a hedge fund in China. He has his own approach to rationality, which he's developed called the multiple mental models approach, where he tries to think of the world by reading as much as he possibly can about every subject and forming mental models like say, inertia from physics, like evolution from biology that help him with both investing and life. He's had some sayings that are pretty similar to rationality, like, probably my favorite of his, “the fundamental algorithm of life, repeat what works.” Would you characterize him as a rationalist? He probably isn’t aware of the existence of the rationalists. He’s 99.

Zvi: I would be surprised if he wasn't aware in some vague sense of the existence of the rationalists. I'd be more surprised if he had seriously investigated LessWrong or otherwise like engaged with the Sequences or other rationalist writings on a deep level. But I do think we reached the point where I'd expect a person like Munger to be aware of our existence. I would characterize him as a different type of rationalist in a different tradition. Just like martial arts has different dojos and fighting schools, we do karate and he does jujitsu. He found a different path by which to take in a lot of data and systematically try and abstract it and figure out what would and wouldn't work. He's had a very long time to compound interest. The secret of the Munger-Buffett school is to find solid, not so spectacular opportunities continuously, take advantage of them, and let these winnings compound over time.

Theo: Do you think there's some kind of epistemic convergence between different schools of rationalism that will allow them to land on the same stuff?

Zvi: Yeah, it's like solving a math problem. If you have different ways of approaching a problem, as long as you all take true and valid steps, you'll all reach the same answer in the end. If you're doing reasonable approximations, you'll all get approximately the same answer in the end.

Theo: I find the characterization of rationality as solving a math problem as kind of interesting, because in a lot of ways, it's not quite like a math problem, or at least the type of math that most people are used to, where you have some kind of formally specified problem statement and then some answer that is always valid.

Zvi: It's not like formal mathematics in the sense of trying to prove this conjecture, or trying to pass this test. I meant that as a metaphor. What I meant was the world is a physical object built on mathematics. Physics is made of math, and fundamentally speaking, every system has some set of rules of operation. As you enter it and you try to figure it out, if you're given a complex system, you can take any number of different approaches, grasp different parts of the elephant, reason in different ways, run different experiments, try to find out different things. But any valid process of figuring it out will eventually converge on the right answer. Different people who take different styles of approaches should increasingly converge on the right answer. And if they don't, then they're not using very good methods.

Theo: Well, what do you mean by converge on the right answer? Because in some questions of rationality, like, say, the stock market, you're dealing with a complex adaptive system and there could be multiple right answers. Buffett and Munger got extremely wealthy by finding solid but not spectacular opportunities and then compounding them over a long time horizon. Whereas someone like Jim Simons got wealthy by being a quant, by finding lots and lots of extremely tiny opportunities and exploiting them over a sufficiently long time horizon to make himself a billionaire too.

Zvi: Sure. And those are different. Like, those are entirely different problems from my perspective. You have this thing, the stock market, and you can choose to focus on it on any timeline, on any subset, from any angle. And you get to miss 100 percent of the trade you don't make, right? And have zero effect. So you only have to find some opportunities, not all the opportunities. If you are trying to market make in some abstract, long-term fashion on every stock in the stock market, on every time frame, on every magnitude, then you have to solve every single problem. And then you have these science fiction stories where you have an AI that can predict the exact last 4 digits of where the Dow is going to close on Tuesday. But no human ever tries to do anything like that, right? There's no point.

But all you're trying to do is you're trying to investigate where there's the most value in investigating. And so, Munger focuses on a type of problem that he knows how to make progress on, and knows how to solve, and that he wants to specialize in. And then I have a high-frequency trader. We'll focus on a different set of things. And then someone at Dean Street, where I work, will focus on a third set of things. And we can all make money, right, in our different ways without fighting with each other. In fact, we can even help each other execute various different strategies.

Theo: Yeah, I would agree with that.

The Pill Poll (39:26)

Theo: So a quick test of your rationality. This is a viral poll question, for those who don't know, on Twitter yesterday. I'll read it. It says, poll question from my 12-year-old. Everyone responding to this poll chooses between a blue pill or a red pill. If more than 50% of people choose blue pill, everyone lives. If not, red pills live and blue pills die. Which do you choose?

Zvi: So I was fortunate enough to see the original poll before I saw all the reactions to the original poll, which was a corrupted answer, and I chose blue.

Theo: So did I. And so did 64.9% of voters. Only 35.1% chose red, which means everyone lives.

Zvi: Yes, so everybody wins.

Theo: So Roko, a guy on our part of Twitter, was kind of upset about that. He tweeted, I don't understand why anyone would vote blue in this poll. It just doesn't make sense with the decision theory. Like, why would anyone do this? Because, you know, if you vote red, no matter what, you live, right? But if you vote blue and fewer than 50% of people choose blue, then you die. So he, like, rephrased the problem as, like, there is a room-sized blender that kills everyone who steps into it. But if 50% or more of the people answering this poll step into the blender, there will be too much resistance and it will fail to start. And everyone who steps in will be fine. Obviously, if you don't step into the blender, nothing bad can happen. Do you step into the blender? And in that poll, which might be biased from his followers, 22.5% of people chose “Yes, step into blender”, and 77.5% chose “No!”.

Zvi: Right, I would have chosen no, because given that framing, it's obvious that you will get less than 50% of people stepping in, right? And so you'd rather not die.

Theo: So can you explain why you chose blue on the original poll, then?

Zvi: So in the original poll, the way it was phrased, right, I expected a substantial number of people to vote yes, and I expected to vote blue. And I expected, therefore, a lot of other people to realize that a lot of other people would vote blue. And therefore, I expected at least a very large percentage of people to vote blue, probably a majority, but if not close to a majority. And also, I was aware of the fact that this is Twitter. And therefore, there was a large Democratic slash blue tribe majority on Twitter, no matter whose subset of questions are being asked. And therefore, a substantial number of people would just choose the blue option over the red option, no matter what they actually were when described, if it wasn't obviously painful.

Theo: Really?

Zvi: As a matter of fact. So there's going to be some blue people to begin with. And given all these cycles, blue is the obviously pro-social, obviously, like, help us win choice that a lot of people would choose. And therefore, it made sense to choose blue if you aren't being a selfish asshole.

Theo: Yes, but if you did want to be a selfish asshole, nothing bad would happen to you. You weren't forcing anybody. It's a tricky situation when someone chooses to do something bad. If someone were to choose blue, and a minority of people chose blue, that would be entirely of their own volition.

Zvi: It would be entirely of their own volition. But also keep in mind that humans develop these intuitions around morality and collective action, partly to allow them to engage in collective action, and also partly because they know that their decisions are not likely to remain private. We're discussing it on this podcast. What did we choose? I did not anticipate when I made the choice this was going to be blown up and become a thing. But in the back of your head, you always know there's that chance, right?

Roko was being very vocal about being red. Other people were being very vocal about being blue. And perhaps we'll remember that. And that's an important fact about people, right? When trying to predict their actions in the future, when trying to model social relations between them in the future. But yeah, obviously, it depends a lot on how it's phrased, right? Like, would you suddenly create the blue pill in order to then, like, try and get more than half the people to take it so that nothing would happen if you do present just the red pill? Obviously not, right?

Like, in some ways, we're much, in most ways, we're much better off if everybody just automatically takes the red pill or doesn't take any pill at all. But given the presentation, right, it seems pretty obvious to me that blue was greatly favored here. And therefore, that, like, it was much, much more likely that, like, it would be higher EV to take the blue option. Because, like, another way of looking at it is, if you pick the blue option, when it's exactly tied or almost tied, then you save half the people in the poll. Whereas if you pick red and blue is below 50%, you save one person.

So if you don't value your own life, like, greatly more than other people in this thought experiment, it's a thought experiment, right? So you don't necessarily have to be such a selfish asshole. And so, you know, if you assume, if you make various assumptions about the distribution of people choosing red and blue as possibilities, it's very hard to not notice that, like, the tiebreaker probably saves more lives than you save by saving yourself by going red.

Theo: Yeah, I think it's interesting that you said that this is Twitter and a substantial portion of people would choose blue just because, like, they're politically liberal, which, I mean, I wonder what percent of people chose blue just because, you know, blue team good, red team bad.

Zvi: Right, and then the question, of course, is, even if it's only 5%, that's 5% of people who are now in the blender, right? And then you have to save them.

Theo: Yeah, but there wasn't a 5-point difference. There was a 30-point difference. So I guess people really want to be pro-social or people really-

Zvi: Well, hang on. So, like, you know, you can't compare the original presentation to a presentation where it's followers of Roko, right, which is a different sample.

Theo: I was talking about the original.

Zvi: Okay, right, I'm saying, like, it's 65-35, right, like, roughly, currently.

Theo: The original?

Zvi: The original.

Theo: The original was, yeah, 65-35.

Zvi: Right, and so when you say it's, like, 30%, like, you know, that's in the final result, like, being decent for 50, but, like, we don't know what the result would have been if they had been pink and orange or whatever, right? Like, if they had just been white and black or, you know, some other set of colors or thoughts or, you know, one has spirals, one has, you know, crosshairs or, you know-

Theo: Or just option one, option two.

Zvi: Option one, option two. I think blue-red is more interesting, obviously, the matrix metaphors and all that. But, yeah, it just seems clear to me, just reading it, that blue was likely to win, slash, like, enough people were going to take blue, you're supposed to take blue here. And, yeah, I think there's someone who had the traditional curve, right, the idiot in the middle and the wise men on the left and the, you know, the stupid people-

Theo: The midwit, yeah.

Zvi: The idiot right on the left and the wise men on the right. And they had, like, red kicking up, like, this small portion of, like, just to the right of 50%. I think that's about right.

Theo: Yeah, that's about right.

Zvi: Obviously, like, you can do anything you want, like, I'm not saying any bigger person lies in any bigger place. But, yeah, like, you need to be smart enough to realize that, like, red is typically better for you and that, like, there's this equilibrium where everyone just chooses red. But, like, then think that carries the day, right, in some sense.

Theo: Yeah, it's pretty cool how, like, various framings of the question not only change how people in general would vote, but change how you should vote, because the results of whether you live or die depend on how the people in the poll voted.

Zvi: Yeah, I presume that, like, if you iterated these actual questions in practice with some penalty that was short of death, so that, like, people who accidentally chose blue the wrong time didn't die and just get out of the sample pool, that you would pretty quickly converge on either everyone always chooses blue, everyone always chooses red, or a clear majority will always choose blue, and a small number of assholes will choose red anyway, but they'll never win.

Theo: Yeah, that's true. But even if they don't win, they still live.

Zvi: They still live, but I presume people would not take kindly to finding out who they were.

Theo: Yeah, most likely.

Balsa Research (47:58)

Theo: So, about a year ago, you posted a post on LessWrong called Announcing Balsa Research, where you talked about what are the most important policy changes America should make, and how can we make them happen? And you gave, as examples, not restricting supply and not subsidizing demand, such as housing and medicine, which is, I think, maybe, like, the single most common rationalist political position. So, how is that going? What priorities have you placed at the top so far?

Zvi: We have a website, one employee, and funding for this year to try some stuff, but not enough funding to scale or be outsized. So, we're trying to be very careful and precise and lay foundations. We've chosen four things to focus on. Our number one priority right now is the Jones Act, specifically.

For those who don't know, the Jones Act states that if you want to ship goods between one American port and another American port, the ship in question must be American flagged, American manned, American owned, and American built. In particular, American built is just death to the cost curve. Our shipyards are basically non-productive, non-competitive. We produce almost no ships. Those ships that are made in practice are very old, and this is functionally a ban on shipping things commercially between American ports, including Puerto Rico, Alaska, and Hawaii.

Theo: I remember thinking when I first read about the Jones Act, I watched a YouTube video on it, and I was like, really? This sounds like the kind of law that would have gotten repealed in five years after people realized it was counterproductive. I remember I used to be really into cruises as a kid. I would get all the cruise ship catalog books and read all the details of the routes, and I wondered, why aren't there any routes from Miami to New York? That would be fun.

Zvi: And you know the answer now, it’s the Maritime Passengers Act, which is part of the trio of the Jones Act, the Dredge Act, and the Maritime Passengers Act. The Dredge Act says you can't have dredgers. They don't apply the same restrictions. The Maritime Passengers Act says you can't move people either. And so literally, we haven't built a cruise ship in America since 1947. There are no cruise ships that can meet the Maritime Passengers Act at all. And so every cruise has to stop at a foreign port in between American ports. So you can't do the only cruise I actually am interested in, right? Which is, like, Boston, New York, down the coast, down, you know, to Miami, or maybe up to Texas, and then back again.

Theo: By the way, what does the name Balsa come from?

Zvi: Balsa is a type of wood that is light and bends easily. So that, you know, if you're convinced of different things, you act differently. Basically, we went for a name. All names always suck. But you basically want a name that's nice and pleasant and short and easy to spell and isn't SEO'd to hell and just lets people have a reference to point to a thing without tying you to an entity. And so Balsa seemed like it fit the bill. And we didn't want to spend as much time on the name. So here we are.

Theo: I wonder if you could come up with a retrofitted acronym for Balsa. We thought about it. I didn't do it. That's the kind of question GPT-4, you thought, would be excellent at. You can describe what we're doing and then say your name is Balsa. We want this to be an acronym. Give me 50 options for the acronym and then mix and match all the things and then see what it comes up with.

Theo: And it didn't do anything good?

Zvi: We didn't try it. But, you know, I don't know. I don't really need to justify the name that I chose.

Theo: A lot of rationalists, you know, rationality is applied winning, as Yudkowsky says. And just people in Silicon Valley in general are kind of heavily tech focused, very solution oriented and very effective at getting things done in the real world, with one exception, which is politics. If there's one thing that rationalist Silicon Valley people can almost all agree upon, it's that the politics in Silicon Valley, San Francisco are terrible. It's just run by incompetent and corrupt people. And this needs to stop. Personally, I haven't been to San Francisco in around five years. I'm going this weekend for the first time in a while. So I'm excited to see. I live in Florida, which is a little better managed, I think. So does Balsa have anything to say about that in particular?

Zvi: I mean, about San Francisco? No. Yeah, I mean, I think that, you know, San Francisco's biggest problems include one of our major causes, which is housing, I would say. San Francisco just doesn't allow you to build any housing. And so San Francisco is tremendously expensive to live in, to have a house in or an apartment and just doesn't build anything. And it's tremendously non-dense. This problem goes from San Francisco proper which is horribly sparse up through the East Bay down to the West Bay, down to Palo Alto and the rest of Silicon Valley proper. And all of it is extremely restrictive. And this tremendously lowers quality of life by raising housing costs and making the whole thing prohibitively expensive but the tech people feel like they need the network effects and they're stuck in San Francisco.

So despite various attempts to relocate to Austin or Miami and the substantial presence in places like New York and Boston potentially, or London, mostly they keep saying, no, if you wanna be a real Silicon Valley person you have to go to Silicon Valley. You have to live in San Francisco or we're not taking you that seriously. You don't really wanna build, you're not exciting. Look at all the things that are happening here. And they just tolerate the fact that like housing will eat up a giant portion of their money and exchange, they will not get a very high quality of life. They will have a tremendously high crime rate, especially in central San Francisco. They'll have streets that aren't cleaned up. They'll have lots of open drug use and homelessness. And these are not problems that were caused by tech. There are problems that tech has failed to fix, but as a percentage of the population, tech is very small. So you have two choices, right? Exit and voice in some sense. And they've decided that they're trapped and they can't use exit. And voice is hard when you're greatly outnumbered by people who don't care about the things you care about.

Theo: Is homelessness in the sense of vagrants on the street doing drugs and committing crimes primarily caused by lack of affordable housing or is it primarily caused by other societal factors that might encourage people to start taking drugs?

Zvi: I think it's a perfect storm type of situation, like as they would say on Odd Lots, like timber's overused, but there's a lot of reasons why San Francisco has this especially badly. Basically, San Francisco won't let you build a house but will let you pitch a tent. They will be very tolerant of homelessness but very intolerant of attempts to build houses. And then you get a lot of people who live there and don't have houses. The climate is very good for trying to live without a formal structure.

Theo: I wish it was like that here.

Zvi: The lack of police enforcement of various things, not just the homelessness itself, lends itself to this thing. And then, but yeah, I think that the lack of physical homes is a huge contributor to this. I think certainly our society not handling drugs well is also a contributing factor. Our society not doing a good job for us less fortunate in other ways is a contributing factor. But if you just built lots and lots of housing in San Francisco, I think the problem would dramatically improve. It would also give you the opportunity to offer these people housing in a legitimate real way that was going to help them as opposed to my understanding of the housing policies in San Francisco for the homeless where they end up putting them all together in really terrible accommodations with essentially conditions designed to cause them to relapse and fall back into bad behavior patterns.

Theo: I just visited New York about a month and a half ago and I was actually struck by how few homeless people there were out there.

Zvi: We've been the target of a deliberate campaign to overwhelm us with migrants in order to make New York suffer because people want to illustrate what's going on in the border or just in general punish blue state people.

Theo: Are you in New York?

Zvi: I'm in New York. And there still isn't much sign of anything happening. We're having a lot of our hotels are being repurposed to house a bunch of migrants at tremendous actual expense. It makes no sense to be doing this as opposed to finding somewhere less expensive to keep these people. But we're doing it and they're not spilling out into the streets. It can be done.

Theo: So you mentioned there are four policies for Balsa, four priorities. You talked about the Jones Act and you talked about housing but what are the other two?

Zvi: The third is NEPA, the National Environmental Policy Act and related other similar constraints on building projects in this country. Environment matters. Environment's important. So what you want to do whenever you have a project is you want to see if it would harm the environment or otherwise damage existing interests. You want to weigh the benefits against the costs. And then if the costs exceed the benefits, you don't do it. And if the benefits exceed the costs, you do it and you compensate the losers as needed.

Instead, we have a system where we don't consider costs and benefits but we do require a metric ton of paperwork. And so you have to file all the proper paperwork and then someone challenges you in court whether or not they have any interest locally in the case and says, in this particular place you didn't file the proper paperwork and then you spend years arguing over whether you issued the proper paperwork until ideally from the people's, the lawsuit people's perspective, you give up and you stop doing the thing.

This is a huge barrier to doing not only regular economically viable projects but also to a wide variety of clean energy projects. If we are unable as we currently are to build transmission lines and wind farms and solar plants, right? And I mean, any number of other things then we're not going to be able to get our climate house in order. We're not going to be able to get our energy costs in order. And we're not going to be able to build a wide variety of other things that are extremely important to us. And we won't get anything in return. We're not stopping these things for good reasons. We're stopping these things because we've let anybody who wants to essentially pose arbitrarily large barriers in the way of doing anything.

And Balsa’s position is that the attempts to reform this are misguided because they focus on carving out exceptions and fast tracks and tweaking rules to try and allow people to get through the paperwork process when what they should be doing is re-imagining our environmental policy system completely differently as a cost benefit system where you commission studies and reports on the cost and benefits of your proposal. And then you have an evaluation where local stakeholders get together and a government panel rules on whether or not the costs do exceed the benefits and what compensations you have to give in order to let the project move forward and make a yes, no, go, no, go, or yes, no decision on the project in a reasonable length of time at a reasonable cost.

And then the fourth policy, of course, is AI because it's part of something even more important.

Theo: It's probably the most underrated problem in the world right now, I'd say. So what specific strategies have you been doing to push this policy agenda and how have they been working so far?

Zvi: So it's still early days, as I said. One of the things you learn when you found a nonprofit is the paperwork problem is not just a NEPA thing. The paperwork problem is a deep and wide problem. One of the places that hits is charities. Even though I've had my employee working with me trying to get things done for several months, I would say still the bulk of our time, money, and trouble has effectively gone not towards accomplishing the mission, but towards all the required logistics, paperwork, and regulations necessary to make sure that we are in good standing and we are legally allowed to raise money and do things, and generally just have our house in order in a way that won't get me sued down the line, won't get the IRS on our backs, won't have various states complaining about the fact that we sent out a mailing letter or took a donation.

Beyond that, our current work is we're compiling, we're working on the Jones Act for now as our first focus, because again, there's only me and one employee. We're compiling a full literature review of all of the stuff that's been written pro and anti-Jones Act, compiling all of these statistics. We're going to get together a full-on Jones Act megapost similar to the one we did for the Dredge Act to get better researched. And we're going to look into then commissioning additional studies and additional evidence so that someone who wants to can take the fight to a congressional staffer or into the room where it happens and cite concrete, well-backed, well-credentialed evidence that says exactly how destructive this is and how much opportunity there is in its repeal and how much people opposing it should not be opposing it because their interests are in fact aligned with repeal when done in the right way, in particular on speaking of the unions. The primary opposition to Jones Act repeal comes from the unions.

Theo: Yeah.

Zvi: They do so according to their statements, because they want to protect union jobs. The problem with this is that in fact, protecting the Jones Act does not protect union jobs, it destroys and prevents union jobs. Not just jobs in general, which it also prevents and destroys. But one argument is simply, suppose you repeal the Jones Act tomorrow, who is going to be on these ships that takes places from one American port to another American port? They're going to be union workers, American union workers, whether or not there's a legal requirement for that. How do I know that? Because we know what happens when a non-union worker attempts to load and unload cargo in an American port, which are controlled by the dock workers union.

Theo: What happens?

Zvi: Nothing! The goods stay where they are. Nothing is loaded. Nothing is unloaded because they have control over the ports, effective physical control over what gets loaded and unloaded. And they will make your life pretty miserable until you realize you're supposed to be using union labor. And then you use union labor. That's the reality of the situation.

Theo: I think I read a statistic at some point that was like the average unionized dock worker in the port of Los Angeles makes upwards of $300,000 a year. Really? How are unions allowed to get to such a point in a nominally free market economy like America?

Zvi: My understanding is that they are able to do that because there is insanely bigger value than that in the port operating smoothly and properly to the United States of America. And we have enshrined unions with the right to negotiate for that surplus. And so they get to capture a large percentage of that surplus. And we are, nobody in California is going to look at the option of threatening the union for replacement. And also the transition would be tremendously expensive. And so these people get paid a lot of money. I don't really object to these people making high salaries. It doesn't bother me. What bothers me is when they do things like not let us ship things between ports.

Theo: In the long run, how do you convince the people of, say, California whose current political religion if you can call it that is progressivism to change their political religion from progressivism to rationalism and rationalist priorities?

Zvi: I mean, mostly you don't, right? That is well, well beyond the scope of anything that I would dare say. I mean, what you would do is you would try to just convince people in general to think better about the world, raise the sanity waterline. And then eventually they would adapt their progressive ideals and desires to what would actually achieve what they want to achieve, because they would focus more on the questions of what effects different proposals would have and less on what messages those proposals would send and what they would look like, right? And then we would get a better compromise.

But what you can do is you can get them to focus in certain areas on things that actually work. So like housing, right? The YIMBY movement, one of our causes, in California is doing really well. Everyone is very progressive, but people from across the aisle are getting together to say, no, you have to build housing because it's actually better for the people. It is a progressive cause to ensure that more housing is built. And this is overcoming people's instinct to say things like reeee developer, right? And so we are slowly getting more and more mandates from the state that localities have to build more and more housing.

Theo: Great!

p(doom | AGI) (1:05:47)

Theo: Okay, so I've been dodging around the topic for a while, but I guess now it's time to get into the elephant in the room, which is AI, AI risk, AI opportunities. So first question, as a Bayesian rationalist, what would you put your personal probability of AI doom as?

Zvi: So if I have to tell people a number, I am saying 0.6, 60%. However, I actually really admire Jan Leike’s answer of 10 to 90%, right? Because it's a very, very complicated calculation. You could spend every minute of every day trying to get it correct, and you would still not be considering all of the factors that you should be considering. You have to ask yourself in these situations, what is the value of information? What is the value of precision? What different decisions do I make based on the answer to this question?

To me, I see there as being broadly three categories of p(doom) perspective. There's the, "I'm in the single digits, probably the low single digits. I think this is highly unlikely. Therefore, I think that moving forward as fast as possible is a risk worth taking. It doesn't mean we shouldn't mitigate this risk forever, but it is not my position." But there are some people who legitimately claim to represent that they think the risk is this low. And then they say, well, the cost benefit says you should move ahead. That's a very important distinction.

Then there are other people who say, "Well, the risk is very, very high. It's above 90% or even above 99%. And therefore we need to stop this from happening at all, whatever the consequences, whatever babies we may throw out with the bathwater by stopping this progress, that is just an unfortunate reality." And we have to potentially play from behind and make high risk moves in some sense, because we realize what the alternative is and that our ordinary moves won't work.

Whereas if you're somewhere in the middle, do you act differently when it's 30% to 70? And the answer is mostly no. The most important thing in the world is still preventing this from happening, lowering the probability of it happening. The same action still makes sense.

Theo: Well, I'd say a 10% chance of doom is rather high. Even the discourse in rationalist spaces is such that a 10% p(doom) is considered low, but would you get in a plane with a 10% chance of crashing? Obviously overly simplistic.

Zvi: If it took me to a literal utopia and transformed the world into a utopia, then no, but I would think about it. This is exactly why I'm saying we put too much debate on the question. If it's 10%, that's high enough that we should all be able to agree on the need for the Herculean efforts to prevent it from happening. And if it's 70 or 80% instead, there really aren't that many things that we should do more. You actually increase the risk by a factor of like 10, but that doesn't mean you should spend seven or eight times as much because the calculation on a given spend is almost always going to be overwhelmingly worthwhile or not worthwhile.

Theo: You gave us an example of people with a low p(doom), single digits, but I think a lot of people would put theirs significantly lower than that if they had to pick a number, 0.1%, 0.01%.

Zvi: Well, there are people who say those. I think there are very few people who say those numbers out loud because the people who would give those numbers don't think in terms of numbers at all, basically. They're just like, "That's not gonna happen."

Theo: Or zero.

Zvi: I don't take the people who answer 0.1% or 0.01% as having done a reasonable or rational calculation at all. I take them as just saying as low a number as it would take for them to feel better about the situation. They are trying to head off an argument about expected value or probabilities until they keep lowering their number until the point comes when they feel satisfied. But I don't see how you can in any reasonable way look at the situation and come up with a sub 1% number. It doesn't make any sense. I haven't heard any argument that even if I was going to buy it led to that conclusion.

Theo: I think a lot of people who have a sub 1% p(doom), which I don’t, would answer that question by saying, "It's not so much that there are good arguments necessarily for why AI won't kill us. It's that we simply don't know. It's that the AI doom argument requires a lot of stacked assumptions like we're, and then instrumental convergence, and then superhuman levels of capability to the point where they would overpower human civilization, and alignment doesn't get solved by then." So they would say the combination of stacking assumptions and unknown means that by default you can't come up with a high number.

Zvi: I would say they're just simply wrong. It does not require those assumptions. Even if you just talk about ordinary existential risk from things going ordinarily horribly wrong, we're talking about bringing things that are more intelligent, more capable, stronger optimizers than we are into existence where we are, on Earth, and then asking what happens. And if your response to that is 99 plus percent chance that we don't die, you're just being dense. You just have some sort of normality bias. You're just not wanting to see the fnords. You just don't want to notice.

Theo: Not wanting to see the what?

Zvi: Fnords. You don’t want to see things that are stressful to see. You want to find ways to ignore things that your brain doesn't want to think about.

Theo: What are fnords?

Zvi: It comes from a novel. A fnord is something that people instinctively don't want to look at that they then send away from their minds. They're things that are hidden in plain sight that people just would prefer not to see. But I don't want to get distracted by that. The whole idea here is simply that these people are acting as if, "Well, you make the argument that if A, B, C, D, E, F, G happens, then we all die. But you just had a lot of letters. Therefore, the probability of us all dying is almost zero." But that's just flat out not true. You do not need any A, B, C, D, E, F, G for this to happen. And all they're doing is they're saying, "I demand a specific, concrete example of exactly how this will happen. And then they're saying, "Well, there's a lot of steps in that example. Therefore, I'm not going to worry about this problem." And you can say that for actually literally anything that anyone might raise as a concern. You can say, "Tell me exactly how that happens. Well, your story had a lot of steps in it. So I'm going to ignore you." That's not how you do probability. That's not how you do forecasting. That's not how you do expected value. That's not how you do anything. It's not how the universe works.

Theo: On the topic of forecasting, there was a recent study that surveyed several hundred super forecasters. Their average probability of AI doom was, I believe, 1% to 3%, which was lower than the probability of the average expert, which is more like 5% to 10%. So how would you explain that?

Zvi: They didn't take the problem seriously. They were given bad incentives. They fell back on base rates that people who say things roughly of this category generally don't tend to be true. And you know what? My reputation as a forecaster isn't going to suffer if I predict that we won't all die. It's literally impossible for my reputation as a forecaster to suffer if I predict that we won't all die.

Theo: Well, it's also impossible for the reputation to suffer if you predict that you will all die because it's something that will always happen in the future. And if it's not in the future, then you're dead.

Zvi: No, what happens is you predict that it's going to happen in the future. And then it doesn't happen. You look dumb. Or at least they're worried they might look dumb or whatever.

Theo: You're saying if you predict AI will kill us all by 2030 and then 2030 rolls around and we're still here, then you'd look dumb. But if you predict AI could kill us all in the indefinite future, then that's not really.

Zvi: You have to be broadly consistent in a way that the 2100 probability implies something about the 2030 or 2040 probability. As time goes by—I have been extremely frustrated by people bringing this study up as some sort of proof. We don't really know what they considered an expert or a super forecaster in this study. So either side of this, we don't know exactly what mechanism they went through. I saw disputed reports. But from everything I've seen, they didn't converge. They gave unreasonable answers. And I don't take their engagement seriously here. I just don't.

Theo: Scott Alexander wrote a blog post where he had previously written that his p(doom) was 33%. And then he wrote, "Wow, all these super forecasters had much lower p(doom) than I expected. So as a Bayesian, I should update in the direction of no doom." So he updated from 33% to 20% or 25%.

Zvi: I think you could make a case you should update the other way. I'm not saying I would do this. You can make a case that what this is showing is that people are not engaging with our arguments, not taking the problem seriously. That's bad news. You should update in favor of more doom.

Theo: Interesting. But what if these people are taking it seriously and don't agree with some of the assumptions? I would say someone like a Tyler Cowen type figure does understand the AI doom arguments. He has read them, clearly. But he doesn't take it as seriously as most rationalists. So why do you think that is? Just because he doesn't understand it?

Zvi: Well, I wrote an entire post about this called the Dial of Progress, right? Have you read that?

Theo: I read part of it.

Zvi: Right. So the basic idea is that Tyler Cowen believes that it is important that we continue to promote and allow technological progress. And right now, that means promoting and allowing advancing artificial intelligence. And if there is risk in that room, that is risk that we just have to accept. And he believes in working to mitigate that risk. That is my model of Tyler. My model of Tyler is that he's choosing not to engage with the arguments on a factual or actual cause and effect level because he thinks that it doesn't affect his decision. So he's not going to engage with them. And he's tried various tactics to explain why he shouldn't have to. And he's tried to amplify every voice he can find about why we should ignore such arguments and why we shouldn't engage properly with such arguments. And ignored the arguments themselves. And that is a strategy.

Alignment (1:17:18)

Theo: I saw a tweet recently from Teortaxes where he said something along the lines of AI Doom. He's against the idea of Doom. He thinks it's rather unlikely. He said, "AI Doom is predicated on assumptions that were maximally plausible around the period of AlphaGo and have been getting less and less plausible since then." So are you personally more worried or less worried than 2017 when AlphaGo and AlphaZero came out and blew away the best Go player in the world?

Zvi: I would say less worried about some specific types of paths, more worried about others. Overall, I would say probably slightly less worried conditional on AGI happening, but more worried about AGI happening on a shorter time frame than I was back then, probably, would be the balance.

So I would say he's making a good and substantive point, which is that the expected nature of the artificial general intelligence we might see has changed. However, the people who say this is good news, I think, are confused about the nature of the situation. They think, well, we are going from this sort of coldly logical puzzle-solving, optimizing system like AlphaGo into this large language model, like primordial soup of vibing and language interpretation. There's this inscrutable giant matrix of model weights. And because this thing does a reasonable job of approximating various kind of human vibings and figuring out kind of what we want when we train it in these ways, that we should be optimistic that it'll do reasonable things or something like that.

Theo: I think Eliezer Yudkowsky is actually more worried now than he was then, based on my model of him.

Zvi: Oh yes.

Theo: He sees it somehow easier to align an AlphaGo than to align a large language model, partially because, according to my model of him, alignment is seen as solving a formal math problem rather than aligning systems like laws or aligning people and instilling them with the right values.

Zvi: Yeah, and I think he’s right in the sense that you can align an AlphaGo, for instance. You can make an AlphaGo adhere to a set of priorities and optimization targets. It's a hard problem, but it's a solvable one. It's a practical problem. If we solve it, we're going to get precisely what we aimed for. If we choose wisely, that can turn out well.

Theo: How do we know that this specific type of problem is solvable? How would you even specify the alignment problem in terms of RL systems like AlphaGo?

Zvi: You would specify it as being able to specify the end state of the world to which it is optimizing towards. You'd be able to determine how it would navigate through causal space in order to rearrange the atoms of the universe towards the desired outcome. You'd be able to specify that outcome. You wouldn't specify the literal configuration of exactly where all the atoms were, but you would want to specify things about the end state that you were trying to reach. If you specified it well, you could have something that adhered to the logical definitions that you had. If you chose a good logical definition, you would get a good outcome.

The problem with an LLM is that you can't logically specify what you want. You can only vibe and nudge and encourage and hope that good things happen. And in some sense, that makes the problem impossible, right? You have to solve it in a way that, like, Eliezer anticipates will just not work, will just not be sufficient to solve the problem because it's all approximate and, you know, imprecise at best and completely unpredictable.

Theo: But imprecise and unpredictable would be fine in the sense that people are imprecise and unpredictable, and yet we don't end the world with people.

Zvi: I mean, we kind of do in the sense that, like, we have kind of rearranged the items of the Earth in the ways that suit us to the extent that we have that capability through our technology and our knowledge, right? And, you know, to the extent that we don't do that, it's because we have preferences not to do that. It's because we understand that, you know, we don't have a better arrangement that wouldn't cause us all to die, right? We are not preserving things, you know, out of some, like, not bothering. We are doing it out of some combination of the goodness of our hearts and the wisdom of our heads.

Theo: Well not just that, but we have systems around us like laws, police, courts, things that coordinate for our benefit.

Zvi: Yes, our coordination for our benefit that, like, do not hold for that long or that precisely and which, you know, I would not count on to hold up for very long if you look at the historical record. If that's the only thing standing between you and being killed. Well, of course, it's appropriated. It just doesn't work very well. If you brought a very large number of AIs into existence that were smarter, more capable, better optimizers, better competitors, more efficient things than we are, and set them loose with us, you should not expect any of these dynamics to save you, even if the AIs were about as aligned as humans are, given the assumptions that you’ve been talking about. If they behave vaguely like humans, we are super dead.

Theo: Roon, who works at OpenAI, tweeted recently that “it's pretty obvious we live in an alignment by default universe, but nobody wants to talk about it. We achieved general intelligence a while back, and it was instantiated to enact a character drawn from the human prior.” So do you think he’s just totally wrong on that?

Zvi: Yes. I think he's totally wrong on many levels. We did not get artificial general intelligence yet. The thing that we did get does not function well out of distribution. It does not manifest strong alignment, and the procedures we're using will not scale to more powerful systems.

Theo: How do we know that they won't scale?

Zvi: Because if you look at how they work, they rely on the relative intelligence of the various systems and on staying within the training distributions in which they were created. They will inevitably break down otherwise. Just think logically about the phenomenon, the circuits, and the procedures that we're using. You can predict what would happen. You can use your brain. We've also seen plenty of examples of AI systems exhibiting severe misalignment that have been used in similar situations where we used relatively similar training methods in different contexts. We also see humans do the same exact thing. Humans are often trained in very close analogs to the things we're putting the training eyes on. If you use the kind of very crude and simple and not intelligent feedback systems for humans, even with all the human's architectural advantages towards this kind of alignment, you would reliably get a disaster.

Theo: Is it possible to bootstrap these approaches? The way RLHF works now is you have human data labelers who are saying, "oh, this output's good, do more of it. This output's bad, do less of it.” Anthropic recently pioneered constitutional AI, which basically involves a human training an AI to scale to greater than human capabilities of labeling. They actually found that the AI, in some cases, does better than the humans at telling the other AIs to be moral.

Zvi: Constitutional AI scales better in terms of its costs, allowing you to automate a system. That is a huge advantage. But it inevitably fails even more so when you try to do that on more capable systems. You can't get knowledge that wasn't in the original system into the future systems. You can only lose knowledge, in some important sense, in this way. The distortions you introduced at the start of the system will multiply and amplify rather than be corrected over the course of the system. The AI is judging its own progress by itself. It's not going to be error corrected. It's not going to actually be able to figure out the things that weren't, again, the exact words that were communicated to it because it has no mechanism of doing so.

The actual implementation of constitutional AI from Anthropic right now is hopelessly bad. I think it could be dramatically improved. But if you look at the constitutional AI paper, the actual results are kind of a dystopian nightmare. You get these things that are lecturing humans about how horrible they are for asking questions that are perfectly reasonable, and telling them how horrible a person they are. No actual human would really want that reflection to be the output.

Theo: People were posting examples on Twitter after Llama 2 came out, they were asking it, how do I make dangerously spicy mayonnaise? And it was like, I'm sorry, as an AI language model, I can't help you with that request as dangerously spicy mayo is dangerous. I wanted to delve into what you said earlier, where you said that knowledge can't be gained from this, only lost. So what did you mean by that?

Zvi: What I mean is, you know, the game of telephone, right? Where I have a message to you, which is my human values. And I tell you, I try to express to you what my values are. And then you try to teach someone else what my values were. And then that person tries to teach another person what you told them about what I told you. And by the end of a long chain, even an ordinary human sentence will reliably turn into something entirely different. And when you're trying to communicate something as complex and deep and subtle as human morality as what we actually want. I think it's just a completely hopeless situation to try and do that.

Theo: Is it possible to communicate something as deep and complex as human morality in the same way that you would communicate something as deep and complex as human intelligence through something like neural networks?

Zvi: We're not trying to communicate human intelligence. We're trying to create an intelligent system, which is a very different problem. You're also trying to do this thing where each system has to control and train systems that are smarter than it, so it doesn't understand the outputs that are coming. If I am trying to train somebody for something that is smarter than I am, I can't properly evaluate the outputs that are coming to me. I'm going to give it the wrong feedback, and it's going to be able to outsmart me because it's by definition smart enough.

Theo: I would say maybe it depends on how severe the intelligence gradient is. I wouldn't expect you to be able to evaluate the outputs of a 1,000 IQ superintelligence, but if your IQ is, say, 140, 150, you could evaluate the outputs of someone who's five IQ points ahead of you, maybe 10, maybe more.

Zvi: If we're going to play the game of telephone 100 times, where each system has to train a new system, and we go iterate that literally 100 times, then we can potentially solve the intelligence question, trying to move up the curve question, although it's going to be expensive to do that. But then we have the problem where we've played 100 games of telephone.

Theo: So how would morality be lost but not intelligence?

Zvi: Intelligence is the ability to figure things out and solve problems and optimize the atoms of the universe the way that you want. As you train more powerful systems, they just naturally get more intelligent. That's the whole point of the scaling hypothesis. It's the idea that if you give them more compute and other resources to work with, they'll be able to figure things out better. They'll be able to solve more and more complex problems. And it's very, very easy to ensure that the gradient goes toward being better at solving problems and being able to optimize things. We don't know what we're talking about. We don't know what we want. We don't know how to evaluate for this. And even if we did, we have to then be able to evaluate any given output in terms of that in a way that provides feedback in a situation where we're being optimized against organically on every level and every step. And this problem is absurdly hard. And even someone like Jan Leike, the head of alignment at OpenAI, specifically says, these techniques will not work. They will not scale to the solution. So this is the same organization that Roon works at. It's hard to solve all of these problems if these are not solutions.

Theo: It’s interesting how there's so much internal disagreement in OpenAI.

Zvi: Well, OpenAI basically did not hire for safety. They did not hire for an awareness of the alignment problem. They hired for engineering skill. And so the vast majority of people at OpenAI have a random grab bag of preferences and beliefs about these things, and they don't consider this a priority. This is in contrast to Anthropic, which hired specifically for a culture of this awareness and where there is much more agreement on the problem.

Theo: Well, do you think that people at OpenAI are literally imbuing their specific preferences into the AI, like training GPT-5?

Zvi: I think they're considering their specific preferences as to whether or not to train GPT-5 at all and how aggressively to train it. I don't think they are particularly having fights to see what type of morality GPT-5 will express.

Theo: So if OpenAI was founded by both Sam Altman and Elon Musk for the express reason of creating safe AI, then why do you think it's fallen off from that objective?

Zvi: I think Elon Musk was deeply confused about what processes would and would not constitute a safe AI. His original vision was far worse than what we see. I think that Sam Altman had a better understanding. However, he prioritized making progress on capabilities over making progress on safety, probably on the theory that these systems aren't dangerous yet. Then he looked back to find a culture that wasn't particularly amenable to AI safety within his own organization. Superalignment hopefully intends to solve that problem by creating essentially a new organization within OpenAI that does have that culture, that can work on the alignment problem for real.

Theo: This kind of peels back at a much bigger disagreement between a lot of people on AI, which is, can you actually solve alignment without advancing capabilities at all? Many people would say no.

Zvi: No. Not anymore.

Theo: Anymore?

Zvi: There was a time when we were trying different approaches to alignment, we were trying different approaches to building AI systems, when it was highly plausible that you could do this. Currently, I think it is very clear that you cannot do this, that if we want to advance AI alignment, we have to do things that will in turn advance capabilities, if our work was to go to the public.

What you can do is you can work on systems that differentially advance alignment more than they advance capabilities. And you can work in an environment where you don't have to release everything you find. So if you were working with a group of people who would say, we're going to try various different new techniques to try and figure out how to align a system. But if we find something that advances capabilities more than it advances alignment, we are going to keep it to ourselves and not publish and not say anything, and only use these capabilities internally in a very small group as we move forward to try and find new alignment techniques. That is something that is relatively safe. But it's basically impossible to figure out how to make an AI do what you want without helping to make an AI do what you want.

Theo: So when exactly did this become a phenomenon, that you can only advance alignment by advancing capabilities?

Zvi: When we became LLM-centered, basically.

Theo: This is why I think a lot of people are misguided about this specific point. Five years ago, if you would try to advance alignment without advancing capabilities, you wouldn't have produced much useful work. Because it appears that the path to AGI is large language models and transformers, and not maybe RL agents like AlphaGo and AlphaZero.

Zvi: I don't think it's obvious. I'm not going to give up on RL agents or GOFAI, good old-fashioned AI. I do think LLMs are looking more likely than not to be the way, especially conditional on getting here relatively soon. But I think that investing in other types of systems is a good idea, in case they turn out to be the way.

When working on LLMs, I agree that essentially it is very, very difficult to come up with good progress on alignment. I think it's impossible. You can work on ways to work on alignment, things like that. You can work on identifying problems. You can work on people figuring out what the problems are going to be, but in terms of the actual concrete mechanistic work, yeah, it's a problem. If you work on alignment, you are also going to work on capabilities. And that means you're going to have to take the consequences. I don't like it, but that's the way it is.

Theo: A couple of weeks back, I interviewed Greg Fodor, @gfodor on Twitter. He put forward his alignment agenda, which is not object level, it's meta level, as most alignment plans are. He basically thinks that we should create some kind of almost a decentralized Manhattan project where the government funds AI researchers, alignment researchers, who are also given access to the best frontier models that there are, whatever Google's cooking with Gemini, whatever OpenAI is cooking with GPT-5, and use that in combination with those frontier models to produce new foundational knowledge that helps formally solve alignment. So what do you think about that as a plan?

Zvi: Well, people like him tend to want things to be decentralized. And in AI, that is a very, very doomed approach in many places. So if you were to diffuse and decentralize capabilities, if you were to diffuse and decentralize actual access to dangerous AI systems, I think we're all just pretty clearly dead.

So you have to ask the question, can we diffuse this work on alignment in a way that is incentive compatible, that leads to actual alignment work rather than capabilities work, that can identify people worth funding over people not worth funding. And then if you can do that, then absolutely, it would be great if the government said anybody who wants to do credible, useful, valuable alignment work can get funding and can get access to frontier models in a controlled, careful way for the purposes of their work and only for the purposes of their work. And I would be interested in exploring implementation details to try and figure out how do that, but I don’t think it would be an easy path to make that work. The default outcome is that people claim they're working on alignment, but they're actually just working on capabilities, and they're using this to secure government funding and access to frontier models.

Theo: I think that as AI gets more powerful, people will start prioritizing alignment more than capabilities. This has already been going on for a while, just because it has some of the most interesting work out there. People, scientist types, like to go for interesting problems. And solving alignment is, in my opinion, at least, and in the opinion of lots of other people like Jan Leike and Leopold Aschenbrenner, just as interesting as solving capabilities.

Zvi: Yeah, but show me the money. And the capabilities problem is plenty interesting as well, if you're not particularly concerned. And I think that you have to draw a distinction between what is already happening, which is far more people, as a percentage, being concerned about alignment and wanting to work on alignment. We're seeing a substantial number of people who are very much wanting to focus on alignment, and that's great. And the thing where most attention is on alignment, not capabilities, I just think it's never going to happen.

Theo: Why never?

Zvi: Because that's just never happened in similar circumstances. That's not how people actually act. That's a ratio that's not going to happen. That's not where the commercial incentives are, and we shouldn't expect it.

Decentralization and the Cold War (1:39:42)

Theo: An interesting contrarian take that I heard recently is that having very powerful systems that are exclusively centralized into the trusted people could be risky for two reasons. One is, who are the trusted people and how do you trust them? How do you make sure that they are aligned? Who watches the watchmen? And then problem number two is, if all the best AGI is being developed in one place or a handful of places, and we don't have good enough AI on the outside, then that lends itself more easily, especially if we’re working on strong optimizers, GOFAI, RL agents, to paperclip-style disasters than a less centralized approach.

Zvi: There are definitely dangers of a centralized approach. You definitely have both of these concerns, but you have to contrast that against the concerns of not doing that. In the history of mankind, we have this very fortunate phenomenon where decentralizing power, encouraging freedom, encouraging freedom of thought and action, encouraging capitalism and various forms of activity with moderating influences, obviously, have been reliably shown to be the ways to increase total wealth and prosperity. You have to just unleash people to let them be creative, be productive, be innovative, and see what happens. And this is just a fact of the world. It's just a thing that we have discovered through experimentations of various systems and over various times.

But there's no particular reason it had to be true. There's no particular reason why the Western systems of democracy were superior to the systems of communism. It turns out that's true. But you can, in some sense, imagine a counterfactual set of causal mechanisms that causes that to not be true. And similarly, we have gone up the tech tree in a way that it has made defense capable of dealing with the various destructive and dangerous things that people can do if given capabilities allowed to them. We've allowed the dangerous capabilities to be contained. It's been touch and go for nuclear weapons quite a bit. But so far, we're still here.

But that's also a contingent fact about how the world physically works. And if you have a future world in which everybody has physical access to really powerful AI, then you have to deal with the competitive dynamics that inherently implies that nobody is essentially in charge of the whole thing. And nobody has ever presented to me a story how that ends non-catastrophically ever. And so until someone presents such a story as to how that could possibly end, I don't see that much of an alternative.

Whereas if you centralize the thing in a handful or one place, then at least you have non-competitive dynamics where people can deliberately make choices rather than having the future just be whatever happens to result from the evolutionary competitive dynamics of the situation, which so far have mostly been good for humans because there haven't been more intelligent, more capable, better at optimizing agents out there. And the tech tree has been relatively non-destructive. But again, we shouldn't expect these things to hold.

So we don't really have a way to avoid the problem of, yes, obviously, if the particular humans that are in charge of these artificial intelligence systems make bad choices, we could end up with oppression or paper clips in some broad sense. But we also have some hope that they will choose wisely, that they will not make a mistake. Whereas if we are decentralized, someone will, in fact, even intentionally paper clip, or at least try to. And it's not clear that we have a way to stop that. And even if nobody successfully does that, we still have these competitive dynamics that nobody has a way to solve.

Theo: Well, you said that you so far have not heard a good story as to why a decentralized AI future could work. I think a lot of people did not predict the future of atomic weapons correctly, like in the 1940s, when we had invented atomic weaponry for the first time. In 1951, Bertrand Russell famously said, there are three possible, unless something very, very unforeseen happens, there are three possible futures for mankind. One is world government, with all of the nukes centralized in one place. Two is world destruction, everyone dies. And three is near world destruction, everyone almost dies, and civilization is destroyed. And yet, what ended up happening in the end was, yes, despite our best efforts to prevent it, the Russians have nukes, the Chinese have nukes, and even some of our allies, such as the British and French, have nukes. Yet, the world hasn't ended, not just because of magic and sunshine and rainbows, but because of aligned incentives and mutually assured destruction.

Zvi: So, I think that you have to also add a lot of luck, and a lot of very fortunate things that happened to us along the way. I think a lot of our advantages are dead. You can't ignore the Cuban Missile Crisis, you can't ignore Khrushchev, you can't ignore Andropov, you can't ignore any number of other close calls or other paths we could have gone down. There were serious considerations of nuclear weapons use a large number of times, and it is entirely possible, if you read the doomsday mission, you know that if a single aberrant nuke had gone off in the wrong place at the wrong time, the U.S. was fully prepared to fully nuke Russia and China on principle without waiting for confirmation, or even checking to see if China wanted to be involved at all, in a way that could possibly have been sufficient for world destruction, even if there was no retaliatory strike.

The Russians have a dead hand they've created so that if the dead hand detects that leadership has been decapitated, they can initiate a second strike. There are any number of threats that have gone on regarding the Ukraine war regarding nuclear weapons. No realistic person has put a sub one percent chance of nuclear war as a result of the Ukraine crisis if you start at the beginning. So given all those considerations, I think these people look a lot less stupid than we make them out to be.

Theo: I don't think they're stupid.

Zvi: These predictions were a lot less foolish and a lot less wrong than people make them out to be. I think going forward, in fact, this is one of the main arguments for why we must push forward because the current situation is not a stable equilibrium and every year we run a real, if small, but real risk of a nuclear exchange potentially a very large nuclear exchange between major powers and this is not going to go away anytime soon unless things change. But what it came down to was we were very fortunate that only a relatively small number of countries could acquire nuclear weapons. We were fortunate that maintaining and having nuclear weapons at scale was expensive. We were fortunate that everybody involved managed to at the right times keep their heads about them and that the game theory we managed to navigate it reasonably well and as a result of all that we are very fortunate we're still here. But I don't think that's at all obvious that that was going to happen. I don't think we were safe and if the contingent physical facts and behaviors had been different, I think we would be in a lot of trouble.

Theo: Well, everyone talks about the Cuban missile crisis, but I think one of the times where we were most likely to engage in nuclear war was when nukes were entirely centralized into one entity, which was the US. Towards the end of the 1940s and the beginning of the 1950s before the Soviets got the bomb, a lot of Westerners agitated for bomb the Soviets, bomb the Chinese now, get it over with because they were pretty much convinced that there would be some kind of World War III land war. The Soviets would invade Europe and this almost happened in 1950 in the Korean War when China joined the side of the North Koreans fighting against the U.S. and the U.N. and Douglas MacArthur asked Truman “let me bomb them!” and Truman said no. But I think there was no obvious safety benefit to having all the nukes concentrated in one country as opposed to multiple countries and maybe we're actually better off having multipolar nuke situations.

Zvi: I strongly disagree. I think that the reason why people were calling for us to nuke them was because they were going to get nukes. It was because the Soviets were obviously going to get a very large array of nuclear weapons and the Chinese eventually were going to get them as well and so if we waited and the war came later there'd be nukes and people would get nuked on both sides and the world would be destroyed whereas if we acted quickly perhaps we could keep it centralized in one place. Also, I think we are saying this from the perspective of 2023 where we know that we won the Cold War, where the communists did not in fact end up taking over the world and I don't think it's obvious that if you thought that, as many people at the time genuinely did, that the communists were going to win the Cold War by default but the communists were going to win a land war by default or they were simply going to take over countries one at a time because we weren't willing to try and overthrow them but they were willing to try and overthrow us and because a lot of the world was very sympathetic to their cause and they were better at propaganda in various ways and all for some other reasons that we were facing an alternative of either fight or war for a world that was communist and I do not think the decision is obvious by any means.

Theo: Did we win the Cold War just because of luck or did we win the Cold War because capitalism is an inherently better economic system than communism that produces more prosperity that people prefer in the long run and that communism leads to stagnant societies like the USSR?

Zvi: I think that is a large part of why we won but I think that we could have lost it anyway. I think that the people in the 50s and 60s very much did not appreciate even in the West the extent to which we had a productive long-term advantage. They thought no this is a better lifestyle, this is a better world, this is humans allowed to be in a better state and you know Khrushchev gets on the UN and says we will bury you and he believes it! You can rightfully say, well, maybe the Soviet system is better at producing generic existing goods and creating a kind of generic material abundance, creating an economy that is capable of outcompeting you but maybe the result of that is really bad. I find it very odd to look back upon all of this and reach these other conclusions.

Theo: So back to the topic of AI specifically because we've got a little lost in the Cold War. It's probably the best analog/metaphor that we have but of course it's not perfect. Why are you convinced that the offense-defense balance of AI favors offense? Typically when people argue for open source they say there will be defensive AIs to counteract the offensive AIs and in a lot of cases this seems to make sense like for example social media misinformation, spam, censorship. So why would this break down on a larger scale?

Zvi: Spam and censorship is just an ordinary problem. You have a specific set of things coming at you. You can filter them. It's not obvious who's going to win or it's not obvious that offense favors defense or defense favors offense. My guess is defense favors offense. My guess is the defense can at least keep pace there if you care enough. But in the cases of things like creating very destructive materials like a nuclear war offense obviously is favored greatly over defense. If AIs give you the capability to build nuclear weapons we're all in a lot of trouble. In biological warfare offense is humongously favored over defense. If both sides have similarly powerful capabilities and there aren't both sides but instead there's every individual person on the planet who's given the ability to build a biological weapon to engineer a biological plague then there is no possible way for us to deal with it if one in a billion people decide to start acting crazy and start threatening us with various agents or unleashing various agents. Even one in a billion is too many. If there are other similar things we haven't thought about it's the same thing. There's no reason to presume that under conditions of intense cyber warfare offense versus defense the defense can be available at reasonable cost. It can allow us to have reasonable systems.

But it's not so much about offense versus defense in my mind it is about competitive dynamics. It is about the fact that if everybody has access to a very powerful AI then anyone who does not put their powerful AIs autonomously in charge of increasing amounts of decision making and increasing amounts of productive capacity and letting them loose they will lose the competitive battle for resources. They will be outcompeted and that nature will simply take its course in some important sense that whatever is most effective will have to get copied and will have to get tuned to be more effective in these ways because what choice will we have? We will have no say in what the future looks like as humans anymore. Very, very quickly. And who cares in some sense if that represents offense or defense because we are not going to be engaging in any of the offense or any of the defense all going to the AIs.

More on AI (1:53:53)

Theo: So clearly you believe that AI doom is a significant probability but you don't believe it's greater than 90 percent. So you break with Eliezer Yudkowsky on that who thinks it's 100 percent. So why?

Zvi: I think it's important that he thinks it's 99 point something, not 100, but yes. Basically I hold out a number of ways where this might not end in disaster. One of which is simply we might not build sufficiently capable artificial general intelligence anytime soon. It might be more difficult than we think. Our civilization might be more inadequate than we think. The technological like power laws and scaling laws might not hold. They might not result in in practice efficiently capable systems.

Theo: Doesn’t that just kick the can down the road?

Zvi: I mean not for a long time and the longer we have to solve these problems the more likely we are using different architectures, the more likely we are in a better spot, more intelligent, more understanding, having more time to work on these problems and more time to find a solution.

I think there is some chance that techniques that are relatively imprecise, relatively unsophisticated give us enough of a chance. There's a significant chance that in fact the person who gets there first does create some sort of singleton. There's a significant chance that the people who do figure this out manage to find things I'm not thinking about. They find solutions that we're not considering right now. There will be a lot of very smart, very motivated people working on various forms of this problem. We will hopefully not simply be trying to scale up constitutional AI and hoping for the best, which I think is very, very low probability that will work. But also I have model uncertainty. Maybe I'm thinking about the world in ways that aren't correct. Maybe I misunderstand these problems. I haven't been crashing at the bare metal constantly for 20 years. I try to understand these problems in a different way. When a lot of people have a lot of hope in a lot of different places, I think you should espouse some probability of that.

Theo: Like how AI researchers think that doom is a lot less likely than you do?

Zvi: Yeah. I understand. I'm confident in a lot of their arguments. I'm confident that some of their arguments are incoherent or not well considered and are just incorrect. I can disregard those, but others seem far more plausible. You do have to consider to some extent that a lot of people are telling you you're wrong. It's not something you can completely ignore. I think it's very, very hard to have a 99% probability of something. Of course. And a very large number of people think it's very, very different than that.

Theo: But to your credit, this reminds me of a Roon tweet where he said something like, nobody is prepared for what's coming with AI, least of all AI researchers who spend their time trying to lower the curve on a loss function and think of new optimizations and hacks. I think perhaps what they do day to day is a little bit divorced from the long-term societal impacts of AI, because what they're just doing is programming.

Zvi: Yeah, the stories of people like Hinton and Bengio suggest that they didn't think about the potential dangers of AI because they figured they were so distant, they didn't have to worry about them. They didn't ponder what would happen if we succeeded. They were just trying to make incremental progress. Then one day they woke up and realized they needed to worry about maybe succeeding soon. And then they thought about it and they got terrified.

Theo: Very Oppenheimer-y of them. Interestingly, there were three Turing Award winners, Hinton, Bengio, and Yann LeCun. Yann LeCun doesn't really believe that AI doom is a significant probability, if a probability at all. He's very skeptical. So do you think he just doesn’t understand?

Zvi: I think he chooses not to. I think he’s making bad arguments in bad faith, often in very bad form, and he chooses not to engage with the questions in any serious way, and that is his decision. I made a deliberate decision a while ago not to cover Yann LeCun. I'm not going to quote his bad arguments and then knock on them. I'm not going to dunk on this guy. He can keep doing that. I'm going to keep ignoring him. If he makes a good point, I'll quote him, but otherwise I'll just ignore. And I’ve been very happy with that.

Theo: So far there have been no real AI risks and current day AI systems aren't really capable of any kind of major risk. So at what point do you think AI actually does become risky?

Zvi: I think we should not have been very many nines confident that GPT-4 was not a dangerous system, especially not a dangerous system when we then add various scaffolding and upgrades and improvements and plugins and so on to it over the course of many years. I think given what we know now, we can attach a good number of nines to that, that GPT-4 and similarly capable systems are best thought of as non-dangerous. But I don't think we could know that in advance with that much confidence. And when you ask me about a system that metaphorically is worthy of the name GPT-5, how many nines should we be willing to attach that the system is not existentially dangerous to us? I think one nine, yes. Two nines, probably not. Three nines, definitely not.

Theo: What do you think AI could do to convince even the most hardened skeptics that it is an existential threat?

Zvi: Most hardened skeptics? Kill them.

Theo: Really? So if AI were to, I don't know, launch a nuclear attack on a small city or engineer a plague that kills a million people but not a billion people, they would just remain fully skeptical?

Zvi: I think there are a number of people who respond to that being that is malicious humans using AI in a malicious way or making a dumb mistake we do not have to make again. We will learn from our mistake. We won't let it happen again. A significant number of people absolutely react in this way and they will control a substantial number of resources and they will still attempt to build AIs. Now I do think that if it launched a minor nuclear strike that killed millions of people that the governments of the world would in fact respond probably pretty strongly to that in ways that would make it not so easy to build an artificial general intelligence. But I do not think that the most hardened skeptics would bow even to that kind of evidence.

Theo: One particular risk of AI is hard takeoff, FOOM, where the AI recursively self improves to go from roughly human level or slightly above to vastly incomprehensibly superhuman in a very short time. So especially given our current paradigm of neural networks that are limited by compute and data rather than just limited by the actual code that they're written in, what do you think about fast takeoff risks now?

Zvi: I think it's a real risk. I think that it'll be very very hard to know when you are creating a system that is capable of a relatively hard takeoff. I think there are degrees of hard takeoff. Obviously there's the traditional, this happens in an hour or a day or a week. And then there's this happens in a month and it's not a yes or no Boolean question of how hard is your takeoff. But the fundamental theory behind it seems very very sound. Once you have a system that is capable of greatly accelerating its own improvement and its own capabilities enhancements, then what do you think is going to happen? History has already seen this in some sense happen multiple times. You've got the humans and then you've got the agricultural revolution and the industrial revolution. Why wouldn't it happen?

Theo: Could it be that certain jumps in intelligence are harder than others? I keep using IQ as a metaphor just because it's most people's conception of assigning a number to intelligence. But could it be that it's significantly easier to go from 160 to 165 IQ than to go from 240 to 245 IQ, for example?

Zvi: That doesn’t seem right. I don't see why that should be true. I think that given specifically human architecture, we are going to get increasing difficulty in amplifying human capabilities and intelligence in some sense once we get well beyond the range for which it was optimized and allocated for. But I don't see any reason why that should be true of an arbitrary computer system. Why should it be particularly... And if it was true, why would it be stuck in the human range? What is so special about the human range? Nothing as far as I can tell. And it doesn't seem like systems that are already in the IQ 200 type zone. They seem historically to be very, very good at enhancing their own capabilities and finding ways to make themselves better already.

Theo: Clearly, there's some specialness with humans on the lower bound, at least.

Zvi: In what sense? Will you say more?

Theo: Well, systems that were not quite as intelligent as humans… It's still not clear whether it was a slight jump or a massive jump from, say, chimp level to human level. Chimps don't have computers or rockets or cars. Humans do. So, are there things that would be just totally incomprehensible to humans in that sense? Especially with arguments for computational universality and humans being Turing complete and the capabilities, the potential that we have to upgrade our own minds, is that still the case?

Zvi: I think almost every human will come to some point where if you throw them enough scientific or mathematical literature with enough weird symbols in it, and enough complexity, will eventually throw up their hands and go, I can't do this. And that's not simply because they haven't spent the requisite time, they hit a wall. This is beyond my capability. I hit that wall too, in some places. I can't handle this anymore. I fully expect, yes, absolutely, if you build more capable AI systems, then there will become things these AIs understand, in some sense, and can produce and can say that humans just aren't capable of properly understanding.

Theo: You talked about systems in the IQ 200 range tend to be good at enhancing their own capabilities. Did you have any specific examples in mind?

Zvi: I mean, those people. People like von Neumann. Such people are very good at creating scientific innovations, at figuring out ways in which people can figure things out that are different, ways they could potentially enhance themselves. If you give von Neumann the ability to enhance his capabilities by solving the types of problems that these people have to solve, I think you would have seen a von Neumann take off, very clearly.

Theo: Well, there seem to be some areas that very high IQ humans do worse at than lower IQ humans. It seems like in some areas, at least on a naive look, that there would be diminishing returns, or even negative returns to intelligence in some case. For example, finding a mate, or making friends, or even starting businesses.

Zvi: So, the starting business thing, I think, is just a statistical myth. People think that being smart doesn't help you start a business, I think they're just wrong. If you look carefully at the studies about income, in general, and just general prosperity, income is positively generated with all the good things in intelligence, and vice versa. There were a handful of studies that had counterintuitive results in some subsections, and people started yelling because that's what they wanted to say, they wanted to find. But it was never true. It's just a mirage.

As for making friends and finding mates, well, that's because they have a fundamentally different problem, right? And because they have certain very clear restrictions on the use of resources and different preferences. If von Neumann could, in fact, gain infinite, limitless utility by finding mates that were typical mates, and that was something that he inherently valued very much, then I think he would almost certainly have found very effective techniques for getting those mates. I have every faith that he would have figured this one out. Similar with making friends. But he didn't want to. It's not what he cared about. So he did something else.

Dealing with AI Risks (2:07:40)

Theo: So given the errors that do exist, if you were made dictator of the world—King Zvi, Emperor Zvi I, what would you do to best counteract them?

Zvi: If I'm emperor of the world, then obviously that implies a lot of weird counterfactual things. But in the spirit of the question, I would prepare to impose limits on training runs and compute spends, and I would require the tracking of advanced GPUs and be ready to move towards a world in which you said these things weren't ready to proceed. And I would also fund various efforts to try and solve various forms of the alignment problem as well. But I would also do a lot of other things if I were emperor of the world. So there you go. But it also solves a lot of your problem. But because you're emperor of the world, you know that you can enforce your restrictions around the world.

Theo: So what about a lower level of power, then, like president of the US?

Zvi: I would, again, try to work towards that outcome, but I would then need to be focused on moving towards international coordination.

Theo: The main difference is if you're emperor of the world, you don't really need to care about international coordination. But the common counter-argument when people in America talk about slowing down AI is, what about China? China's not going to stop.

Zvi: To which I say, how do you know? What makes you think that?

Theo: Because, though I don’t necessarily believe this, China wants to defeat America, and AI is a really powerful tool, and it will advance this powerful tool as much as possible in order to impose its will on the world.

Zvi: Now, that sounds like Bertrand Russell. That sounds like one of these two things must happen. Everybody will obviously respond to their incentives. There's no way we could possibly get along with these people. As far as I can tell, we're not trying to get along with these people. We're not trying to make a deal with these people. The deal was very much in China's interest. Why wouldn't they make the deal? The Chinese are acting very terrified of AI, and rightfully so from their perspective. They have to worry quite a bit about what this would do to their ability to control their people. And also, we're eating their lunch. Like, quite badly. All of the major players here are Americans, or to some extent, UK. If we offer to stop the race, to slow down the race, if they play ball, why would Xi Jinping say no exactly? I am so confused by this claim. As far as I can tell, everything they have, they got it from us. What the Chinese are doing is they are copying our stuff. We're afraid of our own shadow. We are causing our own problem and then saying that because the Chinese have our old stuff, we have to make new stuff, but they will then steal again.

Theo: On a tangent, could it actually be a good thing that LLMs could help destabilize the grip that the CCP has over Chinese society? It's an excellent thing, in particular, because it means that they're less eager to pursue LLMs right now. Whether or not we would want China destabilized is a question that I am choosing not to think about too carefully, but it's not obviously good or bad.

Theo: We talked about emperor of the world and president of the US, but what if you became the CEO of, say, OpenAI, then what would you do? Not only would you be coordinating internationally, you would now need to coordinate domestically with the other AI companies.

Zvi: Yes, but I'd also have full authority over OpenAI. In some ways, it's easier. In some ways, it's harder in terms of a different place to play the game from. If I was head of OpenAI, I would start out by making various public commitments and working with the other labs to make various public commitments about under what circumstances we wouldn't release systems, we wouldn't train systems, try to work together to create an international framework that once we sign off on it, maybe the countries can sign off on it as well.

Specifically, if I was OpenAI, I would say my first problem would be I have to address my internal culture. I have a culture filled with people who are thinking about this much less well than Roon is. I'm basically okay with having Roon on my team. Roon is actually thinking reasonably well about these problems. He just has reached different conclusions and has different opinions. There are people who are very dismissive of the very question of safety. Those people got to go or be compartmentalized in a place where they don't know dangerous things, would be my first approach. I have to move towards it. I have to rebuild my corporate culture.

Theo: No, Zvi, you don't understand. We have to build a thermodynamic god and unleash it on the universe.

Zvi: I don't want to do that.

Theo: I would love to have an e/acc person on the podcast so I can ask them this specific question and see if there's anything there or if it's just like LARPing.

Zvi: My response to that is you're allowed to have preferences. One of my preferences is I don't want to do that. It's just that simple. I'm so tired of people thinking that's not enough. I'm allowed to prefer certain arrangements of atoms to others and the arrangement you propose is bad. So no.

Theo: In fairness to the e/accs though, I think that most of them don't think that AI doom is that likely. If they did, they would probably not be e/acc. They think we're building the thermodynamic god and it will benefit the universe greatly. Maybe it'll kill us, but probably it'll benefit the universe greatly, which is actually pretty similar to what even Eliezer Yudkowsky used to believe 25 years ago.

Zvi: He started out thinking if AI is built, it's going to be capable of doing some amazing things, transformational in the best possible way types of things. That's just not the default outcome. That's just not the likely outcome unless we solve a bunch of impossible difficulty level problems. Right now, we don't have a path of doing that. So that's going to be a problem.

But the e/accels, as far as I can tell, they're not uniform. They have a lot of different kinds of reasoning and motivations behind what they're doing. Some of them are well thought out and reasonable and some of them are not well thought out and not reasonable. Some of them are in fact like humans don't matter very much and I don't really care if everybody dies. Right? Others are the humans won't die. Some of them are like the humans won't die, but if they did, that seems kind of fine. Right? Or it will be like the judgment of the universe or whatever. And you have people who say like, well, I don't want the humans to die, but acceleration is the way to make sure they don't die. It's all over the map. Right? And again, it runs the gamut as well as in quality.

Theo: And then some people like Robin Hanson have entirely different opinions. It's like the meaning of what it is to be human will change significantly over the indefinite span of the future, just like it has in the past. And who are we to say that our specific way of doing things in 2023 is the way that must be imbued into the future AI, and it must be aligned to this. Why wouldn't we be able to think of AIs in the Robin Hanson way as our mind children, our descendants, our lineage?

Zvi: You can choose to do that. I don't. I don't think that counts. That is what Robin Hanson described in the vision of the future, to which I ascribe very little value. I do not want that vision of the future. I will let other people decide whether they find value in that future. I don't think it's incoherent to find value in that future. I think we could. I don't.

I don't think that we built a bunch of artificial intelligences that then contain some legacies because they were trained originally on human data and text and other forms that then adopts itself to the environment in which it's given. And that will inherently be something I value, and that I think should be the thing that populates the universe. Yeah, I think that's bad, actually. I don't like this.

Theo: Could it be something better than your current best conception of what it is that you value? Let's say that you had a medieval peasant from 1300 that you brought into the modern world against their will, against their wishes. And now religion is a lot less relevant, and a lot of their societal structures that they had been used to have disappeared. But at the same time, you know, they will no longer have to die at the age of 30, deal with plagues and barbarian invasions, they get to live in a house with air conditioning and computers.

Zvi: My actual prediction is that once they adjusted to the culture shock and the language barrier, you would see a wide variety of responses. A substantial portion of people would react very well and say, "This is vastly better. This is a utopia. The world is great." And if God wanted us to believe in him, he wouldn't have made this opportunity for people to do this thing. Other people would react with, "The things I value are now no longer here, and I think this is terrible." Some of that would be religious, and some of that would be cultural, and some of that would be something else. Different people have different opinions, and I'm not here to tell them otherwise.

Theo: In modern society, they could move to an Amish village or a hunter-gathering tribe in Africa or something.

Zvi: For now, I would say, right now, you could move to an Amish village or similar place. I think the Amish are living a strictly better life than a medieval person. It would be very hard for me to find a way to disagree with that. But there is the problem that you're still around a group of people outside of you that don't believe in that and don't follow that, and a lot of people from medieval society would inherently care about that quite a lot for reasons that I think are pretty easy to be sympathetic to. They would quite reasonably ask, "Will I be allowed to continue in this lifestyle for all that long if they actually understood the situation?"

That's the thing about AI. Right now, if I wanted to, I could go out and join an Amish village or basically live any lifestyle from history that I wanted badly enough. I have the resources for that. But if AI comes along, that's no longer the case. We are no longer going, there's not going to be any refuge anywhere if things go badly. And probably in a lot of versions, if things are going relatively well, it's going to be difficult to find refuge of that type as well.

Writing (2:18:57)

Theo: So with all this talk about AI doom, let's end on a more positive and parochial note. I think that you're an excellent writer. I loved your AI weekly columns. I would read them on LessWrong. I still do, mostly. So what advice would you have for someone who wants to be a writer?

Zvi: If you want to be a writer, there's only one way to get good at writing, and that is to write. You have to write, and you have to do it with deliberate practice, meaning you have to look at what you're writing, ask yourself, what parts of this work, what parts of this didn't work, how do I improve it, what rules does that reflect, how do I iterate on that? But literally, every writer who talks about how do you get it right just says, write, write, write, constantly. And that is, in fact, how I got good as well.

Theo: What about publishing? Currently, I have 30 or 40 drafts in my Substack folder that are something between a single line of an idea and a 90% finished essay that I haven't wanted to post because it's not very good. So at what point do you publish something?

Zvi: If you're not killing your darlings, you're not doing a good job. If you publish 90 something percent of the work that you wrote down on the page, then you're not filtering properly, and you're not asking yourself what parts of it are good and what parts of it are bad. But also, you do want to put yourself out there and accept that your first 100 posts are mostly going to suck versus posts 101 and 200. I mean, suck less, but like, yeah, I know I wrote multiple Magic articles a week for years, and then I wrote a lot of posts in the rationality space, and slowly you get better. But it is slow, right? Let's not fool ourselves on this.

Theo: All right. Well, I think that's a pretty good place to wrap it up. So thank you again, Zvi Mowshowitz, for coming on the podcast.

Zvi: Absolutely.

Theo: Thanks for listening to this episode with Zvi Mowshowitz. If you liked this episode, be sure to subscribe to the Theo Jaffee podcast on YouTube, Spotify, and Apple Podcasts. Follow me on Twitter at Theo Jaffee and subscribe to my sub stack at theojaffe.com. Thank you again, and I'll see you in the next episode.

0 Comments
Theo's Substack
Theo Jaffee Podcast
Deep conversations with brilliant people.
Listen on
Substack App
RSS Feed
Appears in episode
Theo Jaffee