Theo's Substack
Theo Jaffee Podcast
#2: Carlos de la Guardia

#2: Carlos de la Guardia

AGI, Deutsch, Popper, knowledge, and progress


Note: This transcript was transcribed from audio with OpenAI’s Whisper, and edited for clarity with GPT-4. There may be typos or other errors that I didn’t catch.

Intro (0:00)

Welcome back to the Theo Jaffee Podcast. Today, I had the pleasure of speaking with Carlos de la Guardia. Carlos is a former robotics engineer and longevity scientist who now works as a solo independent AGI researcher, inspired by the work of Karl Popper, David Deutsch, and Richard Dawkins. In his research, he seeks answers to some of humanity's biggest questions: how do humans create knowledge, how can AIs one day do the same, and how can we use this knowledge to figure out how to speed up our minds and even end death. Carlos is currently working on a book about AGI. To support his work, be sure to follow him on Twitter @dela3499 and subscribe to his Substack, Making Minds and Making Progress, at This is the Theo Jaffee podcast. Thank you for listening, and now, here's Carlos de la Guardia.

Carlos’ Research (0:55)

Theo: Welcome back to episode two of the Theo Jaffee Podcast. I'm here today with Carlos de la Guardia. I guess we'll start off by asking, what does a typical day look like for you as an AGI researcher?

Carlos: Well, partly, it's working on my book. So, I take the ideas I've worked through in the past and try to put them in some form that is less chaotic than they are in my notes. And then part of it is following various leads of things that have been interesting to me. I almost think of it like emails for somebody who doesn't have any collaborators. So the prior day, I'll kind of see interesting things, go into my notes, and I'll be working through them one by one and see how they connect with all my prior thoughts. And sometimes you'll find things that seem interesting in some very small way, but they'll turn out to connect to all kinds of bigger things.

Theo: And also in the category of what you do day-to-day, what kind of software do you find most interesting? Because you ask different people on Twitter, and they have such different answers. Some people want super complex systems. Other people say, oh, just Apple Notes. I like this. So just workflow-wise?

Carlos: Yeah. I use Roam right now. And so I like the idea of being able to connect things, and that's been helpful. I use Google Keep for just jotting notes down. Beyond that, I don't do much. Yeah, so I keep it pretty simple, I suppose. In fact, I actually met up with Conor, the CEO or the guy who runs Roam, and he told me that I probably shouldn't even use Roam because I use it only with the simplest features, and it has much more advanced things it could do. But it's been working well so far.

Theo: So how much does your AGI research overlap with conventional AGI research? Kind of like what OpenAI does, where they have neural networks, and they're training them on huge GPU clusters, and they're monitoring the loss functions, and so on. Do you do anything like that?

Carlos: Nope. Yeah, so mine's purely theoretical. So I focus more on theoretical questions of what makes humans special, and what is it that we're doing so differently from every other algorithm, and animal, and system. And I think that's mostly, yeah, for me, a theoretical question that only rarely involves me running computational experiments. So I often think of existing machine learning as being sort of like a bottom-up approach, things that work in practice for some things, and making them better and better and better. And I think of myself as more like a top-down, almost kind of approach of saying like, here's these performance, or here are these philosophical ideas at this higher level, and trying to make them more computational. So I almost think of myself as working on, if Popper worked on the logic of scientific discovery, I think of myself as working on the practice or the computation of scientific discovery. So that also means that nothing I do ever works in the traditional sense of people running actual code today. So it's pretty unsatisfying for them, I think, probably, to hear about theoretical ideas, because they seem so untested. But on the other hand, I'm often addressing problems that they're not considering at all. So I think that there's a place for the high-level research.

Theo: So how likely do you think it is that you or someone like you will get to AGI before someone like OpenAI? Consider AGI, for the purposes of this, like something that can do, let's say, most of what an economically productive college graduate level human can do.

Carlos: I don't think that's what AGI is, necessarily. So that's probably where we differ. So in the sense of like, how soon will it be before I create a useful system? Maybe never. It's not really what I'm trying to do. I think there's a fundamental difference between people and tools. And so I'm not working on tools at all, really. I'm curious about what it is that makes humans fundamentally different from everything else. So I think that my objectives are quite different. And so the book I'm working on tries to drill down as what those differences are. I think of it more like the human capabilities. It's sort of wrong to think about our minds as things that can happen to do lots of specific things. What defines us isn't the many specific things we can do, but the fact that we can think of anything. So if the thoughts that we can hold are unlimited, as I think that they are, that's what defines us. If you give me any system, however many finite things it can do, I will think that it is fundamentally different from an infinite system like ours. So I think that I often like to, maybe this is too much of a tangent, but I often like to compare, when people talk about intelligence, I like to break that up into two separate things. One of which is knowledge, and the other which is the ability to create knowledge. Because I think that if you compare a system like a data center and a baby, a data center has a huge amount of inbuilt knowledge. Millions of lines of code, running many independent complex processes. And it has almost no ability to create new knowledge. If something new happens that wasn't programmed into it, it won't be able to respond. Whereas the baby has almost no knowledge. It's terrible at running a data center. And yet it has infinite ability to learn new things. And so if you looked at them and just asked, what is the intelligence of these things? Well, it's the wrong question. You'd say, which one of these has more knowledge? Well, at the beginning it would be the data center. But which one of these has more knowledge-creating ability? The baby. And that ends up being decisive.

Theo: I see. So have you found GPT-4 or other current AI models at all useful for your AGI research? Last week, I interviewed Greg Fodor. And he talked about how he believes that in order for AGI research to progress, AI researchers need to have access to foundation models. Because that is like the single most helpful tool for AI research to progress. So do you have a similar or different idea on that?

Carlos: Well, in line with my previous kind of answer, I don't really do anything computational. So to me, it would only be helpful in the usual ways it would be helpful to any researcher. In that case, it could be useful. I haven't actually been using it. Maybe I should be. But I'd be using it purely as an ordinary researcher, not as an AGI researcher in particular.

Theo: Do you have a ChatGPT Plus subscription?

Carlos: No. I just have some basic one or whatever. I don't use it often. I've mostly used it so far for wordplay. I'll ask it like, can you give me a word that starts with P that means something like rain? Or whatever it is. No. It's good at that kind of stuff. It's good at many things. And I'm probably the least expert in them. I have a friend who is trying to get into programming. And so he's using it much more intensely than I am.

Theo: Yeah, I studied computer science in school. So I find it immensely helpful for learning. Though on the topic of conjecture, when I'm working on actual projects, something that hasn't been done before exactly like what I'm trying to do, it does fail in surprising ways sometimes. Even a system as advanced as GPT-4.

Carlos: Yeah, I think one thing that's interesting to note is that if you're judging GPT or some system like that, maybe there are two broad ways to judge it. One of them is, how well does it know what is known? And then the second thing might be to say, how well can it solve unsolved problems? And I think we sort of underappreciate how often we solve unsolved problems in the course of everyday life. Just in a way that is almost second nature to us. You know, we'll be asking, where are the keys? We'll find the keys and we'll carry on with our day, not realizing that we've just conducted a quite sophisticated search program in the midst of our ordinary moving around.

So I think when you look at the space of possible solutions to problems that we confront, it's huge. And I think this is where maybe GPT shines less because it's one thing to know what is known or to have text about what is known. It's quite a different thing to be able to use that to solve new problems. And, however many trillions of things have been discovered, that's really an infinitesimal part of what we can discover and of the search space that we're involved in every time we try to solve a hard problem.

So when you ask, can GPT do something plausible in some case, the answer is probably going to be yes. But if you ask, can it find the actual solution to some problem in some vast search space that we don't know the answer yet? Well, that takes a special kind of program. And that's one half of my book. It's partly about universality, all the things that are possible for a system to do. And then secondly, if you have a search space, how efficiently can you search it? Because the search spaces that we're involved with in ordinary human problem solving are exponential. They're huge. So being able to actually navigate them usefully is an incredible challenge. I think that that's where you see a real difference between different kinds of search processes, GPT being much inferior to the human one.

Theo: So if you think GPT is necessarily limited compared to humans, how far do you think the current paradigm can go? What do you think will be the capabilities of GPT-5, GPT-6, what will it be able to do relative to humans?

Carlos: I don't know. That's where like, that's an issue about machine learning that I think if you ask me, what could humans do? Well, rather like, yeah, I don't think I have any particular insight into what any particular technology can do. All I can say is that I'm interested in the things that distinguish different systems. So I'm interested just in the questions of what is the total range a system can explore? What is the search space of things that it can explore? And then secondly, how does it navigate that space? And so if you saw incredible leaps in the size of the search space that different models could explore, that'd be great. Although my fundamental question would be, is it unlimited or not? That's my fundamental question. And, you know, as to what designers come up with in the coming years, they'll, they may surprise me in their answers, but that's the question I'll be asking.

And that's like, secondly, it'll be how efficiently does the system navigate the search space? In particular, like what fraction of new ideas that comes up with are actually improvements over prior ideas? Because natural, you know, like biological evolution is very bad. A billion letters in the genome, it sort of starts switching a few of them at random, you know, so one in a billion chance of being an improvement, let's say. So it's very bad at coming up with new things that are better. It's easy to come up with new things, but new things that are better, that's the hard part. And so those are the kinds of questions I'll be asking.

Theo: So what kinds of capabilities do you think you'd have to see in a system for you to say, yeah this is like a human, this is universal, it can navigate any part of the search space. I don't know if you can test that behaviorally. It's sort of like, if you're asking, can I do this infinite set of things, you can't actually check the infinite set of things. So you have to look more in terms of how it's built. My goal with my book is to try to see what fundamentally matters and then how those things were made achievable in the human case. I think that'll give you a clearer picture as to if you were to open up the internals of some system, you could tell what it could do and what it couldn't do. If you knew that Turing completeness was an important feature of the system and then you could open up a CPU to see if it could achieve that result, then you might be able to tell, okay, this system is Turing complete. You asked me a very specific question. You weren't necessarily testing its outputs for different things. So I think it'll be more about how is this system built rather than what is its behavior.

Theo: So kind of like, if you have a computer made out of crabs? Have you heard about the crab computer? It was a real experiment done by a team of scientists where they figured out how to move crabs around in such a way that they would actually represent a Turing machine and they could get it to represent basic programs. The crab computer, if you have enough crabs, would be universal. But a TV playing the same loop of content over and over again with no option to change its output would not, even though the TV could demonstrate something a lot more complex than the crabs in practice. Is that kind of what you’re getting at?

Carlos: I was just thinking to myself, the crab computer wouldn't be universal only because it doesn't have enough crabs.

Theo: With an infinite number of crabs, then it should be good. But if you only have a finite number of crabs, say a million crabs, then you would probably have a hard time running Google Chrome, especially at acceptable speeds. But if you have a monitor with a static image of Google Chrome on it, it would demonstrate more outwardly complex behavior than the crabs, but it would not be universal in the same way that the crabs are.

Carlos: There's no such thing as a universal Turing machine in physical reality. Every machine is going to be finite at any given time. But the question is, can you extend it? If there are humans around or something, or in the case of humans, if there is a system that itself has the knowledge to acquire more resources and turn them into more memory and more computational power, then that system itself will be able to extend itself. And at that point, it'll still be finite in its computational capacity at any given moment, but into the future, and there'll be no limits on the size of its capacity.

So if the crab computer requires an external person to add more crabs, that it's not as impressive. But if it itself has the knowledge, as I think the human civilization does, to go out and acquire more resources, well, then if we consider ourselves a big crab computer, the big human computer, we're able to get more non-crab materials from the world and turn that into what we want. That actually is a fundamental thing because there are no infinite computers in reality. There are only finite ones, but some of them know how to make themselves bigger. And it makes all the difference.

Theo: So there's no infinity, only a beginning of infinity.

Carlos: There you go.

Theo: On a slight tangent, on the header of your Twitter profile, you have five books: The Logic of Scientific Discovery by Karl Popper, The Blind Watchmaker by Richard Dawkins, The Beginning of Infinity by David Deutsch, Gödel, Escher, Bach by Douglas Hofstadter, and Knowledge and Decisions by Thomas Sowell. Personally, I've only read Beginning of Infinity, but the other four are certainly on my list to read in the near future. So how did you find these books? And how did they influence you?

Carlos: I read The Beginning of Infinity first. I found that through Sam Harris. He had posted on his website in 2012 that he was currently reading it. I read it, and then two years later, he had David Deutsch on his podcast. In his preface, he said, by the way, I apologize, I haven't read your book yet. So I had gotten the book from him, then read it, and then finished it before him, I guess. But that introduced me to Popper. I had known about Dawkins beforehand, but only started reading his stuff more recently. Gödel, Escher, Bach, I've heard about and I still haven't read it entirely. I read like half of it. I find that a lot of the things in there are the right kinds of questions to ask, although I sometimes disagree about the answers, like in light of Popper and Deutsch. But they're always interesting. And then Thomas Sowell, I was always binge-watching Milton Friedman in college, all his videos, and so naturally came across Thomas Sowell and liked his stuff. I especially like this essay from Friedrich Hayek that discusses the use of knowledge in society. The argument is that knowledge is distributed across society and capitalist systems make better use of it, free market systems make better use of it. Thomas Sowell also took this knowledge based view of different economic systems. I haven't read the entire book, I just really loved the intro to the book more than anything. The idea of saying, everyone looks at the question of communism versus socialism versus free markets through a very ideological lens, or even an economic lens, but you could take a purely knowledge based view. I thought that's very interesting. It's a different level of abstraction.

I like to think of replacing the word of epistemology, which is a mouthful, with this more pleasant to the ear, the knowledge based view. I also think of it like goggles you can put on and suddenly you're looking at your iPhone or your computer, or other people around you or civilization. Instead of seeing the atoms, you're seeing the knowledge. You ask questions like, how did it get there? What is its capability? What kinds of things does it affect, etc. I find that quite nice. I found it practical when I was talking to people who weren't philosophy nerds. I thought, do I really want to say the word epistemology right now, at this pool party, whatever it is. And I thought I'll use the word “knowledge-based view”. And that felt better.

Economics and Computation (19:46)

Theo: So on the topic of Sowell and Hayek, who I think are both fantastic, love both of them. A lot of people seem to assume implicitly that once we get AGI, or ASI, whatever that means, then humanity will be in a utopian communist paradise for all eternity. The AGI in its infinite wisdom and goodness will simply decide what resources to allocate to whom. I'm somewhat skeptical, how about you? What do you think about it?

Carlos: I think one question to ask is, what fundamentally changes in a system like that? If you choose any kind of new thing, does it change the fundamental problem? The fundamental problem in economics in that case is if knowledge is distributed, and people are creating new knowledge everywhere, then are you able to predict that and then make decisions or not? If you have more powerful computers, then it wouldn't be able to be doing more powerful things, less predictable things all throughout the economy. So if you were trying to ask how predictable is the economy? If it only consisted of a very simple system, that maybe you could conceivably model the entire thing, in terms of its atoms and everything else, and figure out what it would do. But if the entire rest of the economy is many times more complex than your ability to simulate it, then it seems like that fact hasn't changed.

Theo: That kind of reminds me of an idea from Stephen Wolfram, where he talks about computational irreducibility. Wolfram is kind of like Deutsch in that he's very into computing, computers, and taking a computational view of everything. Unlike Deutsch, he emphasizes something slightly different. So whereas David Deutsch talks about humans are universal, right, we can do anything in theory. Wolfram talks more about how humans are computationally bounded observers, and how the world as a whole is computationally irreducible. The only way that we can understand complex systems is by observing them. They're very hard to compute.

So I wonder if the economy would be totally computationally irreducible, like the only way to allocate resources efficiently would be to let capitalism just do its thing. Is it even possible in theory to have an AI advanced enough, a computer powerful enough to model the whole thing?

Carlos: Yeah, isn't this like a Newcomb's paradox, or something like that, or like Newcomb's problem? I think, like, can you predict what a human will do? And I guess the only way to do it was actually simulate them.

But anyway, I think that there's interesting things to say about predictability, because it might be that if you were to say that complex systems are unpredictable, you can only run them to figure out what they're going to do. You could say, well, isn't the universe such a system? But you'd be wrong in that case, because there are all sorts of regularities in nature that can be exploited, and discovered and exploited. So if you were to say, let's just, all we have to do, our only recourse if you want to understand the system is simulate all its atoms, and then run it. Actually, you'd be missing out on the fact that, let's say, they're all going to obey conservation of momentum and other things like that. And so, without running anything, I would tell you, the total amount of energy or momentum in the system will be the same into the indefinite future. I haven't done any computations, just because I know this deeper principle. So I would be, in other words, predicting something important, even if I wasn't predicting the details. So I think that there are many patterns of reality to be discovered like that. So it's not purely just run it. And we can't say anything else about it.

Theo: He does talk about pockets, reducible pockets in the computationally irreducible universe. So yeah, conservation of momentum and laws of physics is an irreducible pocket. But something like trying to describe, is there a law that you can come up with to describe the 50th step in an arbitrary computational automaton? And Wolfram would say, no, the only way to figure it out would be by running it. So could there be a law that allows us to predict the behavior of systems like that? It's possible. I wonder what it would take to discover.

Carlos: I think on some level, it's important to recognize. Different levels of abstraction exist, and in some cases, you may not care about low-level details. For example, it's impossible to predict the location of a particular atom in the indefinite future. Yet, that might be the most boring thing to ask. At a higher level, things might be quite predictable. If I said something simple, like writing a loop in Python that just adds two plus two indefinitely, that would be perfectly predictable. But if you asked me what the electrons are doing in the CPU, that might always look different. The memory locations being used might always be different. There might be all kinds of incidental complexity that you just wouldn't care about.

I'm not sure what we should take away from the idea of computational irreducibility. I'm more a fan of Deutsch's point about the unpredictability of future knowledge. That highlights the difference between some systems which are very predictable, like this two plus two equals four, and people and what they're doing. There are other systems that would be computationally irreducible, perhaps, but also very boring, like just some kind of noise production system.

Theo: Aside from these five books, Popper, Dawkins, Deutsch, Hofstadter, and Sowell, what other books do you think had a large influence on you?

Carlos: Let me look around the room here.

Theo: By the way, you should really post a list of these books on Susbtack or Twitter, if you haven’t already.

Carlos: I guess the best I've done so far is I just put that thing at the top of my profile. The biggest books, really, like the 90% of the way I think, is The Beginning of Infinity. That introduced me to Popper and other stuff, which is fairly interesting. Everything else is helpful to some extent, and gives some ideas. Lots of things are interesting. The reason I'm doing all the research I'm doing right now was because of The Beginning of Infinity very directly. Deutsch’s essay on AGI is an excellent one in Aeon magazine, about why Popper is relevant to AGI. That kind of stuff is what really got the ball rolling for me and introduced me to this whole area. I never would have gotten interested in Popper otherwise.

Then there's just all sorts of books that I forget the names of at the C level, which are interesting in various ways for one or two ideas. One of them, for instance, was like, book from Minsky called Finite and Infinite Machines. That was kind of interesting one, it's probably outdated at this point, in some ways, people probably invented better ways to explain the fundamentals of computability and so on. But that was so interesting to kind of grapple with low-level details of when people were initially figuring out what can computers do? How should we think about them?

Bayesianism (27:54)

Theo: Deutsch and Popper have one epistemology and a rival epistemology that is favored by a lot of people on our part of Twitter, and on the internet at large, and in the AI community is the rationalism developed by Eliezer Yudkowsky. Rationalism, which was developed by Yudkowsky and explained on his website LessWrong, heavily emphasizes Bayes theorem as essentially the central mechanism of knowledge. Basically, in rationalism, what we do is you have evidence from the outside world that will cause you to update your prior probabilities in one direction, or the other, everything is very mathy. That seems like it's in opposition to a lot of Deutsch and Popper's ideas. Deutsch has written about this. He wrote an article called “The simple refutation of the Bayesian philosophy of science” where he kind of destroys the whole idea, I think. So do you think that there are any idea or any areas where Bayes’ theorem is applicable?

Carlos: I'm not really an expert on it. So maybe there is value there that I'm missing, both in like the formal ideas and in the ideas of the broader community and so on. But for me, I'm mostly just interested in how the mind works. I think the fundamental questions there are like, Popper's questions, like, how do you come up with new ideas? And then how do you select among them?

I feel like whenever I tell people, or talk about fundamental ideas of variation selection, there's never any pushback. I think that there's just simply logically obvious. If you're interested in knowledge creation, it's like, first of all, you have some things that exist right now. And then if you need to have a new idea, it's going to be produced by existing ones in some form, non-existent things can't just come out of nowhere. So you have what you do have now, you reassemble it somehow to come up with a new thing. And you can do that in better and worse ways. So that's just one thing that's like a logical necessity. If you want something better, you have to come up with something new. And you have to have some process that can come up with new things. And then secondly, if you have a bunch of new things you're coming up with all the time, eventually, you run out of resources to explore them unless you start removing the ones that aren't good. And so that you can focus on improving the ones that are. Prior improvements are necessary if you want to have cumulative progress. You need these variation and selection elements. Then it's just a matter of saying there are better and worse ways of doing these things. That's where the discussion then goes to. In the Bayesian case, you might ask, do we have a better way of creating new ideas or a better way of selecting among ideas? Those would be my main questions.

Theo: Do you think that there's any room to say, "I find that this theory or this trait has a 70% chance of being correct, and this one has a 30% chance of being correct. So until we get more evidence, we should favor theory A." Is there any room for numbers and probabilities? Or is the process of selecting better theories purely ordinal, and not cardinal?

Carlos: What's the difference there?

Theo: Ordinal meaning you can only compare which one's better and which one's worse. Cardinal meaning you can assign specific numbers, values. So you can say, this theory is 90% correct, or 90% likely to be correct based on my priors. And this theory is 70% likely to be correct.

Carlos: I guess the short answer is if it helps you out, use it. But that's a practical thing, rather than saying there's some fundamental importance to their probabilities. This is not an answer to your question, but it just reminds me of the price system. There's a value in the fact that there's all kinds of knowledge that is relevant to a given product, but not expressed in the price. The price is a simple number. And thank goodness for that, because it makes it easy to communicate what to do with that object, whatever the product is, in a way that is sensitive to lots of other knowledge, but you don't have to have all that knowledge explicitly represented to make use of it.

So to the extent that you can do useful things like that, then fine. I don't know if there's any fundamental importance to probabilities and so on for epistemology, though. Because again, to me, I guess, maybe throw the question back to you and see if you have any thoughts on it, do they help either in the creation of new theories or in the selecting between theories? I suppose you could say, if you prefer one theory, you could give it a higher number. But to me, I suspect that there are independent reasons that cause those numbers to be what they are. So like, I will first say to you, this theory has these problems, and seems to be consistent with these things. So that seems to be against that theory. I'd say this theory doesn't have those problems, maybe has different problems. And because of that discussion, I might then say, if I had to put it to a vote now, I prefer this one. But you know, they both have their problems. So because I'm going to prefer this one, I'll give that 60%, I'll give this other one 40% or whatever. But you see the order in which those things were arrived at. First, the ideas, then the problems with those theories, then I assign some number for a practical purpose. That's how I'd be thinking about that. That's not using the actual probability calculus or anything.

Theo: I think that a lot of the problems in Bayesian rationalism are just taking their probabilities too seriously because really what they are are wild guesses based on some amount of knowledge and prior probabilities. For example, the whole LK-99 superconductor saga, they had their prediction markets and people were really, really trusting these prediction markets as almost like arbiters of truth. You know, they would decide whether or not the superconductor was real based on prediction markets that would swing wildly. If you remember one day they were at like 20% and then the next day they were up past 50%, 60%. People were saying, oh, my prior or my probability that the superconductor is real is 99% and now they're back down to 5, 10%. So yeah, I think that the usefulness of Bayesianism is limited along with the information that you have.

AI Doom and Optimism (34:21)

Theo: On the topic of AGI, you seem to be an optimist when it comes to the conventional question of like, will AI kill us all? So can you explain why you believe that it won't?

Carlos: I like one of Deutsch's points on this, that the battle between good and bad ideas will rage on, no matter the hardware it's running on. The real thing is about the quality of our ideas. And I think we should have reason to hope that the culture we've already developed over several hundred years through the Enlightenment has been largely about how to coexist peacefully.

When you ask the question, what does it take to coexist safely with other created beings? That's what we've been working on with our whole project of democracy and so on. In a way, it's sort of naive to imagine that we will create beings as good as us, with the same capabilities as us or greater ones in terms of their hardware, and then we will somehow deal with that in some kind of technological way, rather than in a cultural way. So I think that our greatest resource in terms of safety, dealing with other people, is our current institutions and culture, which make me not want to murder you, for instance. It’s not a great way to solve my problems.

That's a different question from algorithms run amok that aren't universal like us. Their whole problem is that they're too stupid in the relevant ways to have the right moral knowledge and so on. But if it's universal, then it should have the same capability that we do to not only learn technological things, but moral things.

Theo: So that kind of runs into an issue that's brought up by Eliezer Yudkowsky and originally developed by Nick Bostrom called the orthogonality thesis. For the audience who may not know, the thesis basically states that the intelligence of a system and the morality of a system are totally unrelated, not just the morality, but the goals of a system. For example, it's possible to create an incredibly intelligent AI system, according to Yudkowsky and Bostrom, that wants nothing more than to make as many paper clips as possible. So, do you think that is likely or not? And why?

Carlos: One question you can ask in such a case is this: What role does the idea of maximizing paper clips play in this larger AGI system? We often imagine that it forms the core, this immutable core, which drives everything else. Everything else is subsidiary to that one idea. If it is nice to you one day, that is because in its calculations, that will be better for achieving this ultimate goal, which never changes.

That's one picture of the role an idea can play in my mind, that it's immutable, central, everything else is subservient to that. A different picture is more of an ecosystem view where, if you drew a circle and your mind was the circle, lots of ideas are vying for power within that system, but there's nothing immutable there. If one day the idea arises in your head that you should create paper clips, that will never be instantiated in your brain so that everything else is subservient to that. Or if it does, maybe that would be what Deutsch calls an anti-rational meme, where it's somehow evolved to have that property. But throughout history, it's quite hard to produce an idea that sticks in people's brains and makes everything else subservient to that.

These are two fundamentally different pictures of what minds are. In one case, you almost have this image of traditional neural network and machine learning algorithms where it's like an optimization loop, which is fixed on the outside. It says, here's the thing we're trying to do. This is our ultimate goal. Everything else is going to be in service of that. There's nothing in the system that can change that external goal, except an external program. There's nothing inside the system that can do it. Whereas in human minds, it's like this ecosystem where many ideas are vying for impact and power, and none of them has a monopoly on it. There is no outer loop. And in a system that's more of this ecosystem type of view, where everything is equal and bouncing around and trying to affect everything else, there is no outer loop. You have escaped that infinite regress.

Theo: Another thesis brought up by Yudkowsky and Bostrom is the idea of instrumental convergence. They believe that no matter what final goal you give an AI, like if you tell it to cure cancer, or if you tell it to make paperclips, it will converge on the same instrumental strategies, like preserving itself, trying to acquire resources, trying to enhance its intelligence, and trying to defend against people shutting it off or changing its goals. With AI systems that are stronger than humans, supposedly that can be very, very dangerous. So what do you think about the instrumental convergence idea?

Carlos: The idea is that it converges on certain things, but it diverges on others, right?

Theo: Basically, no matter what final goal you give an AI, it will converge on instrumental goals that involve preserving itself and preserving its final goal at all costs.

Carlos: I mean, you can't do anything if you can't survive to do it long enough to do it. There are certain requirements that physical reality imposes on you, like if you want to do anything big, you need a certain amount of energy and so on. If the system doesn't realize that fact, then so much the worse for that, and it won't achieve those goals.

There are certain things, I guess you would say in that case, that are predictable prerequisites for achieving many things. So I imagine if it didn't have universal computers, you'd say, well, probably if it's going to be successful, at least we'll have to invent universal computers, because we know how important those are. That's less a statement about any kind of AGIs or anything in particular, just more about the causal relationships between different kinds of technologies and so on, which anything would have to adhere to.

Theo: Another crux of the AI doom argument is that AI systems will become vastly more powerful than humans, or even slightly more powerful than humans, and they'll somehow be able to exploit that asymmetry in order to end up killing all of us. So you've talked about humans being universal, and we can't be replaced by AI systems. Gwern Branwen, an internet writer, wrote a long essay called “Complexity No Bar to AI”, where he argues that even if humans and AIs are both universal in theory, the AIs will run on such better hardware or be able to implement such better algorithms that they would be able to become inconceivably powerful anyway, even if we're both theoretically in the same computability class. So what do you think about that?

Carlos: I think there are different kinds of universality. One is computational universality, but I think there are a few others which I have in another video. We don't literally have complexity classes in terms of what our thinking is about, because we don't have simple algorithms which we're executing on different kinds of inputs. But the core concern of complexity theory about the resources used for computations, that is absolutely essential in our case. It just isn't literally about complexity classes.

So, I think the answer to those cases is how efficiently do you navigate the space of ideas? And that consists primarily of two questions. How efficiently do you come up with new ideas that are actually good? And then how efficiently can you filter the good ideas from the bad ones? I think there are forms of universality associated with each of those.

So, if I said to you, I have a method for generating new ideas. I have some ideas here. I have some ways in which I can combine them. For instance, a genome consists of letters, and you can flip the letters. And I told you, all you can do to create a new mutation is to flip one letter. That's a certain way of navigating the space of genomes. But if that was the only way you could ever use, then it would be the equivalent of only being able to navigate the planet by taking one step at a time on your feet. No planes, no boats, no parasailing or anything like that, just your feet. And that would be quite a slow way to navigate the space of ideas.

Whereas we can continually invent new ideas, new ways of combining them that are the equivalent, if you're thinking of navigating the globe, of taking a plane someplace. My favorite example of this is a video called Pakistan Goes Metal, which in a sense combines only two things, traditional Pakistani music and metal music. This is not a low-level combination. This is a combination of two very high-level concepts that makes it seem very easy. And yet when you combine these two things, there are many lower-level details below them. But at this level of abstraction, the combination is very simple.

There was a time before Pakistan existed, the time before metal music existed, and those concepts had to be invented. And now that we have them, they allow us to form very simple but very powerful combinations. And I think of that as saying, basically, we have invented new ways of combining things. If you thought only in terms of notes, it would take thousands of things to express the difference between this Pakistan Goes Metal thing. Or rather, to express this idea of Pakistan plus metal purely in terms of notes would be very complicated. If you have a look at the MIDI file, it would be hundreds of different things that you have to tweak.

So in other words, there's a simple way of combining new things and that is essential for us to actually be able to efficiently create new ideas that are better. And there's analogous things for our ability to select among options and tell which ones are good. But yeah, the bottom line of all that is if you can't invent new ways like this Pakistan Goes Metal, new concepts, new ways of generating new ideas, then you're going to be hopelessly inefficient. You're going to be like the person trying to navigate the planet on their feet rather than with planes.

Will More Compute Produce AGI? (46:04)

Theo: So back to the current paradigm of AI, where we have neural networks with tremendous amounts of data and tremendous amounts of compute. Do you think that there's a possibility that simply adding more will lead to conjecture and universality in the same way that evolution, which is kind of a dumb process, you know, there is nothing really intelligent guiding in other than navigating the search space. Evolution led to human universality. So do you think that there's some point at which we add just enough compute so that AIs will become universal?

Carlos: I guess there's like two questions. I guess this isn't quite your question, but if you add more computational power to a system which is like less efficient, like exponentially less efficient, then it doesn't really matter how much more power you add to a system. So that's more this computational class type idea where it's like if you have a really bad algorithm for doing something, it doesn't really matter what constant you add in front. But that's, I think, not what you're asking.

I guess your question is more about almost like that instrumental convergence idea. It's like, well, if you have evolution doing its thing over here, you have machine learning doing its thing over here, and one of them ended up discovering a path to us, to this universal kind of algorithm, would the other do the same as well? Is this somewhere along the path of doing really anything impressive?

I don't know. Maybe that's more a question for people designing objective functions and so on. I suspect that the set of possible algorithms is so large that actually converging into things that we do is not so easy. But like you said, evolution did it, so maybe there's a way of getting machine learning to do it, too. Although we might not know why it's doing it at the time.

Theo: That's true, yeah. And clearly, AIs are able to do some things that seem impressive that we're not yet able to explain, like writing code and writing poetry and writing math. And of course, nothing that it's done yet is close to the level of the best humans. But it's been able to do some pretty impressive things just from having the amount of data and compute that it does.

Carlos: Yeah. So one thing that gets asked on Twitter a lot is, can machine learning algorithms do X, or can they explain things, or whatever? I think that's, again, if I think about my two fundamental questions about universality and efficiency, what can the system do in principle, given infinite resources, and then how efficiently can it do things? Those are the two questions that I think about. It may then be the case that machine learning algorithm, maybe it could do anything, given the right amount of time. If it's exploring the space of programs, and it doesn’t have any obvious limit to which programs it can come up with, then it can theoretically generate any kind of program. The challenge then becomes, can it come up with the right ones efficiently? For instance, what would it take for a machine learning system to come up with general relativity? This is a very specific computation, a particular idea. In the space of all ideas, it's a needle in a haystack. So how do you find that needle?

A machine learning system could do it if it can find any program. A random program generator can generate any program, so it could generate relativity. However, finding that needle in the haystack by accident is unlikely. So then the question is, how long would it take for the machine learning system to do that? And would it even recognize that it had found the right answer if it stumbled across it?

Theo: So I guess the difference is, certain people, like the people at OpenAI, Sam Altman for example, once famously said “gradient descent can do it”. He believes that simply adding more and more computing power and giving the model more and more knowledge will eventually cause it to either awaken or simply to know so much that it approximates a universal being like a human. While you think that, no, it doesn't matter. The search space is just too big. You can't put all of it into a model. It needs to be able to explore it by itself. Like the baby vs. the data center.

Carlos: Gradients are an attempt to solve the same efficiency problem. When you're navigating a huge space, you could try to train a modern neural network via evolutionary methods. You could say, here's all the weights. Let's try a small permutation of the weights, see which ones are better, choose those, move on to the next. But the problem is that's vastly more computationally intense than using gradients, which tell you exactly where to go to get the next improvement. So the efficiency is still the relevant thing. But the limiting factor is like, can you get a gradient for your situation? In all these cases, you're trying to assemble a mathematical system, which is similar enough to the actual thing you care about, but also has the property that you have the full mathematics of how it works and can therefore navigate it. But that's not a given.

Theo: Naval Ravikant agreed with this a couple of years ago. He wrote an article called “More Compute Power Doesn't Produce AGI”. And someone responded to him recently saying, wow, this Naval guy really missed the mark on AI. And he said, Naval, “We don't have AGI yet, but GPT-4 has definitely caused me to update my priors. So consider that piece obsolete while we all learn more”. So, have your ideas on this changed at all after GPT-4 or GPT-3.5, Chat GPT even? Or are you still of the opinion that compute can't do it without algorithms?

Carlos: I think this is back to the same point I said before about this is where the analogy with complexity theory makes more sense, where adding a constant factor doesn't change the scaling factor. Given the size of the search space we're dealing with, the space of all possible ideas, it's like an infinite space. So navigating that efficiently is hard. If you have a machine that can do things a million times faster, that's almost no help if it doesn't have a fundamentally good way of navigating that space. It's like saying, I have a system which can search every grain of sand individually to find something buried someplace. Whereas actually you would want a high-level theory that could tell you, it couldn't possibly be here, it couldn't possibly be there. So with the right kinds of ideas, you can eliminate infinite swaths of the search space without ever checking them individually. So there are far better or worse ways to navigate the space of ideas. And so it's really important that you have that.

But more broadly, I'm curious not so much about particular systems, I'm curious about the full spectrum of knowledge-creating systems. I like to think of it as comparative epistemology. So if I'm asking these questions about universality and efficiency, about variation and selection, these are the universal questions which apply both to genetic evolution, to any given algorithm, to human minds, to animal minds, to everything. So any given point in that spectrum, I'm interested in to shed light on the rest of the spectrum. Because there are things with GPT, which I would say like, it's not that I think it's AGI, but I do think it shows you another interesting point on the spectrum of knowledge-creating systems that didn't exist before. So that's what makes me interested in it, as opposed to saying this will be, this either is or will be an early successor or an ancestor of an AGI system.

AI Alignment and Interpretability (54:11)

Theo: Speaking of GPT-4, you talk about looking into different knowledge-creating systems and how that's very interesting. So one such thing where people are looking into knowledge-creating systems is mechanistic interpretability, where AI researchers are looking into the weights and biases of neural networks like GPT-4 and seeing if they can figure out what internal algorithms, internal circuitry it uses to do stuff like adding numbers or deciding what words to use for poetry or whatever it does in there. So do you think that mechanistic interpretability is interesting and/or useful?

Carlos: I find mechanistic interpretability intriguing. I haven't delved too deep into it, but I like the idea that something like GPT-4 is this large computational system and that the way it evolves is such that certain kinds of computations can arise within it, such as general algorithms for adding or doing different things like that. I like the idea that it shows that, for potentially a variety of tasks, there are certain computations that can be identified and understood. Arbitrary programs can arise in the system, and it's hard to predict which kinds of programs will arise. The question becomes, what kinds of things can arise within the system? What process can give rise to that and how efficiently? This comes back again to the idea that maybe it can discover relativity, but what would it take to do that? How would it distinguish that theory from all the alternative things that might have come across along the way? Why would it then decide on relativity? I would be surprised to find general relativity within that system. I wouldn't say that it's impossible. It would be interesting to look, and my question would be about what it is about the human way of thinking that allows us to converge on something like general relativity? It's probably very unlikely for something like GPT. It's not that it's impossible exactly. It's just that I don't think it would be able to distinguish between all the possibilities just on that one. Because, E = mc^2 is a lot similar to E = MC^2.001.

Theo: But one is right and the other is wrong.

Carlos: I don't know a huge amount of physics to go into the derivations of these things, but I would assume it's a pretty fundamental difference between those things. One is a very empirical thing. If you put enough zeros on it, you'd still be asking why is there a 0.01? Because if you think about it, it's a different situation where you have areas. I can understand very clearly why there would be a two there. The 2.001, it doesn't quite square with our understanding of length times another length gives you the answer.

Theo: It is pretty cool how many easily human understandable mathematical constructs there are in reality. Like what you just mentioned, the area calculation or the area of a circle, pi is not really human understandable. It's an irrational number that goes on forever. But the squared part is, the R part is.

Anyway, you wrote a tweet where you said, "Don't treat digital or biological people like tools, that's slavery," something along those lines. So with the idea of AI alignment, that's becoming more and more popular, where depending on who you ask, it's either about making AI systems do what we want or making AI systems do things that are safe. What do you think about the field of alignment?

Carlos: I go with David Deutsch on the idea that there's a fundamental difference between AIs on the one hand and people on the other. If the thing is a person, then it's a person. The hardware doesn't matter. All the same rights and privileges apply that you and I have. If it's not a person, it's a tool, then there's almost no ethical concerns at all about its wellbeing or something like that. It's just a matter of, does it hurt other people? That's the only ethical matter.

So if you say like, I have created some kind of weapon system or something, then I'll be very curious to say like, it's not gonna kill me in it. That's what I care about. But if it's a human, then the more important concern is, are you treating it right? The idea of trying to control its mind in some way other than via persuasion and ordinary argument would be, that's where you enter the Orwellian territory. Orwell doesn't do a whole lot of actual neurosurgery, but you got the idea.

Theo: Do you think that there are AGI risks in the future, risks to creating digital people that would be as capable or more capable than biological people?

Carlos: If they are fundamentally the same as us in terms of the way in which they deal with ideas, it's just like the same program as you and me, just I guess basically saying, what if I took your brain and scanned it and then just ran it on fast hardware? What would happen then? Would we all be subject to your wishes because you had such an advantage over us? How much of an advantage is that, et cetera, et cetera. It's a bit analogous to the question of what happens if some country gets nukes or gains some kind of advantage over us? I'm not that well-practiced on geopolitical questions of that sort. So I'm not sure what the best strategy would be.

Theo: You're saying kind of that the world is robust enough so that giving one person, one entity, a big advantage in one area wouldn't just brick the universe.

Carlos: It's an open question. If we're supposing that we had a system that's basically you, but running on faster hardware, there's an open question. First of all, whether or not you would actually have faster hardware, like if we ran your brain on current CPUs or something, on the best current chips, would it actually be better? It's not obvious that it would be, but suppose it was, how much advantage would that be? And that's an open question. Maybe thirdly, what if there are other competitors to you that also have good hardware? And then again, in that case, we'd be running to this idea that David Deutsch said as well, the battle between good ideas will continue regardless of the hardware it's running on. So then it would be, we'd be asking more perhaps about what are the different ideas of all these different fast running AGIs? Because presumably they will all agree. So we might be asking a question like that. I suppose that's what some other countries like in World War II would be asking about the Western world, will we join the war? Will we not join the war? And their fate might be decided by our decisions, but I don't know. The hardware to understand what advantage it would be to take your brain right now and run it on today's or tomorrow's hardware and how well dispersed that would be, or whether or not we'd have the Neuralink sort of set up making us faster.

Theo: So one of your influences is Douglas Hofstadter.

Carlos: Sure. It's quite a lot less than David Deutsch, but I do find him interesting.

Theo: Hofstadter recently had an interview, which you may have seen where, over the last few decades, he's been highly skeptical of AI capabilities. He was of the mind that the human mind is really complex and it'd be very hard to write a computer program that could replicate the human mind. But in this recent interview, he was really freaked out about the progress in AI. He talked about being terrified. He thinks about this practically all day, every day. He feels an inferiority complex to future AI. Like humanity will soon just be utterly surpassed and all of his beliefs prior were wrong. So clearly David Deutsch still has a cooler head about this. And I think you agree with Deutsch over Hofstadter. So why do you think that Hofstadter, who's been highly attuned to computing for decades, just suddenly switched?

Carlos: I don't know. I haven't read his piece, so it'd be hard to say. But I think one thing that I just have in mind is that it's generally underappreciated how hard it is to make progress in reality. The search space of ideas is this infinite space and finding good solutions like general relativity within it are so stupendously rare. By the way, there aren't simple gradients you can follow to get to them. If you were to think about the topology of some search space like general relativity with this Eiffel Tower and everything around it would be just flat desert. Like a slight variant of that theory, it doesn't work at all. So to find these things is a stupendous achievement that our minds are capable of. And it seems like nothing else is so far. And so when I see that things are impressive in some ways, like maybe automating things that we've already done, maybe putting together ideas that we already have to some degree, those things can be impressive, but against the enormity of the actual search space of ideas, the actual search processes people routinely go through to define better neural networks and better engineering solutions. I just find that when you start to see how many choices are involved in actually doing a good design of something, how much knowledge is involved, how many options we face all along the way, a lot of these systems start to seem much weaker. So I think my starting picture is the space of possible ideas is vast and most systems aren't universal. They're finite and have no means of extending their capability. So there's an infinite difference between finite and infinite. So that's one thing. And then secondly, given the largest search space, you really have to have just incredible mechanisms for efficiently navigating it. And I think most things just don't have what it takes. And so unless somebody appreciates both of those facts, I just don't think that they're really hitting the important issues. And by the way, if something does do those things, that's not any reason to be worried about one's own consciousness. I mean, if something kills you, that's not good, but if something were to have those same properties that our minds already have, they would become equally as good as us in terms of their software. They may have better hardware, but then again, if there's better hardware for them, why not better hardware for you very soon? In which case you would then resume your status as equals.

Mind Uploading (1:05:44)

Theo: So on the topic of better hardware for humans, is it possible that there's no way for human minds to run on a computational substrate without killing you in the upload process?

Carlos: I think that's a question for optics engineers. I would not bet against somebody in the future discovering a way of scanning you without hurting you. But even if they did, if they could just scan all your atoms in one go, and then, let's say, it had very good technology for also replacing your atoms, then I could be scanning you and then replacing all your atoms a million times a second. I don't know if that's physically impossible. And maybe that we actually care about virtual reality more. It could be that's the reality that we actually want to live in, where everything is fully designed. In which case, I would say, yeah, I don't really care about these atoms. Destroy them, because I'm going to be living in virtual reality. By the way, I think the virtual reality people think about now is, of course, goggles and so on. But I think about virtual reality as being like everything you currently experience, but maybe 1,000 times better, a 10-dimensional space you can exist in. Whatever you like about the present reality, I don't think any of that is withheld from you by sufficiently good designers in the future.

Theo: Yeah, I think of things like the Apple Vision Pro as kind of like v0.0.1 virtual reality. And we are so far from coming out with even a v0.1, a Neuralink-type thing that actually works for very basic tasks for humans. But we'll see. Never bet against progress. Never bet against the future. It comes faster than you think. But what I meant by my question with uploading minds is, let's say you want to upload your brain to run on a computational substrate rather than a biological one.

Carlos: I should say, they're both computational substrates. One just happens to be built differently.

Theo: Right, yeah, a silicon-based one. On substrate, for example, is it possible that somehow consciousness is inherent to your actual biological neurons and that in the process of moving your synaptic connections from one substrate to another, you would die, subjectively stop experiencing consciousness, and that a copy of you would be in the uploaded form?

Carlos: I don't know that we have any good theories on terms of selfhood and these kinds of things. So, I don't think about consciousness at all. I think only about computation and capability. You have all this information in your brain, it has all this ability to cause other things, and then the new system would have all those same properties. That's the thing that leaps out at me as being the most important thing. As far as consciousness and selfhood, I guess I'll leave that to others. I sort of assume, like Deutsch does, that given how important it seems to be, that you have particular neurons doing particular things. In other words, if you look at the brain functionally, that seems to be the most important perspective on it. In terms of consciousness and other things, I don't know.

Theo: A lot of people with this question in particular tend to just go on total gut intuition while we just have no explanation, no theories for any of it. People will say, when you go to sleep and you wake up, that's a discontinuity in your consciousness, but you don't die. But if you were to teleport your body, as in if you were to have your atoms disassembled, the information sent to a 3D printer and your body reprinted, that would kill you and create a copy. So say most people, some people. Or if you were to upload your mind, then that would kill you and create a copy. Or, there's the idea of the Moravec transfer, where instead of just destroying your brain and sending the data to a computer in one go, you basically have tiny nanobots going through your brain and one by one swapping out your biological neurons with silicon ones or whatever other substrate we find better.

Carlos: It just doesn't seem like there's any really fundamental difference between saying, what if I zap all your molecules out into existence right now, and I rebuilt them exactly where they were, but one molecule to the left. Or, in the other case, I just swap out every molecule of your body for an identical, but different molecule. I just do them all in order. But also, I finish the whole thing in a nanosecond.

Theo: So you're saying that consciousness is more about the pattern than it is about the actual specific atoms or molecules that make up your body?

Carlos: Well, like I said, I don't know anything about consciousness, but if we are just talking about intuitions and so on, it might seem bad to be like, we'll just destroy your whole body and then reproduce another one, and now a second later. But then again, it doesn't seem like there's any fundamental difference between that and just doing them all one by one, like very quickly. And certainly both of these things would be functionally the same. I guess nobody really disputes that. They're saying, okay, well, yeah, if you put me in the computer, it'll say it's me, but it won't be me. That's the usual reply.

Theo: I guess there's only one way to find out, unless we come up with a good enough theory of consciousness, but I wonder if it's even possible to come up with a good enough theory of consciousness before we have nanotech level technology that can upload our minds into computers.

Carlos: Well, I think that probably it might not help at all with the question of consciousness. So it's like, we already exist here. If our actual existence right now isn't very helpful to this question, it's not clear that some other technology would be helpful either. But yeah, I think maybe it's the wrong kind of question. Like if you were to ask about different species, for instance, basically if we're talking about discrete differences versus continuous, then we'd be comparing, or rather, the question of whether this copy of you would be you may be the wrong sort of question in the same way that asking if two species that existed at different times on the tree of life are fundamentally different or something. They're linked by small, gradual changes throughout. So at no point was there some grand leap from this to that. And yet they're wildly different.

Carlos’ 6 Questions on AGI (1:12:47)

Theo: So, about three years ago, you wrote an article called "A Few Questions on AGI," where you talked about six questions that you had about AGI and related topics. And I'd love to ask you each of those questions now and revisit them and see what progress you've made intellectually three years later.

1. What are the limits of biological evolution? (1:13:06)

Theo: So question number one is, what are the limits of biological evolution?

Carlos: Yes, I guess I would just say that I've framed that more generally, perhaps, in terms of universality. So just asking more generally, given any given system, what can it do and what can it never do, despite any question of resources? And I think certain limitations with biological systems are to do with the fact that they have to obey certain kinds of constraints in every generation. So you have to be able to get yourself copied or reproduction has to happen in every generation. Whereas with human ideas, if you think about the difference between Newton and Einstein, Einstein had to create many theories that were nowhere near as good as Newton's to begin with. They were no better at prediction. Only his last published version was as good as Newton's. Newton's theory was good in terms of prediction. So if you think about a graph of fitness, you'd say that when you go from Newton, which had high fitness, to Einstein, Einstein did a lot of things that were terrible for years. Then eventually, he did something better. But there was this large gap where he tried lots of things and made many improvements. They just weren't improvements in terms of predictive accuracy. He had to invent his own new criteria to improve that. Eventually, he did better at predictive accuracy. That's something that evolution couldn't really do because at every generation you would have to be better to survive according to that one criteria. So these are the gaps that evolution can't cross. That's one kind of limitation that it faces.

Theo: On a slight tangent, you talked about prediction, theories being good at prediction. So what do you think is the difference between prediction and explanation?

Carlos: Well, I think one thing you could say is that there can be relationships between some kind of idea in your head and some system out in reality. They can have certain kinds of similarity. General relativity expresses a certain kind of, or that mathematical idea, that theory is similar to actual reality in a very particular kind of way. That's what explanatory knowledge is all about is finding, exploiting, expressing patterns in reality. I guess predictions are talking maybe less about those wider scale patterns in reality and more to do with particular observations that you make about what you will see, not what reality is really like. I don't think too much about predictions versus explanations except in terms of the role in helping you improve your ideas.

Theo: Prediction would be more like the Celtics are likely to win the next NBA finals. And then explanation would be more like the Celtics have better athletes, better coaches, better whatever, and thus they're more likely to win the finals.

Carlos: Yeah, so discussing the actual things in reality, and how they give rise to something else. By the way, there are different kinds of ideas, but there's no such thing as an explanation per se, like as a particular kind of idea. Like if I say the word banana, in some contexts that will just be a random concept. And in some cases it would be an explanation. If you asked me why did you go to the grocery store? I said, banana. That would be an explanation of why I had gone. I wanted to get a banana, that's why I went. But absent that explanatory problem, it's just a concept. So explanation in some ways is actually more of a role. It's more of a function than anything else.

So if you said like, why did you go to the store? You're saying like, I can imagine many reasons you would go to the store. I don't know which one of these it is. So can you help tell me which one it is? So you have a certain kind of problem in your head, some kind of gap in your knowledge. You're like, I know a lot of things about the situation, but I just don't know that. And so you're asking, so what is the answer to that? And then I would give you banana. And that'd be the answer. Whereas if you had a different explanatory problem, you'd want a different kind of answer. You'd say, that store was really open? Or like, why was the store open? Because say I went, I told you, yeah, I went there at three in the morning. So I thought that store was closed. It was actually open, why is that? That's it, I might have an answer for that. Or I would say like, I broke in because I really wanted a banana. Try to fill in that gap, whatever that gap happened to be.

2. What makes explanatory knowledge special? (1:18:07)

Theo: So, question number two. What makes explanatory knowledge special?

Carlos: I often think about the difference between knowledge, which is about what actually exists and knowledge that is helpful for action. If I told you there was a system where there was a, let's say, you're trying to go from point A to point B, but there's a big monolith in the way. And to avoid the monolith and get around to the point B, the destination, you could actually have in your head a little rule. This is what engineering classes often do, by the way, with Lego race cars and things. You can just program in, like if you see something dark, turn right and then turn left or whatever. So you could put in a pretty simple algorithm for dealing with that situation that incorporated no knowledge, no explicit knowledge of that barrier in the way. And yet that knowledge would only be useful in a narrow range of circumstances. Whereas if I said, ah, there is this particular object in the way, you could then use that knowledge, not only to avoid it, but to do any number of other things that you hadn't anticipated ahead of time. I think it's kind of interesting. So you might say later on, like, I need a cube, but then there's a cube there. And so you would now be able to use that cube in a way that prior you could avoid the cube when that was the relevant thing to do, but now you wouldn't really have any knowledge of the availability of cubes around you. So when you did need one, the fact that there was one there would be irrelevant to it. So I think that's sometimes a useful, simple explanation of how knowledge is only about action can actually be pretty limiting.

3. How can Popperian epistemology improve AI? (1:19:54)

Theo: Alright, and question number three: how can Popperian epistemology improve narrow AI algorithms?

Carlos: Well, I think, as I mentioned, I just have the two things that I always talk about, universality and efficiency. And so I think that when you see the full spectrum of knowledge creating systems and you compare them all on these terms, you can see that there's a lot of room for improvement. What is the space they are searching? What kinds of universality do they have? And secondly, how efficiently do they search that space? You can then get ideas. That's kind of what people have done with evolutionary algorithms and so on. They've taken some inspiration from evolution for their own algorithms.

With Popper, one of the more important things from him is, he has some good pithy phrases, one of which is “the content of a theory is in what it forbids”. He's most interested in this fact because of testability. If I tell you all swans are white and then you see a black swan, well, my theory says there can't be any black swans. So now I can discover there's a problem with the theory, we have something to work out. It's relevant to testability, but it's also relevant to efficient search. If I say, there's a lot of gravity here and it tells me that if I throw this ball, it will follow the parabola and that's how it has to go. Then I'm implicitly also saying that if my predictions say the ball will be here, I'm also saying it's not going to be anywhere else, which is pretty useful when you want to find the ball. There's an infinite everywhere else that I'm telling you that ball isn't going to be there.

If you think of all of our laws, all our scientific knowledge as being like that, telling you not only what is, but therefore also what isn't going to be true, what things aren't worth searching, then it's actually very useful. Because I know that, for instance, if I'm looking for new scientific theories, I think they will all obey conservation of energy. Well, anytime I find a theory that doesn't obey that principle, I then feel I can probably throw that away or I know I have to fix it so that it does obey the principle. That knocks out a hell of a lot of possibilities. It saves me potentially infinite amounts of time. So I think that to the extent you can incorporate powerful constraints like that into any search algorithm, it's for the best.

Theo: By the way, on a slight tangent, do you picture the first true AGI system to resemble more a neural network or resemble more a conventional computer program, like a GOFAI, good old fashioned AI is the term that they use, a simple optimizer or something?

Carlos: Well, I don't think, well, before I kinda mentioned there's a fundamental difference between an optimizer, which has a fixed objective that can't be changed from within the system. That's one kind of thing. And then a system, which is more like an ecosystem where there's no sort of a fixed idea, which everything else must be subservient to. There's an open question as to whether one can simulate the other, in which case maybe that's okay. So if you have an optimization function that can, within itself, simulate this other more ecosystem type of view, maybe that would be good enough. So maybe it's possible to arrive at the right answer sort of within the neural network sort of system.

I guess that's what people usually talk about anyway, they say like, if logical statements, for instance, are really valuable, do we have to bake them in the beginning or can they be emergent within the system? I suppose that's an open question and probably would bet on it being emergent. It's part of the more general question of what things do we have to start off with versus what things can emerge later? And ideally, for someone who wants to build it, you want the core to be very simple so that you don't have to build very much and it can sort of discover everything important later.

4. What are the different kinds of idea conflicts? (1:23:58)

Theo: Number four is, what are the different kinds of conflicts between ideas?

Carlos: I think I have sort of a reason Popper brings that up is that it helps you focus your problem solving in the right place. If you say there's a problem between quantum theory and relativity, then you know where to look to make an improvement in physics, making those theories not conflict anymore. And that's true more generally. So if you know that theory A and theory B don't work together, then you can try altering either of them or coming up with new ideas so you eliminate that conflict. So it gives you a kind of a barometer of progress in a way, something to meet.

But I think that those aren't the only things that can help guide your problem solving. So if you like to think more in terms of attention in a way, just what helps you guide your attention to fruitful areas. If you're going to come up with new ideas and you're going to start filtering ideas, well, which kinds of new ideas should you come up with and how should you select among them? And anything that can help you answer those questions and focus your energy into fruitful areas will be good because biological evolution doesn't do that. And it's terribly slow as a result. If you were to try to make an improvement in your ideas at random, you would be combining ideas as different as giraffes and spaceships, these are both in your head, you could combine them, but it wouldn't be helpful. So what you actually want is something that tells you there's a problem here, or there's just something interesting there. And it's all focused on that.

Theo: Yeah, I think those are definitely good points. I don't think many people think enough about this in particular.

5. Why is the brain a network of neurons? (1:25:47)

Theo: Question number five is, why is the brain a network of neurons?

Carlos: Yeah, I don't know that I have much to say on that one. Although I did read a paper recently that talked about the brain as a Turing machine. It made the point that a fairly small set of neurons allowed you to build up simple circuits that had the properties you would need to make a Turing machine. I forget the details of it. But I guess I wouldn't necessarily address the fundamental question of why neurons like if Turing machines are important, it's not a given that you should build them out of neurons. The fact that we did is an open question. I can't say I made much progress on that question.

Theo: This one's definitely one of the tougher ones. I guess it's something that evolution converged on eventually. In the article you talk about, brains are evolvable while modern computers could be designed. So we can, I guess this ties into the bigger question of can we design architectures that are better than we are?

Carlos: They won't be better in a fundamental sense of what they can compute and so on. But they could be better in the sense of ordinary designers, a GPU does X, Y and Z better than a CPU. So it could be better in that sense, just in terms of being faster, more efficient. But it would be a fundamentally different kind of program.

6. How do neurons make consciousness? (1:27:26)

Theo: Number six, there's another super interesting one. How does the low level behavior of neurons give rise to high level information processing?

Carlos: I think that's an area that I haven’t focused on so much. This paper that I mentioned before of like Turing machine within the brain would maybe be a really cool place to look because lately I've been mostly interested in the higher level questions of just like, how do we even distinguish between different kinds of systems and what makes us special? So that's been, that's really that question of demarcation in a way has been the focus of late. But whenever I do sometimes dip into the lower level details, it's interesting. So yeah, that's kind of an area that I'm getting more into lately.

But I think, in some ways the question of how do you actually compute stuff is a secondary question because if you don't know what you're trying to compute, then it's not as helpful. Whereas if you know Turing completeness is an important thing, well, then we can talk about how you take subunits and get there. But if you don't know the Turing completeness is important, then you're going to be kind of trying to combine units in a way that is kind of aimless. So I think the focus on these high level ideas of universality and efficiency start to tell you what it is that you want to compute. And then you can ask, okay, now how do we actually compute those things with the tools that we've got? But I do think, yeah, I'm slowly getting more interested in lower level details, as I mentioned, like through Minsky and other things, just see, okay, what are the fundamental building blocks of computation? Now that I'm getting a better idea of what we want to compute.

Theo: So what do you think the fundamental building blocks of computation are?

Carlos: Well, I guess like Minsky talks about different kinds of things and building out of many different subunits and that sort of things. One of the things I think is interesting, by the way, this doesn't necessarily answer your question, but it's just sort of one of the things that came up when reading about the stuff is that I used to think of Turing machines as being quite important. And they're like, of course, historically very important, but then there's a separate question of what is their, how should we think about them now? Just in terms of their actual performance and this and that. And of course, people will tell you like, actually computing with Turing machines sucks. You don't want to actually use Turing machines in practice.

So I think that it's become clear to me, and maybe this is obvious to everyone who's already taken a class in computability or something, but to me, on the one hand, you have the space of all possible programs and that thing is pretty fundamental. That's like the space of all ideas. So that's there. That's this immutable thing. That's where all our resources are when it comes to all the ideas you can ever think. Separately from that, there are machines which can run any of those things and Turing machines are one of them, but already at that time of Turing, there were other alternatives to it. So I think there's probably an infinite variety of alternatives to it. Things that can compute anything. So the importance of any one of them is pretty questionable, actually, in the grand scheme of things and where they all differ isn't in what they can compute because they can all compute anything. What they do differ in is how easy they make it to express particular computations.

And so if you have a really bad programming language, like Brainfuck or whatever it is, or just assembly, it can compute anything, but it just makes it very difficult to express the kinds of computations you would want to run. Whereas a nice programming language like Python or something will make it very easy to express the kinds of computations you want to run. Things with loops, things with variables. That in the space of all possible programs, it turns out that some of those are ones we actually want to run. So we make those easy to express. So that's where languages differ. And so that's kind of, I think, when we ask what are the building blocks of computation? In a way, the lesson seems to me to be that of all the things that can be computed, you can create whatever you want as the building blocks. And then depending on how what you've assembled there, that will make different computations easy or hard to express.

And then that becomes relevant for evolution because it says, well, OK, well, what kinds of computations should human minds be doing? And then how is it that evolution put together what is arguably some very shitty computer, but managed to do those computations very well? Because it turns out that those ones were the ones that were most decisive. I don't think that the ones that you see being done in an arbitrary program, like the important computation for us isn't iterating through a trillion item list of numbers and multiplying them all by two. That's a computation that computers do. What we don't do in our heads will be very difficult for us to do. But evidently, that's OK, because that's not the kind of thing that we ought to be doing for our normal idea-based computations.

Theo: So related to Deutsch's ideas on this, how exactly do we know that humans are theoretically universal? That we can theoretically do or understand anything? Why is it that we're not like, say, rats, where rats just cannot understand something no matter what you do or try to explain to them?

Carlos: I don't know if we have a perfect answer there. That's kind of my whole research question, in a way, is to try to get a clearer picture of what distinguishes us from other things. In my video on universality, I go through a few different ones. So first of all, you have computational universality. And there's also a point I mentioned before about the fact that you also want to be able to not only, if you want to have truly unlimited computational capacity, you have to be able to extend whatever finite capacity you have right now. So that's a given. So you need that, too. You can't just be a universal computer, because none of them actually exist. You're always going to have a finite computer, so it has to be able to extend its abilities. So that's one kind of thing.

And there's also, I mentioned before, you need to not only be able to come up with new ideas, but in new ways. This is the analogy of walking along the Earth just using your feet, rather than planes and so on. So in order to navigate the space of ideas, you need to be able to explore it in many different ways. And then also, you need to be able to distinguish between new ideas of all different kinds in order to efficiently navigate the space of ideas.

So if I told you, like, today, you're trying to decide between two ideas, and telling you, first of all, do you like chocolate or do you like vanilla? Well, that will require certain criteria. You will have to invent entirely different criteria when it comes to deciding between relativity and quantum theory, or the successors of both of those. You have to invent new criteria. So if you can't do that, then you're screwed, because you're going to be no better than chance when it comes to that later decision between those new alternatives. And I think there were a few others that I listed out there, too, but yeah. So I would say, hold off on trying to get a definitive answer, perhaps, on the question. But that is what I'm working on.

Theo: Yeah, so we'll see where that goes.

The Optimistic AI Future (1:34:50)

Theo: And then finally, how do you picture a good AI future? What does a good future of AI look like to you?

Carlos: AI or AGI?

Theo: Actually, let's do both.

Carlos: I don't really know much about AI, really. So I'll leave it to people who actually work on that to come up with cool stuff. Well, I guess I would say that what's cool is the idea of generally having, how do I put it? So, I think about this large spectrum of knowledge-creating systems. And I think humans are the zenith. There is a thing that can do anything. That's us. And aliens, and whatever else. But then there are these weaker forms, and there's evolution, and there's all kinds of different algorithms in between.

And so I think we can imagine, and we're probably seeing now, systems which are capable of some amount of search in the course of what they're doing. And so they have much more flexibility as a result. So traditional programs, you would write the whole program, and then if there's any kind of problem at any point in that, where it has a gap in what it knows how to do, the whole thing just fails. It doesn't have any ability to search for solutions to cross whatever gap it had in its algorithm.

Whereas a human, if you tell me, hey, go to the store and get me that banana or whatever, if there's some problem along the way, and I have to go around the back door of the place because the front door is under construction or something, that seems trivial to us. But I had to say, oh, I want to get to the store. How do I do that? The usual way doesn't work. So now what? I have to take a detour. I have to maybe ask somebody. I have to engage in a problem-solving process to deal with that situation.

So I think you can see other just ordinary programs involving that sort of search naturally within them and making them much more powerful as a result. Even if they're not human-level or anything, they don't have to be to be useful. So I think that's pretty cool, just baking search in this broader sense into everything that we do.

As far as AGI goes, it's mainly just a matter of, what is it that I like to say? Longevity, making death optional. And that really requires backups, ultimately, if you want that. So we need to have control over the hardware we're running on so we can make backups and have longevity. And if you have that, we can't guarantee you'll live forever, but we can guarantee you'll at least live as long as our civilization is around. If an asteroid wipes everything out, then you die, too. We can live as long as civilization, at least.

And then the other one is virtual reality. I think that once we control the space of experiences, that would be very cool. We can start designing that. I often like to say that we could live in a 10-dimensional reality if only we knew how to throw and catch a ball there. In 3D, I can throw a ball and you can judge distances and so on well enough to catch it. And mathematically, we could do that with 10 dimensions. It's just that algorithm for doing that isn't in your head right now. So if I throw you a ball in 10 dimensions, you won't catch it, but there's no fundamental reason you couldn't.

So I think we could inhabit such realities and everything else that we haven't yet imagined. So those are the two that usually come to mind. I think I listed a few more in my latest essay, but I'm going to have to make do with those for now.

Theo: All right. So I think that's a pretty good place to wrap it up. So thank you so much. Thank you, Carlos de la Guardia, for coming on the show.

Thanks for listening to this episode with Carlos de la Guardia. If you liked this episode, be sure to subscribe to the Theo Jaffee Podcast on YouTube, Spotify, and Apple Podcasts, follow me on Twitter @theojaffee, and subscribe to my Substack at Thank you again, and I’ll see you in the next episode.

Theo's Substack
Theo Jaffee Podcast
Deep conversations with brilliant people.
Listen on
Substack App
RSS Feed
Appears in episode
Theo Jaffee