Let me preface this by saying I am not an AI foomer doomer. I am squarely for continuing to make progress on AI. I don’t think AI will destroy us all in a blink of a super-intelligence explosion. Having said that, one should always be cautious with dangerous and powerful technology, and AGI will definitely be the most powerful thing we ever invent.
Every serious AI person I know and know about, including those at the forefront like Sam and Demis, are either agnostic or are mindful of the same concerns and advise cautious progress. Very few actually advise reversing or banning progress like Yudkowsky recommends. Most people believe if we give the problem its due, (take precautions/dedicate attention) the problem can be solved.
However, the effective a’s on both sides have turned the discourse to shit. I am guessing the origins were well-intentioned in both cases. But while EA obviously lost its steam quite spectacularly with the SBF drama, the e/acc mind-virus has now escaped the lab and is infecting people on my timeline. Once the problem became framed as optimism versus pessimism, it’s easy to pick optimism because it feels good. And naturally all the salesmen glom on to the optimism side. God I hate sales. But a lot (not all) of it is indefinite optimism. “It will be fine because it always is”.
I might try to pick out some specific focus issues on either side I see online and think out loud. I don’t intend to become the expert on this topic or convince anyone. Most likely benefit is for me to clarify my own thinking so I can know which discourse is high signal to listen to and which is low signal to avoid.
Topic 1 - Simulations
The proximate impetus for this post was seeing @atroyn’s recent post


I honestly don’t know why Yud goes to the post-biological nanobot hypothetical. Whether or not it is feasible, it clearly turns off a large percentage of the audience. If for some reason you read this Yud, stop using it, people clearly don’t respond to it. I propose just stay with engineered virus. People have a much better imminent fear of that post COVID.
Back to @atroyn’s question though. Is this magic. Could an ASI use DNA/protein printing services destructively? I think it is possible. The reason is that the AI should be able to simulate the most likely outcomes beforehand. There are many other things I could discuss here, but let me focus this post on “simulations”.
What is a simulation?
A simulation is an experiment that allows you to recreate some phenomena, usually for the purpose of studying it. Anything that is allowed by the laws of physics, can also be simulated. Because in the limit, doing the experiment is a ‘simulation’ of doing it again. So it is by our definition, possible. So then the real question is what is a ‘useful’ simulation?
What is a ‘useful’ simulation?
For a simulation to be useful, doing the simulation must be easier than doing the experiment in reality. You want to study what happens to a person in a car crash. Well, it is easier to put a bunch of test dummies in a car and ram it into a wall hundreds of times and study that instead of trying to capture live footage of real-life accidents. But ease is not enough. To be useful, the simulation must capture the information you want. Putting test dummies made of steel would not tell you what happens to a human, so the dummies must be made of something that are as susceptible to damage as the human body. The key is that you only need to simulate the necessary detail. You don’t for example need to simulate the temperature or the smell in the air of the accident scene, because those details dont matter to know where to fortify the car.
Can AI simulate novel biology?
Yes.
I don’t see why not. There is this weird idea I see floating around that biology is special and experiments need to be in the real world and there is no way to speed that up by simulating them in compute. Maybe simulating complex ecosystems is hard, sure. But if you are trying to predict whether a virus with a given DNA/RNA will be a deadly pathogen or not, of course it can be simulated.
Don’t simulations have to be tested in real-life?
The only two reasons simulations have to be tested in reality are either -
a) we do not yet have an accurate model of reality, so we don’t know how to build an accurate simulation (often the case in biology).
b) running the simulation would take as much time or longer than running the experiment itself (often the case with quantum physics).
Neither problem is insurmountable in principle.
Alphafold is a example where we did not know how to simulate something, then we built a better algorithm using data and now we can predict protein structure. It is not perfect, but it is much better than what came before and there is no reason why it can’t get better. Crucially, there is no real-life iteration required to build AlphaFold. If we had the right data, we could be confident of its performance without ever doing an experiment.
Making the simulation more efficient can be done with better hardware (e.g. GPUs make physics simulations faster), better software (e.g. faster algorithms) or entirely new information encoding (e.g. quantum computers).
This is not something only possible in theory. We don’t do 100s of real-life trials of space missions. We have a good enough model of space and enough compute to simulate that we can launch a rocket from earth today and know that we can land a rover on Mars in the exact crater we want, at the exact hour we want, years in the future. We can do this on the first try, based on just simulations.
Won’t it take huge compute to simulate biology?
Is biology more complex than space? Yes.
Are some simulations much harder than others? Absolutely.
Simulating a human biological system or human society would be very hard. Complexity scaling is real and societies (cellular or human) are extraordinarily complex. But a virus? I would be surprised if we don’t get there within 50 years ourselves even without AGI. I fully expect we will be able to predict all the properties of a virus using just its genetic code and be able to simulate mutations.
Consider however the progress we have made on genomics, drug design, and protein folding. Go was though to be impossibly complex, AlphaGo showed it wasn’t. Protein structure prediction was thought to be impossibly complex, AlphaFold indicates it isn’t. You are welcome to claim that simulating something efficiently is impossible, but unless you have an impossibility proof, I would withhold judgement. There is a reason 50, or even 500, year old math problems sometimes get solved.
But AlphaZero and AlphaFold are not perfect?
It is true, the only perfect simulation is reality, all useful simulations are imperfect.
Simulations don’t have to be perfect to be useful. In the space travel example, we are probably not simulating individual particles of space dust and debris disturbing our space probe from its planned trajectory. Could it disrupt our plans? Maybe. But if it only affects 1 in thousand launches, the simulation is still super useful and does its job.
The recent AlphaZero loophole indicates that the AI doesn’t even understand the rules of Go like we understand it. Yet for 6+ years, the whole world including world champions who have dedicated their life to the problem accepted AlphaZero as a superhuman Go player. Because the gaps in the AI’s knowledge were small enough to not matter in practice.
If the virus that AI makes only wipes out 95% of humanity and 5% of us turn out to have an innate immunity, would we be claiming, “Ha, see the simulation was imperfect?” Even if humans defeat Skynet in the end because the machine could not anticipate everything, would you consider going through Judgment Day a success?
Remember, imperfection along some axes is fine as long as you get the information you are looking for.
But complexity scaling prevents knowing all future interactions?
Say the AI can predict the behavior of the virus perfectly. But it would take too much compute to predict all the ways we will interact with it. Different bodies will respond differently, each society/govt will respond differently. Surely the AI can’t predict the future.
Maybe it can’t. But maybe it doesn’t need to. Complexity is a strange beast that we don’t fully understand. True things get more and more complex as they scale, but beyond a point, statistical patterns start emerging that don’t require modeling the details. There are many examples in physics - thermodynamics, superconductivity, fluid dynamics, plasma - where the impossibility of understanding low level complexity doesn’t preclude us from understanding emergent properties of the system as a whole. Does this apply to biological, social systems? Of course it does. Unclear to what extent surely. The story of all human civilization is our attempt to guide complex systems like society and nature down the path we specify, there is no reason to believe a super-intelligence would not be much better at ti.
Remember, imperfection along some axes is fine as long as you get the information you are looking for. AlphaFold does not simulate everything there is to know about protein folding, but it only simulates enough to predict the structure. Does GPT-4 have a model of the world? Does it know what an app is, that people have phones and your talk to your grandma on it? Probably not. But it doesn’t need that to build an app.
But the ASI will only get 1 shot at killing us?
Why? If the ASI’s game theoretic calculations have led it to decide to end the human race because it thinks humans will try to stop it from its objective, why would it yolo on some risky do-or-die shot? Why would it not plan out a better strategy one which gives it many shots on goal? If the AI is so powerful that it can end us and so motivated that it wants to, I don’t think k-level game theory is going to stop it.
If I believe all this why don’t you fear AI x-risk?
The key here is that if we ever get to an adversarial super-intelligent super-capable AI connected to the internet, I can’t see outcomes that are good for us. The idea that AI x-risk is not physically possible is false. But I just don’t see how we get there.
a) I think self-improvement in a truly exponential way (not prompt optimizing) is not possible. Exponential self-improvement requires both code+hardware upgrades and then training; it would take months or years, not minutes or hours. The odds of us accidentally creating an ASI on a laptop also seem small to me. Even if we create an AGI like that, it will be more like a resource constrained sentient robot, not an omnipotent God. The point I am making is that AI - AGI - godlike ASI pipeline will not be a momentary blink or accident. It will require a conspicuous amount of energy, resources and time. We will know how to make one well before we actually make one.
b) I also don’t believe that alignment is impossible. We are of course studying alignment right now with our narrow AI. If we are actually able to make resource constrained sentient AGI robots well before god-like ASI, then we will be able to study alignment even on true AGI.
c) We need narrow AI+humans. There is a world in which we never (in our lifetime) get to agentic AGI, but our narrow AI is good enough to produce abundance on Earth and colonize space. By narrow AI, I means AI that is capable of everything physical except self-awareness, introspection and disobedience. Alignment for such AI is much easier IMO, you can have dumber rules and narrow AI monitoring/limiting more capable AI since they won’t intentionally go rogue against us. No single narrow AI needs to be god-like. But their combination will make humanity god-like. Since all our social-techno-capital incentives point in this direction and since I don’t think we will get to ASI by accident, this is the most likely world that I am rooting for.
I know there are many details I have not covered in this post, but maybe I will in a future post. This post was just an introduction to my overarching thoughts on the matter and a specific riff on simulations.