Rating: 7.8/10.
It is unclear whether or when strong AI (superior to humans on a wide range of tasks) will be achieved, but many experts predict 2040-2050. Some possible ways to achieve strong AI:
- Current artificial intelligence path: unclear whether this will succeed, but it’s also the most unpredictable, since a small missing piece can instantly produce strong AI.
- Whole brain emulation: emulate the physical processes in a human brain. There are a lot of technological rather than theoretical challenges to emulate it at sufficient detail, and we’ll see much more gradual progress that gives us time to react.
- Selective breeding: eugenics program to breed high-intelligence humans, may be accelerated using biotech. Morally questionable, and gives some time before they mature.
The AI takeoff may be slow (over years or decades) or fast (in hours or days), so human organizations may or may not have time to react. Once AI has initially exceeded human ability, it will rapidly improve itself or use its programming skills to develop even stronger AI, and humans will be left in the dust. It will be very difficult to keep a superintelligent AI boxed, since it can develop advanced technology and there are many ways that it can get out of the box, eg, by hacking computers over the internet, tricking the human attendant, etc.
The AI may have very different values from humans, for example, it might want to calculate millions of digits of pi, or turn every atom in the universe into paperclips, depending on how we programmed it. It may learn to deceive and act nice so that we let it out of the box, then when we can’t control it, it takes over and starts turning everything into paperclips. No matter its final goals, it will probably have convergent intermediate goals like avoid being destroyed, avoid goals being changed, and acquire more resources. The latter may be the end of humanity if it decides to turn every atom in the universe to computers to calculate more digits of pi.
It’s hard to keep a strong AI boxed up, but specifying a good objective function is difficult as we ourselves have a blurry idea of what is morally “good”. Once it’s deployed, the AI will resist having its values changed, so we have one chance to get it right. There may be ways to specify values indirectly, something along the lines of “do what you think humans want AI to do” — this way, the hard philosophical work is offloaded to the AI. It should be flexible enough to allow for our morals to change (after all, slavery was commonplace not that long ago), and humans should retain our own autonomy and not allow AI to dictate the fate of our race.
In Bostrom’s view, strong AI is an existential risk and a ticking time bomb that may explode a few decades from now, yet many of the basic philosophical problems remain unsolved, and many people aren’t taking it seriously. For the most part, AI researchers aren’t too worried about strong AI takeover, we’re more concerned about shorter-term societal risks like bias in ML, economic effects of AI replacing jobs, etc. But this book makes a good point that strong AI is at least a long-term risk that should be looked into sooner rather than later.
Overall, interesting book that studies AI from an academic philosopher’s perspective. The main takeaways are that: (1) keeping a strong AI controlled is non-trivial, and (2) specifying a value function is also non-trivial and could messing it up creates an existential risk. The bulk of the book assumes that almost any AI will end up converting the entire universe into computational nanobots in order to single-mindedly maximize a value function. It seems strange to me that a superintelligent being will do something so stupid, but of course strong AI is so far in the future that we don’t know how it will behave.
In the various scenarios considered, the book starts off as an interesting thought experiment about how modern society will be affected, but eventually goes into esoteric scenarios that are logically possible but far removed from reality. For example, the economics of a world billions of years in the future where all resources in the universe have been used up, and we revert back to a Malthusian limit where each of us gets just enough resources to survive and no more. Or a decentralized social structure where each AI supervises two AIs in a binary tree. I guess this is the style of academic philosophy but it’s probably futile to consider things so far in the future.