Why might a maximizing AI cause bad outcomes?
Computers only do what you tell them. But any programmer knows that this is precisely the problem: computers do exactly what you tell them, with no common sense or attempts to interpret what the instructions really meant. If you tell a human to cure cancer, they will instinctively understand how this interacts with other desires and laws and moral rules; if a maximizing AI has the sole goal of trying to cure cancer, it will literally just want to cure cancer.
Giving a superintelligence
An AI with cognitive abilities far greater than those of humans in a wide range of important domains.
To take a deliberately extreme example: suppose someone programs a superintelligence to calculate as many digits of pi as it can within one year. And suppose that, with its current computing power, it can calculate one trillion digits during that time. It can either accept one trillion digits, or spend a month trying to figure out how to get control of the TaihuLight supercomputer, which can calculate two hundred times faster. Even if it loses a little bit of time in the effort, and even if there’s a small chance of failure, the payoff — two hundred trillion digits of pi, compared to a mere one trillion — is enough to make the attempt. But on the same basis, it would be even better if the superintelligence could control every computer in the world and set it to the task. And it would be better still if the superintelligence controlled human civilization, so that it could direct humans to build more computers and speed up the process further.
Now we’re in a situation where a superintelligence is incentivized to take over the world as an instrumental goal A goal which is pursued as means to some other end, rather than as an end in itself.