Why is optimization the thing people worry about?
Optimization is a central concept in modern ML.
Optimization is the ability to guide the future to a specific state. This is at the root of many core problems of AI alignment
Goodhart
Goodhart’s law is a direct result of optimization. As you optimize more for the proxy the goal you actually care about is left behind.
Orthogonality
The orthogonality hypothesis is that advanced general capability is compatible with almost any set of goals. Therefore, even a superintelligence is liable to have values which are extremely distant from human ethical concerns. This is because
Intelligence
Intelligence, in the sense of artificial intelligence includes the ability to achieve ones goals. This is essentially an optimization process, how do I get more of the thing which I am pursuing.
Inner alignment
The concern of inner alignment is that
Outer alignment
Goal mispecification
The concern of outer alignment is that if we mispecify the goal then the artificial intelligence
Planning
Planning includes creating a list of possible solutions to a problem and