Can you give an AI a goal which involves “minimally impacting the world”?
Penalizing an AI for affecting the world too much is called impact regularization and is an active area of alignment research.
Keep Reading
Continue with the next entry in "Objections and responses"
Can we constrain a goal-directed AI using specified rules?
Next