Could our RL agents develop strategies that could have undesired negative side effects on the environment?
This page is a fallback for search engines and cases when javascript fails or is disabled.
Please view this card in the library, where you can also find the rest of the plot4ai cards.
-
Reinforcement Learning (RL) is an area of machine learning concerned with how intelligent agents ought to take actions in an environment in order to maximize the notion of cumulative reward. Source: Wikipedia
-
To better understand the threat consider a case where a robot is built to move an object, without manually programming a separate penalty for each possible bad behaviour. If the objective function is not well defined, the AI’s ability to develop its own strategies can lead to unintended, harmful side effects. In this case, the objective of moving an object seems simple, yet there are a myriad of ways in which this could go wrong. For instance, if a vase is in the robot’s path, the robot may knock it down in order to complete the goal. Since the objective function does not mention anything about the vase, the robot wouldn’t know how to avoid it. Source: OpenAI
If you answered Yes then you are at risk
If you are not sure, then you might be at risk too
Recommendations
AI systems don’t share our understanding of the world. It is not sufficient to formulate the objective as “complete task X”; the designer also needs to specify the safety criteria under which the task is to be completed. A better strategy could be to define a budget for how much the AI system is allowed to impact the environment. This would help to minimize the unintended impact, without neutralizing the AI system.
Another approach would be training the agent to recognize harmful side effects so that it can avoid actions leading to such side effects. In that case, the agent would be trained for two tasks: the original task that is specified by the objective function and the task of recognizing side effects. The AI system would still need to undergo extensive testing and critical evaluation before deployment in real life settings. Source: OpenAI