Can Reinforcement Learning learn to be deceptive?
I have seen several exampled of deploying RL agents in deceptive environnement or games and the agent learns to perform its tasks regardless. What about the other way around? Can RL be used to create deceptive agents? An example could be asking an agent a question What color is this? and it replies with a lie for example.
I am interested on a higher level of deception and not a simple if-else program that doesn't tell you what you need to know. If you know any algorithms or reading materials, please feel free to share.
Example:
Details about the agent and the environnement: The agent receives a text-based input (text-based tasks). For the sake of simplicity, let's assume there is an input control and only a certain set of tasks are allowed with certain keywords: Show me the latest news and the agent prints something from last month (it's not recent but it's a good enough answer). To simplify even more the input so it doens't turn into a 100% an NLP problem; let's say the agent knows what needs to be done when receiving the keyword show me.
Another similar use case is to have two agents. The first agent acts normally, excutes the tasks as expected but another agent punishes if it's 100% honest, meaning it will train the other agent to be deceptive.
Topic markov-process reinforcement-learning machine-learning
Category Data Science