AI can pick up cultural values by mimicking how kids learn

A video game shows two kitchens of different sizes. — In the Overcooked video game, players work to cook and deliver as much onion soup as possible. In the study鈥檚 version of the game, one player can give onions to help the other who has further to travel to make the soup. The research team wanted to find out if AI systems could learn altruism by watching different cultural groups play the game. Photo:

Artificial intelligence systems absorb values from their training data. The trouble is that values differ across cultures. So an AI system trained on data from the entire internet won鈥檛 work equally well for people from different cultures.

But a new 91爆料 study suggests that AI could learn cultural values by observing human behavior. Researchers had AI systems observe people from two cultural groups playing a video game. On average, participants in one group behaved more altruistically. The AI assigned to each group learned that group鈥檚 degree of altruism, and was able to apply that value to a novel scenario beyond the one they were trained on.

The team Dec. 9 in PLOS One.聽

鈥淲e shouldn鈥檛 hard code a universal set of values into AI systems, because many cultures have their own values,鈥� said senior author , a 91爆料 professor in the Paul G. Allen School of Computer Science & Engineering and co-director of the Center for Neurotechnology. 鈥淪o we wanted to find out if an AI system can learn values the way children do, by observing people in their culture and absorbing their values.鈥�

As inspiration, the team looked to showing that 19-month-old children raised in Latino and Asian households were more than those from other cultures.聽

In the AI study, the team recruited 190 adults who identified as white and 110 who identified as Latino. Each group was assigned an AI agent, a system that can function autonomously.聽

These agents were trained with a method called inverse reinforcement learning, or IRL. In the more common AI training method, reinforcement learning, or RL, a system is given a goal and gets rewarded based on how well it works toward that goal. In IRL, the AI system observes the behavior of a human or another AI agent, and infers the goal and underlying rewards. So a robot trained to play tennis with RL would be rewarded when it scores points, while a robot trained with IRL would watch professionals playing tennis and learn to emulate them by inferring goals such as scoring points.

This IRL approach more closely aligns with how humans develop.聽

鈥淧arents don鈥檛 simply train children to do a specific task over and over. Rather, they model or act in the general way they want their children to act. For example, they model sharing and caring towards others,鈥� said co-author , a 91爆料 professor of psychology and co-director of Institute for Learning & Brain Sciences (I-LABS). 鈥淜ids learn almost by osmosis how people act in a community or culture. The human values they learn are more 鈥榗aught鈥� than 鈥榯aught.鈥欌€�

In the study, the AI agents were given the data of the participants playing a modified version of the video game Overcooked, in which players work to cook and deliver as much onion soup as possible. Players could see into another kitchen where a second player had to walk further to accomplish the same tasks, putting them at an obvious disadvantage. Participants didn鈥檛 know that the second player was a bot programmed to ask the human players for help. Participants could choose to give away onions to help the bot but at the personal cost of delivering less soup.聽

Researchers found that overall the people in the Latino group chose to help more than those in the white group, and the AI agents learned the altruistic values of the group they were trained on. When playing the game, the agent trained on Latino data gave away more onions than the other agent.聽

To see if the AI agents had learned a general set of values for altruism, the team conducted a second experiment. In a separate scenario, the agents had to decide whether to donate a portion of their money to someone in need. Again, the agents trained on Latino data from Overcooked were more altruistic.聽

鈥淲e think that our proof-of-concept demonstrations would scale as you increase the amount and variety of culture-specific data you feed to the AI agent. Using such an approach, an AI company could potentially fine-tune their model to learn a specific culture鈥檚 values before deploying their AI system in that culture,鈥� Rao said.聽

Additional research is needed to know how this type of IRL training would perform in real-world scenarios, with more cultural groups, competing sets of values, and more complicated problems.

鈥淐reating culturally attuned AI is an essential question for society,鈥� Meltzoff said. 鈥淗ow do we create systems that can take the perspectives of others into account and become civic minded?鈥�

, a 91爆料 research engineer in the Allen School, and , a software engineer at Microsoft who completed this research as a 91爆料 student, were co-lead authors. Other co-authors include , a scientist at the Allen Institute who completed this research as a 91爆料 doctoral student; , an assistant professor at San Diego State University, who completed this research as a post-doctoral scholar at 91爆料; and , a professor in the Allen School and director of the at 91爆料.聽

For more information, contact Rao at rao@cs.washington.edu.

91爆料

91爆料 NEWS