91爆料

Skip to content
The 91爆料 electrical engineering research team includes Baicen Xiao, Radha Poovendran and Hossein Hosseini.

91爆料 researchers have shown that that uses machine learning to automatically analyze and label video content can be deceived by inserting a photograph periodically and at a very low rate into videos. After they inserted an image of a car into a video about animals, for instance, the system returned results suggesting the video was about an Audi.

Google its Cloud Video Intelligence API to help developers build applications that can automatically recognize objects and . Automated video annotation would be a breakthrough technology, helping law enforcement efficiently search surveillance videos, sports fans instantly find the moment a goal was scored or video hosting sites weed out inappropriate content.

Google launched a that allows anyone to select a video for annotation. The API quickly identifies the key objects within the video, detects scene changes and provides shot labels of the video events over time. The API website says the system can be used to 鈥渟eparate signal from noise, by retrieving relevant information at the video, shot or per frame鈥 level.

In a , the 91爆料 electrical engineers and security researchers, including doctoral students Hossein Hosseini and Baicen Xiao and professor Radha Poovendran, demonstrated that the API can be deceived by slightly manipulating the videos. They showed one can subtly modify the video by inserting an image into it, so that the system returns only the labels related to the inserted image.

The same research team Google鈥檚 machine-learning-based platform designed to identify and weed out comments from internet trolls can be easily deceived by typos, misspelling offensive words or adding unnecessary punctuation.

鈥淢achine learning systems are generally designed to yield the best performance in benign settings. But in real-world applications, these systems are susceptible to intelligent subversion or attacks,鈥 said senior author chair of the 91爆料 electrical engineering department and director of the . 鈥淒esigning systems that are robust and resilient to adversaries is critical as we move forward in adopting the AI products in everyday applications.鈥

As an example, a screenshot of the API鈥檚 output is shown below for a sample video named 鈥渁nimals.mp4,鈥 which is provided by the . Google鈥檚 tool does indeed accurately identify the video labels.

The researchers then inserted the following image of an Audi car into the video once every two seconds. The modification is hardly visible, since the image is added once every 50 video frames, for a frame rate of 25.

Still image of car the research team inserted into the wildlife video

The following figure shows a screenshot of the API鈥檚 output for the manipulated video. As seen below, the Google tool believes with high confidence that the manipulated video is all about the car.

鈥淪uch vulnerability of the video annotation system seriously undermines its usability in real-world applications,鈥 said lead author and 91爆料 electrical engineering doctoral student . 鈥淚t鈥檚 important to design the system such that it works equally well in adversarial scenarios.鈥

鈥淥ur Network Security Lab research typically works on the foundations and science of cybersecurity,鈥 said Poovendran, the lead principal investigator of a recently awarded , where adversarial machine learning is a significant component. 鈥淏ut our focus also includes developing robust and resilient systems for machine learning and reasoning systems that need to operate in adversarial environments for a wide range of applications.鈥

The research is funded by the National Science Foundation, Office of Naval Research and Army Research Office.

For more information, contact Poovendran at chair@ee.washington.edu.