AI Isn't Smart Enough (Yet) to Spot Horrific Facebook Videos

But it's getting there.

When Steve Stephens uploaded a 57-second video to Facebook of himself shooting and killing a man Sunday, the video stayed on Stephens' Facebook page for more than 2 hours before the company finally pulled it down. It was enough time for thousands of people to watch and share it on Facebook, and for third-party websites to download and reupload the video to their own servers. The incident reignited a fierce, if familiar, debate about what social media companies can do to keep gruesome content off of their sites and how these companies should go about removing offensive material. The murder also reminded us that once something hits the internet and gets shared around, it's incredibly difficult to scrub it from every corner of the web. So how much should companies do to prevent that content from appearing at all?

The best way to prevent a graphic video from being seen is to never let it be uploaded in the first place. Facebook could take steps to prevent just that. It could insist that someone (or something) watch every single video you try to post and allow it to be uploaded only after it's been approved. But if you had to wait for Facebook's approval of your video of a cat on a vacuum, you'd just post that video somewhere else. Facebook would alienate a large constituency of people who want the ability to immediately and easily share their lives. And Facebook can't afford that.

Others suggest Facebook simply delete offensive videos as soon as they're published, but there's one problem: it's not technically feasible to immediately pinpoint and delete graphic material. The technology isn't ready for algorithms to do it automatically, and it's impractical to hire enough humans to do it manually. If Facebook gave an algorithm the permission to pull down videos, it would inevitably make mistakes. And even if the algorithm got it right according to Facebook's terms of service (a big "if"), the company would be accused of censorship. That would have a chilling effect, because who would want to deal with the possibility of an algorithm wrongly deleting their videos? No one. Again, not something Facebook can afford.

Which is why right now, Facebook takes on a multi-pronged attack. The frontline is you, the Facebook user, who Facebook relies on to watch---and flag---videos like Stephens'. Backing you up in this task is some amount of AI, which can look out for things like videos with an ID known to be associated with child porn. When videos are flagged, they are sent to Facebook's content moderators, a cavalry of hundreds of thousands of humans whose job is to watch hours of footage and determine if it should be deleted. This system is imperfect, but human moderators remain smarter than AI---for now.

Eventually, though, AI will be able effectively flag videos like what was seen Sunday night, and when that day comes, it will be the realization of the promise that AI can work with humans---rather than replace them---to augment its skills. "I don’t think there is a task that, with enough trailing data, would not be possible to do, frankly," says Yann LeCun, director of AI research at Facebook. Though LeCun declined to answer questions about this particular video and how to fight it, what he's saying is that soon AI will be able to do more. It's not a matter of if Facebook will be able to use AI to monitor video in real-time and flag a murder, but of when.

Not Ready For Prime Time

In an ideal world, here's how Facebook would have handled Stephens' video: When he first uploaded himself saying he intended to kill people, AI-powered software would have "watched" that video immediately and flagged it as a high priority. That flag would have alerted Facebook's team of human moderators, who would have watched it, seen the direct and dire threat, removed the video, shut down Stephens' account, and alerted authorities.

That's not what happened. No one flagged the first video at all, according to a statement released yesterday by Justin Osofsky, Facebook's vice president of global operations. The second video---the one of of the murder itself---wasn't flagged until more than an hour and a half after Stephens uploaded it. Once a user flagged it, Osofsky said it took Facebook's moderators 23 minutes to take it down.

But this remains how the process has to work right now. Artificial intelligence is not sophisticated enough to identify the risk factors in that first video, or even necessarily in the second one that showed the murder. For AI to intervene, it would have needed to process Stephens' language; parse that speech and its intonation to differentiate it from a joke or a performance; and take the threat seriously. "There are techniques for this, but it is not clear they are integrated into the deep learning framework and can run efficiently. And there are kind of stupid mistakes that systems make because of lack of common sense," LeCun says. "Like if someone is twice the size, they are twice as close. There is common sense that all of us learn, animals learn too, that machines haven’t quite been able to figure out yet.”

Facebook knows it needs AI to learn this. It is invested heavily in it---LeCun's team is second only to Google in advancing the field. And it already employs algorithms to help flag certain questionable content where computer vision is better suited---namely child pornography, nudity, and copyright violations. In an interview with WIRED last fall, Facebook CEO Mark Zuckerberg said that half of all flags on the network now come from AI as opposed to people. "This is an area where there are two forces that are coming together," he said. "There's this community that is helping people to solve problems on an unprecedented scale. At the same time, we're developing new technologies that augment what this community can do."

But even Zuckerberg realizes that for now, human curators must continue to work alongside AI, and the video that Stephens uploaded on Sunday is a prime example of why. At the F8 developer conference in San Francisco Tuesday, Zuckerberg addressed this controversy directly. "We have a lot more to do here. We're reminded of this this week by the tragedy in Cleveland," he told the crowd. "And we have a lot of work, and we will keep doing all we can to prevent tragedies like this from happening."

Training a computer to identify that kind of violence is much harder than merely asking it spot a naked body. It's a lot like trying to identify fake news: It requires a complex understanding of context cues and formats.

Practical Options Right Now

Since it will take time for Facebook to train its neural networks to streamline that process, in the immediate future Facebook will need to make changes to its moderation process, something the company acknowledges. In his statement after the incident, Osofsky said, "As a result of this terrible series of events, we are reviewing our reporting flows to be sure people can report videos and other material that violates our standards as easily and quickly as possible."

This will mean making it easier to flag high-priority content, adding more human moderators, and insisting they work faster. And these human moderators will have to continue training AI. That in itself is going to take a long time. Before AI can be trained to effectively identify offensive content, it need lots of examples to learn from. So the first thing it needs to lots of properly labeled enough data to use as fodder. That requires hourly-wage human employees to watch endless amounts of on-screen violence and threatening language---grueling work that takes time.

The challenge is even bigger when Facebook Live is taken into consideration. Live video is hard to control, which is why some people have called for Facebook to get rid of its Live feature completely. That's unrealistic; the company introduced it last year in order to compete with other live-streaming services, and it's not going anywhere. Additionally, the service has captured another side of violent incidents. Last year, after police shot Philando Castile, his girlfriend used Facebook Live to capture the aftermath of the shooting and essentially used the streaming service as a way to send a global SOS.

"Instant video and live video are here to stay, for better or worse," according to Jeremy Littau, assistant professor of journalism and communication at Lehigh University. "And Facebook has to compete in that reality."

Short of getting rid of Live, Facebook could treat the features like broadcast networks do and insist that all video be on a delay. But for the reasons already articulated above, that delay wouldn't be of much use unless someone or something was monitoring every video, and that's not yet possible.

One thing Facebook could do is make it harder to download videos from Facebook, similar to how Instagram (also owned by Facebook) works. This could hinder third-party sites like Live Leak from grabbing and redistributing videos like the one Stephens uploaded Sunday. And while a small tweak like that won't stop the video from being uploaded in the first place, it could prevent it from being uploaded elsewhere, to enter the memory of the Internet forever, never to be erased.

Cade Metz contributed reporting.