



One of the latest chapters in Facebook files shows that Facebook AI cannot always detect harmful content, despite significant investment and technological improvements. Perhaps most surprisingly, Facebook employees estimated that 99.4% of content that violated corporate policies against violence and incitement remained on the platform. The reason is that AI detection is essential, despite struggling with this difficult task, as it is too much to monitor manually. In the words of Facebook’s senior engineers, we don’t have, and probably never have, a model that captures even most of the harm of integrity.

This content moderation issue is not unique to Facebook. It plagues all large social media platforms. However, at least for the wrong information, the recent focus on content moderation distracts us from the important things. In addition to detecting false information on social media, AI can be a tool for refunding false information and prevent it from spreading to social media in the first place. However, it has not been used as effectively as possible for this second purpose.

The reason there is so much dangerous false information is, unfortunately, that it is very profitable. Fake news results in real clicks, which are accompanied by real amounts in the form of advertising revenue. When this advertising revenue is exhausted, a lot of false information will be exhausted. You need to make it harder for publishers of incorrect information to host online ads.

Google is the largest digital advertising company in the world. Like Facebook and Twitter, Google displays ads on its own site, but Google also acts as an intermediary between advertisers and independent sites that want to host ads. To do this, Google runs an online auction to algorithmically serve more than 2 million non-Google sites called the Google Display Network. These GDN sites receive payments from advertisers to host ads based on the number of times they view the ad and share a portion of this payment with Google.

The Global Disinformation Index, a non-profit, non-partisan UK organization, estimates that disinformation sites will generate about $ 250 million in advertising revenue in 2019, of which Google will account for about 40%. A Google spokesperson said the relevant GDI report was fundamentally flawed in that it did not define what should be considered disinformation. This methodological criticism from Google discourages more than 12 leaders of the top US philanthropic groups by sending a letter to Google CEO Sundar Pichai expressing concern about the issues revealed in the GDI report. It was. Annual advertising revenue for false information (a category larger than disinformation) was recently estimated by NewsGuard Technologies, a technology company aimed at evaluating news sources, to exceed $ 2.5 billion.

NewsGuard has identified more than 150 sites that have published false and conspiracy theories about the 2020 presidential election between election day and inauguration day. We found that 80% of these sites, including One America News Network and Gateway Pundit, receive ad placements from Google. NewsGuard also found some examples of reputable medical institutions inadvertently advertising and thereby funding sites that published harmful medical false alarms.

Google has a policy that prohibits certain types of content within GDN. It included several categories of false information about elections and health, and on November 8th, negative false information about climate change was added to the list. Google first removes ad hosting privileges on individual pages that violate our policies. It relies on site-wide monetization only in the case of persistent and terrible violations.

AI is used to facilitate and scale the efforts of human moderators, as the sheer volume of ad delivery requires a semi-automated approach to detecting policy violations. In 2020, Google monetized 1 billion pages for policy violations. This is an astronomical number, but as mentioned above, one particular revelation from Facebook files is that there are as many bad things as AI can catch, and there are always a lot of things that it misses. .. For example, Facebook estimates that it uses only 3-5% of malicious expressions that are banned on the platform. This is because we are confident that all automated AI detection systems can be removed without human verification.

The AI ​​system cannot determine if some of the content violates the policy. AI estimates the probability of a policy violation, and if this score exceeds a certain threshold, an action is taken. Action thresholds are usually set very high to keep the number of false positives low. That’s why investigations such as GDI and NewsGuards have found so much false information that passes through Google’s detection system, even though it explicitly violates Google’s advertising hosting policy. Content with a score of 89% is considered safe if the threshold is set to, for example, 90%.

There is a painfully obvious way. These scores should also be used as a penalty for algorithmic auctions run by Google where GDN sites bid on ad placement.

Even the most malicious Facebook knows this is how to use AI. Facebook has only removed 3-5% of hate speech, but has significantly reduced the amount of hate speech users encounter (what Facebook calls the hate speech epidemic). This is because we use the probability of policy violations as a penalty for the newsfeed algorithm. Even if the AI ​​isn’t confident that the post will be permanently deleted, the more likely the post is malicious, the less likely it will be displayed. Facebook still has a long way to go, but at least with this downrank approach, Facebook is using AI effectively to prevent users from seeing that bad on the platform.

Google tasks are much easier than Facebook. Facebook needs to rate individual content, while Google can use site performance to assign penalties. In fact, Google has already done this. We frequently promote our efforts to improve quality journalism in our search rankings by demoting sites that have a proven track record of disclosing incorrect information. This is a somewhat successful endeavor, partly because Facebook and Google were crucified in 2016 because they played a role in disseminating false information, but today we mostly hear about Facebook’s failure. To For example, Google thankfully did not repeat the embarrassing 2016 blunder of ranking WordPress blogs in 2020, and falsely claimed that Trump won the popularity poll as the top result of the final election results for the search phrase. ..

There’s no reason why Google can’t apply the same kind of demotion that Google has already applied in its search ranking algorithms for sites that publish incorrect information in its ad auction / delivery algorithms. Google relies on a porous all-or-nothing approach to detect false information about GDN while sitting lazy on proven methods that can be very helpful.

This is the peak of hypocrisy. In the private shadow of ad serving, Google is deliberately pouring money into exactly the same false alarm sites that are proudly fighting in public search systems.

When asked to comment, a Google spokeswoman wrote: There is a strict publisher policy that explicitly prohibits a wide range of unreliable claims and false information. If we find content that violates these policies, we will block or remove the delivery of ads. Depending on the nature and spread of the breach, it may be individual page-specific or site-wide. These strict policies are hampered by an all-or-nothing approach to unnecessarily anemia for Google to enforce.

We need to continue to encourage Facebook to fix engagement-driven algorithms that undermine democracy. You also need to be more transparent in order to better understand these algorithms and hold Facebook accountable for their harm. But anti-Zuckerberg hysteria shouldn’t distract us from Google’s quiet role in financially supporting much of the harmful content posted on Facebook.

What really happened at the alt-right and criminal-loved Bitcoin competitor Astroworld is more horrifying than conspiracy theory. Can the illness really cause acute psychosis?

To give Google some credibility, it eventually discontinued the previous option that GDN sites remain anonymous to advertisers (a common choice for false information sites). We also recently added an option for advertisers to import a dynamically updated host site exclusion list curated by a third party. This helps avoid false information sites when advertisers are willing to do so.

Google is the largest advertising distributor and has a lot of responsibility here, but Google isn’t the only one funding false information. We need Congress to regulate the dangerous and dysfunctional digital advertising market. Demand all online ad distributors to clarify their efforts to ban and rank down false alarm sites. This will provide at least some public accountability and mandatory measures to help advertisers choose healthier locations. Place ads regardless of the ad distributor you use.

Thanks to Frances Haugen and her leading whistleblower, you can see how much harmful content Facebook has and how little the company has done to stop it. What we aren’t fully focused on is who will benefit from creating this harmful content in the first place, and how to stop it before it spreads and becomes a nightmare of technical and free speech. is.

