



Success online usually hinges on one factor that matters more than anything else: your website’s ranking in Google search.

For the past few decades, an entire industry known as Search Engine Optimization, or “SEO,” has revolved around trying to crack the code to boost the ranking of specific pages for various keyword search queries on Google.

This week, that “code” — or more specifically, the secrets behind Google's search engine algorithm — was leaked.

“We've never seen a breach of this magnitude or detail reported from Google's search division in the last quarter century,” said Rand Fishkin, CEO of Sparktro, a longtime SEO influencer.

Fishkin has been in the industry for many years, and founded the well-established SEO company Moz. It is likely Fishkin's long history in SEO that led the anonymous individual to send him an internal Google document called “Content API Warehouse.” The 2,500-page document details a wealth of previously unknown or unconfirmed information about how Google determines how websites are ranked in its search engine.

Upon receiving the leak, Fishkin and several other SEO and digital marketing leaders set out to verify the document. After examining the page, they determined that the leak was genuine. Google initially did not explicitly acknowledge the legitimacy of the leak, but Fishkin revealed that a Google employee had contacted him to change some of the details he had posted about the document.

Late Wednesday, Google confirmed to The Verge in an email that the document was indeed authentic.

The document contains a lot of technical information and seems more geared towards developers and technical SEO professionals than the general public or content creation SEO professionals, but there are some very interesting details that everyone can take away from this leak.

Google appears to use Chrome to rank pages

This is especially interesting because Google has previously denied using Chrome to rank websites.

According to documents analyzed by experts like Fishkin, Google appears to track the number of times users click on web pages in its Chrome web browser in order to select website pages to include in its sitemap for search queries.

So while it doesn't appear that Google uses this information to determine site-wide rankings, analysts speculate that the company uses Chrome activity to decide which internal pages to show in search results below a website's homepage.

Google seems to be tagging “small personal” sites for some reason

SEO expert Mike King from iPullRank pointed out the issue, but it raises more questions than it answers.

According to an analysis of Google's internal documents, the company specifically flags “small personal websites.” It's unclear how Google determines which websites are “small” or “personal,” and there's no information on why Google is tagging websites this way. Is this to help those websites rank higher in searches, or to lower their rankings?

At this time its purpose remains a mystery.

Clicks are important

This is another issue that SEO experts have long speculated about but Google has denied for years, and once again, it looks like they were right.

It turns out that Google relies on user clicks to determine search rankings much more than previously thought.

NavBoost is a Google ranking factor that focuses on improving search results. It focuses on click data to improve search results. King noted that NavBoost has a “specific module that is entirely focused on click signals.” One of the main factors that determines a website's ranking for a search query is short clicks vs. long clicks, i.e., the amount of time a user stays on a page after clicking a link from a Google search.

Exact match domains can hurt your search rankings

If you've ever seen a domain name that contains multiple keywords and a dash, like used-cars-for-sale.net, it's likely that at least part of the reason is SEO. It's long been believed among domain investors and the digital marketing community that Google values ​​exact match domain names.

However, this isn't necessarily true – in fact, exact match domains can actually have a negative impact on your rankings.

Nearly 10 years ago, Google announced that exact match domain names, once favored by their algorithm, would no longer be valued highly as a ranking tool. However, this leak provides evidence that there are mechanisms in place to actively demote these websites in Google Search. It turns out that Google equates many of these types of domains with keyword stuffing practices. The algorithm considers these types of URLs as potential spam.

Topic Whitelist

According to the document analysis, Google has a whitelist for certain topics, meaning that websites that appear in Google Search for these types of search queries must be manually approved and are not shown based on the usual algorithmically ranked search factors.

Some topics aren't all that surprising: websites containing content related to COVID information and political questions, especially election information, are whitelisted.

However, there is also a whitelist for travel websites. The exact purpose of this whitelist is unclear. SEO experts suggest that it may be related to travel sites that appear in certain Google travel tabs or widgets.

Google “lied”

The leaked documents have allowed Fishkin, King and other SEO experts to confirm and debunk a significant number of SEO theories, and it's clear to them that Google hasn't been telling the whole truth about how its search algorithm works for years.

“'Lied' is a harsh word, but it's the only accurate word that can be used here,” King wrote in his own analysis of the Google Content API Warehouse documents.

“While I don't necessarily blame Google representatives for protecting their company's proprietary information, I do take issue with the company's efforts to actively discredit people in marketing, technology and journalism who have published reproducible findings,” he said.

As industry experts continue to sift through this voluminous document, even more intriguing details hidden in Google's search algorithms may soon emerge.

A Google representative declined Mashable's request for comment.

Update: May 30, 2024 at 10:52 a.m. EDT Google has since confirmed the legitimacy of the leaked documents. This article has been updated to reflect this information.

