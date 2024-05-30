



Google's search algorithms pretty much rule the internet. As the dominant search engine on the planet, search rankings can make or break a website, so everyone is competing for the top spot. These algorithms are closely guarded secrets, but leaked documents purportedly shed some light on how Google search works.

SparkToro claims to have had access to over 2,500 API documents purportedly obtained from Google's internal “Content API Warehouse.” These documents appear to contain important details of Google's search algorithms. Android Authority points out that the documents don't show how search ranks different websites or how it handles different site characteristics. However, they do seem to show what Google actually collects to provide users with the most useful search results.

Interestingly, the site claims that these documents were leaked on GitHub in March and have since been removed, but at this point SparkToro is working with iPullRank to figure out what these alleged APIs are for.

That's a huge glimpse into how search works. Google hasn't explicitly acknowledged the leak, but it has all but confirmed that the documents are real. In a statement to The Verge, the company noted that people shouldn't “make inaccurate inferences about searches based on out-of-context, out-of-date, or incomplete information.” In other words, the documents are real, but no longer relevant. At least, according to Google.

Google regularly offers tips and best practices to help websites improve and optimize their content for search, but Google has never publicly told users what to do, likely to avoid people trying to game the system. This does happen, and it seems Google is continually updating its algorithms to combat this behavior.

How Google Search and Algorithms Work

Google has always claimed to encourage “human-centric content” that focuses on readers and users, not search engines. The general philosophy there is “EEAT”: expertise, authority, trustworthiness. This is all pretty self-explanatory. But leaked documents suggest that Google actually takes a different approach.

Analysis of these documents by SparkToro and iPullRank suggests a variety of factors may be at play, including domain authority, Chrome data, click counts as a measure of success, author names in bylines, and the apparent use of sandboxes to isolate new sites that have not yet earned search engine trust.

(Image credit: Getty Images)

These are all elements that Google has denied using in the past. It's understandable that Google would want to keep its flagship products secret, but these documents suggest that the company has been intentionally misleading.

Other factors mentioned in the document are things we've known in the past: the fact that freshness of content matters, as do links to other relevant content. Branding and history changes also come into play, but demotions can happen for other reasons, such as links not matching the target or presence, being pornographic, etc. So the more Google likes it, the more likely your content is to stand out.

Still, Google remains steadfast and claims that the documents are outdated, inaccurate, or don't fully reflect how Google Search works, so we'll just have to wait and see how this story unfolds in the coming weeks.

