



It's been an interesting week. Rand Fishkin published an interesting article about a set of documents that were shared with him. These documents include documentation for making API calls to Google's Cloud Content Warehouse. Naturally, the speculation is that the documents will help us learn a lot about Google's search system.

Here is the documentation:

https://hexdocs.pm/google_api_content_warehouse/0.4.0/api-reference.html#module

There is also an even shorter version, 5.0.

This story is shrouded in mystery. Here is a video from someone who contacted Rand. Here is Mike King's thoughts on what we can learn from these documents. Every SEO should read it.

I believe there are many questions and a lot to learn about these files, and I intend to publish a blog post series as I learn and dig deeper each day.

These are not ranking algorithms, but we might be able to learn something about rankings by studying them.

The file contains two lists, one containing attributes, and some interesting ones. We'll look more into these in the future (not the AI ​​really, just me), but first we need to understand what they are and speculate on how or if they might be used for ranking.

The second documentation list contains information about thousands of modules that provide instructions to help developers connect to specific APIs (Application Programming Interfaces) on Google Cloud Platform.

Google Cloud Platform is a set of services that enables businesses to use Google's infrastructure and machine learning models.

The file is called Google_API_Content_Warehouse.

So I Googled contentwarehouse and found this document: contentwarehouse is a warehouse of documentation that developers use to connect to Google's AI.

This leads to some important conclusions and is perhaps a good place to end the first part of our investigation.

These documents are not the code used by Google's systems, but are intended to assist developers building with Google's AI on Cloud Platform.

Still, I think it's worth spending more time on.

Why do you think it is important to study these documents?

In April 2024, Google announced at its Cloud Next keynote that Google's AI technology will enable businesses to build more with more confidence, more accuracy, and more. Tools built to use Gemini can now be grounded in Google searches. This is important because grounding makes it less likely that AI will fabricate information.

Companies can now develop with Gemini using Google Cloud Platform to create products based on their data and search.

So that brings up a question I hope to answer by the end of this series.

Are the attributes listed in these API files the ones used in Google's search ranking algorithms? If so, how is this information handled?

My take on it is that attributes are all things Google can use in their calculations. Ranking is about using math to predict what will be useful to a searcher. It started with PageRank, and over time Google learned to use more signals than links and do more calculations with their machine learning algorithms. As these machine learning systems learn, they adjust the weight they give to each signal (attribute?) they use.

I think we could learn a lot more by studying the attributes. There are a few NavBoost related ones mentioned, like navquery, which uses Navboost query data. In fact, there could be multiple blog posts just looking at the mentions of Navboost. Clicks (including badClicks) are mentioned too. There's PagerankWeight, which is the weight stored in the PageRank link map. There's also AnchorSpamPenalizer!?! And there's some interesting information about the quality raters.

Stay tuned! As I learn more, I'll continue this series. (I'll add a link here when part 2 is ready.)

Marie

(I document everything I find interesting and important about this and other topics related to rankings and AI in Maries Notes.)

