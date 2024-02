A heartbreaking analysis by independent product review site HouseFresh explains why. This site is dedicated to air purifier reviews and spends hundreds of hours testing each air purifier.

But after Google changed its SEO rules, we found ourselves overwhelmed by large media brands, some owned by private equity firms, publishing reviews that weren't reviews at all. These are just fillers for his SEO compliance designed to attract affiliate marketing.

This results in the recommendation of poorly performing products, including products from defunct companies that have not been independently tested. Evil drives out good.

These spam and fraudulent sites are now being fed back into the AI ​​model for training. Chatbots need high-quality information, and you know you can't get it for free.

So maybe Google could use the AI ​​itself instead of the real web to create nice, clean synthetic training data? Oh, this is even worse.

If you train an AI on the output it produces, the model will break down. Research shows that AI models actually converge on homogeneous results that remove differences and diversity.

For example, give a model a variety of faces of all ages and races, generate them only artificially, and run them repeatedly. By the third run, I had removed almost all faces that were non-white, under 25, or over 40. By the fifth run, we had converged on a small number of nearly identical faces.

In the researchers' words, the synthetic data mode floats around a single (high-quality) image and collapses before being combined.

Others have reproduced this phenomenon, but it is not unique to images. We recently talked about how AI is making the world less interesting.

Google failed to heed a simple business adage. When signing a contract with a supplier, it's wise to leave something on the table, especially if you value their products and want to do business with them in the future.

Giving your suppliers some leeway allows them to invest and hopefully thrive. But Google has rarely treated its information supply chain with such care or thought. I didn't think of them as a supplier, more like a mining site.

Currently, Google's AI requires high-quality data, so it can't be scraped for free. It takes a heart of stone not to enjoy the irony of quoting Oscar Wilde incorrectly.

