Connect with us

Tech

Why Reddit's licensing agreement provides Google with data mining fortunes

Why Reddit's licensing agreement provides Google with data mining fortunes

 


Social media platform Reddit signed a licensing agreement with Google on Thursday, giving the search giant access to Reddit users' posts to train its artificial intelligence (AI) engine. As part of the deal, Google will pay the social news aggregator $60 million annually for access to user-generated content from the platform.

This agreement could not have come at a better time for both companies. Reddit is hoping for cash and investor love ahead of its planned initial public offering (IPO). And Google is trying to save face from its AI failures.

Although Reddit generates revenue, the company is not profitable. The company's IPO documents filed with U.S. stock market regulators show 2023 revenue of $804 million. Most of it comes from advertisers. However, the platform caused him a net loss of $90.8 million.

Google's annual salary to Reddit provides cash to the platform and monetizes the company. Additionally, his data partnership with one of the biggest names in the AI ​​business could boost his Reddit status ahead of its IPO, and in an age of AI where chatbots are tearing apart monolithic social platforms, investors will be able to see value in the platform.

(For the day's top tech news, subscribe to our technology newsletter Todays Cache)

The licensing deal gives the Mountain View, Calif.-based company a mine of data to help it navigate the AI ​​disaster it's currently facing.

What's wrong with Google?

Google's sporadic attempts to break OpenAI's dominance in AI have seriously hurt the search giant. The company's virgin AI chatbot Bard, launched as a rival to OpenAIs ChatGPT, had flaws. The first demo video contained factual errors. Subsequent iterations were also not academically well-endowed.

More recently, the Gemini chatbot company overcompensated for its lack of diversity by displaying irrelevant images in response to queries. The company's AI-based image generator displayed a photo of a black woman in response to the question, “Who is the founding father of the United States?” In another example, Asians were depicted as German soldiers during the Nazi era. Such an unintellectual reaction caused quite a stir.

These blunders prompted Prabhakar Raghavan, the company's chief executive officer who oversees search operations, to apologize and say the product had missed the mark.

While these issues are related to large-scale language models (LLMs) and the weights attached to tokens, another challenge facing Google is that raw data LLMs are data-intensive algorithms. , the quality of the information flowing into it is very important.

To be able to input accurate text, a generative AI (GenAI) model must first read a large amount of text. For a long time, tech companies have been free riding by scraping the web for text or using open source crawling tools to sneak into websites and retrieve data from them. I did.

The tactic has been called into question as users and publishers push back against AI companies indiscriminately collecting data from the web. A proposed class action lawsuit in July 2023 accuses Google of misusing vast amounts of web users' personal information to train its AI models.

Separately in December, news publisher The New York Times sued OpenAI and Microsoft for copyright infringement. The lawsuit alleges that the AI ​​company used millions of news articles to train its AI model (ChatGPT).

Complaints like these from individuals and businesses are prompting lawmakers to step up and create policies for the ethical use of information available on the Web.

U.S. lawmakers have passed a bill, the AI ​​Fundamentals Model Transparency Act, that would require the Federal Trade Commission (FTC) and National Institute of Standards and Technology (NIST) to set rules for reporting data transparency for AI models. submitted. This requires builders of basic AI models to disclose the source of their training data.

If such a law were passed, AI companies would have to be compensated for using their data to train their models. As a result, the cost of building AI models increases. To pre-empt such laws, big tech companies are entering into licensing agreements with news publishers and other content sources. OpenAI's deal with news agency Associated Press is a case in point.

Other news organizations, including Gannett (the largest newspaper company in the U.S.) and News Corp (owner of the Wall Street Journal), are also in talks with OpenAI, according to media reports. Publications that sign deals with AI companies will receive fees based on how often their content is used.

How different is this deal?

Google's deal with Reddit is against this context. However, unlike other platforms, Reddit functions as a social news website, with content curated and promoted on social. The platform is made up of hundreds of sub-communities, known as subreddits, where members post content that is upvoted or downvoted by other members.

In the context of this agreement, Google will have access to the Reddits Data API to provide unique content in real-time to huge search content from a large and dynamic platform. This helps enterprise AI models access behavioral and trend information data. Separately, Google will continue to use crawlers to access information on the web.

However, there is one problem with Reddit. Concerns over content moderation and accessibility arose in July 2023 when Reddit decided to implement a new policy that would charge some third-party apps to access data on the platform. Several groups protested the changes proposed by Reddit. More than 8,000 subreddits went dark. At the time, these subreddit groups said the change threatened to eliminate historically important ways to customize the platform.

To avoid such conflicts this time around, Reddit is giving an unspecified number of top users, including moderators and users with high Karma scores, the opportunity to buy shares in the IPO, The Verge reports.

Reddit plans to do that through a tier-based allocation system. Tier 1 individuals will be specific users and moderators identified as users who have contributed meaningfully to Reddit community programs. The second tier consists of users with a Karma score of 2,000 or higher, a score that shows how much a user has contributed to the Reddit community, and users who have performed at least 5,000 moderator actions.

This is an unusual move since this privilege is usually given to professional investors who want to buy shares at a theoretically lower price before they are listed on an exchange. Reddit currently has approximately 267.5 million weekly active users, more than 100,000 active communities, and 1 billion total posts, according to SEC filings.

Have other platforms used user data to train AI models?

Unlike Reddit, few platforms have yet to announce whether their users' public information will be used to train AI models. X (formerly Twitter) announced in September that it would use user posts to train AI models for purposes outlined in its policy. The policy does not specify which AI models it refers to.

Meta said user data from applications such as Facebook, Instagram, and Threads will be used for AI training of the AI ​​chatbot. TikTok and Snapchat have both launched AI chatbots, but neither has mentioned accepting user posts to train their AI models.

Using user data to train algorithms is nothing new in the tech world. Most platforms' recommender engines use your personal usage data to suggest videos, articles, and movies. But using that information to train an AI model is new and requires caution, given that these chatbots tend to spew out personal information when responding to prompts.

A case in point is when Samsung banned the use of AI chatbots in its offices after employees used them and it was discovered that the bots spewed out trade secrets.

Sources

1/ https://Google.com/

2/ https://www.thehindu.com/sci-tech/technology/why-reddit-licensing-deal-offers-google-a-data-mine-to-push-its-luck/article67891673.ece

The mention sources can contact us to remove/changing this article

What Are The Main Benefits Of Comparing Car Insurance Quotes Online

LOS ANGELES, CA / ACCESSWIRE / June 24, 2020, / Compare-autoinsurance.Org has launched a new blog post that presents the main benefits of comparing multiple car insurance quotes. For more info and free online quotes, please visit https://compare-autoinsurance.Org/the-advantages-of-comparing-prices-with-car-insurance-quotes-online/ The modern society has numerous technological advantages. One important advantage is the speed at which information is sent and received. With the help of the internet, the shopping habits of many persons have drastically changed. The car insurance industry hasn't remained untouched by these changes. On the internet, drivers can compare insurance prices and find out which sellers have the best offers. View photos The advantages of comparing online car insurance quotes are the following: Online quotes can be obtained from anywhere and at any time. Unlike physical insurance agencies, websites don't have a specific schedule and they are available at any time. Drivers that have busy working schedules, can compare quotes from anywhere and at any time, even at midnight. Multiple choices. Almost all insurance providers, no matter if they are well-known brands or just local insurers, have an online presence. Online quotes will allow policyholders the chance to discover multiple insurance companies and check their prices. Drivers are no longer required to get quotes from just a few known insurance companies. Also, local and regional insurers can provide lower insurance rates for the same services. Accurate insurance estimates. Online quotes can only be accurate if the customers provide accurate and real info about their car models and driving history. Lying about past driving incidents can make the price estimates to be lower, but when dealing with an insurance company lying to them is useless. Usually, insurance companies will do research about a potential customer before granting him coverage. Online quotes can be sorted easily. Although drivers are recommended to not choose a policy just based on its price, drivers can easily sort quotes by insurance price. Using brokerage websites will allow drivers to get quotes from multiple insurers, thus making the comparison faster and easier. For additional info, money-saving tips, and free car insurance quotes, visit https://compare-autoinsurance.Org/ Compare-autoinsurance.Org is an online provider of life, home, health, and auto insurance quotes. This website is unique because it does not simply stick to one kind of insurance provider, but brings the clients the best deals from many different online insurance carriers. In this way, clients have access to offers from multiple carriers all in one place: this website. On this site, customers have access to quotes for insurance plans from various agencies, such as local or nationwide agencies, brand names insurance companies, etc. "Online quotes can easily help drivers obtain better car insurance deals. All they have to do is to complete an online form with accurate and real info, then compare prices", said Russell Rabichev, Marketing Director of Internet Marketing Company. CONTACT: Company Name: Internet Marketing CompanyPerson for contact Name: Gurgu CPhone Number: (818) 359-3898Email: [email protected]: https://compare-autoinsurance.Org/ SOURCE: Compare-autoinsurance.Org View source version on accesswire.Com:https://www.Accesswire.Com/595055/What-Are-The-Main-Benefits-Of-Comparing-Car-Insurance-Quotes-Online View photos

ExBUlletin

to request, modification Contact us at Here or [email protected]