Conflict between Reddit and Perplexity: a case of data theft for AI training

Amid growing tensions surrounding the use of data to train artificial intelligence, Reddit has just filed a high-profile lawsuit against Perplexity. Accused of illegally exploiting Reddit data, this technology company finds itself at the center of a heated debate on digital ethics and online information ownership. Here are the details of this case, which could well redefine the limits of data use on the internet.

3 key points to note

  • Reddit accuses Perplexity of copying its data without a license to train its AI.
  • The lawsuit was filed in New York and includes three other technology companies.
  • Reddit wants to prohibit the use of its data and is seeking financial compensation.

Reddit’s accusations against Perplexity

Reddit has filed a lawsuit in the United States, accusing Perplexity of circumventing its platform’s security measures to access its data. This complaint, filed in federal court in New York, is part of a broader context in which several technology companies are being singled out for their unauthorized use of data.

Reddit claims that Perplexity, along with three other companies – Oxylabs, AWMProxy, and SerpApi – accessed billions of posts without permission. This data was allegedly used to enrich Perplexity’s AI search engine algorithm.

The defense of Perplexity and the other companies

Perplexity has responded by describing its approach as “principled and responsible” and plans to vigorously defend itself against Reddit’s accusations. SerpApi has expressed “strong disagreement” with the allegations against it, while Oxylabs has said it is “shocked and disappointed” by Reddit’s lack of dialogue.

The companies involved in this case seem determined to prove their innocence, highlighting the legal and ethical complexities surrounding the use of data on the internet.

Reddit and data protection

Reddit has always been vigilant about protecting its data. Last year, the platform decided to stop allowing search engines to display its content for free. A substantial $60 million per year agreement was reached with Google to allow Reddit posts to be displayed in search results and to train Gemini models.

This strategy aims to protect Reddit’s rights to its data while monetizing its use by third parties, highlighting the growing importance of data ownership in the digital ecosystem.

Background and implications of the case

Reddit, founded in 2005 by Steve Huffman and Alexis Ohanian, has become one of the largest online discussion platforms, known for its diverse and dynamic communities. In 2023, the platform took steps to protect the monetization of its content, including limiting free access to it by search engines.

Perplexity, meanwhile, positions itself as an innovative company in the field of AI search engines. The company promotes an ethical and responsible approach to data use, which makes this case all the more complex. The verdict in this lawsuit could have major repercussions on how data is shared and used by technology companies in the future.

[New] 4 ebooks on digital marketing available for free download

Did you enjoy this article? Receive our next articles by email.

Sign up for our newsletter, and you will receive an email every Thursday with the latest articles published by experts.

Other articles on the same topic:

Leave a Reply

Your email address will not be published. Required fields are marked *