OpenAI Offers Million-Dollar Deals to News Publishers for AI Training Content

TL;DR:

  • OpenAI extends offers from $1 million to $5 million to news publishers to secure content for AI training.
  • The actual value depends on the terms of the agreements.
  • Axel Springer, a major publishing giant, partnered with OpenAI, with the deal estimated at tens of millions.
  • Apple Inc. is investing around $50 million in agreements with various publications for generative AI development.
  • Several media companies, including The New York Times, CNN, Reuters, and Vox Media, have blocked OpenAI’s access to their data.
  • Legal actions have been taken against OpenAI and Microsoft for alleged copyright infringement in AI training.
  • The cost of training Large Language Models is on the rise, reflecting the challenges in acquiring training data.

Main AI News:

OpenAI is making a strategic move by extending offers ranging from $1 million to a potential $5 million to news publishing firms. Their objective? To secure content for the training of their formidable Large Language Models (LLM). While these figures might appear modest in light of OpenAI’s flagship LLM, ChatGPT, the true value hinges on the specifics of the agreements in place.

This development comes to us via a report by The Information, which cites insights from two industry executives familiar with the negotiations. OpenAI is currently engaged in discussions with approximately twelve media companies, signaling their intention to bolster the capabilities of their language models.

In a notable partnership disclosed in December 2023, OpenAI joined forces with Axel Springer, a publishing giant renowned for its diverse portfolio, encompassing prominent brands like Business Insider and Politico. Although the precise financial details of this alliance remain undisclosed, our sources suggest that it falls within the realm of tens of millions of dollars.

Meanwhile, Apple Inc. has entered the race to develop generative AI and is not holding back. They’ve secured agreements with prestigious publications, including Condé Nast (home to Vogue and The New Yorker), NBC News, and IAC (owners of The Daily Beast and Better Homes and Gardens). These agreements come with substantial price tags, reportedly totaling around $50 million. Apple’s willingness to make such substantial investments stems from its ambition to utilize content more expansively compared to OpenAI’s more specific objectives.

However, the path to training generative AI is fraught with challenges. Several major players in the media landscape have taken measures to protect their intellectual property. The New York Times, CNN, Reuters, and Vox Media, among others, have recently barred OpenAI’s GPT crawler from accessing their data. In December, The New York Times filed a lawsuit against both OpenAI and Microsoft Corp., alleging illegal use of their copyrighted material in the training of AI models.

Reddit Inc. followed suit last year, clamping down on companies leveraging their content for AI training purposes. Additionally, authors who have seen their hard work potentially exploited without consent have initiated legal actions against these entities. As a result, the cost of training Large Language Models is on an upward trajectory, highlighting the increasingly complex landscape surrounding AI training data acquisition.

Conclusion:

The market for AI training data is becoming increasingly complex and litigious. OpenAI’s offers to news publishers and their partnerships with publishing giants like Axel Springer demonstrate the growing importance of securing high-quality content for AI development. Apple’s substantial investments underline the competitive nature of this market. However, legal challenges and data access restrictions highlight the need for ethical and legal considerations in AI training.

Source