Over 88% of top US news outlets block AI web crawlers used for chatbots and AI projects

TL;DR:

  • Over 88% of top US news outlets block AI web crawlers, including renowned publications.
  • Right-wing media outlets such as Fox News, Daily Caller, and Breitbart have chosen not to block AI crawlers.
  • Speculation arises about whether this is a strategic move to counter perceived political bias in AI models.
  • Debates continue on the actual impact of including right-wing content in AI training data.
  • Copyright concerns and differing ideologies may also play a role in the differing approaches.
  • Some right-wing outlets cited unintentional oversight as the reason for not blocking AI crawlers.
  • Larger, well-resourced news outlets are more likely to block AI bots as a defensive measure.

Main AI News:

In a digital age marked by the relentless march of artificial intelligence, a fascinating phenomenon has emerged in the media landscape. Recent data reveals that over 88 percent of leading US news outlets have erected formidable barriers to thwart web crawlers, the lifeblood of AI companies like OpenAI, from accessing their content for chatbots and AI projects. Yet, a conspicuous anomaly persists as right-wing media outlets buck this trend, raising intriguing questions about their motivations and strategies.

A Dive into the Data

Research conducted in mid-January by Ontario-based AI detection startup, Originality AI, offers a unique glimpse into this digital skirmish. Their study surveyed 44 top news websites and found that nearly all of them had fortified their defenses against AI web crawlers. Esteemed publications such as The New York Times, The Washington Post, The Guardian, and even general-interest magazines like The Atlantic had raised their digital drawbridges to AI invaders. At the forefront of their defenses stood OpenAI’s GPTBot, the most widely-blocked crawler in the field.

A Striking Outlier

However, in this battle of wits between the media and AI, a glaring exception emerged. Right-wing news outlets such as Fox News, the Daily Caller, Breitbart, and even The Free Press, founded by pundit Bari Weiss, stood as bastions of accessibility. They chose not to block the most prominent AI web scrapers, including Google’s data collection bot, contrary to their liberal counterparts.

Unmasking the Motives

The rationale behind this discrepancy remains a subject of intrigue. One prevailing theory suggests that right-wing media may be strategically leveraging this open access to AI companies to counter perceived political bias in AI models. “AI models reflect the biases of their training data,” asserts Originality AI founder and CEO Jon Gillham, hinting at the potential for ideological balance.

Debates on Impact

Yet, the impact of such a strategy remains a matter of contention among experts. Data scientist David Rozado, creator of the RightWingGPT AI model, acknowledges that including right-wing content in AI training data could influence model parameters. On the contrary, AI ethics researcher Jeremy Baum remains skeptical, citing the substantial volume of existing mainstream news data and the tendency of AI companies to maintain a neutral stance.

AI Companies’ Perspective

OpenAI spokesperson Kayla Wood emphasizes the company’s commitment to diverse training data, asserting that the inclusion or exclusion of a single news sector has negligible influence on the model’s intended learning and output.

The Copyright Conundrum

Another facet of this saga may revolve around copyright. The New York Times is currently pursuing a lawsuit against OpenAI for copyright infringement, characterizing AI data collection as unlawful. Mainstream media leaders like Condé Nast CEO Roger Lynch echo this sentiment. However, the right-wing media remains relatively silent on the matter, suggesting they may see data scraping as a protected endeavor under the fair use doctrine.

Unintentional Oversights

Some right-wing outlets attributed their non-blocking stance to simple oversight. The Washington Examiner, for instance, began blocking GPTBot only after being made aware of the option. Similarly, the Daily Caller confessed that their permissiveness towards AI crawlers was an unintentional mistake.

Navigating the Landscape

While right-wing media wields considerable influence, they are smaller and leaner than their mainstream counterparts. Data journalist Ben Welsh’s research indicates that larger news outlets with better resources and technical knowledge are more likely to block AI bots, perhaps as a defensive measure.

A Forward-Thinking Approach

One right-leaning news outlet is taking a forward-thinking approach. The Daily Wire, for example, is exploring ways to protect its intellectual property while also striving to prevent AI from inheriting the biases of the mainstream press.

Conclusion:

The unique approach of right-wing media in allowing AI web crawlers to access their content may reflect an attempt to influence AI models’ political biases. However, the actual impact remains debatable, and the copyright issue adds complexity to this evolving landscape. For the market, this highlights the need for AI companies to navigate diverse media access policies while striving for balanced and unbiased AI models.

Source