scorecardresearch

Major web platforms unite against AI using their content: A new era for fair use

In a digital age showdown, Medium leads the charge against AI content scraping, paving the way for ethical web use.

advertisement
Major websites unite against web crawler GPTbotartificial intelligence
Major websites unite against web crawler GPTbot
profile
New Delhi, UPDATED: Sep 29, 2023 12:24 IST

Highlights

  • Leading web platforms, including Medium, CNN, and The New York Times, band together to block OpenAI's content-scraping GPTBot
  • Medium's CEO criticises AI companies for profiting from writers' work without consent, sparking Medium's move to ban GPTBot
  • Medium seeks partners to form a coalition addressing ethical concerns around AI content use, potentially reshaping the future of web content

In a significant development, the web publishing platform Medium has chosen to block OpenAI's GPTBot, a robot designed to scrape web content for AI model training. This move aligns Medium with other major players like CNN, The New York Times, and various media outlets, all of which have placed "User-Agent: GPTBot" on their websites' 'robots.txt' files to prohibit its access. This collective stance represents a growing resistance to what many view as the exploitation of their content.

advertisement
Robots.txt of Medium which shows they have blocked GPTbot crawler, (Photo: Medium)
Robots.txt of Medium which shows they have blocked GPTbot crawler, (Photo: Medium)

AI's shadow on the web: A clash of ethics and profit

Tony Stubblebine, the CEO of Medium, did not mince words when he stated his opinion that the current state of generative AI does not benefit the internet as a whole. According to him, AI companies are profiting from the work of writers without obtaining consent, offering compensation, or giving credit to content creators. Consequently, Medium has decided to close the door on OpenAI's web scraper.

I’m not a hater, but I also want to be plain-spoken that the current state of generative AI is not a net benefit to the Internet. They are making money on your writing without asking for your consent, nor are they offering you compensation and credit AI companies have leached value from writers in order to spam Internet readers.

Tony Stubblebine, the CEO of Medium
advertisement

However, Stubblebine also acknowledges that this voluntary approach may not deter unscrupulous AI platforms, which are likely to disregard such requests. To counter these actions, the idea of actively disrupting their data with fake content (known as data poisoning) has been proposed, although it comes with the potential for escalation and costly legal disputes.

A coalition for the future: Websites join hands for fair use

Nevertheless, there is hope on the horizon. Medium is actively seeking partners to create a coalition of platforms committed to shaping the future of fair content use in the era of AI. Conversations have already begun with several major organisations, even though they are not yet ready to publicly collaborate.

Progress in this regard is slow, as the realm of publishing and copyright faces complex legal and ethical challenges posed by the nascent field of AI. The central challenge lies in defining intellectual property (IP) and copyright in this evolving landscape and aligning the diverse interests of various stakeholders.

The uncertainty surrounding IP and copyright laws in the context of AI makes it difficult to establish clear rules. A bold step may require a major player like Wikipedia to lead the way and break the ice for others.

While some organisations may be constrained by business considerations, others are unburdened by such concerns, ready to forge ahead without worrying about disappointing stockholders. Until a pioneering entity steps up, the fate of web content remains at the mercy of web crawlers, who either respect or disregard consent as they please.

advertisement
Published on: Sep 29, 2023 12:24 ISTPosted by: Minaal, Sep 29, 2023 12:24 IST

COMMENTS 0

Advertisement
Recommended