Reddit Prevents Wayback Machine from Saving User Posts


Reddit intends to prevent the Internet Archive’s Wayback Machine from saving users’ posts, aiming to stop AI firms from extracting archived comments for training algorithms without compensation. As reported by The Verge, Reddit will disallow the Wayback Machine from archiving post detail pages, comments, and profiles, though the homepage will remain reachable. Reddit asserts that this action safeguards users, as AI firms have been violating policies by scraping data. Reddit representative Tim Rathschmidt stated that access is restricted until the Internet Archive adheres to platform policies.

In spite of these assertions, Reddit has demonstrated a willingness to sell user data to AI firms. In 2024, Reddit curtailed search engines like Microsoft Bing and DuckDuckGo from indexing its platform but permitted Google to keep accessing data for AI training through a $60 million agreement. A comparable arrangement was established with OpenAI, the developer of ChatGPT.

Reddit CEO Steve Huffman indicated that without agreements in place, Reddit lacks control over data utilization, prompting them to block those who refuse to negotiate terms. Users possess limited control over how their public posts are utilized since Reddit does not permit opting out of data sales or AI training, leaving users with the choice to cease posting.

The choice to obstruct the Wayback Machine appears to be financially driven, as AI companies were obtaining Reddit data at no cost. Huffman informed the New York Times that Reddit’s data holds significant value and shouldn’t be provided at no charge to large corporations.

Reddit has been striving to mitigate financial losses, resulting in unpopular changes such as charging developers for API access, eliminating ad personalization opt-outs, and introducing plans for paid subreddits. Despite these efforts, Reddit recorded a net loss of $484.3 million last year, considerably surpassing the $90.8 million loss in 2023.