Codeberg was asking about this. The linked toot by a commenter points to :
These are CC-BY-SA 4.0 remixes of the Stack Exchange Creative Commons Data Dumps. 100% Unendorsed by Stack Exchange, Inc.
They are minimal. They provide the data you probably care about and the data you need to comply with the original license in SQLite format.
0 points
How could anybody stop the AI robbers from stealing content from the fediverse?
1 point
*
0 points
robots.txt may help : https://neil-clarke.com/block-the-bots-that-feed-ai-models-by-scraping-your-website or blocking by IP addresses.
1 point
*