The beef between Microsoft and Reddit came to light after I published a story revealing that Reddit is currently blocking every crawler from every search engine except Google, which earlier this year agreed to pay Reddit $60 million a year to scrap the site for its generative AI products.
I know the author meant “scrape”, but sometimes it really does feel like AI is just scrapping the old internet for parts.
yep, and the longer that happens the less value to the dataset. its becoming aged.
[Joke] See, Reddit’s doing a nice thing here! They’re making sure nobody ends up toxifying their own dataset by using Reddit’s garbage heap of bot posts!