

They didn’t care at first. The only reason they began destructively scanning books is because they started to care about copyright law:
Anthropic first chose to amass digitized versions of pirated books to avoid what CEO Dario Amodei called “legal/practice/business slog”—the complex licensing negotiations with publishers. But by 2024, Anthropic had become “not so gung ho about” using pirated ebooks “for legal reasons” and needed a safer source.
Plus even if they were to implement those features, the challenges would still get increasingly harder the more bot-like a scraper behaves.
You can’t prevent scraping entirely but you can certainly prevent scraping that behaves like a DOS attack.