| Deutsch English Français Italiano |
|
<vrc8qm$2tkq5$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!.POSTED!not-for-mail From: D Finnigan <dog_cow@macgui.com> Newsgroups: comp.misc Subject: Re: bad bot behavior Date: Tue, 18 Mar 2025 12:00:07 -0500 Organization: A noiseless patient Spider Lines: 15 Message-ID: <vrc8qm$2tkq5$1@dont-email.me> References: <vrc2r4$2okrp$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Tue, 18 Mar 2025 18:00:07 +0100 (CET) Injection-Info: dont-email.me; posting-host="a578e7d7c236ab6edd8250f49303d6f2"; logging-data="3068741"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19yDzDzY4wJEEuxQxsUQDpS" Cancel-Lock: sha1:qBPLslCF1JAZ7j82Y5qcKFGiBIc= In-Reply-To: <vrc2r4$2okrp$1@dont-email.me> Content-Language: en-US On 3/18/25 10:17 AM, Ben Collver wrote: > Please stop externalizing your costs directly into my face > ========================================================== > March 17, 2025 on Drew DeVault's blog > > Over the past few months, instead of working on our priorities at > SourceHut, I have spent anywhere from 20-100% of my time in any given > week mitigating hyper-aggressive LLM crawlers at scale. This is happening at my little web site, and if you have a web site, it's happening to you too. Don't be a victim. Actually, I've been wondering where they're storing all this data; and how much duplicate data is stored from separate parties all scraping the web simultaneously, but independently.