Deutsch   English   Français   Italiano  
<67d9f16a@news.ausics.net>

View for Bookmarking (what is this?)
Look up another Usenet article

Message-ID: <67d9f16a@news.ausics.net>
From: not@telling.you.invalid (Computer Nerd Kev)
Subject: Re: bad bot behavior
Newsgroups: comp.misc
References: <vrc2r4$2okrp$1@dont-email.me> <vrc8qm$2tkq5$1@dont-email.me>
User-Agent: tin/2.0.1-20111224 ("Achenvoir") (UNIX) (Linux/2.4.31 (i586))
NNTP-Posting-Host: news.ausics.net
Date: 19 Mar 2025 08:19:22 +1000
Organization: Ausics - https://newsgroups.ausics.net
Lines: 28
X-Complaints: abuse@ausics.net
Path: ...!weretis.net!feeder9.news.weretis.net!news.bbs.nz!news.ausics.net!not-for-mail
Bytes: 1926

D Finnigan <dog_cow@macgui.com> wrote:
> On 3/18/25 10:17 AM, Ben Collver wrote:
>> Please stop externalizing your costs directly into my face
>> ==========================================================
>> March 17, 2025 on Drew DeVault's blog
>> 
>> Over the past few months, instead of working on our priorities at
>> SourceHut, I have spent anywhere from 20-100% of my time in any given
>> week mitigating hyper-aggressive LLM crawlers at scale.
> 
> This is happening at my little web site, and if you have a web site, 
> it's happening to you too. Don't be a victim.

Meh, my little Web site runs so light that even when Amazon's bot
got stuck in a recursive loop grabbing the same dynamic page tens of
times a second from different IPs, the server load was near nill as
usual. The main problem that caused was access logs of hundreds of
megabytes per day. Amazon is still scraping the hell out of
everything I put online (even a mirror that's tens of GBs), and
other bots squeeze into the logs too, maybe even a few humans view
things sometimes? I don't care, they're welcome to it, and they
helped me find the bug in the Apache configuration which allowed
that recursive loop (though I still don't get why bots started
forming such URLs in the first place).

-- 
__          __
#_ < |\| |< _#