Path: ...!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Lawrence D'Oliveiro Newsgroups: comp.misc Subject: Re: Emigration from Usenet [was: Re: PTD was the most-respected of the AUE regulars ...] Date: Sun, 28 Jul 2024 01:55:16 -0000 (UTC) Organization: A noiseless patient Spider Lines: 13 Message-ID: References: <20240724115828.5d9d85d9305fe8300a91db5d@g{oogle}mail.com> <66a2d000@news.ausics.net> <20240726013343.02805fe30e4853cf7cd40797@gmail.moc> <66a31b29@news.ausics.net> <66a39428@news.ausics.net> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Injection-Date: Sun, 28 Jul 2024 03:55:16 +0200 (CEST) Injection-Info: dont-email.me; posting-host="3b558a041f0a0aed486aeb8fa027d259"; logging-data="3818776"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/1OUXUZEy8pQTMkweSJU6V" User-Agent: Pan/0.159 (Vovchansk; ) Cancel-Lock: sha1:bbAxvmrp1XswawdLNZ+3e+x9YKg= Bytes: 2142 On 26 Jul 2024 22:18:48 +1000, Computer Nerd Kev wrote: > I'm not really sure whether a HTML parser > library would be helpful or just a pointless extra layer of complexity. > So far I've just used regular expressions for scraping webpages. I learned about BeautifulSoup early on, and never looked back. I use it for all my web-scraping projects nowadays. By the way, this is the kind of discussion you could not have on a platform like Discord. The last time I was on there, the server Ts&Cs had prohibitions against talking about web-scraping, since so many websites didn’t like it.