Path: ...!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: candycanearter07 Newsgroups: comp.os.linux.advocacy Subject: Re: Need Assistance -- Network Programming Date: Thu, 20 Jun 2024 05:50:02 -0000 (UTC) Organization: the-candyden-of-code Lines: 27 Message-ID: References: <17da6bead1f52684$159717$3694546$802601b3@news.usenetexpress.com> Injection-Date: Thu, 20 Jun 2024 07:50:03 +0200 (CEST) Injection-Info: dont-email.me; posting-host="0541d73c3888447053ae0ab7fcacc490"; logging-data="2576004"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/eRbK0JvwdqFICEVybOOVgWGugngxlqvInwgB14NEd8w==" User-Agent: slrn/1.0.3 (Linux) Cancel-Lock: sha1:ZBUv01xomMY+GXT62fax/E8V7Ms= X-Face: b{dPmN&%4|lEo,wUO\"KLEOu5N_br(N2Yuc5/qcR5i>9-!^e\.Tw9?/m0}/~:UOM:Zf]% b+ V4R8q|QiU/R8\|G\WpC`-s?=)\fbtNc&=/a3a)r7xbRI]Vl)r<%PTriJ3pGpl_/B6!8pe\btzx `~R! r3.0#lHRE+^Gro0[cjsban'vZ#j7,?I/tHk{s=TFJ:H?~=]`O*~3ZX`qik`b:.gVIc-[$t/e ZrQsWJ >|l^I_[pbsIqwoz.WGA] wrote at 13:47 this Wednesday (GMT): > Ordinarily, I don't give a flying fuck about network programming > but necessity does dictate. > > I need to read through an HTML file, find all external HTTP(S) links, > and then determine if those external links are still viable, i.e. > if the pages to which they link still exist. > > Perl is the language of choice. > > Finding the links is not a problem, but how do I determine viability? > Do I look for the "404" error or is there another way? > > I don't want no fucking Python code. > > Pseudocode or just the relevant commands would be preferable. Well, if you're fine with bash, this should do it: while IFS=" "; read -r url; do curl -fs "$url" > /dev/null || echo "$url down" done < "$1" (or replace $1 with a path to the file) -- user is generated from /dev/urandom