Deutsch English Français Italiano |
<20250303171334.785ee79e@wibble.sysadmininc.com> View for Bookmarking (what is this?) Look up another Usenet article |
Path: news.eternal-september.org!eternal-september.org!feeder3.eternal-september.org!tncsrv06.tnetconsulting.net!newsfeed.endofthelinebbs.com!.POSTED.47.186.35.85!not-for-mail From: Nigel Reed <sysop@endofthelinebbs.com> Newsgroups: news.admin.peering Subject: Re: Newsgroups files Date: Mon, 3 Mar 2025 17:13:34 -0600 Organization: End Of The Line BBS Sender: nelgin@47.186.35.85 Message-ID: <20250303171334.785ee79e@wibble.sysadmininc.com> References: <20250303133017.7b629d4a@wibble.sysadmininc.com> <8mwmd5x3c6.fsf@raybanana.net> <20250303143634.5f78bc54@wibble.sysadmininc.com> <vq57l9$1nji2$2@news.trigofacile.com> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: quoted-printable Injection-Info: newsfeed.endofthelinebbs.com; posting-host="47.186.35.85"; logging-data="797777"; mail-complaints-to="abuse@endofthelinebbs.com" X-Newsreader: Claws Mail 4.3.1git13 (GTK 3.24.41; x86_64-pc-linux-gnu) On Mon, 3 Mar 2025 22:40:57 +0100 Julien =C3=89LIE <iulius@nom-de-mon-site.com.invalid> wrote: > Hi Nigel, >=20 > > I'm probably just going to get a script to pull the most popular of > > the descriptions for the list and ignore the moderated part unless > > the group has moderated in its name or a majority think its > > moderated when do a manual check on those. =20 >=20 > I would suggest to instead just use the latest known descriptions > (from checkgroups when they are sent). > I maintain the list encoded in UTF-8 (the standard according to RFCs) > here:=20 > https://raw.githubusercontent.com/Julien-Elie/usenet-hierarchies/refs/hea= ds/main/website/data/newsgroups.utf8 >=20 > Also, FWIW, the same list in pure ASCII: > =20 > https://raw.githubusercontent.com/Julien-Elie/usenet-hierarchies/refs/hea= ds/main/website/data/newsgroups.ascii >=20 >=20 > The usual master file for these descriptions has unfortunately mixed=20 > charsets (like windows-1252 for some descriptions, UTF-8 for others,=20 > ISO-8859-xx variants, etc.): > https://ftp.isc.org/pub/usenet/CONFIG/newsgroups >=20 > That's why I generate the above first two lists :) > Feel free to use! >=20 Yes, we've sort of had this discussion before about encoding. This one is more about the inconsistency of the labeling of the groups.=20 In the newsgroups list above, pretty much every group that contains non-standard A-Z letters is garbled. Probably because it's ISO-8859 when I'm using UTF-8. The cn.* groups are definitely garbled. I'll just do my best to make a valid UTF-8 file for my server. --=20 End Of The Line BBS - Plano, TX telnet endofthelinebbs.com 23