Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: DFS Newsgroups: comp.os.linux.advocacy Subject: Re: Try BlockNews and/or UsenetNews again, an hour later. Date: Thu, 11 Apr 2024 10:40:24 -0400 Organization: A noiseless patient Spider Lines: 104 Message-ID: References: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 11 Apr 2024 16:40:25 +0200 (CEST) Injection-Info: dont-email.me; posting-host="a4a3e89b3479fda13df9e68f7b8f5de1"; logging-data="1835742"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Da4w0byOcdg26agTh5m9/" User-Agent: Betterbird (Windows) Cancel-Lock: sha1:jdMXy/1OMSPdXBayRZF4Ih4OOzQ= Content-Language: en-US In-Reply-To: Bytes: 5203 On 4/9/2024 12:24 AM, Relf wrote: > You (DFS) replied ( to me ): >>> MessageID (instead of Article Number) "bookmarking" is tricky. >>> I list MessageIDs (100 at a time), starting from the highest Article Number, >>> until I find one of the 10 Message-ID "bookmarks" I've saved. >> >> Can you give me the code you run to retrieve articles? >> How were you getting 1000 articles in 4 seconds from glorb? > > Glorb was lightning fast. > When possible, I sent it 100 commands at a time, instead of individually. > > From "z1.CPP" in "Jeff-Relf.Me/z1.ZIP": > > typedef __int64 i64 ; i64 _LowArtNum, _HighArtNum, HighArtNum ; > > const int max_BookMarks = 10, MIDsPerPage = 100, Max_szMID = 200, Max_Sz_BookMark = max_BookMarks * Max_szMID ; > > OpenGroup₍₎( LnP NG ) { // 211 303,341,573 1,844,605,951 2,147,947,523 Comp.OS.Linux.Advocacy > if ( !Go( L"Group %s", LowerCaseNG( NG ) ) ) { Sh( L"No Such NewsGroup, “ %c%s%c ” ?", ChNew, NG, ChText ); return 0 ; } > strCpy( _T, *Ln.PP ), P = _T, gTok, gTok, gTok, LowArtNum = AtoI( Tok ) - 1, gTok, HighArtNum = AtoI( Tok ); > Sh( L"%s Articles; From: %s To: %s .", Commas( HighArtNum - LowArtNum ), Commas( LowArtNum + 1 ), Commas( HighArtNum ) ); return HighArtNum ; } > > _LowArtNum = HighArtNum - MIDsPerPage - 1 ; > Go( L"xHdr Message-ID %I64d-%I64d", _LowArtNum, _HighArtNum ); > >> The python code I've relied on for years has become very slow and >> unstable when running it against blocknews. Connections are dropped all >> the time. SuckMT is OK fast, but it drops connections too. >> >> I want to stay with blocknews because it has articles going back to >> 2003, and they might recover from their current woes. > > Try BlockNews and/or UsenetNews again, an hour later. It's been slow for weeks... well, my previously quick python code, anyway. I used to do head and body commands (via Python nntplib), but they don't work very well lately. The 'article' command is pretty quick, though. This code prints the 211 response for the group, then sends the 'article' command one at a time, parses and displays the response. It's pretty fast - up to ~9 articles retrieved per second late in the evening. ============================================================================================== import nntplib, time from datetime import datetime startStr = datetime.now().strftime('%Y-%m-%d %H:%M:%S') startTime = time.time() srvName = 'usnews.blocknews.net' grpName = 'comp.os.linux.advocacy' news = nntplib.NNTP(srvName,119,'dfsblocknews','hogsblocknews') r,a,b,e,g = news.group(grpName) print(r) cnt, found, notfound = 0,0,0 for artID in range(904779, 904785): print("------------------------------------------------------------------------------") print(artID) msgID, msg = 0,'' try: response, info = news.article(artID) artID = info[0] msgID = info[1] msg = info[2] for line in msg: line = line.decode('latin-1') if line != "Path: not-for-mail": print(line) found += 1 except (nntplib.NNTPTemporaryError, nntplib.NNTPPermanentError) as err: print('Error:',err) notfound += 1 cnt += 1 print("------------------------------------------------------------------------------") print("done") endStr = datetime.now().strftime('%Y-%m-%d %H:%M:%S') totTime = time.time() - startTime print("\nStart: " + startStr) print("End : " + endStr) print("%.1fs to request %d messages (%.f per sec)" % (totTime, cnt, cnt / totTime)) print("%d messages downloaded" % (found)) print("%d messages not found" % (notfound)) ============================================================================================== Save it as a .py file and run it: $ python prog.py a minute ago, from blocknews: 211 1795999 899518 2695516 comp.os.linux.advocacy But blocknews recently screwed up their article numbering, and there are a lot of gaps in the sequence, so there aren't 1795999 articles available on their server.