Deutsch   English   Français   Italiano  
<afq1viha37gjs37sprgfb30dfm0m1ok5jh@4ax.com>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!feeder1.cambriumusenet.nl!feed.tweak.nl!217.73.144.44.MISMATCH!feeder.ecngs.de!ecngs!feeder2.ecngs.de!144.76.237.92.MISMATCH!3.eu.feeder.erje.net!feeder.erje.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: legg <legg@nospam.magma.ca>
Newsgroups: sci.electronics.design
Subject: Re: Chinese downloads overloading my website
Date: Tue, 12 Mar 2024 20:08:47 -0400
Organization: A noiseless patient Spider
Lines: 112
Message-ID: <afq1viha37gjs37sprgfb30dfm0m1ok5jh@4ax.com>
References: <7qujui58fjds1isls4ohpcnp5d7dt20ggk@4ax.com> <6lekuihu1heui4th3ogtnqk9ph8msobmj3@4ax.com> <usec35$130bu$1@solani.org> <u14quid1e74r81n0ajol0quthaumsd65md@4ax.com> <usjiog$15kaq$1@solani.org> <t7rrui5ohh07vlvn5vnl277eec6bmvo4p9@4ax.com> <usm6v6$17e2c$1@solani.org> <gabuui56k0fn9iovps09um30lhiqhvc61t@4ax.com> <usqjih$h74g$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=us-ascii
Content-Transfer-Encoding: 7bit
Injection-Info: dont-email.me; posting-host="0c6ae09a78fd51de96ac32b31ad71a74";
	logging-data="616450"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19N4ZGfReYHUGg1wDViqyt4"
Cancel-Lock: sha1:esAMoX84LNdXZBIRgtHCZ4/yGTM=
X-Newsreader: Forte Agent 4.2/32.1118
Bytes: 6067

On Tue, 12 Mar 2024 15:05:00 -0700, Don Y
<blockedofcourse@foo.invalid> wrote:

>On 3/11/2024 9:48 AM, legg wrote:
>>>>> When I ask google for "how to add a captcha to your website"
>>>>> I see many solutions, for example this:
>>>>> https://www.oodlestechnologies.com/blogs/create-a-captcha-validation-in-html-and-javascript/
>>>>>
>>>>> Maybe some html guru here nows?
>>>>
>>>> That looks like it's good for accessing an html page.
>>>> So far the chinese are accessing the top level index, where
>>>> files are offered for download at a click.
>>>>
>>>> Ideally, if they can't access the top level, a direct address
>>>> access to the files might be prevented?
>> 
>> Using barebones (Netscape) Seamonkey Compser, the Oodlestech
>> script generates a web page with a 4-figure manually-entered
>> human test.
>> 
>> How do I get a correct response to open the protected web page?
>
>Why not visit a page that uses it and inspect the source?

I'm afraid to find out. If it's google product . . . .

>
>>> What I am doing now is using a html://mywebsite/pub/ directory
>>> with lots of files in it that I want to publish in for example this newsgroup,
>>> I then just post a direct link to that file.
>>> So it has no index file and no links to it from the main site.
>>> It has many sub directories too.
>>> https://panteltje.nl/pub/GPS_to_USB_module_component_site_IXIMG_1360.JPG
>>> https://panteltje.nl/pub/pwfax-0.1/README
>>>
>>> So you need the exact link to access anything
>>> fine for publishing here...
>> <snip>
>> 
>> The top (~index) web page of my site has lists of direct links
>> to subdirectories, for double-click download by user.
>
>You could omit the actual links and just leave the TEXT for a link
>present (i.e., highlight text, copy, paste into address bar) to
>see if the "clients" are exploring all of your *links* or are
>actually parsing the *text*.

After the chinese IPs were blocked, there was not much more 
I could learn by fiddling about. My ISP had to reset the auto 
suspension and up the limit with each (failed) iteration. 
The current block is considered as dusting of the hands.
Case closed.

>
>> It also has limks to other web pages that, in turn, offer links or
>> downloads to on-site and off-site locations. A great number of
>
>Whether or not you choose to "protect" those assets is a separate
>issue that only you can resolve (what's your "obligation" to a site that
>you've referenced on YOUR page?)
>
>> off-site links are invalid, after ~10-20years of neglect. They'll
>> probably stay that way until something or somebody convinces me
>> that it's all not just a waste of time.
>> 
>> At present, I only maintain data links or electronic publications
>> that need it. This may not be neccessary, as the files are generally
>> small enough for the Wayback machine to have scooped up most of the
>> databases and spreadsheets. They're also showing up in other places,
>> with my blessing. Hell - Wayback even has tube curve pages from the
>> 'Conductance Curve Design Manual' - they've got to be buried 4 folders
>> deep - and each is a hefty image.
>
>You can see if bitsavers has an interest in preserving them in a
>more "categorical" framework.

The PDF version of complte CCDM is already out there in a couple 
of free doc sites. Chart images in that pdf might have sample envy.
>
>> Somebody, please tell me the the 'Internet Archive' is NOT owned
>> by Google?
>> 
>> Some off-site links for large image-bound mfr-logo-ident web pages
>> (c/o geek@scorpiorising) seem already to have introduced a
>> captcha-type routine. Wouldn't need many bot hits to bump that
>> location into a data limit. Those pages take a long time
>> simply to load.
>
>There is an art to designing all forms of documentation
>(web pages just being one).  Too abridged and folks spend forever
>chasing links (even if it's as easy as "NEXT").  Too verbose and
>the page takes a long time to load.

The problem with mfr logo ident is the raw volume of tiny images.
Don't recall if an epub version was made - I think, if anything, 
that attempt just made a bigger file . . . .
Slow as it is - it's already split up alpha numerically into six 
sections . . . .
>
>OTOH, when I'm looking to scrape documentation for <whatever>,
>I will always take the "one large document" option, if offered.
>It's just too damn difficult to rebuild a site's structure,
>off-line, in (e.g.) a PDF.  And, load times for large LOCAL documents
>is insignificant.
>> Anyway - how to get the Oodlestech script to open the appropriate
>> page, after vetting the user as being human?
>
>No examples, there?
>

RL