Deutsch   English   Français   Italiano  
<v6gqh7$t2eb$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: BillGill <tonisdad215@gmail.com>
Newsgroups: sci.electronics.design
Subject: Re: hobby electronics
Date: Mon, 8 Jul 2024 08:45:08 -0500
Organization: A noiseless patient Spider
Lines: 98
Message-ID: <v6gqh7$t2eb$1@dont-email.me>
References: <j5a88jhm7pge920n2io4jnhs101i8ntb2g@4ax.com>
 <v635o1$24goj$1@dont-email.me> <v63k0i$271d8$1@dont-email.me>
 <v63ldd$26rbm$2@dont-email.me> <v667qj$2p9gt$4@dont-email.me>
 <v66doo$2q0be$1@dont-email.me> <v68tfj$3abt3$1@dont-email.me>
 <v698an$3c5jp$2@dont-email.me> <v6bge1$3qrun$1@dont-email.me>
 <v6busc$3t51f$1@dont-email.me> <v6e5ut$bs5n$1@dont-email.me>
 <v6ed9e$cpga$1@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Mon, 08 Jul 2024 15:45:12 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="eaa3dba28bc49828c7efbcb54c5fc7ac";
	logging-data="952779"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19rhHtHnlTqtYl3AY1uxE068viZK0yuRN8="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:OdY0dpKBpGvMCwZkmS+zFphJvsM=
Content-Language: en-US
In-Reply-To: <v6ed9e$cpga$1@dont-email.me>
Bytes: 6066

On 7/7/2024 10:46 AM, Don Y wrote:
> On 7/7/2024 6:41 AM, BillGill wrote:
>> On 7/6/2024 12:28 PM, Don Y wrote:
>>> Yes -- definitely true of "pocket books".  Do you have
>>> to take care in positioning the book to ensure it is in the
>>> cameras' focused field?  I.e., the scanner approach automatically
>>> crops the image to the actual page size so you just load pages
>>> and wait -- to load MORE pages.
>>
>> I am using my own KISS (Keep It Simple Stupid) scanner, which
>> I designed myself.  I originally called it a Tower Scanner, but
>> changed the name when I realized that I had made it as simple
>> as possible.
>>
>> I posted a description of it on DIY Book Scanner, at:
>> https://diybookscanner.org/forum/viewtopic.php?p=22274&hilit=tower+scanner#p22274
>>
>> As you can see I have a mirror in the base of the scanner
>> that I used to verify that the page is correctly placed.  It
>> doesn't zoom in to fit the page, it just overscans.
> 
> If the USB i/f worked, you wouldn't need the mirror (?)
There is some software that works with some cameras that will
send the images directly to the computer.  When I was KISSing
the scanner I went with the simplest approach.  This way it
will work with any camera, you just have to build it so that
the camera is at the correct distance from the platen.
> 
> Your approach seems more like the Reading Machine used
> (in "paper handling") -- though it used a moving camera-illuminator
> to scan the actual page (which meant the book had to remain in place for
> a considerable length of time):
> <https://life.ieee.org/wp-content/uploads/Harvey-Lauer-with-Kurzweil-Reading-Machine-1200x819.png>
>
Well, I see that the way the book is set on the scanner is
similar, but of course my scanner takes the whole page at
once.  The goal is, of course, different.  The reading machine
is turning the printed text into sound, which is a whole
different thing from turning it into text.

>> I don't do much manipulation of the images before I OCR them.
>> I use Abby Finereader 14 which does a pretty good job of
>> picking out the text.  I stick with 14 because it works good
>> and newer versions are only available as subscriptions.
> 
> All OCR tools "have problems".  My scanner will do OCR but then I
> lose the original images (so how do I sort out what the OCR *should* have
> been once the original is gone?).  I've also had some luck with
> Omnipage.
Since I have the original scans in the computer, rather than running
them straight through the OCR and losing the originals that is not
a problem.  And of course I still have the original books so that I
can proof the text with confidence.
> 
>> Understand that I am making ebooks that I can carry around on
>> different devises, not PDFs that can also be viewed on different
>> devices, but don't necessarily have all the text correct.
> 
> The PDF doesn't have to get the text correct; it can store the
> image of the page (and let your eyes/brain do the OCR).
> 
> I can store the OCRed text "behind" the image so that you can select
> the text with your cursor (in a PC application).  But, again, you
> are stuck relying on the quality of the OCR algorithm.
Does the PDF reflow the text so that the text size is the same
size on all devices, including a phone?  The EPUB format does
that. It also resizes any illustrations so that they will fit.

Illustrations of course have to be handled seperately.  I run
any page with illustrations through a graphics programs, such as
GIMP to do any cleanup, such as cropping the image to provide
only the illustration.  Then I reinsert the illustration into
the text file at the appropriate location.
> 
>> And I don't digitize technical books.  They are a whole different
>> proposition, with lots of finicky illustrations.  Not something
>> that I would like to try to digitize.
> 
> As my goal is to be rid of dead trees, I have no choice in the
> matter.  Even discarding (scanning) all of my "paperbacks" leaves me
> with a few hundred cubic feet of paper.
> 
>> Also I don't want to destroy my paper books.  I like reading
>> books on paper. After all that is how I grew up.
> 
> Agreed.  But, if you are proactively safeguarding your collection against
> the possibility of downsizing into a different living situation, you've
> already decided that they will be discarded -- even if not "destroyed".
> 
> 
As I say, I prefer real books, and have the space to keep them.
When I do have to dispose of them I will sell them to
a used books store, or donate them to Goodwill,
which, here in Tulsa, has a pretty good used book store in
the back corner

Bill