Deutsch English Français Italiano |
<v6h0oa$u3fa$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder9.news.weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Don Y <blockedofcourse@foo.invalid> Newsgroups: sci.electronics.design Subject: Re: hobby electronics Date: Mon, 8 Jul 2024 08:31:19 -0700 Organization: A noiseless patient Spider Lines: 193 Message-ID: <v6h0oa$u3fa$1@dont-email.me> References: <j5a88jhm7pge920n2io4jnhs101i8ntb2g@4ax.com> <v635o1$24goj$1@dont-email.me> <v63k0i$271d8$1@dont-email.me> <v63ldd$26rbm$2@dont-email.me> <v667qj$2p9gt$4@dont-email.me> <v66doo$2q0be$1@dont-email.me> <v68tfj$3abt3$1@dont-email.me> <v698an$3c5jp$2@dont-email.me> <v6bge1$3qrun$1@dont-email.me> <v6busc$3t51f$1@dont-email.me> <v6e5ut$bs5n$1@dont-email.me> <v6ed9e$cpga$1@dont-email.me> <v6gqh7$t2eb$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Mon, 08 Jul 2024 17:31:24 +0200 (CEST) Injection-Info: dont-email.me; posting-host="1204b6b780de24b9cb690be0c57d3575"; logging-data="986602"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX1/ACibLbFq9s+HAKuJVHh4d" User-Agent: Mozilla/5.0 (Windows NT 6.1; Win64; x64; rv:102.0) Gecko/20100101 Thunderbird/102.2.2 Cancel-Lock: sha1:GQ12rnyzJ1mVh0krswNvDS6Pjrc= In-Reply-To: <v6gqh7$t2eb$1@dont-email.me> Content-Language: en-US Bytes: 10509 On 7/8/2024 6:45 AM, BillGill wrote: >>> As you can see I have a mirror in the base of the scanner >>> that I used to verify that the page is correctly placed. It >>> doesn't zoom in to fit the page, it just overscans. >> >> If the USB i/f worked, you wouldn't need the mirror (?) > There is some software that works with some cameras that will > send the images directly to the computer. But, would it show "viewfinder" images or just *snapped* images? I.e., could you use the images delivered "in real time" at the computer to PREVIEW the photos that will be snapped? Or, does it only transfer the images after they have been taken? (which would be more tedious to try to use to align the book) > When I was KISSing > the scanner I went with the simplest approach. This way it > will work with any camera, you just have to build it so that > the camera is at the correct distance from the platen. Makes sense. And, as you are shooting THROUGH the glass, this distance is constant (whereas those builds that shot the "open book" have to accommodate the distance changing as the number of pages increases the "thickness" as they are sequentially "flipped" >> Your approach seems more like the Reading Machine used >> (in "paper handling") -- though it used a moving camera-illuminator >> to scan the actual page (which meant the book had to remain in place for >> a considerable length of time): >> <https://life.ieee.org/wp-content/uploads/Harvey-Lauer-with-Kurzweil-Reading-Machine-1200x819.png> > > Well, I see that the way the book is set on the scanner is > similar, but of course my scanner takes the whole page at > once. The point was that the approach makes it easier to access deeper into the gutter. I would imagine the builds that have the book face up, opened to a pair of pages have to contend with pages infringing on the gutter as the position *in* the book changes (and more paper piles up on one side or the other) > The goal is, of course, different. The reading machine > is turning the printed text into sound, which is a whole > different thing from turning it into text. Actually, The Reading Machine was one of the first commercial "omnifont" OCR machines. The scanner (a linear CCD) assembles the image of the characters and then the characters are recognized, fed to the text-to-speech system and converted to sound (by the speech synthesizer). An early attempt to commercialize the OCR capabilities was The Data Entry System. There, the text-to-speech module and speech synthesizer were elided from the system with the recognized text as the primary output. A graphic display allowed a *sighted* operator (The Reading Machine was targeted to the visually impaired) to verify and correct the OCR as the pages were being scanned. Note this is mid/late 70's so these sorts of capabilities didn't exist. How would you get a *digital* copy of a telephone book (name, address, phone)? Or, copies of published newspapers (AS they were being published)? >>> I don't do much manipulation of the images before I OCR them. >>> I use Abby Finereader 14 which does a pretty good job of >>> picking out the text. I stick with 14 because it works good >>> and newer versions are only available as subscriptions. >> >> All OCR tools "have problems". My scanner will do OCR but then I >> lose the original images (so how do I sort out what the OCR *should* have >> been once the original is gone?). I've also had some luck with >> Omnipage. > Since I have the original scans in the computer, rather than running > them straight through the OCR and losing the originals that is not > a problem. And of course I still have the original books so that I > can proof the text with confidence. Yes, but this means KEEPING extra "stuff" -- the original books, the original scans, AND the output of your process. My approach keeps the scans *as* the output -- so the books are redundant. And, being TIFFs, they are lossless so I can do (or RE-do) the OCR at any time -- including as I am reading them. >>> Understand that I am making ebooks that I can carry around on >>> different devises, not PDFs that can also be viewed on different >>> devices, but don't necessarily have all the text correct. >> >> The PDF doesn't have to get the text correct; it can store the >> image of the page (and let your eyes/brain do the OCR). >> >> I can store the OCRed text "behind" the image so that you can select >> the text with your cursor (in a PC application). But, again, you >> are stuck relying on the quality of the OCR algorithm. > > Does the PDF reflow the text so that the text size is the same > size on all devices, including a phone? The EPUB format does > that. It also resizes any illustrations so that they will fit. No. The PDF is a (lossless) photo of the page. For "pocket books" (i.e., the paperbacks of the 60's), my ereader screens are large enough that it is as if I was holding the original book in my hand (but only seeing recto or verso page-at-a-time). If I want to read a technical paper typically typeset in 8.5x11 format, I have to use a larger display -- or, flip the ereader to landscape mode (so the display is 8.5" wide) and scroll through the image. This is tedious for multicolumn layouts. But, I could also view them "full size" on a PC's display (24" diagonal displays are about 11 inches tall) or on a small (~14") laptop, "sideways". Eventually, I will buy a larger tablet and install all of these documents in its internal memory; so, the tablet will be my "library". Even larger pages (e.g., B-size foldout scehmatics) require even larger displays. But, you also would likely want the ability to easily zoom and pan the display to examine the finer details in such documents. > Illustrations of course have to be handled seperately. I run > any page with illustrations through a graphics programs, such as > GIMP to do any cleanup, such as cropping the image to provide > only the illustration. Then I reinsert the illustration into > the text file at the appropriate location. Yes, I have to do this with "foldout" pages that exceed the capabilities of the "small" scanner. This makes scanning service manuals a bit tedious as they may have five 8.5x11 pages followed by three 11x17 foldouts followed by more 8.5x11's, etc. But, assembling the final document (PDF) is relatively easy in Acrobat; I just import ALL of the images and then rearrange their order using the graphical thumbnails. If a page got scanned upside down, I can flip it. If pages were typeset to be read in landscape mode (e.g., rotate the document to read the table on page 27), I can perform that rotation in the PDF so the user doesn't have to turn the screen sideways. It's also helpful as I can add other content to the "container" to preserve it as originally packaged. E.g., audio files that accompany the text or program listings that really want to be *attachments* and not "in-lined". >>> Also I don't want to destroy my paper books. I like reading >>> books on paper. After all that is how I grew up. >> >> Agreed. But, if you are proactively safeguarding your collection against >> the possibility of downsizing into a different living situation, you've >> already decided that they will be discarded -- even if not "destroyed". > > As I say, I prefer real books, and have the space to keep them. Then you are scanning as a preemptive action in the hope that when you need to be rid of the paper, *someone* will be able to do that (I make my plans assuming that I may not have the same physical or mental competencies as I do, now). E.g., SWMBO would curse me up and down if *she* had to sort through all of my books -- even if she KNEW that they should all be discarded ("Why the hell didn't HE do this??") Ditto my business records, software archive, financial records, etc. (I've seen too many people "rushed" by "unexpected events" that have had to take a broad brush approach to discarding "stuff" because they didn't have the time or abilities to more selectively filter it) ========== REMAINDER OF ARTICLE TRUNCATED ==========