Deutsch English Français Italiano |
<v3vs75$27u7g$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Paul <nospam@needed.invalid> Newsgroups: comp.lang.c Subject: Re: ASCII to ASCII compression. Date: Fri, 7 Jun 2024 16:57:08 -0400 Organization: A noiseless patient Spider Lines: 73 Message-ID: <v3vs75$27u7g$1@dont-email.me> References: <v3snu1$1io29$2@dont-email.me> <v3spmv$1jbjq$1@dont-email.me> <v3t150$1kia9$1@dont-email.me> <v3ukbb$20s0s$3@dont-email.me> <v3uva6$22nnp$1@dont-email.me> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8bit Injection-Date: Fri, 07 Jun 2024 22:57:09 +0200 (CEST) Injection-Info: dont-email.me; posting-host="77eb54e1bd58f88a185cf0a76c1304b7"; logging-data="2357488"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19Mshibf8xlAulHQiDw5Q3a1wsRFFm2Er4=" User-Agent: Ratcatcher/2.0.0.25 (Windows/20130802) Cancel-Lock: sha1:qxE/wc92mcnFI6qGdnkZittiKQE= Content-Language: en-US In-Reply-To: <v3uva6$22nnp$1@dont-email.me> Bytes: 4888 On 6/7/2024 8:43 AM, Malcolm McLean wrote: > On 07/06/2024 10:36, David Brown wrote: >> On 06/06/2024 21:02, Malcolm McLean wrote: >>> On 06/06/2024 17:55, bart wrote: >>>> On 06/06/2024 17:25, Malcolm McLean wrote: >>>>> >>>>> Not strictly a C programming question, but smart people will see the relavance to the topicality, which is portability. >>>>> >>>>> Is there a compresiion algorthim which converts human language ASCII text to compressed ASCII, preferably only "isgraph" characters? >>>>> >>>>> So "Mary had a little lamb, its fleece was white as snow". >>>>> >>>>> Would become >>>>> >>>>> QWE£$543GtT£$"||x|VVBB? >>>> >>>> What's the problem with compressing to binary (using existing, efficient utilities), then turning that binary into ASCII (like Mime or Base64)? >>>> >>> Because if a single bit flips in a zip archive, it's likely the entire archive will be lost. This scheme is robust. We can emed compressed text in programs, and if it is corruped, only a single line will become unreadable. >> >> Ah, you want something that will work like your newsreader program that randomly changes letters or otherwise corrupts your spelling while leaving most of it readable? :-) >> >> Pass the data through a compressor and then add forward error checking mechanisms such as Reed-Solomon codes. Then convert to ASCII base64 or similar. >> > Yes, exactly. > > I want a system for compression which is robust to corruption, can be stored as text, and with a compressor / decompressor which can be written by a child hobby programmer with only a very little bit of experience of programming. > > That's what I need for Baby X. The FileSystem XML files can get very large, and of course Baby X programmers are going to ask about compression. And I don't think there is an existing system, and so I shall devise one. > "XML Compression" https://link.springer.com/referenceworkentry/10.1007/978-1-4899-7993-3_783-2 "The size increase incurred by publishing data in XML format is estimated to be as much as 400 % [14], making it a prime target for compression. While standard general-purpose compressors, such as zip, gzip or bzip, typically compress XML data reasonably well... " Show us a "dir" or an "ls -al" so we can better understand the magnitude of what you're working on. Lots of things have used ZIP, implicitly or explicitly, mainly because it is a kind of standard and does not form a barrier to access. In addition, if a structure is voluminous (a thousand control files representing one project), users appreciate having them stored in a container, rather than filling the file system with fluff. A ZIP can do that too. And if the ZIP has a convenient library you can get from FOSS-land, that could save time on building a standards based container. But what's more important than any techie adventure, is not annoying your users. What do the users want most ? The ability to edit the files in question, on a moments notice ? Or would the files, 99.999% of the time, comfortably remain hidden from view ? If the "blob" involved was 100GB, then yes, I'd compress it :-) If it is 4KB, well, those little files are a nuisance no matter what you do. I would leave that uncompressed, unless I could containerize it perhaps. As an example, Mozilla has used .jsonlz4 as a file format solution. I have no idea what problem they thought they were solving, but I can tell you I consider the solution obnoxious and inconsiderate of the user. LZ4 decompressors are not a stockroom item. I had to write a very short program, so I could deal with that. Mozilla has made a perfect example of what not to do, by doing that. Paul