| Deutsch English Français Italiano |
|
<v3t1gf$1kia9$2@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!npeer.as286.net!npeer-ng0.as286.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Malcolm McLean <malcolm.arthur.mclean@gmail.com> Newsgroups: comp.lang.c Subject: Re: ASCII to ASCII compression. Date: Thu, 6 Jun 2024 20:09:03 +0100 Organization: A noiseless patient Spider Lines: 42 Message-ID: <v3t1gf$1kia9$2@dont-email.me> References: <v3snu1$1io29$2@dont-email.me> <874ja657s9.fsf@bsb.me.uk> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Injection-Date: Thu, 06 Jun 2024 21:09:04 +0200 (CEST) Injection-Info: dont-email.me; posting-host="fd1a4517109bed24a981b559f1527ee3"; logging-data="1722697"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19ImuRVXyc1Ie0hJl1A5Mca3vsvQICW7HA=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:mw8E26mp2MIbmcx0wXHWnV1N3TA= In-Reply-To: <874ja657s9.fsf@bsb.me.uk> Content-Language: en-GB Bytes: 2887 On 06/06/2024 17:56, Ben Bacarisse wrote: > Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes: > >> Not strictly a C programming question, but smart people will see the >> relavance to the topicality, which is portability. > > I must not be smart as I can't see any connection to the topic of this > group! > >> Is there a compresiion algorthim which converts human language ASCII text >> to compressed ASCII, preferably only "isgraph" characters? >> >> So "Mary had a little lamb, its fleece was white as snow". >> >> Would become >> >> QWE£$543GtT£$"||x|VVBB? > > Obviously such algorithms exist. One that is used a lot is just base64 > encoding of binary compressed text, but that won't beat something > specifically crafted for the task which is presumably what you are > asking for. I don't know of anything aimed at that task specifically. > > One thing you should specify is whether you need it to work on small > texts, or, even better, at what sort of size you want the pay-off to > start to kick in. For example, the xz+base64 encoding of the complete > works of Shakespeare is still less than 40% of the size of the original > but your single line will end up much larger using that off-the-shelf > scheme. > What I was thing of was using Huffman codes to convert ASCII to a string of of bits. Then use isgraph characters to chop the code up into six bits. Then you have a special character to represent end of line. So if the compressed text gets corrupted, you only lose a single line. The idea is that the compressed data can then be embedded in C source. Or stored in text files. A bit like uuencoding. But for text, not binary. -- Check out Basic Algorithms and my other books: https://www.lulu.com/spotlight/bgy1mm