Deutsch   English   Français   Italiano  
<v3t1gf$1kia9$2@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!npeer.as286.net!npeer-ng0.as286.net!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Malcolm McLean <malcolm.arthur.mclean@gmail.com>
Newsgroups: comp.lang.c
Subject: Re: ASCII to ASCII compression.
Date: Thu, 6 Jun 2024 20:09:03 +0100
Organization: A noiseless patient Spider
Lines: 42
Message-ID: <v3t1gf$1kia9$2@dont-email.me>
References: <v3snu1$1io29$2@dont-email.me> <874ja657s9.fsf@bsb.me.uk>
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Thu, 06 Jun 2024 21:09:04 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="fd1a4517109bed24a981b559f1527ee3";
	logging-data="1722697"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19ImuRVXyc1Ie0hJl1A5Mca3vsvQICW7HA="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:mw8E26mp2MIbmcx0wXHWnV1N3TA=
In-Reply-To: <874ja657s9.fsf@bsb.me.uk>
Content-Language: en-GB
Bytes: 2887

On 06/06/2024 17:56, Ben Bacarisse wrote:
> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
> 
>> Not strictly a C programming question, but smart people will see the
>> relavance to the topicality, which is portability.
> 
> I must not be smart as I can't see any connection to the topic of this
> group!
> 
>> Is there a compresiion algorthim which converts human language ASCII text
>> to compressed ASCII, preferably only "isgraph" characters?
>>
>> So "Mary had a little lamb, its fleece was white as snow".
>>
>> Would become
>>
>> QWE£$543GtT£$"||x|VVBB?
> 
> Obviously such algorithms exist.  One that is used a lot is just base64
> encoding of binary compressed text, but that won't beat something
> specifically crafted for the task which is presumably what you are
> asking for.  I don't know of anything aimed at that task specifically.
> 
> One thing you should specify is whether you need it to work on small
> texts, or, even better, at what sort of size you want the pay-off to
> start to kick in.  For example, the xz+base64 encoding of the complete
> works of Shakespeare is still less than 40% of the size of the original
> but your single line will end up much larger using that off-the-shelf
> scheme.
> 
What I was thing of was using Huffman codes to convert ASCII to a string 
of of bits. Then use isgraph characters to chop the code up into six 
bits. Then you have a special character to represent end of line. So if 
the compressed text gets corrupted, you only lose a single line.

The idea is that the compressed data can then be embedded in C source.
Or stored in text files. A bit like uuencoding. But for text, not binary.

-- 
Check out Basic Algorithms and my other books:
https://www.lulu.com/spotlight/bgy1mm