Warning: mysqli::__construct(): (HY000/1203): User howardkn already has more than 'max_user_connections' active connections in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\includes\artfuncs.php on line 21
Failed to connect to MySQL: (1203) User howardkn already has more than 'max_user_connections' active connections
Warning: mysqli::query(): Couldn't fetch mysqli in D:\Inetpub\vhosts\howardknight.net\al.howardknight.net\index.php on line 66
Article <v3u5a3$1ul3c$1@dont-email.me>
Deutsch   English   Français   Italiano  
<v3u5a3$1ul3c$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Mikko <mikko.levanto@iki.fi>
Newsgroups: comp.lang.c
Subject: Re: ASCII to ASCII compression.
Date: Fri, 7 Jun 2024 08:20:03 +0300
Organization: -
Lines: 51
Message-ID: <v3u5a3$1ul3c$1@dont-email.me>
References: <v3snu1$1io29$2@dont-email.me> <874ja657s9.fsf@bsb.me.uk> <v3t1gf$1kia9$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; charset=iso-8859-1; format=flowed
Content-Transfer-Encoding: 8bit
Injection-Date: Fri, 07 Jun 2024 07:20:04 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="6ffc97c8aed538b65f79c8ced6ee8603";
	logging-data="2053228"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+F9vtUf8W06e2SGUDqv6Wh"
User-Agent: Unison/2.2
Cancel-Lock: sha1:0v6n5nWzcH5Cz4OD7u/YAvw1MKE=
Bytes: 3077

On 2024-06-06 19:09:03 +0000, Malcolm McLean said:

> On 06/06/2024 17:56, Ben Bacarisse wrote:
>> Malcolm McLean <malcolm.arthur.mclean@gmail.com> writes:
>> 
>>> Not strictly a C programming question, but smart people will see the
>>> relavance to the topicality, which is portability.
>> 
>> I must not be smart as I can't see any connection to the topic of this
>> group!
>> 
>>> Is there a compresiion algorthim which converts human language ASCII text
>>> to compressed ASCII, preferably only "isgraph" characters?
>>> 
>>> So "Mary had a little lamb, its fleece was white as snow".
>>> 
>>> Would become
>>> 
>>> QWE£$543GtT£$"||x|VVBB?
>> 
>> Obviously such algorithms exist.  One that is used a lot is just base64
>> encoding of binary compressed text, but that won't beat something
>> specifically crafted for the task which is presumably what you are
>> asking for.  I don't know of anything aimed at that task specifically.
>> 
>> One thing you should specify is whether you need it to work on small
>> texts, or, even better, at what sort of size you want the pay-off to
>> start to kick in.  For example, the xz+base64 encoding of the complete
>> works of Shakespeare is still less than 40% of the size of the original
>> but your single line will end up much larger using that off-the-shelf
>> scheme.
>> 
> What I was thing of was using Huffman codes to convert ASCII to a 
> string of of bits.

Works if one knows at the time one makes ones compression and
decmpression algorithms how often each short sequence of characters
will be used in the files that will be compressed. If you have an
adaptive Huffman coding (or any other adaptive coding) a single error
will corrupt the rest of your line. If you reset the adaptation at the
end of each line it does not adapt well and the result is not much
better than without adaptation. If you reset the adaptation at the
end of each page you can have better compression but an error corrupts
the rest of the page.

For ordinary texts (except short ones) and many other purposes Lempel-Ziv
and its variants work better than Huffman.

-- 
Mikko