Deutsch   English   Français   Italiano  
<v3vln3$26am6$1@dont-email.me>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!feeds.phibee-telecom.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Richard Harnden <richard.nospam@gmail.invalid>
Newsgroups: comp.lang.c
Subject: Re: ASCII to ASCII compression.
Date: Fri, 7 Jun 2024 20:06:09 +0100
Organization: A noiseless patient Spider
Lines: 37
Message-ID: <v3vln3$26am6$1@dont-email.me>
References: <v3snu1$1io29$2@dont-email.me> <874ja657s9.fsf@bsb.me.uk>
 <v3t1gf$1kia9$2@dont-email.me> <v3u5a3$1ul3c$1@dont-email.me>
 <v3ui89$20jte$1@dont-email.me> <v3uvq9$22s77$1@dont-email.me>
 <v3vkn1$265uv$2@dont-email.me>
Reply-To: nospam.harnden@invalid.com
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Fri, 07 Jun 2024 21:06:11 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="459b11261669d4f8bcbaef599a03cd81";
	logging-data="2304710"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX19/Puyn558DKqxizXpu63K8ntSfePMEi/4="
User-Agent: Mozilla Thunderbird
Cancel-Lock: sha1:/2cSneGHuyg5MXjf1g1hYB+q3HQ=
X-Antivirus-Status: Clean
In-Reply-To: <v3vkn1$265uv$2@dont-email.me>
Content-Language: en-US
X-Antivirus: AVG (VPS 240607-6, 6/7/2024), Outbound message
Bytes: 3082

On 07/06/2024 19:49, Malcolm McLean wrote:
> On 07/06/2024 13:52, Mikko wrote:
>> On 2024-06-07 09:00:57 +0000, Malcolm McLean said:
>>
>>> Yes, but Huffman is easy to decode. It's the sort of project you give 
>>> to people who have just got past the beginner stage but aren't very 
>>> experienced programmers yet, whilst implementing Lempel-Ziv is a job 
>>> for someone who knows what he is doing.
>>>
>>> Because the lines will often be very short, adaptive Huffman coding 
>>> is no good. I need a fixed Huffman table with 128 entries for each 7 
>>> bit value plus one for "stop". I wonder if any such standard table 
>>> exists.
>>
>> You don't need a standard table. You need statistics. Once you have the
>> statistics the table is easy to costruct with Huffman's algorithm.
>>
> No you do. The text might be very short, like "Mary had a little lamb", 
> and you will compress it because you know that you are being fed 
> meaningful ASCII. For example even this tiny fragment contains the 
> letter "e", which would have a short Huffman code. And four a's and two 
> t's, which are the third and the second most commn letters. So it should 
> compress.
> 
> And we're compressing each line independently, and choosing a visually 
> distinctive ASCII character as the line break. So anyone seeing the 
> compressed data will immediately be able to home in on the line breaks, 
> and will be able to fix any corruption without special tools.
> 
> And you have a standard table which never changes. And so that makes the 
> decompressor much easier to write.

Will your babyx be able to handle, say, utf-8?

-- 
This email has been checked for viruses by AVG antivirus software.
www.avg.com