Deutsch English Français Italiano |
<v3vln3$26am6$1@dont-email.me> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!feeds.phibee-telecom.net!news.mixmin.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail From: Richard Harnden <richard.nospam@gmail.invalid> Newsgroups: comp.lang.c Subject: Re: ASCII to ASCII compression. Date: Fri, 7 Jun 2024 20:06:09 +0100 Organization: A noiseless patient Spider Lines: 37 Message-ID: <v3vln3$26am6$1@dont-email.me> References: <v3snu1$1io29$2@dont-email.me> <874ja657s9.fsf@bsb.me.uk> <v3t1gf$1kia9$2@dont-email.me> <v3u5a3$1ul3c$1@dont-email.me> <v3ui89$20jte$1@dont-email.me> <v3uvq9$22s77$1@dont-email.me> <v3vkn1$265uv$2@dont-email.me> Reply-To: nospam.harnden@invalid.com MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Fri, 07 Jun 2024 21:06:11 +0200 (CEST) Injection-Info: dont-email.me; posting-host="459b11261669d4f8bcbaef599a03cd81"; logging-data="2304710"; mail-complaints-to="abuse@eternal-september.org"; posting-account="U2FsdGVkX19/Puyn558DKqxizXpu63K8ntSfePMEi/4=" User-Agent: Mozilla Thunderbird Cancel-Lock: sha1:/2cSneGHuyg5MXjf1g1hYB+q3HQ= X-Antivirus-Status: Clean In-Reply-To: <v3vkn1$265uv$2@dont-email.me> Content-Language: en-US X-Antivirus: AVG (VPS 240607-6, 6/7/2024), Outbound message Bytes: 3082 On 07/06/2024 19:49, Malcolm McLean wrote: > On 07/06/2024 13:52, Mikko wrote: >> On 2024-06-07 09:00:57 +0000, Malcolm McLean said: >> >>> Yes, but Huffman is easy to decode. It's the sort of project you give >>> to people who have just got past the beginner stage but aren't very >>> experienced programmers yet, whilst implementing Lempel-Ziv is a job >>> for someone who knows what he is doing. >>> >>> Because the lines will often be very short, adaptive Huffman coding >>> is no good. I need a fixed Huffman table with 128 entries for each 7 >>> bit value plus one for "stop". I wonder if any such standard table >>> exists. >> >> You don't need a standard table. You need statistics. Once you have the >> statistics the table is easy to costruct with Huffman's algorithm. >> > No you do. The text might be very short, like "Mary had a little lamb", > and you will compress it because you know that you are being fed > meaningful ASCII. For example even this tiny fragment contains the > letter "e", which would have a short Huffman code. And four a's and two > t's, which are the third and the second most commn letters. So it should > compress. > > And we're compressing each line independently, and choosing a visually > distinctive ASCII character as the line break. So anyone seeing the > compressed data will immediately be able to home in on the line breaks, > and will be able to fix any corruption without special tools. > > And you have a standard table which never changes. And so that makes the > decompressor much easier to write. Will your babyx be able to handle, say, utf-8? -- This email has been checked for viruses by AVG antivirus software. www.avg.com