Deutsch   English   Français   Italiano  
<647f0226-265e-2757-bd2a-3aa89de38107@example.net>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail
From: D <nospam@example.net>
Newsgroups: comp.os.linux.misc
Subject: Re: Script to conditionally find and compress files recursively
Date: Thu, 13 Jun 2024 11:55:23 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <647f0226-265e-2757-bd2a-3aa89de38107@example.net>
References: <v48s96$u6fg$1@dont-email.me> <v4b46s$7dh$1@tncsrv09.home.tnetconsulting.net> <wwvo7868waw.fsf@LkoBDZeT.terraraq.uk> <083d0e35-e02d-8668-726f-7aa89980e9b2@example.net> <v4dtih$23kjq$2@dont-email.me>
MIME-Version: 1.0
Content-Type: text/plain; format=flowed; charset=US-ASCII
Injection-Info: i2pn2.org;
	logging-data="4054350"; mail-complaints-to="usenet@i2pn2.org";
	posting-account="w/4CleFT0XZ6XfSuRJzIySLIA6ECskkHxKUAYDZM66M";
X-Spam-Checker-Version: SpamAssassin 4.0.0
In-Reply-To: <v4dtih$23kjq$2@dont-email.me>
Bytes: 2473
Lines: 38



On Thu, 13 Jun 2024, J Newman wrote:

> On 12/06/2024 16:13, D wrote:
>> 
>> 
>> On Wed, 12 Jun 2024, Richard Kettlewell wrote:
>> 
>>> Grant Taylor <gtaylor@tnetconsulting.net> writes:
>>>> On 6/11/24 01:53, J Newman wrote:
>>>>> Any suggestions on how to proceed?
>>>> 
>>>> As others have said, it's very difficult to tell within the first five
>>>> seconds what the ultimate compression ratio will be.
>>> 
>>> Not just difficult but impossible in general: the input file could
>>> change character in its second half, switching the overall result from
>>> that that is (for example) a gzip win to an xz win.
>>> 
>>> 
>> 
>> This is true! The only thing I can imagine are parsing the file type, and 
>> from that file type, drawing conclusions about the compressability of the 
>> data, or doing a flawed statistical analysis, but as said, the end could be 
>> vastly different from the start.
>
> OK good point...as mentioned elsewhere my experience is with compressing 
> video files with lzma.
>
> But if we accept that the script will make mistakes sometimes in choosing the 
> right algorithm for compression, do you suggest parsing the file type, or 
> trying to compress each file for the first 5 seconds, as the option with the 
> least errors in choosing the right compression algorithm?
>

Hmm, I'd say parsing file types first, and perhaps have a little database 
that maps file type to compression algorithm, and if that doesn't yield 
anything, proceed with "brute force".