Path: ...!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: D Newsgroups: comp.os.linux.misc Subject: Re: Script to conditionally find and compress files recursively Date: Thu, 13 Jun 2024 11:55:23 +0200 Organization: i2pn2 (i2pn.org) Message-ID: <647f0226-265e-2757-bd2a-3aa89de38107@example.net> References: <083d0e35-e02d-8668-726f-7aa89980e9b2@example.net> MIME-Version: 1.0 Content-Type: text/plain; format=flowed; charset=US-ASCII Injection-Info: i2pn2.org; logging-data="4054350"; mail-complaints-to="usenet@i2pn2.org"; posting-account="w/4CleFT0XZ6XfSuRJzIySLIA6ECskkHxKUAYDZM66M"; X-Spam-Checker-Version: SpamAssassin 4.0.0 In-Reply-To: Bytes: 2473 Lines: 38 On Thu, 13 Jun 2024, J Newman wrote: > On 12/06/2024 16:13, D wrote: >> >> >> On Wed, 12 Jun 2024, Richard Kettlewell wrote: >> >>> Grant Taylor writes: >>>> On 6/11/24 01:53, J Newman wrote: >>>>> Any suggestions on how to proceed? >>>> >>>> As others have said, it's very difficult to tell within the first five >>>> seconds what the ultimate compression ratio will be. >>> >>> Not just difficult but impossible in general: the input file could >>> change character in its second half, switching the overall result from >>> that that is (for example) a gzip win to an xz win. >>> >>> >> >> This is true! The only thing I can imagine are parsing the file type, and >> from that file type, drawing conclusions about the compressability of the >> data, or doing a flawed statistical analysis, but as said, the end could be >> vastly different from the start. > > OK good point...as mentioned elsewhere my experience is with compressing > video files with lzma. > > But if we accept that the script will make mistakes sometimes in choosing the > right algorithm for compression, do you suggest parsing the file type, or > trying to compress each file for the first 5 seconds, as the option with the > least errors in choosing the right compression algorithm? > Hmm, I'd say parsing file types first, and perhaps have a little database that maps file type to compression algorithm, and if that doesn't yield anything, proceed with "brute force".