Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail From: fir Newsgroups: comp.lang.c Subject: Re: program to remove duplicates Date: Sat, 21 Sep 2024 21:27:08 +0200 Organization: i2pn2 (i2pn.org) Message-ID: References: <4fdc265edcfdf2fea23aa6fa4c1c58cc7cbde376@i2pn2.org> MIME-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Injection-Date: Sat, 21 Sep 2024 19:27:08 -0000 (UTC) Injection-Info: i2pn2.org; logging-data="2917491"; mail-complaints-to="usenet@i2pn2.org"; posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0"; User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24 In-Reply-To: <4fdc265edcfdf2fea23aa6fa4c1c58cc7cbde376@i2pn2.org> X-Spam-Checker-Version: SpamAssassin 4.0.0 Bytes: 3884 Lines: 96 fir wrote: > fir wrote: >> >> >> i think if to write a simple comandline program >> that remove duplicates in a given folder >> >> i mean some should copy a program to given folder >> run it and all duplicates and multiplicates (when >> duplicate means a file with different name but >> exact binary size and byte content) will be removed >> leafting only one for multiplicate set >> >> this should work for a big doze of files - >> i need it for example i once recovered a hdd disk >> and as i got some copies of files on this disc >> the removed files are generally multiplicated >> and consume a lot of disk space >> >> so is there some approach i need to take to make this >> proces faster? >> >> probably i would need to read list of files and sizes in >> current directory then sort or go thru the list and if found >> exact size read it into ram tnen compare it byte by byte >> >> in not sure if to do sorting as i need write it quick >> also and maybe sorting will complicate a bit but not gives much >> >> some thoughts? > > couriously, i could add i once searched for program to remove duplicates > but they was not looking good..so such commandline > (or commandline less in fact as i dont even want toa dd comandline > options maybe) program is quite practically needed assuming i got code to read in list of filanemes in given directory (which i found) what you suggest i should add to remove such duplicates - the code to read those filenames into l;ist (tested to work but not tested for being 100% errorless) #include #include void StrCopyMaxNBytes(char* dest, char* src, int n) { for(int i=0; i