| Deutsch English Français Italiano |
|
<dadedb8538fff0ea1879112d4c41900e1e005d7c@i2pn2.org> View for Bookmarking (what is this?) Look up another Usenet article |
Path: ...!news.misty.com!weretis.net!feeder9.news.weretis.net!i2pn.org!i2pn2.org!.POSTED!not-for-mail
From: fir <fir@grunge.pl>
Newsgroups: comp.lang.c
Subject: Re: program to remove duplicates
Date: Sat, 21 Sep 2024 21:27:08 +0200
Organization: i2pn2 (i2pn.org)
Message-ID: <dadedb8538fff0ea1879112d4c41900e1e005d7c@i2pn2.org>
References: <ecb505e80df00f96c99d813c534177115f3d2b15@i2pn2.org> <4fdc265edcfdf2fea23aa6fa4c1c58cc7cbde376@i2pn2.org>
MIME-Version: 1.0
Content-Type: text/plain; charset=ISO-8859-1; format=flowed
Content-Transfer-Encoding: 7bit
Injection-Date: Sat, 21 Sep 2024 19:27:08 -0000 (UTC)
Injection-Info: i2pn2.org;
logging-data="2917491"; mail-complaints-to="usenet@i2pn2.org";
posting-account="+ydHcGjgSeBt3Wz3WTfKefUptpAWaXduqfw5xdfsuS0";
User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:27.0) Gecko/20100101 Firefox/27.0 SeaMonkey/2.24
In-Reply-To: <4fdc265edcfdf2fea23aa6fa4c1c58cc7cbde376@i2pn2.org>
X-Spam-Checker-Version: SpamAssassin 4.0.0
Bytes: 3884
Lines: 96
fir wrote:
> fir wrote:
>>
>>
>> i think if to write a simple comandline program
>> that remove duplicates in a given folder
>>
>> i mean some should copy a program to given folder
>> run it and all duplicates and multiplicates (when
>> duplicate means a file with different name but
>> exact binary size and byte content) will be removed
>> leafting only one for multiplicate set
>>
>> this should work for a big doze of files -
>> i need it for example i once recovered a hdd disk
>> and as i got some copies of files on this disc
>> the removed files are generally multiplicated
>> and consume a lot of disk space
>>
>> so is there some approach i need to take to make this
>> proces faster?
>>
>> probably i would need to read list of files and sizes in
>> current directory then sort or go thru the list and if found
>> exact size read it into ram tnen compare it byte by byte
>>
>> in not sure if to do sorting as i need write it quick
>> also and maybe sorting will complicate a bit but not gives much
>>
>> some thoughts?
>
> couriously, i could add i once searched for program to remove duplicates
> but they was not looking good..so such commandline
> (or commandline less in fact as i dont even want toa dd comandline
> options maybe) program is quite practically needed
assuming i got code to read in list of filanemes in given directory
(which i found) what you suggest i should add to remove such duplicates
- the code to read those filenames into l;ist
(tested to work but not tested for being 100% errorless)
#include<windows.h>
#include<stdio.h>
void StrCopyMaxNBytes(char* dest, char* src, int n)
{
for(int i=0; i<n; i++) { dest[i]=src[i]; if(!src[i]) break; }
}
//list of file names
const int FileNameListEntry_name_max = 500;
struct FileNameListEntry { char name[FileNameListEntry_name_max]; };
FileNameListEntry* FileNameList = NULL;
int FileNameList_Size = 0;
void FileNameList_AddOne(char* name)
{
FileNameList_Size++;
FileNameList = (FileNameListEntry*) realloc(FileNameList,
FileNameList_Size * sizeof(FileNameListEntry) );
StrCopyMaxNBytes((char*)&FileNameList[FileNameList_Size-1].name,
name, FileNameListEntry_name_max);
return ;
}
// collect list of filenames
WIN32_FIND_DATA ffd;
void ReadDIrectoryFileNamesToList(char* dir)
{
HANDLE h = FindFirstFile(dir, &ffd);
if(!h) { printf("error reading directory"); exit(-1);}
do {
if (!(ffd.dwFileAttributes & FILE_ATTRIBUTE_DIRECTORY))
FileNameList_AddOne(ffd.cFileName);
}
while (FindNextFile(h, &ffd));
}
int main()
{
ReadDIrectoryFileNamesToList("*");
for(int i=0; i< FileNameList_Size; i++)
printf("\n %d %s", i, FileNameList[i].name );
return 'ok';
}