Deutsch   English   Français   Italiano  
<slrnv1ikps.ggn.rotflol2@zerosignal.strangled.net>

View for Bookmarking (what is this?)
Look up another Usenet article

Path: ...!weretis.net!feeder8.news.weretis.net!eternal-september.org!feeder3.eternal-september.org!news.eternal-september.org!.POSTED!not-for-mail
From: Borax Man <rotflol2@hotmail.com>
Newsgroups: comp.os.linux.misc
Subject: Re: Files tree
Date: Fri, 12 Apr 2024 15:29:02 -0000 (UTC)
Organization: A noiseless patient Spider
Lines: 109
Message-ID: <slrnv1ikps.ggn.rotflol2@zerosignal.strangled.net>
References: <uvba27$2c40q$1@dont-email.me>
Injection-Date: Fri, 12 Apr 2024 17:29:02 +0200 (CEST)
Injection-Info: dont-email.me; posting-host="52546525aeba82f1915b6a88574e5482";
	logging-data="2563073"; mail-complaints-to="abuse@eternal-september.org";	posting-account="U2FsdGVkX1+i5t6RyysjQbR0tuZ7kki2sKexsFFk4zs="
User-Agent: slrn/1.0.3 (Linux)
Cancel-Lock: sha1:7gC8AJcACNxwMkUIaCyq16BoDBM=
Bytes: 5405

On 2024-04-12, James Harris <james.harris.1@gmail.com> wrote:
> For a number of reasons I am looking for a way of recording a list of 
> the files (and file-like objects) on a Unix system at certain points in 
> time. The main output would simply be sorted text with one 
> fully-qualified file name on each line.
>
> What follows is my first attempt at it. I'd appreciate any feedback on 
> whether I am going about it the right way or whether it could be 
> improved either in concept or in coding.
>
> There are two tiny scripts. In the examples below they write to 
> temporary files f1 and f2 to test the mechanism but the idea is that the 
> reports would be stored in timestamped files so that comparisons between 
> one report and another could be made later.
>
> The first, and primary, script generates nothing other than names and is 
> as follows.
>
> export LC_ALL=C
> sudo find /\
>   -path "/proc/*" -prune -o\
>   -path "/run/*" -prune -o\
>   -path "/sys/*" -prune -o\
>   -path "/tmp/*/*" -prune -o\
>   -print0 | sort -z | tr '\0' '\n' > /tmp/f1
>
> You'll see I made some choices such as to omit files from /proc but not 
> from /dev, for example, to record any lost+found contents, to record 
> mounted filesystems, to show just one level of /tmp, etc.
>
> I am not sure I coded the command right albeit that it seems to work on 
> test cases.
>
> The output from that starts with lines such as
>
> /
> /bin
> /boot
> /boot/System.map-5.15.0-101-generic
> /boot/System.map-5.15.0-102-generic
> ...etc...
>
> Such a form would be ideal for input to grep and diff to look for 
> relevant files that have been added or removed between any two runs.
>
> The second, and less important, part is to store (in a separate file) 
> info about each of the file names as that may be relevant in some cases. 
> That takes the first file as input and has the following form.
>
> cat /tmp/f1 |\
>   tr '\n' '\0' |\
>   xargs -0 sudo ls -ld > /tmp/f2
>
> The output from that is such as
>
> drwxr-xr-x  23 root   root         4096 Apr 13  2023 /
> lrwxrwxrwx   1 root   root            7 Mar  7  2023 /bin -> usr/bin
> drwxr-xr-x   3 root   root         4096 Apr 11 11:30 /boot
> ...etc...
>
> As for run times, if anyone's interested, despite the server I ran this 
> on having multiple locally mounted filesystems and one NFS the initial 
> tests ran in 90 seconds to generate the first file and 5 minutes to 
> generate the second, which would mean (as long as no faults are found) 
> that it would be no problem to run at least the first script whenever 
> required. Other than that, I'd probably also schedule both to run each 
> night.
>
> That's the idea. As I say, comments, advice and criticisms on the idea 
> or on the coding would be appreciated!
>

One thing, find has a "printf" option, where you can format the
output.  you can remove the need for "tr" by using this instead of
"-print0".

-printf "%P\n"

That will also remove the leading slash, which I think is a good idea
in this case.  Use the lower case p to keep the starting point of the
file and have the leading path.

If you are wanting to validate a directory tree, that is, see if it
has changed, I would recommend using mtree.  It's available in debian
under the mtree-bsd package.

Mtree can output a list of files, plus other attributes to a spec
file, and can tell you later, according to the spec file, what changes
have been made.  The problem with your "find" method, is you can't
tell if a file has simply been modified.

Using mtree, you can do two things.  One generate a specification
file, which is really a list of files plus selected attributes at any
point in time AND, see what changes have been made.  As a bonus, you
can get it to output the spec in a simple format, using the "-C"
option, and you get output very similar to "find" with a little extra
info tacked on, which you could remove using a pipe.

You could output to a spec file which as the date in the filename,
then run mtree against any previous spec file to see what has changes
between that spec and the current state.

If you just want the list of files, find works fine, with the
suggestion I made about printing the filename, but have a look at
mtree because I think it will save you a bit of coding.

Thats the thing with Linux, or computing in general, its likely what
you thought of has already been done, and there is a tool which does
it, or easily adapated to do it.