Perl: Searching Through Directory Trees

I had a need to scan a huge directory tree today, identifying the users and Unix groups owning all the files. The problem I faced was too long usernames and group names which meant the

find /directory -ls

command which I normally use for such tasks wasn’t terribly useful because there was no space delimiter between a username and a group. Results of such scan of the directory tree will have to later be parsed by other tools, and that’s why proper splitting of the output into separate fields is so important.

 

This issue was motivational enough to refresh my Perl skills and sketch the following script (based entirely on this Never Run Unix Find Again article).

It’s a very simple piece of code which takes a directory to scan as a parameter.

How this works

As you can see, we’re using the standard File:Find functionality, and the two parameters find function takes are the wanted function, where you put conditions for your search.

Within this function, you call lstat to obtain all the necessary information about each directory entry, and then output the necessary fields.

Perl code

#!/usr/bin/perl
use File::Find;
if ($ARGV[0] ne "") {
        $dir = $ARGV[0];
} else {
        print "Please specify a directory!";
        exit;
}

find &wanted, $dir;

sub wanted {
  my $dev;         # the file system device number
  my $ino;         # inode number
  my $mode;        # mode of file
  my $nlink;       # counts number of links to file
  my $uid;         # the ID of the file's owner
  my $gid;         # the group ID of the file's owner
  my $rdev;        # the device identifier
  my $size;        # file size in bytes
  my $atime;       # last access time
  my $mtime;       # last modification time
  my $ctime;       # last change of the mode
  my $blksize;     # block size of file
  my $blocks;      # number of blocks in a file
  my $user;	# username
  my $group;	# unix group name

#Right below here your telling lstat to retrieve all this info on each and every file/directory. Each and every file/directory is written to $_.

  (($dev,$ino,$mode,$nlink,$uid,$gid,$rdev,$size,$atime,$mtime,$ctime,$blksize,$blocks) = lstat($_));
  $user = getpwuid($uid);
  $group = getgrgid($gid);

  print $File::Find::name . ":$mode:$size:$user:$group:$ctime:$mtimen";
}

Hope you find this useful. Good luck with finding all your files! 🙂

For further reading, please consult the Perldoc section on File:Find.