I see that my Finding Large Files and Directories post is quite popular, yet there are a few more ways to simplify your search for the largest disk space consumers in your Unix system.
Make find command show file sizes
If you remember, the default way a find command reports results includes only the fully qualified (that means including the full path) filenames.
Now, if you look at a task of identifying the largest files, it's great if you can get a list of all the files bigger than some figure your specify, but what would be even better is to include the exact size of each file right into the output of the find command.
Here's how you do it: it's possible to specify which information about each file you'd like to see. Check out the find command man page for all the possibilities, but in today's example I'm using two parameters: %s means the size of a file in bytes and %f means the filename itself.
Let's say I want to get a list of all the files under /usr directory which are larger than 15Mb each, and show the exact size of each file. Here's how it can be done:
ubuntu$ find /usr -size +15M -printf "%s - %p\n" 39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar 35487120 - /usr/lib/vmware/bin/vmware-hostd 16351166 - /usr/lib/vmware/bin/vmplayer 38353296 - /usr/lib/vmware/hostd/libtypes.so 54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe 92143616 - /usr/lib/vmware/isoimages/linux.iso 23494656 - /usr/lib/vmware/isoimages/windows.iso 47070920 - /usr/lib/libgcj.so.81.0.0 20890468 - /usr/share/fonts/truetype/arphic/uming.ttf 17733780 - /usr/share/icons/crystalsvg/icon-theme.cache 18597793 - /usr/share/myspell/dicts/th_en_US_v2.dat 45345879 - /usr/src/linux-source-2.6.22.tar.bz2
Just to help you refresh your mind, here's the explanation of all the parameters in the command line:
- /usr is the directory where we'd like to find the files of interest
- -size +15M narrows our interest to only the files larger than 15Mb
- -printf "%s – %p\n" is the magic which shows the nice list of files along with their sizes.
Sort the list of files by filesize
Next really useful thing we could do is to sort this list, just so that we could see a nice ordered representation of how big each file is. It's very easily done by piping the output of the find command to a sort command:
ubuntu$ find /usr -size +15M -printf "%s - %p\n" | sort -n 16351166 - /usr/lib/vmware/bin/vmplayer 17733780 - /usr/share/icons/crystalsvg/icon-theme.cache 18597793 - /usr/share/myspell/dicts/th_en_US_v2.dat 20890468 - /usr/share/fonts/truetype/arphic/uming.ttf 23494656 - /usr/lib/vmware/isoimages/windows.iso 35487120 - /usr/lib/vmware/bin/vmware-hostd 38353296 - /usr/lib/vmware/hostd/libtypes.so 39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar 45345879 - /usr/src/linux-source-2.6.22.tar.bz2 47070920 - /usr/lib/libgcj.so.81.0.0 54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe 92143616 - /usr/lib/vmware/isoimages/linux.iso
As you can see, the smallest files (just above 15Mb) are at the top of the list, and the largest ones are at the bottom.
Limit the number of files returned by find
The last trick I'll show you today is going to make your task even easier: why look at the pages of find commnand output, if you're after only the largest files? After all, your list can be much longer than the one shown above. To solve this little problem we'll pipe the output of all the commands to yet another unix command, tail.
tail command allows you to show only a specified number of lines of any standard input or Unix text file you point it to. By default, it strips the number of lines to 10, which can be enough for your purposes.
Here's how you can get a least of the 10 largest files under /usr:
ubuntu$ find /usr -size +15M -printf "%s - %p\n" | sort -n | tail 18597793 - /usr/share/myspell/dicts/th_en_US_v2.dat 20890468 - /usr/share/fonts/truetype/arphic/uming.ttf 23494656 - /usr/lib/vmware/isoimages/windows.iso 35487120 - /usr/lib/vmware/bin/vmware-hostd 38353296 - /usr/lib/vmware/hostd/libtypes.so 39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar 45345879 - /usr/src/linux-source-2.6.22.tar.bz2 47070920 - /usr/lib/libgcj.so.81.0.0 54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe 92143616 - /usr/lib/vmware/isoimages/linux.iso
Show the largest 10 files in your Unix system
Now that you know all the most useful tricks, you can easily identify and show the list of the 10 largest files in your whole system. Bear in mind, that you should probably run this command with root privileges, as files in your system belong to various users, and a single standard user account will most likely have insufficient privileges to even list such files.
If you're trying to locate your largest files in Ubuntu, use the sudo command (assuming you have the sudo privileges to become root):
ubuntu$ sudo find / -size +15M -printf "%s - %p\n" | sort -n | tail
alternatively, just become root by doing something like this (you obviously should know the root password to do that):
$ su - root
and then run the find command itself. Here's how the output looks on my Ubuntu desktop:
ubuntu$ find / -size +15M -printf "%s - %p\n" | sort -n | tail 39859372 - /usr/lib/vmware/webAccess/java/jre1.5.0_07/lib/rt.jar 45345879 - /usr/src/linux-source-2.6.22.tar.bz2 45356784 - /var/cache/apt/archives/linux-source-2.6.22_2.6.22-14.52_all.deb 45424028 - /var/cache/apt/archives/kde-icons-oxygen_4%3a4.0.2-0ubuntu1~gutsy1~ppa1_all.deb 47070920 - /usr/lib/libgcj.so.81.0.0 54366585 - /export/dist/vmware/server2b2/vmware-server-distrib/lib/hostd/docroot/client/VMware-viclient.exe 54366585 - /usr/lib/vmware/hostd/docroot/client/VMware-viclient.exe 92143616 - /export/dist/vmware/server2b2/vmware-server-distrib/lib/isoimages/linux.iso 92143616 - /usr/lib/vmware/isoimages/linux.iso 340199772 - /export/dist/vmware/server2b2/VMware-server-e.x.p-63231.x86_64.tar.gz
That's it for today, hope this helps! Please bookmark this post if you liked it, and leave comments if there are any questions!
ThomasU says
The quotes in the command lines prevent us from copy-pasting them effectively. After copy pasting, one should replace the quotes with "
Gleb Reys says
That's a valid point, Thomas. Thanks for the comment! I'll see if this can be fixed.
Gleb Reys says
This Thomas, this is fixed now. Thanks for letting me know!
scott says
The terminal on my jailbroken iPhone didn't have a man find, so this worked great! thanks
Ezequiel says
Very usefull. thanks!
Matt says
After several search queries I finally found this. This has helped me so much. Thank you!