Sorting files according to size recursively

  • I need to find the largest files in a folder.
    How do I scan a folder recursively and sort the contents by size?

    I have tried using ls -R -S, but this lists the directories as well.
    I also tried using find.

    Do you want to list the files in each subdirectory separately or do you want to find all files in all subdirs and list them by size irrespective of which subdir they are in? Also, what do you mean by "directory" and "folder"? You seem to be using them to describe different things.

    Are you saying that you just want to list the files in a given directory as well as the files in its sub-directories without showing just the sub-directories? Please try and clean up you question, it's not very clear.

  • slm

    slm Correct answer

    7 years ago

    You can also do this with just du. Just to be on the safe side I'm using this version of du:

    $ du --version
    du (GNU coreutils) 8.5
    

    The approach:

    $ du -ah ..DIR.. | grep -v "/$" | sort -rh
    

    Breakdown of approach

    The command du -ah DIR will produce a list of all the files and directories in a given directory DIR. The -h will produce human readable sizes which I prefer. If you don't want them then drop that switch. I'm using the head -6 just to limit the amount of output!

    $ du -ah ~/Downloads/ | head -6
    4.4M    /home/saml/Downloads/kodak_W820_wireless_frame/W820_W1020_WirelessFrames_exUG_GLB_en.pdf
    624K    /home/saml/Downloads/kodak_W820_wireless_frame/easyshare_w820.pdf
    4.9M    /home/saml/Downloads/kodak_W820_wireless_frame/W820_W1020WirelessFrameExUG_GLB_en.pdf
    9.8M    /home/saml/Downloads/kodak_W820_wireless_frame
    8.0K    /home/saml/Downloads/bugs.xls
    604K    /home/saml/Downloads/netgear_gs724t/GS7xxT_HIG_5Jan10.pdf
    

    Easy enough to sort it smallest to biggest:

    $ du -ah ~/Downloads/ | sort -h | head -6
    0   /home/saml/Downloads/apps_archive/monitoring/nagios/nagios-check_sip-1.3/usr/lib64/nagios/plugins/check_ldaps
    0   /home/saml/Downloads/data/elasticsearch/nodes/0/indices/logstash-2013.04.06/0/index/write.lock
    0   /home/saml/Downloads/data/elasticsearch/nodes/0/indices/logstash-2013.04.06/0/translog/translog-1365292480753
    0   /home/saml/Downloads/data/elasticsearch/nodes/0/indices/logstash-2013.04.06/1/index/write.lock
    0   /home/saml/Downloads/data/elasticsearch/nodes/0/indices/logstash-2013.04.06/1/translog/translog-1365292480946
    0   /home/saml/Downloads/data/elasticsearch/nodes/0/indices/logstash-2013.04.06/2/index/write.lock
    

    Reverse it, biggest to smallest:

    $ du -ah ~/Downloads/ | sort -rh | head -6
    10G /home/saml/Downloads/
    3.8G    /home/saml/Downloads/audible/audio_books
    3.8G    /home/saml/Downloads/audible
    2.3G    /home/saml/Downloads/apps_archive
    1.5G    /home/saml/Downloads/digital_blasphemy/db1440ppng.zip
    1.5G    /home/saml/Downloads/digital_blasphemy
    

    Don't show me the directory, just the files:

    $ du -ah ~/Downloads/ | grep -v "/$" | sort -rh | head -6 
    3.8G    /home/saml/Downloads/audible/audio_books
    3.8G    /home/saml/Downloads/audible
    2.3G    /home/saml/Downloads/apps_archive
    1.5G    /home/saml/Downloads/digital_blasphemy/db1440ppng.zip
    1.5G    /home/saml/Downloads/digital_blasphemy
    835M    /home/saml/Downloads/apps_archive/cad_cam_cae/salome/Salome-V6_5_0-LGPL-x86_64.run
    

    If you just want the list of smallest to biggest, but the top 6 offending files you can reverse the sort switch, drop (-r), and use tail -6 instead of the head -6.

    $ du -ah ~/Downloads/ | grep -v "/$" | sort -h | tail -6
    835M    /home/saml/Downloads/apps_archive/cad_cam_cae/salome/Salome-V6_5_0-LGPL-x86_64.run
    1.5G    /home/saml/Downloads/digital_blasphemy
    1.5G    /home/saml/Downloads/digital_blasphemy/db1440ppng.zip
    2.3G    /home/saml/Downloads/apps_archive
    3.8G    /home/saml/Downloads/audible
    3.8G    /home/saml/Downloads/audible/audio_books
    

    The `grep -v "/$"` part doesn't seem to be doing what you expected, as the directories don't have a slash appended. Does anyone know how to exclude directories from results?

    @JanekWarchol - what version of coreutils are you using?

    I'm on 8.13. But anyway, the output in your answer doesn't have trailing `/`s either - for example `/home/saml/Downloads/audible` seems to be a directory, but it doesn't have a slash. Only `/home/saml/Downloads/` has a slash, but that's probably because you wrote it with a slash when specifying the argument for initial `du`.

    @JanekWarchol - Look at the 5th text box. The grep is just to filter the `~/Downloads/` bit out. As you've stated, it's just to filter out the argument of `~/Downloads` when `du` processes it. I changed the word directories to directory since I think that's ultimately what was causing the confusion. Thanks for the feedback!

    @JanekWarchol - incidentally to omit the directories you'll have to change tactics and use `find` to generate a list of files only and then have `du` tally them up.

    This finds dirs also

    This doesn't list just files, but also lists directories :(

    du --version in not an option for me, however the approach does work

    building on that solution, and the solution offered on this post: https://unix.stackexchange.com/questions/22432/getting-size-with-du-of-files-only, I was able to yield a result with files only with the following command: `find . -type f -exec du -ah {} + | grep -v "/$" | sort -rh`

    @flochtililoch Super answer! It works for me :)

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM