Tracking down where disk space has gone on Linux?

  • When administering Linux systems I often find myself struggling to track down the culprit after a partition goes full. I normally use du / | sort -nr but on a large filesystem this takes a long time before any results are returned.

    Also, this is usually successful in highlighting the worst offender but I've often found myself resorting to du without the sort in more subtle cases and then had to trawl through the output.

    I'd prefer a command line solution which relies on standard Linux commands since I have to administer quite a few systems and installing new software is a hassle (especially when out of disk space!)

    @Bart thanks for taking the time to improve posts here, but may I ask you to take a little more care when adding tags? Tags aren’t for visibility, they describe the question. The disk tag isn’t appropriate here (see its tag description), and the at least two of the tags you added in this suggested edit weren’t appropriate there (Kali isn’t Debian, and there are no PPAs involved).

  • Try ncdu, an excellent command-line disk usage analyser:

    enter image description here

    when i try to ./configure this, it tells me a required header is missing

    Typically, I hate being asked to install something to solve a simple issue, but this is just great.

    Install size is 81k... And it's super easy to use! :-)

    I was looking for a fast way to find what takes up disk space in an ordered way. This tool does it and it also provides sorting and easy navigation. Thank you for the reference.

    `sudo apt install ncdu` on ubuntu gets it easily. It's great

    You quite probably know which filesystem is short of space. In which case you can use `ncdu -x` to only count files and directories on the same filesystem as the directory being scanned.

    best answer. also: `sudo ncdu -rx /` should give a clean read on biggest dirs/files ONLY on root area drive. (`-r` = read-only, `-x` = stay on same filesystem (meaning: do not traverse other filesystem mounts) )

    @Alf47 Required header for what? You list only partial error. You are missing a lib dependency. Maybe try installing ncurses lib. That seems to be the usual culprit. The info is there in the build output on what your system is missing. see: https://unix.stackexchange.com/a/113493/186861

    @bshea had a great suggestion, many times on AWS it's only your root filesystem that is small, everything else is an EBS or EFS mount that is huge, so you only need to find and clean the root partition.

    I have so little space that I can't install ncdu

    Hands down the best. ncdu is an amazing and beautiful tool!

    Error, can't install ncdu, E: You don't have enough free space in /var/cache/apt/archives/. :(

    Problem is... ran out of disk space so can't install another dependency :)

    This is like WinDirStat for Linux users - absolutely perfect for evaluating disk consumption and treating out-of-control scenarios.

    Pressing r when browsing disk usage refreshes current directory

  • Don't go straight to du /. Use df to find the partition that's hurting you, and then try du commands.

    One I like to try is

    # U.S.
    du -h <dir> | grep '[0-9\.]\+G'
    # Others
    du -h <dir> | grep '[0-9\,]\+G'
    

    because it prints sizes in "human readable form". Unless you've got really small partitions, grepping for directories in the gigabytes is a pretty good filter for what you want. This will take you some time, but unless you have quotas set up, I think that's just the way it's going to be.

    As @jchavannes points out in the comments, the expression can get more precise if you're finding too many false positives. I incorporated the suggestion, which does make it better, but there are still false positives, so there are just tradeoffs (simpler expr, worse results; more complex and longer expr, better results). If you have too many little directories showing up in your output, adjust your regex accordingly. For example,

    grep '^\s*[0-9\.]\+G'
    

    is even more accurate (no < 1GB directories will be listed).

    If you do have quotas, you can use

    quota -v
    

    to find users that are hogging the disk.

    This is very quick, simple and practical

    `grep '[0-9]G'` contained a lot of false positives and also omitted any decimals. This worked better for me: `sudo du -h / | grep -P '^[0-9\.]+G'`

    @BenCollins I think you also need the -P flag for Perl regex.

    @jchavannes `-P` is unnecessary for this expression because there's nothing specific to Perl there. Also, `-P` isn't portable to systems that don't have the GNU implementation.

    Ahh. Well having a carat at the beginning will remove false positives of directories which have a number followed by a G in the name, which I did.

    In case you have really big directories, you'll want `[GT]` instead of just `G`

    Is there a tool that will continuously monitor disk usage across all directories (lazily) in the filesystem? Something that can be streamed to a web UI? Preferably soft-realtime information.

    I like to use `du -h | sort -hr | head`

  • For a first look, use the “summary” view of du:

    du -s /*
    

    The effect is to print the size of each of its arguments, i.e. every root folder in the case above.

    Furthermore, both GNU du and BSD du can be depth-restricted (but POSIX du cannot!):

    • GNU (Linux, …):

      du --max-depth 3
      
    • BSD (macOS, …):

      du -d 3
      

    This will limit the output display to depth 3. The calculated and displayed size is still the total of the full depth, of course. But despite this, restricting the display depth drastically speeds up the calculation.

    Another helpful option is -h (words on both GNU and BSD but, once again, not on POSIX-only du) for “human-readable” output (i.e. using KiB, MiB etc.).

    if `du` complains about `-d` try `--max-depth 5` in stead.

    Great anwser. Seems correct for me. I suggest `du -hcd 1 /directory`. -h for human readable, c for total and d for depth.

    I'm use `du -hd 1 | sort -hr | head`

    `du --max-depth 5 -h /* 2>&1 | grep '[0-9\.]\+G' | sort -hr | head` to filter Permission denied

  • You can also run the following command using du :

    ~# du -Pshx /* 2>/dev/null
    
    • The -s option summarizes and displays total for each argument.
    • h prints Mio, Gio, etc.
    • x = stay in one filesystem (very useful).
    • P = don't follow symlinks (which could cause files to be counted twice for instance).

    Be careful, the /root directory will not be shown, you have to run ~# du -Pshx /root 2>/dev/null to get that (once, I struggled a lot not pointing out that my /root directory had gone full).

    Edit: Corrected option -P

    `du -Pshx .* * 2>/dev/null` + hidden/system directories

    `/root/`shows without issues. Why would it not be shown ?

  • Finding the biggest files on the filesystem is always going to take a long time. By definition you have to traverse the whole filesystem looking for big files. The only solution is probably to run a cron job on all your systems to have the file ready ahead of time.

    One other thing, the x option of du is useful to keep du from following mount points into other filesystems. I.e:

    du -x [path]
    

    The full command I usually run is:

    sudo du -xm / | sort -rn > usage.txt
    

    The -m means return results in megabytes, and sort -rn will sort the results largest number first. You can then open usage.txt in an editor, and the biggest folders (starting with /) will be at the top.

    Thanks for pointing out the `-x` flag!

    "finding biggest takes long time.." -> Well it depends, but tend to disagree: doesn't take that long with utilities like `ncdu` - at least quicker than `du` or `find` (depending on depth and arguments)..

    since I prefer not to be root, I had to adapt where the file is written : `sudo du -xm / | sort -rn > ~/usage.txt`

  • I always use du -sm * | sort -n, which gives you a sorted list of how much the subdirectories of the current working directory use up, in mebibytes.

    You can also try Konqueror, which has a "size view" mode, which is similar to what WinDirStat does on Windows: it gives you a viual representation of which files/directories use up most of your space.

    Update: on more recent versions, you can also use du -sh * | sort -h which will show human-readable filesizes and sort by those. (numbers will be suffixed with K, M, G, ...)

    For people looking for an alternative to KDE3's Konqueror file size view may take a look at filelight, though it's not quite as nice.

    That's only Konqueror 3.x though - the file size view _still_ hasn't been ported to KDE4.

    'du -sh * | sort -h ' works perfectly on my Linux (Centos distro) box. Thanks!

  • I use this for the top 25 worst offenders below the current directory

    # -S to not include subdir size, sorted and limited to top 25
    du -S . | sort -nr | head -25
    

    This command did the trick to find a hidden folder that seemed to be increasing in size over time. Thanks!

    Is this in bytes?

    By default, on my system, 'du -S' gives a nice human readable output. You get a plain number of bytes for small files, then a number with a 'KB' or 'MB' suffix for bigger files.

    You could do du -Sh to get a human readable output.

    @Siddhartha If you add `-h`, it will likely change the effect of the `sort -nr` command - meaning the sort will no longer work, and then the `head` command will also no longer work

    On Ubuntu, I need to use `-h` to `du` for human readable numbers, as well as `sort -h` for human-numeric sort. The list is sorted in reverse, so either use `tail` or change order.

  • At a previous company we used to have a cron job that was run overnight and identified any files over a certain size, e.g.

    find / -size +10000k
    

    You may want to be more selective about the directories that you are searching, and watch out for any remotely mounted drives which might go offline.

    You can use the `-x ` option of find to make sure you don't find files on other devices than the start point of your find command. This fixes the remotely mounted drives issue.

  • One option would be to run your du/sort command as a cron job, and output to a file, so it's already there when you need it.

  • I use

    du -ch --max-depth=2 .
    

    and I change the max-depth to suit my needs. The "c" option prints totals for the folders and the "h" option prints the sizes in K, M, or G as appropriate. As others have said, it still scans all the directories, but it limits the output in a way that I find easier to find the large directories.

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM