Find where inodes are being used

  • So I received a warning from our monitoring system on one of our boxes that the number of free inodes on a filesystem was getting low.

    df -i output shows this:

    Filesystem       Inodes  IUsed    IFree IUse% Mounted on
    /dev/xvda1       524288 422613   101675   81% /
    

    As you can see, the root partition has 81% of its inodes used.
    I suspect they're all being used in a single directory. But how can I find where that is at?

  • Patrick

    Patrick Correct answer

    7 years ago

    I saw this question over on stackoverflow, but I didn't like any of the answers, and it really is a question that should be here on U&L anyway.

    Basically an inode is used for each file on the filesystem. So running out of inodes generally means you've got a lot of small files laying around. So the question really becomes, "what directory has a large number of files in it?"

    In this case, the filesystem we care about is the root filesystem /, so we can use the following command:

    find / -xdev -printf '%h\n' | sort | uniq -c | sort -k 1 -n
    

    This will dump a list of every directory on the filesystem prefixed with the number of files (and subdirectories) in that directory. Thus the directory with the largest number of files will be at the bottom.

    In my case, this turns up the following:

       1202 /usr/share/man/man1
       2714 /usr/share/man/man3
       2826 /var/lib/dpkg/info
     306588 /var/spool/postfix/maildrop
    

    So basically /var/spool/postfix/maildrop is consuming all the inodes.

    Note, this answer does have three caveats that I can think of. It does not properly handle anything with newlines in the path. I know my filesystem has no files with newlines, and since this is only being used for human consumption, the potential issue isn't worth solving (and one can always replace the \n with \0 and use sort -z above). It also does not handle if the files are spread out among a large number of directories. This isn't likely though, so I consider the risk acceptable. It will also count hard links to a same file (so using only one inode) several times. Again, unlikely to give false positives


    The key reason I didn't like any of the answers on the stackoverflow answer is they all cross filesystem boundaries. Since my issue was on the root filesystem, this means it would traverse every single mounted filesystem. Throwing -xdev on the find commands wouldn't even work properly.
    For example, the most upvoted answer is this one:

    for i in `find . -type d `; do echo `ls -a $i | wc -l` $i; done | sort -n
    

    If we change this instead to

    for i in `find . -xdev -type d `; do echo `ls -a $i | wc -l` $i; done | sort -n
    

    even though /mnt/foo is a mount, it is also a directory on the root filesystem, so it'll turn up in find . -mount -type d, and then it'll get passed to the ls -a $i, which will dive into the mount.

    The find in my answer instead lists the directory of every single file on the mount. So basically with a file structure such as:

    /foo/bar
    /foo/baz
    /pop/tart
    

    we end up with

    /foo
    /foo
    /pop
    

    So we just have to count the number of duplicate lines.

    `ls -a` bad point for scripting in recursion, because it show `.` and `..` Then you'll have duplicated data, you can use `-A` instead of `-a`

    @MohsenPahlevanzadeh that isn't part of my answer, I was commenting on why I dislike the solution as it's a common answer to this question.

    Using a bind mount is a more robust way to avoid searching other file systems as it allows access to files under mount points. Eg, imagine I create 300,000 files under `/tmp` and then later the system is configured to mount a tmpfs on `/tmp`. Then you won't be able to find the files with `find` alone. Unlikely senario, but worth noting.

    @Graeme good point, I forgot about that one.

    @StephaneChazelas Why did you put an intermediate `sort` in the command? That should not be necessary. The entries will already be grouped.

    `find` may output a/b, a/b/c, a/b (try `find . -printf '%h\n' | uniq | sort | uniq -d`)

    ah, good catch. I forgot about directories in the middle of the files.

    @Patrick, I recently encountered a similar sort of issue. However, in my case I knew the directory responsible for the large inode count. I could verify it by using `ls -l | wc -l`. But if I had seen this post earlier, I could have checked the file system once before backing up. But neverthless, +1 for a great answer and the explanation :)

    Both work just had to remove sort because sort needs to create a file when the output is big enough, which wasn't possible since I hit 100% usage of inodes.

    Note that `-printf` appears to be a GNU extension to find, as the BSD version available in OS X does not support it.

    @Graeme are bind-mounts posix (contrasted to linux-only)? Patrick: best workaround ever (out of total 1 I care about)!

    The assumption that all files are in a single directory is a difficult one. A lot of programs know that many files in a single directory has bad performance and thus hash one or two levels of directories

    @PlasmaHH `du --inodes -x / | sort -n`

    is there a way to limit the depth, like e.g. `--max-depth=1` in `du`?

    This lists any directory that contains more than 1000 inodes (files, directories, or other): `sudo find / -xdev -printf "%h\n" | gawk '{a[$1]++}; END{for (n in a){ if (a[n]>1000){ print a[n],n } } }' | sort -nr | less`

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM