Find where inodes are being used
So I received a warning from our monitoring system on one of our boxes that the number of free inodes on a filesystem was getting low.
df -ioutput shows this:
Filesystem Inodes IUsed IFree IUse% Mounted on /dev/xvda1 524288 422613 101675 81% /
As you can see, the root partition has 81% of its inodes used.
I suspect they're all being used in a single directory. But how can I find where that is at?
I saw this question over on stackoverflow, but I didn't like any of the answers, and it really is a question that should be here on U&L anyway.
Basically an inode is used for each file on the filesystem. So running out of inodes generally means you've got a lot of small files laying around. So the question really becomes, "what directory has a large number of files in it?"
In this case, the filesystem we care about is the root filesystem
/, so we can use the following command:
find / -xdev -printf '%h\n' | sort | uniq -c | sort -k 1 -n
This will dump a list of every directory on the filesystem prefixed with the number of files (and subdirectories) in that directory. Thus the directory with the largest number of files will be at the bottom.
In my case, this turns up the following:
1202 /usr/share/man/man1 2714 /usr/share/man/man3 2826 /var/lib/dpkg/info 306588 /var/spool/postfix/maildrop
/var/spool/postfix/maildropis consuming all the inodes.
Note, this answer does have three caveats that I can think of. It does not properly handle anything with newlines in the path. I know my filesystem has no files with newlines, and since this is only being used for human consumption, the potential issue isn't worth solving (and one can always replace the
sort -zabove). It also does not handle if the files are spread out among a large number of directories. This isn't likely though, so I consider the risk acceptable. It will also count hard links to a same file (so using only one inode) several times. Again, unlikely to give false positives
The key reason I didn't like any of the answers on the stackoverflow answer is they all cross filesystem boundaries. Since my issue was on the root filesystem, this means it would traverse every single mounted filesystem. Throwing
-xdevon the find commands wouldn't even work properly.
For example, the most upvoted answer is this one:
for i in `find . -type d `; do echo `ls -a $i | wc -l` $i; done | sort -n
If we change this instead to
for i in `find . -xdev -type d `; do echo `ls -a $i | wc -l` $i; done | sort -n
/mnt/foois a mount, it is also a directory on the root filesystem, so it'll turn up in
find . -mount -type d, and then it'll get passed to the
ls -a $i, which will dive into the mount.
findin my answer instead lists the directory of every single file on the mount. So basically with a file structure such as:
/foo/bar /foo/baz /pop/tart
we end up with
/foo /foo /pop
So we just have to count the number of duplicate lines.
`ls -a` bad point for scripting in recursion, because it show `.` and `..` Then you'll have duplicated data, you can use `-A` instead of `-a`
@MohsenPahlevanzadeh that isn't part of my answer, I was commenting on why I dislike the solution as it's a common answer to this question.
Using a bind mount is a more robust way to avoid searching other file systems as it allows access to files under mount points. Eg, imagine I create 300,000 files under `/tmp` and then later the system is configured to mount a tmpfs on `/tmp`. Then you won't be able to find the files with `find` alone. Unlikely senario, but worth noting.
@StephaneChazelas Why did you put an intermediate `sort` in the command? That should not be necessary. The entries will already be grouped.
`find` may output a/b, a/b/c, a/b (try `find . -printf '%h\n' | uniq | sort | uniq -d`)
@Patrick, I recently encountered a similar sort of issue. However, in my case I knew the directory responsible for the large inode count. I could verify it by using `ls -l | wc -l`. But if I had seen this post earlier, I could have checked the file system once before backing up. But neverthless, +1 for a great answer and the explanation :)
Both work just had to remove sort because sort needs to create a file when the output is big enough, which wasn't possible since I hit 100% usage of inodes.
Note that `-printf` appears to be a GNU extension to find, as the BSD version available in OS X does not support it.
@Graeme are bind-mounts posix (contrasted to linux-only)? Patrick: best workaround ever (out of total 1 I care about)!
@XiongChiamiov seems like you're right http://pubs.opengroup.org/onlinepubs/009695399/utilities/find.html
The assumption that all files are in a single directory is a difficult one. A lot of programs know that many files in a single directory has bad performance and thus hash one or two levels of directories