Find and remove large files that are open but have been deleted

  • How does one find large files that have been deleted but are still open in an application? How can one remove such a file, even though a process has it open?

    The situation is that we are running a process that is filling up a log file at a terrific rate. I know the reason, and I can fix it. Until then, I would like to rm or empty the log file without shutting down the process.

    Simply doing rm output.log removes only references to the file, but it continues to occupy space on disk until the process is terminated. Worse: after rming I now have no way to find where the file is or how big it is! Is there any way to find the file, and possibly empty it, even though it is still open in another process?

    I specifically refer to Linux-based operating systems such as Debian or RHEL.

    If you know the pid then you can use `lsof -p ` to list its open files and their sizes. The deleted file will have a `(deleted)` next to it. The deleted file will be linked at `/proc//fd/1` probably. I don't know how to make a process stop writing to its file descriptor without terminating it. I would think that would depend on the process.

    Thanks. How might one get the PIDs of all `rm`ed files that are still open?

    @donothingsuccessfully The "deleted" tag reported by lsof is Solaris specific, in fact Solaris 10 or later only. The OP did not specify what operating system he is using. @dotancohen On Solaris you can pipe the output of lsof to search for deleted, eg `lsof | grep "(deleted)"`. When there are no more processes holding a deleted file open, the kernel will free up the inode and disk blocks. Processes do not have "handlers" by which they can be notified that an open, essentially locked file, have been removed from disk.

    @Johan, the `lsof | grep '(deleted)'` works on Linux as well. On Linux, you can be notified of file deletion (even files that already don't have any entry in any directory other than /proc/some-pid/fd anymore) with the inotify mechanism (IN_DELETE_SELF event)

    I created `somefile` and opened it in VIM, then `rm`ed it in another bash process. I then run `lsof | grep somefile` and it is not in there, even though the file is open in VIM.

    @dotancohen: try using : `tail -f /tmp/somefile` on one terminal and rm /tmp/somefile on another terminal. `tail -f` will keep the fd open until you stop it. Not sure vi/vim will keep the fd open when not needed... and use : `lsof -p PID` to see all fd of process PID (ie, the tail). to find its pid : before deleting the file: `ps -ef | grep '/tmp/[s]omefile'` ([s]omething greps for "something", and thus will not show the "grep ...." line as that line contains s]omething instead)

  • If you can't kill your application, you can truncate instead of deleting the log file to reclaim the space. If the file was not open in append mode (with O_APPEND), then the file will appear as big as before the next time the application writes to it (though with the leading part sparse and looking as if it contained NUL bytes), but the space will have been reclaimed (that does not apply to HFS+ file systems on Apple OS/X that don't support sparse files though).

    To truncate it:

    : > /path/to/the/file.log
    

    If it was already deleted, on Linux, you can still truncate it by doing:

    : > "/proc/$pid/fd/$fd"
    

    Where $pid is the process id of the process that has the file opened, and $fd one file descriptor it has it opened under (which you can check with lsof -p "$pid".

    If you don't know the pid, and are looking for deleted files, you can do:

    lsof -nP | grep '(deleted)'
    

    lsof -nP +L1, as mentioned by @user75021 is an even better (more reliable and more portable) option (list files that have fewer than 1 link).

    Or (on Linux):

    find /proc/*/fd -ls | grep  '(deleted)'
    

    Or to find the large ones with zsh:

    ls -ld /proc/*/fd/*(-.LM+1l0)
    

    An alternative, if the application is dynamically linked is to attach a debugger to it and make it call close(fd) followed by a new open("the-file", ....).

    Thank you Stephane. However this of course depends on knowing the PID which I don't and won't know. I'll see where I can take it, though.

    There's also a `truncate` command that does the same thing more explicitly.

    @dotancohen Stephane edited to include info on how to do this when the pid is not known.

    very nice answer! +1 for it. However, it won't work on all unix systems (for ex: AIX 6.1 + lsof 4.82, at least, where lsof doesn't show the path to the file, and /proc/pid/fd/x acts apparently differently (not sure of the details yet)) (I know it is tagged linux only, but I'd welcome a "most unix" answer very much!)

    hmm, maybe because I didn't run it as root... apologies, i'll investigate further on AIX

    nope, even as root: AIX 6.1, lsof 4.82 : doesn't show the filename. instead, using `procfiles -n pid` instead of `lsof -p pid` will show the filename, UNTIL you delete it (ie, after deletion, it still shows its other informations, inode, modes, etc, but the things `-n` was showing (ie: its full path : `name:.........`) is no longer shown once the corresponding file is deleted). So please if anyone knows a solution for AIX 6.1, I'm interrested.

    @OlivierDulac, `lsof` is probably going to be the closest to a portable solution you can get to list open files. the debugger approach to close the fd under the application feet should be quite portable as well.

    @StephaneChazelas: thanks. I found a way to list all PIDs which have a file open on each partitions : `df -k | awk 'NR>1 { print $NF }' | xargs fuser -Vud` (and then easy to send signals to the offenders to force them to release the fd)

    You can also use `lsof +L1`. From the lsof man page: "A specification of the form `+L1` will select open files that have been unlinked. A specification of the form `+aL1 ` will select unlinked open files on the specified file system.". That should be a bit more reliable than grepping.

    +1 in particular for including instructions on finding file handles without using lsof. I had a system that didn't have it installed, and didn't have enough space to install it.

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM