How can I find broken symlinks

  • Is there a way to find all symbolic links that don't point anywere?

    find ./ -type l
    

    will give me all symbolic links, but makes no distinction between links that go somewhere and links that don't.

    I'm currently doing:

    find ./ -type l -exec file {} \; | grep broken
    

    But I'm wondering what alternate solutions exist.

  • I'd strongly suggest not to use find -L for the task (see below for explanation). Here are some other ways to do this:

    • If you want to use a "pure find" method, it should rather look like this:

      find . -xtype l
      

      (xtype is a test performed on a dereferenced link) This may not be available in all versions of find, though. But there are other options as well:

    • You can also exec test -e from within the find command:

      find . -type l ! -exec test -e {} \; -print
      
    • Even some grep trick could be better (i.e., safer) than find -L, but not exactly such as presented in the question (which greps in entire output lines, including filenames):

       find . -type l -exec sh -c 'file -b "$1" | grep -q ^broken' sh {} \; -print
      

    The find -L trick quoted by solo from commandlinefu looks nice and hacky, but it has one very dangerous pitfall: All the symlinks are followed. Consider directory with the contents presented below:

    $ ls -l
    total 0
    lrwxrwxrwx 1 michal users  6 May 15 08:12 link_1 -> nonexistent1
    lrwxrwxrwx 1 michal users  6 May 15 08:13 link_2 -> nonexistent2
    lrwxrwxrwx 1 michal users  6 May 15 08:13 link_3 -> nonexistent3
    lrwxrwxrwx 1 michal users  6 May 15 08:13 link_4 -> nonexistent4
    lrwxrwxrwx 1 michal users 11 May 15 08:20 link_out -> /usr/share/
    

    If you run find -L . -type l in that directory, all /usr/share/ would be searched as well (and that can take really long)1. For a find command that is "immune to outgoing links", don't use -L.


    1 This may look like a minor inconvenience (the command will "just" take long to traverse all /usr/share) – but can have more severe consequences. For instance, consider chroot environments: They can exist in some subdirectory of the main filesystem and contain symlinks to absolute locations. Those links could seem to be broken for the "outside" system, because they only point to proper places once you've entered the chroot. I also recall that some bootloader used symlinks under /boot that only made sense in an initial boot phase, when the boot partition was mounted as /.

    So if you use a find -L command to find and then delete broken symlinks from some harmless-looking directory, you might even break your system...

    This is something I hadn't considered @rozcietrzewiacz and is something that will definitely effect my particular case. Thanks for the thorough follow-up.

    I think this might be the answer to a major performance problem we've been having with a particular script. Thanks!

    This is awesome! I've been writing throw-away Python script for this particular one!! :-/

    I think `-type l` is redundant since `-xtype l` will operate as `-type l` on non-links. So `find -xtype l` is probably all you need. Thanks for this approach.

    @quornian Great catch! Indeed, `find -xtype l` is enough. The only difference is in the number of system calls performed by each command (which indicates `find -type l -xtype l` should be faster). But I guess this would make a difference only on a large filesystem trees.

    I think it is not a good solution to grep for a string here. If you have a different locale, the output is in a different language, and you grep expression will fail.

    I don't understand why `-xtype l` works. `find . -type l -xtype l` means "find all the symlinks to symlinks", rather than, "find all the symlinks to files that don't exist", right?

    Be aware that those solutions don't work for all filesystem types. For example it won't work for checking if `/proc/XXX/exe` link is broken. For this, use `test -e "$(readlink /proc/XXX/exe)"`.

    @Flimm `find . -xtype l` means "find all symlinks whose (ultimate) target files are symlinks". But the ultimate target of a symlink cannot be a symlink, otherwise we can still follow the link and it is not the ultimate target. Since there is no such symlinks, we can define them as something else, i.e. broken symlinks.

    @weakish I would rather say that `-xtype` follows the chain of symbolic links and evaluates the file at the end, which can only be a symbolic link in case it is broken.

    @JoóÁdám "which can only be a symbolic link in case it is broken". Give "broken symbolic link" or "non exist file" an individual type, instead of overloading `l`, is less confusing to me.

    The warning at the end is useful, but note that this does not apply to the `-L` hack but rather to (blindly) removing broken symlinks in general.

    I somehow suspected that *`-L`* was risky and that's why I found this. On breaking systems from symlinks I recall doing something (but I was able to salvage it because it was an experiment of sort) with *`/dev/null`* years ago. I needn't elaborate on why there was a symlink to it - it was in any case risky; it was not to do with *`find`* but something about recursively following the link nonetheless (or rather in this case dereferencing the link). I have this memory I've played with the same functionality before but I don't know for sure and it's not really important. (1/2)

    As for *`-xtype l`* some comments on the options that change how symlinks in *`find`* go in order to find broken symlinks: (1) Because of how *`-L`* works **don't use that option** for broken symlink tests. (2) *`-P`* is the default and *`-xtype l`* works fine to find broken symlinks. But (3) *`-H`* **appears** to work but I do not know if it has any implications in different circumstances. Thus because what you and the commentator suggest it seems what needs to be done is **only use** *`-xtype l`*. Oh and correct: *`-xtype`* is **not specified in POSIX**. (2/2)

    Another comment. Because the *`-xtype`* isn't available on all systems e.g. macOS the second option, *`find . -type l ! -exec test -e {} \; -print`* seems to me to be the proper answer. However - and maybe this is an artefact of books of old that I read aeons ago - I think it should actually be: *`find . -type l \! -exec test -e '{}' \; -print`*. However if you want to act on each file maybe *`-exec`* is a better idea? Or if you have GNU find then *`-print0 | xargs -0 `* more so.

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM