How to delete the rest of each line after a certain pattern or a string in a file?

  • Suppose I have a list of URLs in a text file:

    I want to delete everything that comes after '.com'.

    Expected Results:

    I tried

    sed 's/.com*//' file.txt 

    but it deleted .com as well.

    Is there a specific reason for which you want to search for `.com` only instead of removing everything after and including the first `/` character? What if you had a URL like `` in your list?

  • To explicitly delete everything that comes after ".com", just tweak your existing sed solution to replace ".com(anything)" with ".com":

    sed 's/\.com.*/.com/' file.txt

    I tweaked your regex to escape the first period; otherwise it would have matched something like "".

    Note that you may want to further anchor the ".com" pattern with a trailing forward-slash so that you don't accidentally trim something like "":

    sed 's/\.com\/.*/.com/' file.txt

License under CC-BY-SA with attribution

Content dated before 6/26/2020 9:53 AM