count lines in a file

  • I'm sure there are many ways to do this: how can I count the number of lines in a text file?

    $ <cmd> file.txt
    1020 lines
    
  • The standard way is with wc, which takes arguments to specify what it should count (bytes, chars, words, etc.); -l is for lines:

    $ wc -l file.txt
    1020 file.txt
    

    How do I count the lines in a file if I want to **ignore** comments? Specifically, I want to *not* count lines that begin with a +, some white space (could be no white space) and then a %, which is the way comment lines appear in a git diff of a MATLAB file. I tried doing this with grep, but couldn't figure out the correct regular expression.

    @Gdalya I hope the following pipeline will do this (no tests were perfomed): `cat matlab.git.diff | sed -e '/^\+[ ]*.*\%$/d' | wc -l`. `/regexp/d` deletes a line if it matches `regexp`, and `-e` turns on an adequate (IMNSHO) syntax for `regexp`.

    Why not simply `grep -v '^+ *%' matlab.git.diff | wc -l`?

    @celtschk , as long as this is usual in comment lines: is it possible to modify your `grep` command in order to consider as comment cases like `" + Hello"` (note the space(s) before the `+`)?

    @SopalajodeArrierez: Of course it is possible: `grep -v '^ *+' matlab.git.diff | wc -l` (I'm assuming the quote signs were not actually meant to be part of the line; I also assume that both lines with and without spaces in front of the `+` are meant to be comments; if at least one space is mandatory, either replace the star `*` with `\+`, or just add another space in front of the star). Probably instead of matching only spaces, you'd want to match arbitrary whitespace; for this replace the space with `[[:space:]]`. Note that I've also removed matching the `%` since it's not in your example.

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM