Understanding of diff output

  • I have file1.txt

    this is the original text  
    line2  
    line3  
    line4  
    happy hacking !  
    

    and file2.txt

    this is the original text  
    line2  
    line4  
    happy hacking !  
    GNU is not UNIX  
    

    if I do: diff file1.txt file2.txt I get:

    3d2  
    < line3  
    5a5  
    > GNU is not UNIX  
    

    How is the output generally interpreted? I think that < means removed but what do 3d2 or 5a5 mean?

    If I do:

    $ diff -u file1.txt file2.txt  
    --- file1.txt        2013-07-06 17:44:59.180000000 +0200  
    +++ file2.txt        2013-07-06 17:39:53.433000000 +0200  
    @@ -1,5 +1,5 @@  
     this is the original text  
     line2  
    -line3  
     line4  
     happy hacking !  
    +GNU is not UNIX  
    

    The results are clearer but what does @@ -1,5 +1,5 @@ mean?

    Thanks for nothing `man` entry!

  • In your first diff output (so called "normall diff") the meaning is as follows

    < - denotes lines in file1.txt

    > - denotes lines in file2.txt

    3d2 and 5a5 denote line numbers affected and which actions were performed. d stands for deletion, a stands for adding (and c stands for changing). the number on the left of the character is the line number in file1.txt, the number on the right is the line number in file2.txt. So 3d2 tells you that the 3rd line in file1.txt was deleted and has the line number 2 in file2.txt (or better to say that after deletion the line counter went back to line number 2). 5a5 tells you that the we started from line number 5 in file1.txt (which was actually empty after we deleted a line in previous action), added the line and this added line is the number 5 in file2.txt.

    The output of diff -u command is formatted a bit differently (so called "unified diff" format). Here diff shows us a single piece of the text, instead of two separate texts. In the line @@ -1,5 +1,5 @@ the part -1,5 relates to file1.txt and the part +1,5 to file2.txt. They tell us that diff will show a piece of text, which is 5 lines long starting from line number 1 in file1.txt. And the same about the file2.txt - diff shows us 5 lines starting from line 1.

    As I have already said, the lines from both files are shown together

     this is the original text  
     line2  
    -line3  
     line4  
     happy hacking !  
    +GNU is not UNIX  
    

    Here - denotes the lines, which were deleted from file1.txt and + denotes the lines, which were added.

  • Summary:

    Given a diff file1 file2, < means the line is missing in file2 and >means the line is missing in file1. The 3d2 and 5a5 can be ignored, they are commands for patch which is often used with diff.

    Full Answer:

    Many *nix utilities offer TeXinfo manuals as well as the simpler man pages. you can access these by running info command, for example info diff. In this case, the section your are interested in is:

    2.4.2 Detailed Description of Normal Format


    The normal output format consists of one or more hunks of differences; each hunk shows one area where the files differ. Normal format hunks look like this:

     CHANGE-COMMAND
     < FROM-FILE-LINE
     < FROM-FILE-LINE...
     ---
     > TO-FILE-LINE
     > TO-FILE-LINE...
    

    There are three types of change commands. Each consists of a line number or comma-separated range of lines in the first file, a single character indicating the kind of change to make, and a line number or comma-separated range of lines in the second file. All line numbers are the original line numbers in each file. The types of change commands are:

    `LaR'
         Add the lines in range R of the second file after line L of the
         first file.  For example, `8a12,15' means append lines 12-15 of
         file 2 after line 8 of file 1; or, if changing file 2 into file 1,
         delete lines 12-15 of file 2.
    
    `FcT'
         Replace the lines in range F of the first file with lines in range
         T of the second file.  This is like a combined add and delete, but
         more compact.  For example, `5,7c8,10' means change lines 5-7 of
         file 1 to read as lines 8-10 of file 2; or, if changing file 2 into
         file 1, change lines 8-10 of file 2 to read as lines 5-7 of file 1.
    
    `RdL'
         Delete the lines in range R from the first file; line L is where
         they would have appeared in the second file had they not been
    
  • The above answers are good. However as a beginner, I found them slightly difficult to understand and upon searching further, I found a very useful link: Linux Diff Command & Examples

    The site explains the concept in a simple and easy to understand manner.

    Diff command is easier to understand if you consider it this way :

    Essentially, it outputs a set of instructions for how to change one file to make it identical to the second file.

    Each of the following cases are explained well:

    a for add, c for change, d for delete

  • I suggest to use:

    diff -rupP file1.txt file2.txt > result.patch

    Then, when you read result.patch, you will instantly know the difference.

    These are the meanings of the command line switches:

    -r: recursive

    -u: shows line number

    -p(small): shows differences in C functions

    -P(capital): in case of multiple files the full path is shown

  • Rename parameters to help you remember what's going on:

    diff <file-to-edit> <file-with-updates> # Rather than: difff1f2

    i.e. the results operate on the file-to-edit (file1), applying various updates to it.


    Similarity, I find these additional renames helpful to conceptualize the results:

    d stands for delete, but 'remove' is more clearly what happens
    a stands for add, ...... but 'insert' is more clearly what happens

    Used like this:

    2,4d1 --- D(s)-d-N --- delete ('remove') D line(s). Then sync at line N in both.

    4a2,4 --- N-a-U(s) --- At line N, add ('insert') update's line(s) U

    Note: Parameters for these two are nearly symmetric; just reversed left to right.


    Change= 'remove & insert'.

    2,4c5,6 --- R(s)-c-U(s) --- Remove R(s) lines, then insert updated lines U(s) in their place.



    For example:

    4a2,4 --- starting at 4, add (insert) updated lines 2-4 (i.e. "2,4" means lines 2, 3 and 4)

    2,4d1 --- remove lines 2-4 (2, 3 and 4).

    2,4c5,6 --- remove lines 2-4 (2, 3 and 4), and insert updated lines 5-6 (5 and 6).


    • *I know that these are stream editor commands, and designed to be processed by a machine. For example, it really is the ed command add, not insert, but it's more helpful for me to think of insert which is what in the end is being done to the file.

    They use stream operations, but I prefer to think in terms of results.*

License under CC-BY-SA with attribution


Content dated before 6/26/2020 9:53 AM