Understanding of diff output
this is the original text line2 line3 line4 happy hacking !
this is the original text line2 line4 happy hacking ! GNU is not UNIX
if I do:
diff file1.txt file2.txtI get:
3d2 < line3 5a5 > GNU is not UNIX
How is the output generally interpreted? I think that
<means removed but what do
If I do:
$ diff -u file1.txt file2.txt --- file1.txt 2013-07-06 17:44:59.180000000 +0200 +++ file2.txt 2013-07-06 17:39:53.433000000 +0200 @@ -1,5 +1,5 @@ this is the original text line2 -line3 line4 happy hacking ! +GNU is not UNIX
The results are clearer but what does
@@ -1,5 +1,5 @@mean?
In your first
diffoutput (so called "normall diff") the meaning is as follows
<- denotes lines in file1.txt
>- denotes lines in file2.txt
5a5denote line numbers affected and which actions were performed.
dstands for deletion,
astands for adding (and
cstands for changing). the number on the left of the character is the line number in file1.txt, the number on the right is the line number in file2.txt. So
3d2tells you that the 3rd line in file1.txt was deleted and has the line number 2 in file2.txt (or better to say that after deletion the line counter went back to line number 2).
5a5tells you that the we started from line number 5 in file1.txt (which was actually empty after we deleted a line in previous action), added the line and this added line is the number 5 in file2.txt.
The output of
diff -ucommand is formatted a bit differently (so called "unified diff" format). Here
diffshows us a single piece of the text, instead of two separate texts. In the line
@@ -1,5 +1,5 @@the part
-1,5relates to file1.txt and the part
+1,5to file2.txt. They tell us that
diffwill show a piece of text, which is 5 lines long starting from line number 1 in file1.txt. And the same about the file2.txt -
diffshows us 5 lines starting from line 1.
As I have already said, the lines from both files are shown together
this is the original text line2 -line3 line4 happy hacking ! +GNU is not UNIX
-denotes the lines, which were deleted from file1.txt and
+denotes the lines, which were added.
diff file1 file2,
<means the line is missing in
>means the line is missing in
5a5can be ignored, they are commands for
patchwhich is often used with
Many *nix utilities offer TeXinfo manuals as well as the simpler
manpages. you can access these by running
info command, for example
info diff. In this case, the section your are interested in is:
2.4.2 Detailed Description of Normal Format
The normal output format consists of one or more hunks of differences; each hunk shows one area where the files differ. Normal format hunks look like this:
CHANGE-COMMAND < FROM-FILE-LINE < FROM-FILE-LINE... --- > TO-FILE-LINE > TO-FILE-LINE...
There are three types of change commands. Each consists of a line number or comma-separated range of lines in the first file, a single character indicating the kind of change to make, and a line number or comma-separated range of lines in the second file. All line numbers are the original line numbers in each file. The types of change commands are:
`LaR' Add the lines in range R of the second file after line L of the first file. For example, `8a12,15' means append lines 12-15 of file 2 after line 8 of file 1; or, if changing file 2 into file 1, delete lines 12-15 of file 2. `FcT' Replace the lines in range F of the first file with lines in range T of the second file. This is like a combined add and delete, but more compact. For example, `5,7c8,10' means change lines 5-7 of file 1 to read as lines 8-10 of file 2; or, if changing file 2 into file 1, change lines 8-10 of file 2 to read as lines 5-7 of file 1. `RdL' Delete the lines in range R from the first file; line L is where they would have appeared in the second file had they not been
The above answers are good. However as a beginner, I found them slightly difficult to understand and upon searching further, I found a very useful link: Linux Diff Command & Examples
The site explains the concept in a simple and easy to understand manner.
Diff command is easier to understand if you consider it this way :
Essentially, it outputs a set of instructions for how to change one file to make it identical to the second file.
Each of the following cases are explained well:
a for add, c for change, d for delete
I suggest to use:
diff -rupP file1.txt file2.txt > result.patch
Then, when you read
result.patch, you will instantly know the difference.
These are the meanings of the command line switches:
-u: shows line number
-p(small): shows differences in C functions
-P(capital): in case of multiple files the full path is shown
Rename parameters to help you remember what's going on:
> # Rather than: diff
i.e. the results operate on the file-to-edit (file1), applying various updates to it.
Similarity, I find these additional renames helpful to conceptualize the results:
d stands for delete, but 'remove' is more clearly what happens
a stands for add, ...... but 'insert' is more clearly what happens
Used like this:
2,4d1 --- D(s)-d-N --- delete ('remove') D line(s). Then sync at line N in both.
4a2,4 --- N-a-U(s) --- At line N, add ('insert') update's line(s) U
Note: Parameters for these two are nearly symmetric; just reversed left to right.
Change= 'remove & insert'.
2,4c5,6 --- R(s)-c-U(s) --- Remove R(s) lines, then insert updated lines U(s) in their place.
4a2,4 --- starting at 4, add (insert) updated lines 2-4 (i.e. "2,4" means lines 2, 3 and 4)
2,4d1 --- remove lines 2-4 (2, 3 and 4).
2,4c5,6 --- remove lines 2-4 (2, 3 and 4), and insert updated lines 5-6 (5 and 6).
*I know that these are stream editor commands, and designed to be processed by a machine. For example, it really is the ed command add, not insert, but it's more helpful for me to think of insert which is what in the end is being done to the file.
They use stream operations, but I prefer to think in terms of results.*