doc: do not use `rm .git/index` when normalizing line endings
[git/git.git] / contrib / diff-highlight / README
1 diff-highlight
2 ==============
3
4 Line oriented diffs are great for reviewing code, because for most
5 hunks, you want to see the old and the new segments of code next to each
6 other. Sometimes, though, when an old line and a new line are very
7 similar, it's hard to immediately see the difference.
8
9 You can use "--color-words" to highlight only the changed portions of
10 lines. However, this can often be hard to read for code, as it loses
11 the line structure, and you end up with oddly formatted bits.
12
13 Instead, this script post-processes the line-oriented diff, finds pairs
14 of lines, and highlights the differing segments. It's currently very
15 simple and stupid about doing these tasks. In particular:
16
17 1. It will only highlight hunks in which the number of removed and
18 added lines is the same, and it will pair lines within the hunk by
19 position (so the first removed line is compared to the first added
20 line, and so forth). This is simple and tends to work well in
21 practice. More complex changes don't highlight well, so we tend to
22 exclude them due to the "same number of removed and added lines"
23 restriction. Or even if we do try to highlight them, they end up
24 not highlighting because of our "don't highlight if the whole line
25 would be highlighted" rule.
26
27 2. It will find the common prefix and suffix of two lines, and
28 consider everything in the middle to be "different". It could
29 instead do a real diff of the characters between the two lines and
30 find common subsequences. However, the point of the highlight is to
31 call attention to a certain area. Even if some small subset of the
32 highlighted area actually didn't change, that's OK. In practice it
33 ends up being more readable to just have a single blob on the line
34 showing the interesting bit.
35
36 The goal of the script is therefore not to be exact about highlighting
37 changes, but to call attention to areas of interest without being
38 visually distracting. Non-diff lines and existing diff coloration is
39 preserved; the intent is that the output should look exactly the same as
40 the input, except for the occasional highlight.
41
42 Use
43 ---
44
45 You can try out the diff-highlight program with:
46
47 ---------------------------------------------
48 git log -p --color | /path/to/diff-highlight
49 ---------------------------------------------
50
51 If you want to use it all the time, drop it in your $PATH and put the
52 following in your git configuration:
53
54 ---------------------------------------------
55 [pager]
56 log = diff-highlight | less
57 show = diff-highlight | less
58 diff = diff-highlight | less
59 ---------------------------------------------
60
61
62 Color Config
63 ------------
64
65 You can configure the highlight colors and attributes using git's
66 config. The colors for "old" and "new" lines can be specified
67 independently. There are two "modes" of configuration:
68
69 1. You can specify a "highlight" color and a matching "reset" color.
70 This will retain any existing colors in the diff, and apply the
71 "highlight" and "reset" colors before and after the highlighted
72 portion.
73
74 2. You can specify a "normal" color and a "highlight" color. In this
75 case, existing colors are dropped from that line. The non-highlighted
76 bits of the line get the "normal" color, and the highlights get the
77 "highlight" color.
78
79 If no "new" colors are specified, they default to the "old" colors. If
80 no "old" colors are specified, the default is to reverse the foreground
81 and background for highlighted portions.
82
83 Examples:
84
85 ---------------------------------------------
86 # Underline highlighted portions
87 [color "diff-highlight"]
88 oldHighlight = ul
89 oldReset = noul
90 ---------------------------------------------
91
92 ---------------------------------------------
93 # Varying background intensities
94 [color "diff-highlight"]
95 oldNormal = "black #f8cbcb"
96 oldHighlight = "black #ffaaaa"
97 newNormal = "black #cbeecb"
98 newHighlight = "black #aaffaa"
99 ---------------------------------------------
100
101
102 Bugs
103 ----
104
105 Because diff-highlight relies on heuristics to guess which parts of
106 changes are important, there are some cases where the highlighting is
107 more distracting than useful. Fortunately, these cases are rare in
108 practice, and when they do occur, the worst case is simply a little
109 extra highlighting. This section documents some cases known to be
110 sub-optimal, in case somebody feels like working on improving the
111 heuristics.
112
113 1. Two changes on the same line get highlighted in a blob. For example,
114 highlighting:
115
116 ----------------------------------------------
117 -foo(buf, size);
118 +foo(obj->buf, obj->size);
119 ----------------------------------------------
120
121 yields (where the inside of "+{}" would be highlighted):
122
123 ----------------------------------------------
124 -foo(buf, size);
125 +foo(+{obj->buf, obj->}size);
126 ----------------------------------------------
127
128 whereas a more semantically meaningful output would be:
129
130 ----------------------------------------------
131 -foo(buf, size);
132 +foo(+{obj->}buf, +{obj->}size);
133 ----------------------------------------------
134
135 Note that doing this right would probably involve a set of
136 content-specific boundary patterns, similar to word-diff. Otherwise
137 you get junk like:
138
139 -----------------------------------------------------
140 -this line has some -{i}nt-{ere}sti-{ng} text on it
141 +this line has some +{fa}nt+{a}sti+{c} text on it
142 -----------------------------------------------------
143
144 which is less readable than the current output.
145
146 2. The multi-line matching assumes that lines in the pre- and post-image
147 match by position. This is often the case, but can be fooled when a
148 line is removed from the top and a new one added at the bottom (or
149 vice versa). Unless the lines in the middle are also changed, diffs
150 will show this as two hunks, and it will not get highlighted at all
151 (which is good). But if the lines in the middle are changed, the
152 highlighting can be misleading. Here's a pathological case:
153
154 -----------------------------------------------------
155 -one
156 -two
157 -three
158 -four
159 +two 2
160 +three 3
161 +four 4
162 +five 5
163 -----------------------------------------------------
164
165 which gets highlighted as:
166
167 -----------------------------------------------------
168 -one
169 -t-{wo}
170 -three
171 -f-{our}
172 +two 2
173 +t+{hree 3}
174 +four 4
175 +f+{ive 5}
176 -----------------------------------------------------
177
178 because it matches "two" to "three 3", and so forth. It would be
179 nicer as:
180
181 -----------------------------------------------------
182 -one
183 -two
184 -three
185 -four
186 +two +{2}
187 +three +{3}
188 +four +{4}
189 +five 5
190 -----------------------------------------------------
191
192 which would probably involve pre-matching the lines into pairs
193 according to some heuristic.