@@ -179,6 +179,107 @@ in the message does not reject such hints stored outside baked-in data
179
179
structure, which allows mistakes to be corrected without affecting the
180
180
real history".
181
181
182
+ * [ BUG in git diff-index
] ( https://public-inbox.org/git/[email protected] / )
183
+
184
+ In March 2016 Andy Lowry described what he believed to be a bug in
185
+ ` git diff-index ` . After creating a file and then touching it, `git
186
+ diff-index` reports the file has changed, and reports "bogus
187
+ destination SHA":
188
+
189
+ ```
190
+ $ git diff-index HEAD
191
+ :100644 100644 e69de29bb2d1d6434b8b29ae775ad8c2e48c5391
192
+ 0000000000000000000000000000000000000000 M A
193
+ ```
194
+
195
+ But then using ` git diff ` followed by ` git diff-index ` again reports
196
+ no changes:
197
+
198
+ ```
199
+ $ git diff
200
+ $ git diff-index HEAD
201
+ $
202
+ ```
203
+
204
+ Carlos Martín Nieto first replied to Andy that "this is expected and
205
+ matches the documentation".
206
+
207
+ Indeed the ` git diff-index ` documentation tells that this command has
208
+ 2 different operating modes. The "cached" mode, when ` --cached ` is
209
+ specified, makes Git trust the index file entirely. While the
210
+ "non-cached" mode also shows files that don’t match the stat state as
211
+ being "tentatively changed" using "the magic 'all-zero' sha1".
212
+
213
+ Jeff King, alias Peff, also pointed to the same documentation and
214
+ explained in more details how the command works and the historical
215
+ background:
216
+
217
+ > Back when diff-index was written, it was generally assumed that
218
+ > scripts would refresh the index as their first operation, and then
219
+ > proceed to do one or more operations like diff-index, which would
220
+ > rely on the refresh from the first step.
221
+
222
+ Peff wrote that running ` git diff ` does refresh the index, which is
223
+ why Andy's last step shows no diff.
224
+
225
+ Andy thanked Peff, but replied that he is after "a tree-to-filesystem
226
+ comparison, regardless of index":
227
+
228
+ > I've currently got a "diff" thrown in as a "work-around" before
229
+ > "diff-index", but now I understand it's not a workaround at all. If
230
+ > there's a better way to achieve what I'm after, I'd appreciate a
231
+ > tip.
232
+
233
+ Peff suggested using ` git update-index --refresh ` rather than `git
234
+ diff` to just refresh the index.
235
+
236
+ Andy appreciated this answer, though he described his use-case in
237
+ details and asked:
238
+
239
+ > So I think now that the script should do "update-index --refresh"
240
+ > followed by "diff-index --quiet HEAD". Sound correct?
241
+
242
+ Junio confirmed that:
243
+
244
+ > Yes. That always been one of the kosher ways for any script to make
245
+ > sure that the files in the working tree that are tracked have not
246
+ > been modified relative to HEAD (assuming that the index matches
247
+ > HEAD).
248
+
249
+ and then described a few other "kosher ways" along with their
250
+ benefits.
251
+
252
+ Recently Marc Herbert then chimed into this 18 month old discussion
253
+ adding the Linux kernel mailing list and a number of kernel developers
254
+ in CC. Saying:
255
+
256
+ > Too bad kernel/scripts/setlocalversion didn't get the memo
257
+
258
+ Marc pointed to
259
+ [ a commit from 2013 in the Linux kernel repo] ( https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=cdf2bc632ebc9ef51 )
260
+ that removes a ` git update-index ` call from the "setlocalversion"
261
+ script. And he added that "this causes a spurious '-dirty' suffix when
262
+ building from a directory copy".
263
+
264
+ In the "PS:" part of his email Marc also wondered if there is a
265
+ "robots.txt" file in https://public-inbox.org that blocks indexing the
266
+ site contents as Google couldn't find the thread he was replying to on
267
+ the site.
268
+
269
+ Eric Wong, the creator and maintainer of public-inbox.org replied:
270
+
271
+ > There's no blocks on public-inbox.org and I'm completely against
272
+ > any sort of blocking/throttling. Maybe there's too many pages
273
+ > to index? Or the Message-IDs in URLs are too ugly/scary? Not
274
+ > sure what to do about that...
275
+ >
276
+ > Anyways, I just put up a robots.txt with Crawl-Delay: 1, since I
277
+ > seem to recall crawlers use a more conservative delay by default:
278
+ >
279
+ > ==> https://public-inbox.org/robots.txt <==
280
+ > User-Agent: *
281
+ > Crawl-Delay: 1
282
+
182
283
## Developer Spotlight: Philip Oakley
183
284
184
285
* Who are you and what do you do?
0 commit comments