Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
watchman: try to detect symlink changes more reliably on osx
Summary: Looking for a way to mitigate facebook#172 wherein fsevents won't send us notifications about certain classes of changes involving symlinks; the main one we're aware of is where a symlink is changed to point to a file that doesn't (yet) exist on the filesystem. The approach in this diff is to add a configuration option (`recheck_dirs_interval_ms`) to specify how long we'll wait before we go back and re-examine candidate files. This parameter is checked after we've settled and defaults to the trigger settle interval on mac, but is turned off for Linux. Once we reach this point in time, we perform a query over the recently changed file nodes and collect all dir nodes or the dir-node parents of the files that have changed since we last checked for this category of change. If this list is empty then nothing has changed so we have no further work to do. Otherwise, we then search all symlink file nodes and add them to the set, and we then re-examine this complete list of file nodes and directories to try to discover changes that fsevents failed to inform us of. After this check we look to see how many nodes were fixed up; if there were any fixed up nodes we'll re-set the settle to its base. I found a couple of bugs while looking at this: 1. We didn't respect the recursive flag in `crawler` and would unconditionally recursively crawl every item contained in a dir. 2. If we re-examined a file node that was already !file->exists and realized that it still didn't exist, we'd observe a change for it even though it logically could not have changed (it was deleted and still is deleted). 3. YATOCTOU (Yet Another...) when we `lstat(2)`; we can traverse symlinks in the path, so we add `w_lstat` which uses our strict checking for the path. This fixes a bug where we'd want to assess a path that had previously been deleted when its parent transitioned from a dir to a symlink and then we'd flip-flop between exists and !exists. 4. I added an additional fstat in our opendir implementation on mac; there is a TOCTOU-style issue where we can return a dir handle for something that isn't actually a dir any more (for dir->symlink transitions) and the deferred detection of the error condition is suboptimal. We now stand a greater chance of detecting and dealing with this condition when we open the dir. Test Plan: `make integration` has no regressions. Then I repeatedly checked out between two revisions in a large repo where there are a large number of dir and symlink transitions: ``` $ hg co master ; rm .hg/watchman.state ; hg status --time $ hg co arcpatch-XYZ ; rm .hg/watchman.state ; hg status --time ``` without this diff, the second step would consistently report 3 files as modified in the `hg status` output. A subsequent `hg status` wouldn't show the same files due to the way that it records and re-checks notable files. To validate that we're not busy waiting or polling too aggressively: ``` $ watchman -p --server-encoding=json log-level debug | jq .log ``` is quiet once we've settled. Activity monitor shows 0% CPU in this state.
- Loading branch information