Skip to content

Commit 2e66a0e

Browse files
authored
Update FAILURE_CASES.md
1 parent a2a093b commit 2e66a0e

File tree

1 file changed

+6
-0
lines changed

1 file changed

+6
-0
lines changed

docs/FAILURE_CASES.md

+6
Original file line numberDiff line numberDiff line change
@@ -6,8 +6,12 @@ Like all methods, XMem can fail. Here, we try to show some illustrative and fran
66

77
The first one is fast motion with similarly-looking objects that do not provide sufficient appearance clues for XMem to track. Below is an example from the YouTubeVOS validation set (0e8a6b63bb):
88

9+
https://user-images.githubusercontent.com/7107196/179459162-80b65a6c-439d-4239-819f-68804d9412e9.mp4
10+
911
And the source video:
1012

13+
https://user-images.githubusercontent.com/7107196/179459166-a48884d3-58e6-4d09-9118-7df51c5305bf.mp4
14+
1115
Technically it can be solved by using more positional and motion clues. XMem is not sufficiently proficient at those.
1216

1317
## Shot changes; saliency shift
@@ -16,4 +20,6 @@ Ever wondered why I did not include the final scene of Chika Dance when the roac
1620

1721
XMem seems to be attracted to any new salient object in the scene when the (true) target object is missing. By new I mean an object that did not appear (or had a different appearance) earlier in the video -- as XMem could not have a memory representation for that object. This happens a lot if the camera shot changes.
1822

23+
https://user-images.githubusercontent.com/7107196/179459190-d736937a-6925-4472-b46e-dcf94e1cafc0.mp4
24+
1925
Note that the first shot change is not as problematic.

0 commit comments

Comments
 (0)