Make assertion code consistent and remove asserting unknown #110

vesalvojdani · 2020-10-08T15:18:02Z

I want to make some semi-brutal changes to how assertions are handled for our regression tests.

Instead of exclamation marks, I would use "TODO" or "SKIP" annotations. These will be tracked by the script and we can give a positive warning once such a TODO starts working. I want all our test annotations to be about the program, and goblint failures should explicitly marked. (I already made this change in the test script for the witness branch).
I want to eliminate UNKNOWN for assertions. The main problem with asserting unknown is that we have some tests that now fail if we make the analysis path sensitive. For each path the assertion is either known to be true or known to be false. I want to only allow no annotation = definitely succeeds and "MAY FAIL".
The goblint assertion code behaves differently for regression tests by not improving the value. Without the debug flag, it does refine the value based on the assertion. This is precisely the sort of thing that can be very painful when trying to understand why regression tests behave differently from running it normally.

This is a fairly easy fix that I plan to make soon after we submit the SV-COMP benches, but I thought I would check if there are any opinions about it.

michael-schwarz · 2020-10-08T15:42:10Z

I want to eliminate UNKNOWN for assertions. The main problem with asserting unknown is that we have some tests that now fail if we make the analysis path sensitive. For each path the assertion is either known to be true or known to be false. I want to only allow no annotation = definitely succeeds and "MAY FAIL".

I would argue against this. We quite often had the situation where the regression tests were passing, but there still was some unsoundness somewhere, where we claimed to know things that can either be true or false when executing the program. Having such UNKNOWN! assertions to me seems the best way to express this.
Also, I think these UNKNOWN! assertions can be implemented in a way that does not interact badly with path-sensitivity, if this is not already the case.

vesalvojdani · 2020-10-08T18:13:45Z

Can you think of any example where we really need uncertainty in both directions? Most of the time, we just want to make sure goblint isn't certain that some value is still the same when some other indirect access or thread could change it. For these cases, it suffices to say that the assertion may fail. I've seen only a few real examples, like join after thread creation, where I'd actually need to make sure goblint doesn't take the value from either branch. I would rather handle these rare cases by explicitly doing a new non-det branch in the test file and make two assertions that may fail.

I do not think you can easily implement "unknown" assertions in a way that interacts well with path-sensitivity. The unknown assertion is making claims over multiple paths, so it would have to be handled within the path-sensitivity functor where one would have to rejoin split paths just for "asserts" or collect generated warnings from different paths and check if all agree.

Anyway, I think it would be better if our assertion language corresponded to real program behavior. I'd rather force people to write tests in a way that assertions are true or may fail when running the program. And then combine this with TODO annotation to express whether goblint can answer this correctly at the moment. I do not like that our tests are very closely tied to how goblint analyzes programs.

michael-schwarz · 2020-10-09T05:42:48Z

One recent example from this month that comes to mind is this one that we discovered while fixing branching:

analyzer/tests/regression/27-inv_invariants/04-ints-not-interval.c

Lines 1 to 9 in a56c407

    
           //PARAM: --disable ana.int.def_exc --enable ana.int.interval 
        
           int main() { 
        
               int x; 
        
               if(!x) { 
        
               } else { 
        
                   assert(x==1); //UNKNOWN! 
        
               } 
        
           }

Due to a bug (fixed in 77da471) with how the LNot operation was handled in invariant, Goblint claimed to know that x -> [1,1] in the else branch.

How would one express that while x==1 may hold, it may also not, (i.e. if Goblint knows that it either holds definitely or does not hold, it is unsound) without the UNKNOWN! ?

assert(x==1) // MAY FAIL
- would still succeed even if Goblint computed x -> [1,1], right?
assert(x!=1) // MAY FAIL
- same here
assert(x!=0)
- would succeed for [1,1] (and would need to be marked // TODO for intervals as the interval domain can not represent this)

While this seems super trivial, it is exactly the sort of thing that should be part of the regression tests, as this test failing points to a very specific root cause, while locating the error in Goblint for the more elaborate ones is a lot harder.

vesalvojdani · 2020-10-09T20:05:13Z

Okay, so "may fail" is not the best label. I did not intend never fail to be LEQ may fail. By "may fail", I mean there exists at least one execution that violates it. In this particular example, where the fear is that Goblint is over-confident about a specific value, it would suffice to state assert(x == 1) // MAY FAIL. Maybe a better word is VIOLATION EXISTS or even keep calling it "unknown", but it's the other direction that I want to eliminate: the distinction between a violation existing and the assertion definitely failing.

Let's say you have a case where [0,1] are possible. If you assert x == 0 is unknown, then a path sensitive analysis will say [0,0] is definitely true, and [1,1] is definitely false, so nope, it's not unknown. This is what currently happens if you turn on path-sensitivity for thread ids, for example.

Basically, our current assert e is UNKNOWN means that there definitely exists an execution where e holds and there exists an execution where e does not hold. It would be better to have an assert_unknown macro for this that explicitly makes a non-deterministic choice and assert both e and !e as possibly failing. Since our assertions do not refine the value in debug mode, there isn't even the need to make a non-det choice; one can simply assert one after the other. It seems we very often do that anyway:

  if (x) {
    i = 5;
  } else {
    i = 7;
  }

  assert(i==5); // UNKNOWN!
  assert(i==7); // UNKNOWN!

This would work the same way. I'm still not sure about this debug mode not refining asserts. It is very useful to be able to query the current state like this.

michael-schwarz · 2020-10-12T05:06:28Z

Alright, I guess it was mostly a case of me not understanding what you meant by //MAY FAIL, thanks for clarifying!
No objections anymore then.

I'm still not sure about this debug mode not refining asserts. It is very useful to be able to query the current state like this.

I would leave that like it is, and not refine on asserts. One could rename assert to goblint_assert or something to clear up confusion there: If I run without debug enabled, these are just calls to some external function (that we should mark as safe) that takes an int. So behavior would be the same as when running with debug enabled, except that I don't output any messages about asserts holding or not.

sim642 · 2021-07-14T13:35:11Z

For completeness, I'll quote #286 (comment) here:

Although we also have some cases like this:

analyzer/tests/regression/01-cpa/32-earlyglobs.c

Lines 7 to 9 in cd6cf51

g = 100;

// This is only unknown because exp.earlyglobs is on

assert(g == 100); //UNKNOWN!

It's not UNKNOWN! because in the concrete semantics, it is known. It's not TODO because it shouldn't be fixed, because it's the whole point of earlyglobs. So probably one needs a third UNKNOWN!-like category for things which should remain unknown but only by our own choice of imprecision, not by concrete semantics.

vesalvojdani added the testing label Oct 8, 2020

vesalvojdani self-assigned this Oct 8, 2020

michael-schwarz mentioned this issue Mar 10, 2021

Some //UNKNOWN! annotations need to be removed #170

Closed

jerhard mentioned this issue May 7, 2021

tests shouldn't fail if the analysis is more precise than expected #41

Closed

sim642 mentioned this issue Jul 14, 2021

Undo update_suite ignoring precision difference for UNKNOWN #286

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make assertion code consistent and remove asserting unknown #110

Make assertion code consistent and remove asserting unknown #110

vesalvojdani commented Oct 8, 2020

michael-schwarz commented Oct 8, 2020 •

edited

Loading

vesalvojdani commented Oct 8, 2020

michael-schwarz commented Oct 9, 2020 •

edited

Loading

vesalvojdani commented Oct 9, 2020 •

edited

Loading

michael-schwarz commented Oct 12, 2020

sim642 commented Jul 14, 2021

Make assertion code consistent and remove asserting unknown #110

Make assertion code consistent and remove asserting unknown #110

Comments

vesalvojdani commented Oct 8, 2020

michael-schwarz commented Oct 8, 2020 • edited Loading

vesalvojdani commented Oct 8, 2020

michael-schwarz commented Oct 9, 2020 • edited Loading

vesalvojdani commented Oct 9, 2020 • edited Loading

michael-schwarz commented Oct 12, 2020

sim642 commented Jul 14, 2021

michael-schwarz commented Oct 8, 2020 •

edited

Loading

michael-schwarz commented Oct 9, 2020 •

edited

Loading

vesalvojdani commented Oct 9, 2020 •

edited

Loading