Undo update_suite ignoring precision difference for UNKNOWN #286

sim642 · 2021-07-14T10:37:18Z

In #196 among many other changes we merged this innocent-looking change: 0cb2c9c. This PR reverts that.

During #278 I looked over all the UNKNOWN asserts to see if making assert refine the state would cause problems. I saw cases where problems should have appeared but didn't due to the reverted change, which completely silenced all the cases where we still have an UNKNOWN annotation but suddenly output success instead. I only realized this when I used regtest (which awfully does similar expected comparisons within Goblint...) and saw different output.

Because we haven't completely gone through the entire regression test suite to recategorize each UNKNOWN to either UNKNOWN! or TODO, then we still have cases where UNKNOWN is used with the meaning that it must be unknown. Therefore the script should report if we're suddenly unsoundly reporting success or fail. See #110.

This reverts commit 0cb2c9c.

vogler · 2021-07-14T10:44:11Z

I only realized this when I used regtest (which awfully does similar expected comparisons within Goblint...) and saw different output.

I agree with 'awful' concerning duplication, but IMO it'd be better for goblint to handle these things instead of some ruby script - ideally we'd get rid of it in favor of an OCaml executable or some minimal shell script (parallel) with goblint doing most of the work.

michael-schwarz · 2021-07-14T10:59:03Z

@jerhard and I were tossing this back and forth over lunch and something we discussed at some point was using something along the lines of __goblint_check(a == 42, __GOBLINT_UNKNOWN) or __goblint_check(x ==17, __GOBLINT_SUCCESS) in the regression tests instead of having an assert(...) //UNKNOWN.

Then, the expected result is immediately obvious to Goblint itself, and Goblint can either output things for all asserts or only those where something failed without having to somehow do a regex on the comments.

Also, once could use the syntactic search to identify those places where __goblint_check(...) is called that are unreachable and then also warn there that the assert is unreachable. (These are the cases where warnings are missing in ./regtest.sh vs the ruby script).

The question that is still open is how to integrate warnings different from asserts into this setting.

Opinions?

vogler · 2021-07-14T12:28:47Z

Also, once could use the syntactic search to identify those places where __goblint_check(...) is called that are unreachable and then also warn there that the assert is unreachable. (These are the cases where warnings are missing in ./regtest.sh vs the ruby script).

Good point to keep in mind.
This (together with the rest?) could be done as post-processing in finalize of the assert-analysis.

using something along the lines of __goblint_check(a == 42, __GOBLINT_UNKNOWN) or __goblint_check(x ==17, __GOBLINT_SUCCESS) in ehe regression tests instead of having an assert(...) //UNKNOWN.

Pro:

type-safe (constant instead of comment)
available in CIL instead of having to read the comment from source

Con:

can't compile and execute it anymore without including another file?
assert is easily understood and portable (assert.h is available on every system)

sim642 · 2021-07-14T12:31:55Z

Tbh, I don't really like the idea of this regression testing check (dbg.regression for regtest) being inside Goblint itself at all. It's ideologically wrong to have the testing logic right in the middle of Goblint (which would be both the tester and the testee simultaneously) instead of being independent. And having to write __goblint_check(x == 17, __GOBLINT_SUCCESS) is so much more verbose than a standard assert(x == 17).

But that's all beside the point for this issue of update_suite not reporting possible unsoundness cases. If we want to discuss completely redoing the regression testing architecture, it's best to have another issue for that. And that won't solve the fact that we still have ambiguous uses of UNKNOWN.

vogler · 2021-07-14T13:20:28Z

And that won't solve the fact that we still have ambiguous uses of UNKNOWN.

#288 (comment)
If we did something like this, we'd at least see those diffs.

$ grep -E 'UNKNOWN$' -r tests/regression | wc -l
313

All these need to be replaced with UNKNOWN! xor TODO?

sim642 · 2021-07-14T13:33:44Z

All these need to be replaced with UNKNOWN! xor TODO?

This is the idea of #110, yes. Although we also have some cases like this:

analyzer/tests/regression/01-cpa/32-earlyglobs.c

Lines 7 to 9 in cd6cf51

    
           g = 100; 
        
           // This is only unknown because exp.earlyglobs is on 
        
           assert(g == 100); //UNKNOWN!

It's not UNKNOWN! because in the concrete semantics, it is known. It's not TODO because it shouldn't be fixed, because it's the whole point of earlyglobs. So probably one needs a third UNKNOWN!-like category for things which should remain unknown but only by our own choice of imprecision, not by concrete semantics.

sim642 · 2021-07-20T09:14:55Z

This revealed an issue with a congruences test now: #260 (comment).

sim642 added 3 commits July 14, 2021 13:19

Add update_suite sanity test

aaf2de3

Revert "Treat UNKNOWN and UNKNOWN! differently"

da8c514

This reverts commit 0cb2c9c.

Remove unknown from completed assert in 31/13

3da36e9

sim642 added bug testing precision labels Jul 14, 2021

vogler mentioned this pull request Jul 14, 2021

better regression testing #288

Open

sim642 mentioned this pull request Jul 14, 2021

Make assertion code consistent and remove asserting unknown #110

Open

sim642 merged commit 837b273 into master Jul 20, 2021

sim642 deleted the update_suite-unknown branch July 20, 2021 08:23

sim642 mentioned this pull request Jul 20, 2021

Add congruence integer domain and enable mutual refinement of the integer domains #260

Merged

michael-schwarz added a commit that referenced this pull request Jul 22, 2021

Test 37/05 Remove UNKNOWN and unskip. References #260. References #286

34cab0b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Undo update_suite ignoring precision difference for UNKNOWN #286

Undo update_suite ignoring precision difference for UNKNOWN #286

sim642 commented Jul 14, 2021

vogler commented Jul 14, 2021

michael-schwarz commented Jul 14, 2021 •

edited

Loading

vogler commented Jul 14, 2021 •

edited

Loading

sim642 commented Jul 14, 2021

vogler commented Jul 14, 2021

sim642 commented Jul 14, 2021

sim642 commented Jul 20, 2021

Undo update_suite ignoring precision difference for UNKNOWN #286

Undo update_suite ignoring precision difference for UNKNOWN #286

Conversation

sim642 commented Jul 14, 2021

vogler commented Jul 14, 2021

michael-schwarz commented Jul 14, 2021 • edited Loading

vogler commented Jul 14, 2021 • edited Loading

sim642 commented Jul 14, 2021

vogler commented Jul 14, 2021

sim642 commented Jul 14, 2021

sim642 commented Jul 20, 2021

michael-schwarz commented Jul 14, 2021 •

edited

Loading

vogler commented Jul 14, 2021 •

edited

Loading