Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

regression2sv-benchmarks improvements #383

Open
7 tasks
sim642 opened this issue Oct 11, 2021 · 0 comments
Open
7 tasks

regression2sv-benchmarks improvements #383

sim642 opened this issue Oct 11, 2021 · 0 comments
Labels

Comments

@sim642
Copy link
Member

sim642 commented Oct 11, 2021

This issue is to document possible improvements and fixes to the regression2sv-benchmarks script. Not intended to be done for SV-COMP 2022 any more.

  • Support multi-property benchmarks.

    Technically this is already possible to an extent (data race and assert properties might end up in the same benchmark), but the validity of these combinations is questionable. For example, benchmarks containing data races wouldn't be valid for unreach-call at all.
    Also, our testing language doesn't allow specifying expected output for assert and race on the same line, which makes it difficult to properly label such multi-property benchmarks.

  • Support successful asserts with irrelevant comments (i.e. not indicative of any other result).

    We have some asserts, where the comment is just a human-readable explanation, but this currently makes it unacceptable for the script. This was to ensure that we don't accidentally submit untrue assertions.
    Extra checks have to be put in place to consider comments that aren't failing or unknown or ignored to be handled as succeeding.

  • Support asserts with NOWARN.

    Sometimes we do this to indicate unreachable code (i.e. Goblint shouldn't say anything about the assertion success/unknown/fail). These can be converted to reach_error() with expected unreach-call verdict being true (given compatibility with other asserts).

  • Support always true asserts.

    Sometimes we use always true asserts (e.g. assert(1)) to indicate that the location should be reachable. As an SV-COMP benchmark, it would always be correct since it never fails, but such benchmark doesn't actually require the verifier to know it's reachable.
    Instead these should be converted to reach_error() with expected unreach-call verdict being false (given compatibility with other asserts).

  • Check reachability of all normal asserts.

    Generalizing the previous point about always true asserts, in our testing language we actually require all asserts to be (may-)reachable. Therefore it would be possible to generate a special reachability variant for each assert individually, by converting it to reach_error() with expected unreach-call verdict being false.
    This requires that all of our asserts are actually reachable and aren't simply bogus checks on unreachable code (which should be explicitly NOWARN).

  • Change from 32bit to 64bit architecture?

    Currently all benchmarks are generated to have 32bit architecture, but Goblint doesn't actually care and analyzes everything with the architecture where it was compiled on (make sizes of primitive types configurable with current machine as default #54). Since we all probably use 64bit machines and some singlethreaded benchmarks do checks specific to 64bit integer ranges, 64bit would probably be more appropriate.

  • Generate all singlethreaded benchmarks.

    Currently the script is only restricted to multithreaded benchmarks. This is purely for logistical reasons:

    • We have much more singlethreaded tests and they'd have to be carefully considered for sv-benchmarks submission.
    • In sv-benchmarks the directory c/goblint-regression is part of ConcurrencySafety, so only multithreaded benchmarks should go in there. To submit singlethreaded benchmarks we would have to split the directories in sv-benchmarks first.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

1 participant