[CI Problem] Figure out a way to assert certain tests aren't skipped in CI #11670

kellyseattle · 2022-06-10T19:47:03Z

Is there a way to verify that a PR hasn't resulted in an increased number of skipped tests? The "Tests" link in the Jenkins report shows skipped tests, but it looks like that currently includes all tests that are run on other shards, making it very difficult to tell what tests were actually skipped. (Example, Tests tab for PR#11313, which introduced the bug Mehrdad found, which lists 238k skipped tests, most of which were run on a different shard.)

Brainstorming on possible steps:

When running pytest_collection_modifyitems, if a test is running on another shard, remove it from items entirely instead of...Downside: If a test isn't assigned to any shard, it wouldn't show up in any report as being skipped.
When generating the "Tests" tab in the Jenkins report, only show a test as being skipped if it was skipped on all shards.
If the number of skipped tests has changed relative to the Jenkins report for the previous commit on main, highlight in the PR.
Would require additional github/Jenkins interactions, uncertain how difficult that would be.

Comments

we're working on a bot to comment a link to the docs on a PR. i think maybe we should also add in the # of added/removed/skipped tests somewhere, or perhaps see if Jenkins can know this (edited)
jenkins has some notion of test stats already, it tracks passed tests, failed tests, skipped tests (and ‘fixed tests’: those that were broken but are now passing, and ‘regressed tests’ which is the opposite), we could query for these in the bot for it + the base commit and post them on the PR
option 1 of Eric’s is the cleanest and easiest and I’ve never seen any problems with the sharding but it could silently break in the future
we’ll probably have to do something like option 2 eventually (i.e. post process the junit xml before sending to Jenkins)
would looking at just the number of tests ran (i.e. passed + failed tests) carry the same signal?
we could also run collection one time and produce a count or master list and then fail if each test isn't accounted for
there's also https://pypi.org/project/pytest-shard/
looks like they go the ‘remove from items ’ route: https://github.com/AdamGleave/pytest-shard/blob/master/pytest_shard/pytest_shard.py#L49-L53
added a graph of passed/failed tests to the main ci dashboard, the change is pretty clear
Thank you, and that definitely helps! I think the passed+failed tests graph is the best visual. The "skipped" tests has the advantage of being near zero, and therefore being noticed if/when it jumps much higher. Though, I'm not sure what the number of skipped tests for other reasons (e.g. no GPU on CPU nodes) is, so maybe it would be much farther from zero than I had been imagining.
we can always download a junit XML from a recent build and check it manually
test coverage from pytest or l/gcov for C++ would also tell us about these kinds of things but coverage comes with its own bag of worms
e could also have like a lint step (not sure if it coudl run in lint) that compares the set of skipped tests to an include list of regexes or a master "skippable test list"

CC: @hpanda-naut

areusch · 2022-08-02T17:58:24Z

@gigiblender is going to look at this one

This PR does the following: - When running on a shard, removes the tests that will not run on the current shard from the suite. We'll no longer get the message `Test running on shard X of Y` in the XML reports. This will result in cleaner test reports. - Adds the shard index or `no-shard` to the generated XML report file name. Currently, the reports generated when sharding is present are overwritten in S3 by the last shard to finish in the CI pipeline. This change is needed as part of #11670. - Since the same tests might run on different configurations (CPU, GPU), it uploads the result of each configuration in a subdirectory in S3 (e.g `/pytest-results/frontend_aarch64`).

This PR adds a script that does a diff of skipped tests between the latest successful build on the main and the current branch. Then, it posts a comment with the report on the open PR. #11670

This PR does the following: - When running on a shard, removes the tests that will not run on the current shard from the suite. We'll no longer get the message `Test running on shard X of Y` in the XML reports. This will result in cleaner test reports. - Adds the shard index or `no-shard` to the generated XML report file name. Currently, the reports generated when sharding is present are overwritten in S3 by the last shard to finish in the CI pipeline. This change is needed as part of apache#11670. - Since the same tests might run on different configurations (CPU, GPU), it uploads the result of each configuration in a subdirectory in S3 (e.g `/pytest-results/frontend_aarch64`).

This PR adds a script that does a diff of skipped tests between the latest successful build on the main and the current branch. Then, it posts a comment with the report on the open PR. apache#11670

hpanda-naut assigned areusch Jun 21, 2022

areusch removed their assignment Aug 2, 2022

gigiblender mentioned this issue Aug 7, 2022

[CI] Deduplicate and clean XML test reports #12332

Merged

gigiblender mentioned this issue Aug 15, 2022

[CI] Assert some unittests are not skipped in CI #12436

Merged

areusch added the needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it label Oct 19, 2022

hpanda-naut added dev:ci and removed needs-triage PRs or issues that need to be investigated by maintainers to find the right assignees to address it labels Oct 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI Problem] Figure out a way to assert certain tests aren't skipped in CI #11670

[CI Problem] Figure out a way to assert certain tests aren't skipped in CI #11670

kellyseattle commented Jun 10, 2022

areusch commented Aug 2, 2022