Skip to content

Basic implementation of partial scoring. #3047

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

meisterT
Copy link
Member

@meisterT meisterT commented Aug 3, 2025

This follows the legacy spec in:
https://icpc.io/problem-package-format/spec/legacy.html#graders

We might decided at a later point whether we want to support this or just the new (currently draft) spec, but basic constructs of the change should be the same.

Not yet implemented:

  • any form of testing
  • anything on the team interface
  • shadowing

Part of #2518

This follows the legacy spec in:
https://icpc.io/problem-package-format/spec/legacy.html#graders

We might decided at a later point whether we want to support this or
just the new (currently draft) spec, but basic constructs of the change
should be the same.

Not yet implemented:
- any form of testing
- anything on the team interface
- shadowing

Part of DOMjudge#2518
@meisterT
Copy link
Member Author

meisterT commented Aug 5, 2025

Some notes:

  • there's a to-do already, but we need to talk about internal precision and precision in presentation
  • we should add to the jury submission page some info that allows following the result calculation with test case groups (should be easy to do with the existing function)
  • the number of tries on a scoring scoreboard should really be the number of total tries (not until the first correct attempt like for pass-fail)
  • it seems we are not handling the accept_if_any_accepted case always correctly, needs more testing
  • while we can ingest results from a primary CCS now, we don't compare the scores yet for shadow differences
  • the layout of the shadow differences table needs to be fixed
  • optimization problems are by their very nature less deterministic as they incentivize teams to do stuff like try this approach for that amount of time, then this other approach for that amount of time and generally the use of RNG is much more prevalent

Copy link
Member

@nickygerritsen nickygerritsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial round of review done. Of course there are the todos and comments from PHPStan / PHPCS

@@ -211,6 +211,11 @@ class Problem extends BaseApiEntity implements
#[Serializer\Exclude]
private Collection $languages;

#[ORM\ManyToOne(inversedBy: 'problems')]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not seeing this reverse relation? And should it be manytoone? When does a testcasegroup have many problems?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And for me calling it "parent testcase group" sounds weird, maybe "root testcase group"?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did push the change for the reverse relation.

@@ -1855,6 +1859,10 @@ protected function importRun(Event $event, EventData $data): void
->setRuntime($runTime)
->setResult($judgementTypeId === null ? null : $verdictsFlipped[$judgementTypeId]);

if (isset($data->score)) {
$run->setScore($data->score);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment from PHPStan is not lying 😛

@@ -259,6 +260,8 @@ public function calculateScoreRow(

$contestStartTime = $contest->getStarttime();

$pointsJury = "0";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not $scoreJury? We have points already for multipoint problems, so this might be confusing?

Comment on lines +62 to +67
/**
* Returns a two-element array:
* - The score for the testcase group, or null if not all results are ready.
* - The result for the testcase group, or null if not all results are ready.
* @return array{string|null, string|null}
*/
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would return a string indexed array to make this more clear:

Suggested change
/**
* Returns a two-element array:
* - The score for the testcase group, or null if not all results are ready.
* - The result for the testcase group, or null if not all results are ready.
* @return array{string|null, string|null}
*/
/**
* Returns a two-element array:
* - The score for the testcase group, or null if not all results are ready.
* - The result for the testcase group, or null if not all results are ready.
* @return array{score: string|null, result: string|null}
*/

@nickygerritsen
Copy link
Member

Some notes:

  • the number of tries on a scoring scoreboard should really be the number of total tries (not until the first correct attempt like for pass-fail)

I changed this to show the total number of submissions. I wonder if this is what we want? Say you have 5 submissions:

  1. AC, score = 50
  2. AC, score = 75
  3. AC, score = 100
  4. WA
  5. WA

Do we want 3 or 5 tries? And what if the fourth one is AC, score = 100?

Also please check my logic. Basically I do not stop processing submissions after the first correct one for scoring problems (and have changed the FTS logic since this was dependent on this assumption).

  • the layout of the shadow differences table needs to be fixed

Fixed

I also fixed some other things:

  • I added the reverse relation for problem->testcaseGroup.
  • We now correctly import (and export) the scoreboard type from contest.yaml/json (but for this I have renamed the SCORING enum case to SCORE as that is what the spec says).
  • I fixed the single team scoreboard row, it didn't show the scores.
  • I added the score to the team submission list and detail modal.
  • I fixed an issue where if we calculated the scorecache for a submission without judgement, it would crash.
  • I now check the scoreboard type when starting up the shadow script.
  • I fixed some logic for importing an event with no items, as that is what we had in the replay for the NAC challenge.
  • I also fixed importing the ZIPs from that challenge that contained folders.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants