-
Notifications
You must be signed in to change notification settings - Fork 36
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
adding signac diff functionality #247
Conversation
Discussion on the appropriate scope and API of this PR is happening at #248. Once everything there is finalized we can update this PR. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will make a couple commits to this branch to add proper docs.
Reviewers @vyasr @cbkerr and author @atravitz: I made some final touch-ups to this PR, and I am satisfied with the feature's quality and completeness for merging to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested the performance, for 10,000 jobs with a trivial statepoint ({'i': i} for i in range(int(1e4))
) the diff took less than a second. For 100,000 jobs it can take a while to just open all the jobs (perhaps many minutes depending on whether any of them are in the FS cache and the nature of the filesystem), but the diffing itself is less than ten seconds, so my concerns about performance aren't too significant IMO. No need for progressbars.
Regarding the different use-cases outlined in #248, I think the existing API is sufficiently general to allow this. We would simply need to add an extra argument to the Python and CLI such as |
suggestions from @vyasr have been addressed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nearly there! Just a few very small changes.
Description
compares two or more jobs, returns jobs ids with only their keys that are not shared by all other jobs.
Motivation and Context
several requests for this functionality, often when two state points only vary by one or two values.
Resolves #248.
Types of Changes
1The change breaks (or has the potential to break) existing functionality.
Checklist:
If necessary: