-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Mixscape reproducibility #683
Conversation
The test fixes are unrelated and I'm hoping to fix them next week. Just ping me if you want a review, please. |
Originally, I wanted to add more to this PR, but I think it’s easier to create a separate one. Feel free to review, if you have time @Zethson! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great, thank you!
-
Input required: The perturbation signature was previously calculated by subtracting the average expression of the N nearest control (NT) cells from the observed expression. But, in the original implementation, this subtraction is performed in the reverse order (see here). This only affects the sign of the result, but aligning our calculation with the original implementation would make it easier to compare scores. I suggest adopting our calculation to the original approach, but happy to discuss this further.
Yes, let's adopt the way they did it. Whenever we can align with them, let's do it.
- I had to merge
main
into your branch to get the CI to run and hopefully pass. You may now also have the code and issue of Why is np.round().astype("int64") applied to post_prob in Mixscape? #694. I'll have a look at this ASAP but your results may now be different.
copy: bool | None = False, | ||
): | ||
"""Identify perturbed and non-perturbed gRNA expressing cells that accounts for multiple treatments/conditions/chemical perturbations. | ||
|
||
The implementation resembles https://satijalab.org/seurat/reference/runmixscape | ||
The implementation resembles https://satijalab.org/seurat/reference/runmixscape. Note that in the original implementation, the | ||
perturbation signature is calculated on unscaled data by default and we therefore recommend to do the same. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we test somewhere whether the data is unscaled and print a warning if it is? We can at least test whether the input is count or normalized data. There's code somewhere in pertpy for this I think. If not, please ping me.
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #683 +/- ##
==========================================
- Coverage 65.56% 64.89% -0.68%
==========================================
Files 47 46 -1
Lines 6105 5994 -111
==========================================
- Hits 4003 3890 -113
- Misses 2102 2104 +2
|
PR Checklist
docs
is updatedDescription of changes
GaussianMixture
model. I added a seed parameter topt.tl.Mixscape.mixscape
.n_dims
, which specifies the number of dimensions from the chosen representation (usually PCA) to be used for determining the nearest neighbors. I have added this parameter to our implementation ofpt.tl.Mixscape.perturbation_signature
and set its default value to 15, which is also the default used in the original implementation.pt.tl.Mixscape.perturbation_signature
that not only verifies that a layer has been added to the AnnData object but also checks the correctness of the computed scores