Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix confusing default torch transform #21

Merged
merged 5 commits into from
Sep 25, 2024

Conversation

danielkovtun
Copy link
Contributor

Fixed

  • default structure2tensor_transform in the torch PinderDataset was using atom_coordinates for residue_coordinates. This PR fixes that by setting residue_coordinates to the coordinates of the c-alpha atoms

Changed

  • structure2tensor encoding was previously storing a one-hot encoding of element numbers in the returned atom_types key. This PR removes some of this confusion by using a consistent tensor dimension matching the convention for residue_types and introduces an additional key element_types to store the element integer encodings
  • atom_types is adjusted to represent atom names/types, not elements, and now stores the atom name integer encodings derived from pinder.core.utils.constants.ALL_ATOM_POSNS

Notes

Reminder: the encodings used by default in the PinderDataset and PPIDataset dataset classes are only example implementations. The default transform(s) can and likely should be customized to fit the end-user's featurization/loading workflow.

Closes #20 - thanks for catching and reporting the residue_coordinates bug @eswan01!

danielkovtun added 5 commits September 25, 2024 16:39
…d use simpler default encoding containing the integer index of the string defined in pinder constants
… for the residue_coordinates property; also add atom names and element names separately
Copy link

codecov bot commented Sep 25, 2024

Codecov Report

Attention: Patch coverage is 88.88889% with 1 line in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/pinder-core/pinder/core/loader/geodata.py 88.88% 0 Missing and 1 partial ⚠️
Flag Coverage Δ
80.05% <88.88%> (+0.07%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files with missing lines Coverage Δ
src/pinder-core/pinder/core/loader/geodata.py 79.34% <88.88%> (+0.27%) ⬆️

... and 6 files with indirect coverage changes

@danielkovtun danielkovtun merged commit c9ac589 into main Sep 25, 2024
5 checks passed
@danielkovtun danielkovtun deleted the fix-confusing-default-torch-transform branch September 25, 2024 21:27
danielkovtun added a commit that referenced this pull request Sep 25, 2024
* fix: element_types were left out of collate function after PR #21

* docs: re-run cached loader notebooks

---------

Co-authored-by: danielkovtun <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Confusing features in dataset returned by get_torch_loader
1 participant