Fix confusing default torch transform #21
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Fixed
structure2tensor_transform
in thetorch
PinderDataset
was usingatom_coordinates
forresidue_coordinates
. This PR fixes that by setting residue_coordinates to the coordinates of the c-alpha atomsChanged
structure2tensor
encoding was previously storing a one-hot encoding of element numbers in the returnedatom_types
key. This PR removes some of this confusion by using a consistent tensor dimension matching the convention forresidue_types
and introduces an additional keyelement_types
to store the element integer encodingsatom_types
is adjusted to represent atom names/types, not elements, and now stores the atom name integer encodings derived frompinder.core.utils.constants.ALL_ATOM_POSNS
Notes
Reminder: the encodings used by default in the
PinderDataset
andPPIDataset
dataset classes are only example implementations. The default transform(s) can and likely should be customized to fit the end-user's featurization/loading workflow.Closes #20 - thanks for catching and reporting the
residue_coordinates
bug @eswan01!