Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deterministic miRDeep2_core_algorithm.pl #65

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open

Deterministic miRDeep2_core_algorithm.pl #65

wants to merge 2 commits into from

Conversation

Pfaendner
Copy link

Modify miRDeep2_core_algorithm.pl basically at two sites:

  1. sub find_mature_query: Thus far, the read with the highest frequency is chosen as the mature sequence of the potential precursor. However, if 2 or more reads have the same maximal read count, the process is 'random'. An additional lexicographic ordering makes read selection deterministic.
  2. sub print_hash_comp: Additional lexicographic ordering to make the printing of the signature deterministic.

Christian Pfaendner added 2 commits March 30, 2020 16:00
The read with the highest frequency is used to determine the mature sequence of the candidate precursor.
If 2 or more reads have the same frequency, one of them is chosen 'by random' due to the use of hashes.
Now, the reads get additionally lexicographically ordered, making this process deterministic and replicable.
The signature is ordered according to begin and end position in the potential precursor.
Now, they are additionally lexicographically ordered, making the output deterministic and replicable.
@mschilli87
Copy link
Member

@Pfaendner: Thank you for the contribution and apologies for the long silence.
From reading #66 I understand you are fine with setting the Perl seed as suggested by @Drmirdeep instead of the changes suggested in this PR?

@mschilli87 mschilli87 removed the request for review from Drmirdeep October 13, 2021 10:53
@Pfaendner
Copy link
Author

For me the Perl seed is a quick and dirty solution with the disadvantage that its behaviour must not be preserved across different versions of Perl and that it is not very user friendly. Thus, I have made the code deterministic (including randfold) for my use-case. If you are interested in the code, please let me know. I am glad to provide it.

@mschilli87
Copy link
Member

mschilli87 commented Nov 8, 2021

@Pfaendner: That would be for @Drmirdeep to decide. AFAICT there was no interesting in this feature due to the additional maintenance burden.
But you could definitely link your code here for others to find or even update this PR based on latest master if you don't mind the extra bit of work. If you keep your fork up-to-date with upstream, it might not even require any additional work at your end.
Either way thank you for following up (and for your contribution in the first place of course).

@mschilli87 mschilli87 removed the stale label Nov 8, 2021
@Pfaendner
Copy link
Author

Ok. I will update this pull request until the end of next week. Thanks for your response, too!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants