-
Notifications
You must be signed in to change notification settings - Fork 105
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New parameter 'seed' and code cleanup on geom_sina(). #104
Conversation
@thomasp85 can you have a look in merging the pull request? It only affects sina.R and it fixes a reported bug and adds a requested seed parameter. |
I generally hate having seed as an argument. It will affect all code that runs afterwards, but this is not apparent to the user, who often thinks that it only affects the code within the function... setting Is there a very real reason to have it as an argument? |
It mainly serves reproducibility purposes. I tried consecutive calls of geom_sina, one with seed and one without and there is no memory of the seed in the second one.
|
The memory is in how it affects all subsequent sampling. p <- ggplot(midwest, aes(state, popdensity)) + scale_y_log10()
p + geom_sina(seed = 1) + geom_sina(color = "2")
sample(10) If you call the above block repeatedly, you'll notice that the sample call will always return the same sequence, even though it is supposed to be random. This will confuse people that only expect the seed argument to affect the produced plot and not code that follows it. On the other hand, the following block makes it obvious that you are meddling with the global state: set.seed(10)
p <- ggplot(midwest, aes(state, popdensity)) + scale_y_log10()
p + geom_sina() + geom_sina(color = "2")
sample(10) and people will not be surprised that all code will be affected by it. So, if the seed argument is to remain it will need to be implemented in such a way that it does not affect the global environment. One way of achieving that is to getting a new seed prior to setting it so your code will be: if (!is.null(params$seed)) {
new_seed <- sample(.Machine$integer.max, 1)
set.seed(params$seed)
on.exit(set.seed(new_seed))
} Another way would simply be to educate users about using |
I see, thanks for clarifying that. I followed the first of your suggestions and removed the parameter completely, as it seems like the more neat solution. |
Have you been requested to add it? In that case you should probably forward the |
I just did. Is the rest of the request ok? I am working on this request on a separate branch and would like to get done with this one first before I make any new merges. |
Fixes #90
I reset my master fork and recreated the previous pull request.