-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue with read.data #30
Comments
Here is a reproducible example (take any four datasets you have localy but don’t change the origins and experiments labels): pooled_env <- initialize.project(datasets = c("LCMV1", "LCMV2","LCMV3", "LCMV4"), pooled_env <- read.data(pooled_env) table(combined$dataset.labels) |
@asmagen The problem after removing the unique command is that rerunning cause the length of origins, experiments, and dataset.labels are modified again, which causes error in the subsequent workflow. See the following output.
Notice how the length of origins, experiments and dataset.labels are inconsistent across the two run. Any suggestion on how this should be resolved? The easiest fix is to just disable rerun since we are not going to use this data structure for the next update anyway. |
Good catch, thanks for letting me know. Since we need to retain compatibility with earlier versions after the SCE update, I would suggest to set a couple of additional variables (baseline.origins, baseline.experiments) to which the initialize.project is storing origins and experiments. Then read.data reads from baseline.origins, baseline.experiments and saves to origins, experiments. Basically it prevents reading and writing to the same variables, which is what is causing the problem. Lmk if that makes sense. |
Found another issue with read.data - genes.filter used to filter genes from the expression matrix is used incorrectly: |
Similarly to read.preclustered.datasets, the
unique
commands needs to be removed from all the following code (taken from read.data function). We have already don't that for read.preclustered.datasets previously. The structure ofinitialize.project
is that each dataset has a value oforigins
andexperiments
which can be duplicated. Applying unique onto these vectors is problematic.See example of how I run initialize.project:
The text was updated successfully, but these errors were encountered: