-
Notifications
You must be signed in to change notification settings - Fork 0
/
Copy pathnotes.txt
26 lines (21 loc) · 1.16 KB
/
notes.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
To-implement, in order of easiness:
1. get overexpressed genes for each cluster
- DONE
2. more visualization tools - heat maps, gene expression in scatter plot
2. uncertainty sampling by cluster assignment entropy
3. re-labeling/re-initialization of uncurl
- including cluster merging/splitting
- TODO
4. improved distribution selector, using p-values?
- try out various things: implementing Chi-square test?
- implement that crazy p-value thing from that one paper?
5. improved gene selection methods??? bring back robust uncurl???
- use Poisson test/c-score test on cluster subsets to identify differentiating genes?
6. hierarchical uncurl
- binary splitting w/uncurl, soft assignments based on entropy
7. picking K
Uncurl question: how many iterations do we actually need???
to investigate:
- gene selection while doing cell state estimation - use Poisson Test? (might fail for very sparse genes). Use c-score? (too ad-hoc).
- streaming uncurl? in terms of returning/yielding partial results during the optimization process, not in terms of operating over data streams (although that would be interesting too? streaming NMF?)
regularization in uncurl