feature barcoding data (like CITE-seq) is taking too much time to process #630
-
There has been a couple of enquires regarding quantification of the feature barcoding data like CITE-seq, where a fixed length of barcodes are matched (instead of the full read sequences) to a small set of reference antibody derived tags. Some users are observing alevin is taking too much time to process such data when they follow the feature-barcoding alevin-tutorial . |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Basically the answer lies in how complicated the UMI graph network is. Experiment with the antibody derived barcodes (ADT) with ~200 protein panel, generally, doesn't need the In general, we'd recommend if you expect very low diversity in the number of barcodes in your experiment, use |
Beta Was this translation helpful? Give feedback.
Basically the answer lies in how complicated the UMI graph network is. Experiment with the antibody derived barcodes (ADT) with ~200 protein panel, generally, doesn't need the
--naiveEqclass
mode UMI deduplication, unless the experiment is super deeply sequenced. However, for very low diversity like 20 barcodes e.g. for HTO like sample barcodes, the graphical network becomes exponentially hard to solve and potentially increases the running time for alevin.In general, we'd recommend if you expect very low diversity in the number of barcodes in your experiment, use
--naiveEqclass
otherwise prefer avoiding it. Generally, the experiment with low diversity barcodes results in such a highly de…