Publications
Detailed Information
FastRNA: An efficient solution for PCA of single-cell RNA-sequencing data based on a batch-accounting count model
Cited 2 time in
Web of Science
Cited 2 time in Scopus
- Authors
- Issue Date
- 2022-11
- Publisher
- University of Chicago Press
- Citation
- American Journal of Human Genetics, Vol.109 No.11, pp.1974-1985
- Abstract
- Almost always, the analysis of single-cell RNA-sequencing (scRNA-seq) data begins with the generation of the low dimensional embedding of the data by principal-component analysis (PCA). Because scRNA-seq data are count data, log transformation is routinely applied to correct skewness prior to PCA, which is often argued to have added bias to data. Alternatively, studies have proposed methods that directly assume a count model and use approximately normally distributed count residuals for PCA. Despite their theoretical advantage of directly modeling count data, these methods are extremely slow for large datasets. In fact, when the data size grows, even the standard log normalization becomes inefficient. Here, we present FastRNA, a highly efficient solution for PCA of scRNA-seq data based on a count model accounting for both batches and cell size factors. Although we assume the same general count model as previous methods, our method uses two orders of magnitude less time and memory than the other count-based methods and an order of magnitude less time and memory than the standard log normalization. This achievement results from our unique algebraic optimization that completely avoids the formation of the large dense residual matrix in memory. In addition, our method enjoys a benefit that the batch effects are eliminated from data prior to PCA. Generating a batch-accounted PC of an atlas-scale dataset with 2 million cells takes less than a minute and 1 GB memory with our method.
- ISSN
- 0002-9297
- Files in This Item:
- There are no files associated with this item.
Item View & Download Count
Items in S-Space are protected by copyright, with all rights reserved, unless otherwise indicated.