Cell Calling and Quality Control
The ScaleBio RNA workflow implements multiple approaches to distinguish real cells from background barcodes in single-cell RNA sequencing data. This guide explains the different cell calling methods and quality control options available.
1. Overview
For QuantumScale RNA, single cells are identified by a combination of RT (sample) barcode, bead barcode and library (PCR) index. However not all barcode combinations correspond to real cells, but instead to background, e.g. empty beads. The workflow implements multiple approaches to distinguish cells from background barcodes.
- cellFinder: EmptyDrops-like approach with ambient RNA profile rescue
- TopCell Thresholding: Dynamic threshold based on transcript count percentiles
- Fixed UTC Threshold: User-defined unique transcript count threshold
2. CellFinder (Default Method)
2.1. What is CellFinder?
CellFinder is an EmptyDrops-like cell calling approach used by default (cellFinder
parameter). Cell-barcodes that are above a minimal transcript count (minUTC
) but below the provided or calculated UTC
threshold can be rescued based on expression differences from the ambient RNA profile.
2.2. How It Works
- Barcodes above
minUTC
but below theUTC
threshold are evaluated - Expression differences from ambient RNA profile are calculated
- Cells are rescued based on false discovery rate (
cellFinderFdr
)
2.3. Enable CellFinder
# In your runParams.yml
cellFinder: true
UTC: 0 # Optional: Set fixed threshold
cellFinderFdr: 0.001
2.4. Parameters
Parameter | Description | Default |
---|---|---|
cellFinder |
Enables CellFinder cell calling | true |
UTC |
Fixed unique transcript count threshold (optional) | 0 |
cellFinderFdr |
False discovery rate for cell rescue | 0.001 |
Important
CellFinder is disabled for multi-species (barnyard) samples.
3. TopCell Thresholding
3.1. What is TopCell?
TopCell thresholding sets a hard threshold on the unique transcript count per cell-barcode; all barcodes with counts above this threshold are called as cells, while all other barcodes are background.
3.2. How It Works
- Barcodes with counts less than the parameter
minUTC
are filtered from the data. - Next, the top cell, the cell barcode
topCellPercent
percentile count (if theexpectedCells
parameter is set, the percentile is applied to that number). - The UTC top cell is then divided by the parameter
minCellRatio
to determine the unique transcript count threshold for the dataset.
3.3. Enable TopCell
# In your runParams.yml
cellFinder: false
minUTC: 100
topCellPercent: 99
minCellRatio: 10
3.4. Parameters
Parameter | Description | Default |
---|---|---|
minUTC |
Minimum counts to consider a barcode as potential cell | 100 |
topCellPercent |
percentile to use for the top cell (robust max) | 99 |
minCellRatio |
Ratio between transcript counts of top cell and the lower cell threshold | 10 |
4. Fixed UTC Thresholding
4.1. What is Fixed UTC Thresholding?
A unique transcript count threshold is used when the parameter UTC
is greater than 0 and cellFinder is not enabled.
In this case every barcode with a total count >= UTC
is called as a cell.
4.2. Enable Fixed Threshold
# In your runParams.yml
cellFinder: false
UTC: 300
4.3. How It Works
- Every barcode with total count ≥
UTC
is called as a cell - All other barcodes are considered background
- Can be set in in a sample specific manner by using the sample barcode table (
samples.csv
). See the example below:
sample | barcodes | UTC |
---|---|---|
sample1 |
1A-6H |
300 |
sample2 |
7A-12H |
500 |
8. Output Files
For detailed information about cell calling outputs, see:
- Outputs Overview: General output structure and organization
- Samples Directory: The samples directory has information about the cells called per sample as well as background ambient barcodes.
9. References & Further Reading
Need Help?
For more information, please contact support@scale.bio or visit our support website.