Reporting Workflow
Overview
A separate sub-workflow, reporting
, is included to run only the QC and reporting steps of the workflow, after alignment and quantification (i.e. after STARsolo). This includes
- Cell filtering (UMI threshold)
- Sample metric and report generation
- Library metric and report generation
- ScalePlex assignment
The reporting-only workflow allows you to generate or regenerate QC reports and summary metrics from existing analysis results without re-running alignment or quantification. This is useful for:
- Quickly updating reports after changing cell filtering or QC parameters.
- Merging results from multiple extended throughput plates or sequencing runs.
- Generating new reports for downstream analysis or sharing.
Important
The reporting workflow operates on outputs from a previous full analysis run (after STARsolo alignment and quantification). It does not re-align reads or re-quantify gene expression.
When to Use the Reporting Workflow
- Parameter tuning: Adjust cell filtering thresholds or QC parameters and regenerate reports without re-running the full pipeline.
- Merging runs: Combine results from multiple libraries or sequencing runs (see Extended Throughput).
How to Run
nextflow run /PATH/TO/ScaleRna \
-profile ... \
--genome genome.json \
--samples samples.csv \
--reporting \
--resultDir <outDir from previous workflow run>
--samples
and--genome
should match those used in the original analysis.--reporting
turns on the reporting-only workflow option--resultDir
points to the output directory of the previous run, can also be added as a column in the samples barcode table
Warning
Do not specify --runFolder
or --fastqDir
for a reporting run.
Inputs
Previous analysis outputs:
- STARsolo raw output:
resultDir/alignment/<sample>.<libName>.star.solo
- STARsolo log file:
resultDir/alignment/<sample>.<libName>.star.align
Reference files:
genome.json
library.json
Sample barcode table:
samples.csv
(can include aresultDir
column for merging, see below)
Outputs
Filtered gene expression matrix:
samples/<sample>.<libName>.filtered/
Cell metrics:
- samples/<sample>.<libName>.allCells.csv
QC reports:
reports/<sample>/<sample>.<libName>.report.html
- CSV metric files in the same directory
Important
The reporting workflow does not aggregate read-trimming statistics from the original run. Metrics such as Total Sample Reads (pre-trimming) and Average Trimmed Read Length will be missing from regenerated reports.
Merging and Combining Runs
Please see merging multiple runs for a detailed explanation on how the merging function works alongside the --reporting
because this depends on the experimental setup. You can use the reporting workflow to combine results from multiple previous analysis runs Here is a simple example:
- Add a
resultDir
column to yoursamples.csv
to specify the output directory for each sample:
sample | resultDir |
---|---|
pbmc1 | /PATH/TO/RUN1/ScaleRna.out |
pbmc2 | /PATH/TO/RUN2/ScaleRna.out |
-
See samples.ext-merge.csv for a complete example.
-
If
--merge
is set (default), the workflow produces combined outputs for each sample across all libraries (plates).
Important
This function only combines cells or samples from different runs, not reads for the same set of cells. To combine multiple sequencing runs of the same library, you must merge FASTQ files and re-run the full pipeline to properly detect duplicate reads.
Additional Functionality and Notes
- Cell filtering: The reporting workflow can re-apply cell filtering (e.g., UMI thresholds, CellFinder) to existing data.
- Metric definitions: For details on barcode and cell metrics, see Barcode Metric Definitions and Cell Calling.
- Report content: HTML and CSV reports include mapping metrics, cell counts, sensitivity, and barcode distributions.
Related Documentation
- Analysis Parameters — How to set reporting and filtering options
- Outputs — Output file structure and descriptions
- QC Reports — Report content and interpretation
- Extended Throughput — Merging multiple runs
- Barcode Metric Definitions — Explanation of barcode metrics
- ScalePlex - Explanation of ScalePlex analysis and assignment
If you need more details on the reporting script's internal logic, see the reporting.py source code.
Need Help?
For more information, please contact support@scale.bio or visit our support website.