Dependency Management

To ensure reproducible analysis, the ScaleBio RNA workflow uses modern dependency management strategies. We recommend using containerized environments (Docker or Singularity/Apptainer) for most users. Conda is available as an alternative if containers are not supported. Manual installation is possible but not recommended.

Tip

If you're unsure, try -profile docker first on your local machine, or -profile singularity on a cluster.

Quickstart: Which Dependency Method Should I Use?

Environment	Recommended Profile	Requirements
Local/cloud	`-profile docker`	Docker installed
HPC/cluster	`-profile singularity`	Singularity/Apptainer installed
No containers	`-profile conda`	Conda installed

1. Using Containers (Recommended)

If your system supports docker containers, this is the recommended way to handle all dependencies for the ScaleBio RNA workflow. We provide pre-built docker containers and the workflow is setup to automatically use them. This is enabled by adding -profile docker to the nextflow command-line.

If your system does not support docker, singularity (or new versions called Apptainer) is an alternative that is enabled on many HPC clusters (2.3.x or newer). Setting -profile singularity will use the singularity engine for all dependencies. The environment variable NXF_SINGULARITY_CACHEDIR can be used to control where singularity images are stored. This should be a writable location that is available on all compute nodes. Similarly TMPDIR should be changed from the default /tmp to a location writable from the container if necessary.

See Nextflow Containers for details and additional configuration options.

One important point is that all input and output paths need to be available (bind) inside the containers. For docker, Nextflow will set the relevant options automatically at runtime; for singularity this requires user mounts to be enabled in the system-wide configuration (see the notes in the Nextflow singularity documentation).

Tip

For both Docker and Singularity, Nextflow will automatically pull and use the correct images. You do not need to install any workflow tools manually.

2. Using Conda

Another option is using the Conda package manager. Nextflow can automatically create conda environments with most dependencies. This mode is selected by setting -profile conda.

Prerequisites

Install and update conda: bash conda update -n base -c defaults conda
Install ScaleBio Tools: bash /PATH/TO/ScaleRNA/envs/download-scale-tools.sh
BCL Convert (if using runFolder input): If running from a sequencer runFolder (.bcls), Illumina BCL Convert must be installed and available on $PATH

Troubleshooting Conda Installation

If nextflow throws an error installing packages while using -profile conda, there are more verbose yaml files with comprehensive package lists:

To use these verbose files, edit the nextflow.config and replace the process.conda section:

Change process.conda = "$projectDir/envs/scalerna.conda.yml" to process.conda = "$projectDir/envs/scalerna_verbose.conda.yml"
Change conda = "$projectDir/envs/scalernareport.conda.yml" to conda = "$projectDir/envs/scalernareport_verbose.conda.yml"

Important

Clean Base Environment: Automatic installation works best if the base conda environment is clean - avoid extra channels or complex packages in the base environment.

See the Nextflow documentation for additional detail of conda support in Nextflow.

3. Manual Installation (Advanced)

Not recommended unless you have special requirements.
All required tools must be installed and available on your $PATH.
See envs/scalerna.conda.yml and envs/scalernareport.conda.yml for a full list of dependencies.

4. Reference and Further Reading

Need Help?

For more information, please contact support@scale.bio or visit our support website.