Dependency Management
To ensure reproducible analysis, the ScaleBio RNA workflow uses modern dependency management strategies. We recommend using containerized environments (Docker or Singularity/Apptainer) for most users. Conda is available as an alternative if containers are not supported. Manual installation is possible but not recommended.
Tip
If you're unsure, try -profile docker
first on your local machine, or -profile singularity
on a cluster.
Quickstart: Which Dependency Method Should I Use?
Environment | Recommended Profile | Requirements |
---|---|---|
Local/cloud | -profile docker |
Docker installed |
HPC/cluster | -profile singularity |
Singularity/Apptainer installed |
No containers | -profile conda |
Conda installed |
1. Using Containers (Recommended)
If your system supports docker containers, this is the recommended way to handle all dependencies for the ScaleBio RNA workflow. We provide pre-built docker containers and the workflow is setup to automatically use them.
This is enabled by adding -profile docker
to the nextflow command-line.
If your system does not support docker, singularity (or new versions called Apptainer) is an alternative that is enabled on many HPC clusters (2.3.x or newer). Setting -profile singularity
will use the singularity engine for all dependencies. The environment variable NXF_SINGULARITY_CACHEDIR
can be used to control where singularity images are stored. This should be a writable location that is available on all compute nodes. Similarly TMPDIR
should be changed from the default /tmp
to a location writable from the container if necessary.
See Nextflow Containers for details and additional configuration options.
One important point is that all input and output paths need to be available (bind) inside the containers. For docker, Nextflow will set the relevant options automatically at runtime; for singularity this requires user mounts to be enabled in the system-wide configuration (see the notes in the Nextflow singularity documentation).
Tip
For both Docker and Singularity, Nextflow will automatically pull and use the correct images. You do not need to install any workflow tools manually.
2. Using Conda
Another option is using the Conda package manager. Nextflow can automatically create conda environments with most dependencies. This mode is selected by setting -profile conda
.
Prerequisites
-
Install and update conda:
bash conda update -n base -c defaults conda
-
Install ScaleBio Tools:
bash /PATH/TO/ScaleRNA/envs/download-scale-tools.sh
-
BCL Convert (if using runFolder input): If running from a sequencer runFolder (.bcls), Illumina BCL Convert must be installed and available on
$PATH
Troubleshooting Conda Installation
If nextflow throws an error installing packages while using -profile conda
, there are more verbose yaml files with comprehensive package lists:
To use these verbose files, edit the nextflow.config and replace the process.conda section:
- Change
process.conda = "$projectDir/envs/scalerna.conda.yml"
toprocess.conda = "$projectDir/envs/scalerna_verbose.conda.yml"
- Change
conda = "$projectDir/envs/scalernareport.conda.yml"
toconda = "$projectDir/envs/scalernareport_verbose.conda.yml"
Important
Clean Base Environment: Automatic installation works best if the base
conda environment is clean - avoid extra channels or complex packages in the base environment.
See the Nextflow documentation for additional detail of conda support in Nextflow.
3. Manual Installation (Advanced)
- Not recommended unless you have special requirements.
- All required tools must be installed and available on your
$PATH
. - See
envs/scalerna.conda.yml
andenvs/scalernareport.conda.yml
for a full list of dependencies.
4. Reference and Further Reading
- Nextflow: Managing Dependencies
- Nextflow: Using Containers
- nf-core: Dependency Installation
- For ScaleBio tools, see Pipeline Steps.
Need Help?
For more information, please contact support@scale.bio or visit our support website.