Essentials
Running Information
After invoking the pipeline, nextflow will report the progress to stdout, with each row representing a process.
N E X T F L O W ~ version 23.10.1
Launching `/thunderData/pipeline/starscope/scRNA-seq/main.nf` [adoring_ekeblad] DSL2 - revision: 8e27902b23
executor > slurm (9)
[e0/1d00d4] process > scRNAseq:CAT_FASTQ (human_test) [100%] 1 of 1 β
[37/8c0795] process > scRNAseq:TRIM_FASTQ (human_test) [100%] 1 of 1 β
[20/1edf9b] process > scRNAseq:MULTIQC (human_test) [100%] 1 of 1 β
[5a/e0becc] process > scRNAseq:STARSOLO (human_test) [100%] 1 of 1 β
[02/15a3b1] process > scRNAseq:CHECK_SATURATION (human_test) [100%] 1 of 1 β
[09/e25428] process > scRNAseq:GET_VERSIONS (get_versions) [100%] 1 of 1 β
[48/703c20] process > scRNAseq:FEATURESTATS (human_test) [100%] 1 of 1 β
[79/cd2784] process > scRNAseq:GENECOVERAGE (human_test) [100%] 1 of 1 β
[e6/808adf] process > scRNAseq:REPORT (human_test) [100%] 1 of 1 β
Completed at: 09-May-2024 09:07:55
Duration : 25m 9s
CPU hours : 3.7
Succeeded : 9When encountering any error, nextflow will interrupt running and print error message to stderr directly.
User could also check the error message from running log file .nextflow.log
$ head .nextflow.log
May-09 08:42:37.523 [main] DEBUG nextflow.cli.Launcher - $> nextflow run /thunderData/pipeline/starscope/scRNA-seq -c /thunderData/pipeline/nf_scRNAseq_config/latest/thunderbio_human_config --input sampleList.csv
May-09 08:42:37.924 [main] INFO nextflow.cli.CmdRun - N E X T F L O W ~ version 23.10.1
May-09 08:42:38.096 [main] DEBUG nextflow.plugin.PluginsFacade - Setting up plugin manager > mode=prod; embedded=false; plugins-dir=/home/xzx/.nextflow/plugins; core-plugins: nf-amazon@2.1.4,nf-azure@1.3.3,nf-cloudcache@0.3.0,nf-codecommit@0.1.5,nf-console@1.0.6,nf-ga4gh@1.1.0,nf-google@1.8.3,nf-tower@1.6.3,nf-wave@1.0.1
May-09 08:42:38.147 [main] INFO o.pf4j.DefaultPluginStatusProvider - Enabled plugins: []
May-09 08:42:38.150 [main] INFO o.pf4j.DefaultPluginStatusProvider - Disabled plugins: []
May-09 08:42:38.163 [main] INFO org.pf4j.DefaultPluginManager - PF4J version 3.4.1 in 'deployment' mode
May-09 08:42:38.234 [main] INFO org.pf4j.AbstractPluginManager - No plugins
May-09 08:42:42.225 [main] DEBUG nextflow.config.ConfigBuilder - Found config base: /thunderData/pipeline/starscope/scRNA-seq/nextflow.config
May-09 08:42:42.231 [main] DEBUG nextflow.config.ConfigBuilder - User config file: /thunderData/pipeline/nf_scRNAseq_config/latest/thunderbio_human_config_v2
May-09 08:42:42.233 [main] DEBUG nextflow.config.ConfigBuilder - Parsing config file: /thunderData/pipeline/starscope/scRNA-seq/nextflow.configNextflow Log CLI
After each invokation, the pipeline running information could be retrieved by nextflow log
command, and user could check the RUN NAME, STATUS and SESSION ID from the command output.
$ nextflow log
TIMESTAMP DURATION RUN NAME STATUS REVISION ID SESSION ID COMMAND
2024-05-09 08:42:44 25m 12s adoring_ekeblad OK 8e27902b23 8670925f-ce5a-4f7a-b327-a98b288e6aa6 nextflow run /thunderData/pipeline/starscope/scRNA-seq -c /thunderData/pipeline/nf_scRNAseq_config/latest/thunderbio_human_config --input sampleList.csvWork Dir and Intermediate Files
Each task of the process will be conducted in a sub-directory of the workDir set in
nextflow configuration file. By default, StarScope set this to work folder
under project running directory. To confirm each tasksβ working directory, user
will have to check the task hash id with command below. The adoring_ekeblad is
the RUN NAME from nextflow log output.
$ nextflow log adoring_ekeblad -f hash,name,exit,status
e0/1d00d4 scRNAseq:CAT_FASTQ (human_test) 0 COMPLETED
09/e25428 scRNAseq:GET_VERSIONS (get_versions) 0 COMPLETED
37/8c0795 scRNAseq:TRIM_FASTQ (human_test) 0 COMPLETED
20/1edf9b scRNAseq:MULTIQC (human_test) 0 COMPLETED
5a/e0becc scRNAseq:STARSOLO (human_test) 0 COMPLETED
79/cd2784 scRNAseq:GENECOVERAGE (human_test) 0 COMPLETED
48/703c20 scRNAseq:FEATURESTATS (human_test) 0 COMPLETED
02/15a3b1 scRNAseq:CHECK_SATURATION (human_test) 0 COMPLETED
e6/808adf scRNAseq:REPORT (human_test) 0 COMPLETEDTo check CAT_FASTQ process task working directory, we could use itβs hash_id (e0/1d00d4) to
locate the folder in work:
$ ls -a work/e0/1d00d49d7d562790a4d4f5993852ba/
. .command.begin .command.log .command.run .command.trace human_test_1.merged.fq.gz human_test.R1.fq.gz
.. .command.err .command.out .command.sh .exitcode human_test_2.merged.fq.gz human_test.R2.fq.gzThe work directory always contains several important hidden files:
.command.outSTDOUT from tool..command.errSTDERR from tool..command.logcontains both STDOUT and STDERR from tool..command.begincreated as soon as the job launches..exitcodecreated when the job ends, with exit code..command.tracelogs of compute resource usage..command.runwrapper script used to run the job..command.shprocess command used for this task.
$ cat work/e0/1d00d49d7d562790a4d4f5993852ba/.command.sh
#!/bin/bash -ue
ln -s human_test.R1.fq.gz human_test_1.merged.fq.gz
ln -s human_test.R2.fq.gz human_test_2.merged.fq.gzRunning in Background
The nextflow pipeline could be execute in background, with -bg option:
starscope gex --input sampleList.csv --config custom_config -bgResume Previous Run
One of the core features of Nextflow is the ability to cache task executions and re-use them in subsequent runs to minimize duplicate work. Resumability is useful both for recovering from errors and for iteratively developing a pipeline. It is similar to checkpointing, a common practice used by HPC applications.
To resume from previous run, please use the command below after entering the project running directory:
starscope gex --input sampleList.csv --config custom_config -bg -resumeOr resume from a specific run with session ID (check from nextflow log output):
starscope gex --input sampleList.csv --config custom_config -bg -resume 8670925f-ce5a-4f7a-b327-a98b288e6aa6Additional resources: