Combine outputs of mutually exclusive processes in a Nextflow (DSL2) pipeline - dsl

I have a DSL2 workflow in Nextflow set up like this:
nextflow.enable.dsl=2
// process 1, mutually exclusive with process 2 below
process bcl {
tag "bcl2fastq"
publishDir params.outdir, mode: 'copy', pattern: 'fastq/**fastq.gz'
publishDir params.outdir, mode: 'copy', pattern: 'fastq/Stats/*'
publishDir params.outdir, mode: 'copy', pattern: 'InterOp/*'
publishDir params.outdir, mode: 'copy', pattern: 'Run*.xml'
beforeScript 'export PATH=/opt/tools/bcl2fastq/bin:$PATH'
input:
path runfolder
path samplesheet
output:
path 'fastq/Stats/', emit: bcl_ch
path 'fastq/**fastq.gz', emit: fastqc_ch
path 'InterOp/*', emit: interop_ch
path 'Run*.xml'
script:
// processing omitted
}
// Process 2, note the slightly different outputs
process bcl_convert {
tag "bcl-convert"
publishDir params.outdir, mode: 'copy', pattern: 'fastq/**fastq.gz'
publishDir params.outdir, mode: 'copy', pattern: 'fastq/Reports/*'
publishDir params.outdir, mode: 'copy', pattern: 'InterOp/*'
publishDir params.outdir, mode: 'copy', pattern: 'Run*.xml'
beforeScript 'export PATH=/opt/tools/bcl-convert/:$PATH'
input:
path runfolder
path samplesheet
output:
path 'fastq/Reports/', emit: bcl_ch
path 'fastq/**fastq.gz', emit: fastqc_ch
path 'InterOp/', emit: interop_ch
path 'Run*.xml'
script:
// processing omitted
}
// downstream process that needs either the first or the second to work, agnostic
process fastqc {
cpus 12
publishDir "${params.outdir}/", mode: "copy"
module 'conda//anaconda3'
conda '/opt/anaconda3/envs/tools/'
input:
path fastq_input
output:
path "fastqc", emit: fastqc_output
script:
"""
mkdir -p fastqc
fastqc -t ${task.cpus} $fastq_input -o fastqc
"""
}
Now I have a variable, params.bcl_convert which can be used to switch from one process to the other, and I set up the workflow like this:
workflow {
runfolder_repaired = "${params.runfolder}".replaceFirst(/$/, "/")
runfolder = Channel.fromPath(runfolder_repaired, type: 'dir')
sample_data = Channel.fromPath(params.samplesheet, type: 'file')
if (!params.bcl_convert) {
bcl(runfolder, sample_data)
} else {
bcl_convert(runfolder, sample_data)
}
fastqc(bcl.out.mix(bcl_convert.out)) // Problematic line
}
The problem lies in the problematic line: I'm not sure how (and if it is possible) to have fastqc get the input of bcl2fastq or bcl_convert (but only fastq_ch, not the rest) regardless of the process that generated it.
Some of the things I've tried include (inspired by https://github.com/nextflow-io/nextflow/issues/1646, but that one uses a the output of a process):
if (!params.bcl_convert) {
def bcl_out = bcl(runfolder, sample_data).out
} else {
def bcl_out = bcl_convert(runfolder, sample_data).out
}
fastqc(bcl_out.fastq_ch)
But this then compilation fails with Variable "runfolder" already defined in the process scope, even using the approach in a similar way as the post:
def result_bcl2fastq = !params.bclconvert ? bcl(runfolder, sample_data): Channel.empty()
def result_bclconvert = params.bclconvert ? bcl_convert(runfolder, sample_data): Channel.empty()
I thought about using conditionals in a single script, however the outputs from the two processes differ, so it's not really possible.
The only way I got it to work is by duplicating all outputs, like:
if (!params.bcl_convert) {
bcl(runfolder, sample_data)
fastqc(bcl.out.fastqc_ch)
} else {
bcl_convert(runfolder, sample_data)
fastqc(bcl_convert.out.fastqc_ch
}
However this looks to me like unnecessary complication. Is what I want to do actually possible?

I was able to figure this out, with a lot of trial and error.
Assigning a variable to a process output acts like the .out property of said process. So I set the same variable for the two exclusive processes, set the same outputs (as seen in the question) and then accessed them directly without using .out:
workflow {
runfolder_repaired = "${params.runfolder}".replaceFirst(/$/, "/")
runfolder = Channel.fromPath(
runfolder_repaired, type: 'dir')
sample_data = Channel.fromPath(
params.samplesheet, type: 'file')
if (!params.bcl_convert) {
bcl_out = bcl2fastq(runfolder, sample_data)
} else {
bcl_out = bcl_convert(runfolder, sample_data)
}
fastqc(bcl_out.fastqc_ch)
}

Related

Terraform: YAML file rendering issue in storage section of container linux config of flatcar OS

I am trying to generate a file by template rendering to pass to the user data of the ec2 instance. I am using the third party terraform provider to generate an ignition file from the YAML.
data "ct_config" "worker" {
content = data.template_file.file.rendered
strict = true
pretty_print = true
}
data "template_file" "file" {
...
...
template = file("${path.module}/example.yml")
vars = {
script = file("${path.module}/script.sh")
}
}
example.yml
storage:
files:
- path: "/opt/bin/script"
mode: 0755
contents:
inline: |
${script}
Error:
Error: Error unmarshaling yaml: yaml: line 187: could not find expected ':'
on ../../modules/launch_template/launch_template.tf line 22, in data "ct_config" "worker":
22: data "ct_config" "worker" {
If I change ${script} to sample data then it works. Also, No matter what I put in the script.sh I am getting the same error.
You want this outcome (pseudocode):
storage:
files:
- path: "/opt/bin/script"
mode: 0755
contents:
inline: |
{{content of script file}}
In your current implementation, all lines after the first loaded from script.sh will not be indented and will not be interpreted as desired (the entire script.sh content) by a YAML decoder.
Using indent you can correct the indentation and using the newer templatefile functuin you can use a slightly cleaner setup for the template:
data "ct_config" "worker" {
content = local.ct_config_content
strict = true
pretty_print = true
}
locals {
ct_config_content = templatefile("${path.module}/example.yml", {
script = indent(10, file("${path.module}/script.sh"))
})
}
For clarity, here is the example.yml template file (from the original question) to use with the code above:
storage:
files:
- path: "/opt/bin/script"
mode: 0755
contents:
inline: |
${script}
I had this exact issue with ct_config, and figured it out today. You need to base64encode your script to ensure it's written correctly without newlines - without that, newlines in your script will make it to CT, which attempts to build an Ignition file, which cannot have newlines, causing the error you ran into originally.
Once encoded, you then just need to tell CT to !!binary the file to ensure Ignition correctly base64 decodes it on deploy:
data "template_file" "file" {
...
...
template = file("${path.module}/example.yml")
vars = {
script = base64encode(file("${path.module}/script.sh"))
}
}
storage:
files:
- path: "/opt/bin/script"
mode: 0755
contents:
inline: !!binary |
${script}

Commander can't handle multiple command arguments

I have the following commander command with multiple arguments:
var program = require('commander');
program
.command('rename <id> [name]')
.action(function() {
console.log(arguments);
});
program.parse(process.argv);
Using the app yields the following result:
$ node app.js 1 "Hello"
{ '0': '1',
'1':
{ commands: [],
options: [],
_execs: [],
_args: [ [Object] ],
_name: 'rename',
parent:
{ commands: [Object],
options: [],
_execs: [],
_args: [],
_name: 'app',
Command: [Function: Command],
Option: [Function: Option],
_events: [Object],
rawArgs: [Object],
args: [Object] } } }
As you can see, the action receives the first argument (<id>) and program, but doesn't receives the second argument: [name].
I've tried:
Making [name] a required argument.
Passing the name unquoted to the tool from the command line.
Simplifying my real app into the tiny reproducible program above.
Using a variadic argument for name (rename <id> [name...]), but this results on both 1 and Hello to being assigned into the same array as the first parameter to action, defeating the purpose of having id.
What am I missing? Does commander only accepts one argument per command (doesn't looks so in the documentation)?
I think this was a bug in an old version of commander. This works now with commander#2.9.0.
I ran in to the same problems, and decided to use Caporal instead.
Here's an example from their docs on Creating a command:
When writing complex programs, you'll likely want to manage multiple commands. Use the .command() method to specify them:
program
// a first command
.command("my-command", "Optional command description used in help")
.argument(/* ... */)
.action(/* ... */)
// a second command
.command("sec-command", "...")
.option(/* ... */)
.action(/* ... */)

Clearing require cache

I am trying to delete a module from cache as suggested here.
In the documentation we read:
require.cache
Object
Modules are cached in this object when they are required. By deleting a key value from this object, the next require will reload the module.
So, I created a file named 1.js that contains a single line:
module.exports = 1;
Then I require it via node shell:
ionicabizau#laptop:~/Documents/test$ node
> require("./1")
1
> require.cache
{ '/home/ionicabizau/Documents/test/1.js':
{ id: '/home/ionicabizau/Documents/test/1.js',
exports: 1,
parent:
{ id: 'repl',
exports: [Object],
parent: undefined,
filename: '/home/ionicabizau/Documents/test/repl',
loaded: false,
children: [Object],
paths: [Object] },
filename: '/home/ionicabizau/Documents/test/1.js',
loaded: true,
children: [],
paths:
[ '/home/ionicabizau/Documents/test/node_modules',
'/home/ionicabizau/Documents/node_modules',
'/home/ionicabizau/node_modules',
'/home/node_modules',
'/node_modules' ] } }
# edited file to export 2 (module.exports = 2;)
> require.cache = {}
{}
> require.cache
{}
> require("./1") // supposed to return 2
1
So, why does require("./1") return 1 when my file contains module.exports = 2 and the cache is cleared?
Doing some debugging I saw that there is a Module._cache object that is not cleared when I do require.cache = {}.
require.cache is just an exposed cache object reference, this property is not used directly, so changing it does nothing. You need to iterate over keys and actually delete them.
for (var i in require.cache) { delete require.cache[i] }
If you need to filter specific files, i could be filtered with specific Regular Expression or rules at your customization.
for (var i in require.cache) { if (i.startsWith('src/cache/') )delete require.cache[i] }

Grunt watch: compile only one file not all

I have grunt setup to compile all of my coffee files into javascript and maintain all folder structures using dynamic_mappings which works great.
coffee: {
dynamic_mappings: {
files: [{
expand: true,
cwd: 'assets/scripts/src/',
src: '**/*.coffee',
dest: 'assets/scripts/dest/',
ext: '.js'
}]
}
}
What I would like to do is then use watch to compile any changed coffee file and still maintain folder structure. This works using the above task with this watch task:
watch: {
coffeescript: {
files: 'assets/scripts/src/**/*.coffee',
tasks: ['coffee:dynamic_mappings']
}
}
The problem is that when one file changes it compiles the entire directory of coffee into Javascript again, it would be great if it would only compile the single coffee file that was changed into Javascript. Is this naturally possible in Grunt or is this a custom feature. The key here is it must maintain the folder structure otherwise it would be easy.
We have custom watch scripts at work and I'm trying to sell them on Grunt but will need this feature to do it.
You can use something like the following Gruntfile. Whenever a CoffeeScript file changes, it updates the configuration for coffee:dynamic_mappings to only use the modified file as the src.
This example is a slightly modified version of the example in the grunt-contrib-watch readme.
Hope it helps!
var path = require("path");
var srcDir = 'assets/scripts/src/';
var destDir = 'assets/scripts/dest/';
module.exports = function( grunt ) {
grunt.initConfig( {
coffee: {
dynamic_mappings: {
files: [{
expand: true,
cwd: srcDir,
src: '**/*.coffee',
dest: destDir,
ext: '.js'
}]
}
},
watch : {
coffeescript : {
files: 'assets/scripts/src/**/*.coffee',
tasks: "coffee:dynamic_mappings",
options: {
spawn: false, //important so that the task runs in the same context
}
}
}
} );
grunt.event.on('watch', function(action, filepath, target) {
var coffeeConfig = grunt.config( "coffee" );
// Update the files.src to be the path to the modified file (relative to srcDir).
coffeeConfig.dynamic_mappings.files[0].src = path.relative(srcDir, filepath);
grunt.config("coffee", coffeeConfig);
} );
grunt.loadNpmTasks("grunt-contrib-coffee");
grunt.loadNpmTasks("grunt-contrib-watch");
grunt.registerTask("default", [ "coffee:dynamic_mappings", "watch:coffeescript"]);
};
found a solution from an answer to a similar question https://stackoverflow.com/a/19722900/1351350
short answer: try https://github.com/tschaub/grunt-newer

Compile less files with grunt-contrib-less won't work

I'm using Grunt for building my web project. I installed grunt-contrib-less package und added a task to my grunt.initConfig({..});
less : {
options: {
paths: ['js/base']
},
files: {
'js/base/*.css' : 'js/base/*.less'
}
}
when I run the target less via grunt less, it runs without errors but doesn't compile the less file to a css file.
Running "less:files" (less) task
Done, without errors.
I have installed the lessc package via node, too. Doing lessc <source> <dest> works fine.
Currently I have pointed with the files option directly to one dir which contains one less file for testing. Even if I write the whole file name into files option, it happens nothing...
Later on I want to scan the whole js directory and compile all new modified *.less files.
I have installed following versions:
grunt-cli v0.1.6
grunt v0.4.0
node v0.8.7
npm 1.1.49
BR,
mybecks
The glob pattern js/base/*.css does not match any files, therefore there is no destination. Usually, tasks like this expect multiple inputs to combine into a single output. Also, bear in mind that less is a multi-task, and putting files as a child of less is not doing what you expect. (it is treating it as a target, not a src/dest map)
If you want a 1-1 transform of .less into .css, you can use dynamic expansion. (or you can define each src/dest pair manually, but who wants to do that?)
In your case:
less: {
options: {
paths: ['js/base']
},
// target name
src: {
// no need for files, the config below should work
expand: true,
cwd: "js/base",
src: "*.less",
ext: ".css"
}
}
I used Anthonies solution but stil had an error
Warning: Object true has no method indexOf
If I changed the order putting expand true as second it gave me the error
Unable to read "less" file
where "less" was the value of the first item in my list.
I solved it by changing files into an array like this:
less: {
options: {
paths: ["js/base"]
},
files: [{
expand: true,
cwd: "js/base",
src: ["**/*.less"],
dest: "js/base",
ext: ".css"
}]
},
I used "grunt-contrib-less" : "^0.11.0"
This works for me, but modified to reflect this scenario:
less: {
options: {
paths: ["js/base"]
},
files: {
expand: true,
cwd: "js/base",
src: ["**/*.less"],
dest: "js/base",
ext: ".css"
}
},

Resources