VCR BadAlias error when used with Cucumber tag

VCR BadAlias error when used with Cucumber tag - cucumber

I have to feature steps
#vcr
Given A
#vcr
Given B
and its definitions:
Given /^A$/ do
a_method_that_makes_a_request
end
Given /^B$/ do
a_method_that_makes_a_request
end
This fail with:
Unknown alias: 70305756847740 (Psych::BadAlias)
The number changes. But when I did this:
# Feature step
Given B
# Step definition
Given /^B$/ do
VCR.use_cassette 'a_cassette' do
a_method_that_makes_a_request
end
end
It works. Can avoid this patch to use #vcr tag?
This is my config:
# features/support/vcr_setup.rb
require 'vcr'
VCR.configure do |c|
# c.allow_http_connections_when_no_cassette = true
c.cassette_library_dir = 'spec/fixtures/cassettes'
c.hook_into :webmock
c.ignore_localhost = true
log_path = File.expand_path('../../../log/vcr.log', __FILE__)
c.debug_logger = File.open(log_path, 'w')
end
VCR.cucumber_tags do |t|
t.tag '#localhost_request' # uses default record mode since no options are given
t.tags '#disallowed_1', '#disallowed_2', :record => :none
t.tag '#vcr', :use_scenario_name => true, record: :new_episodes
end

Related

snakemake - replacing wildcards in input directive by anonymous function

I am writing a snakemake that will run a bioinformatics pipeline for several input samples. These input files (two for each analysis, one with the partial string match R1 and the second with the partial string match R2) start with a pattern and end with the extension .fastq.gz. Eventually I want to perform multiple operations, though, for this example I just want to align the fastq reads against a reference genome using bwa mem. So for this example my input file is NIPT-N2002394-LL_S19_R1_001.fastq.gz and I want to generate NIPT-N2002394-LL.bam (see code below specifying the directories where input and output are).
My config.yaml file looks like so:
# Run_ID
run: "200311_A00154_0454_AHHHKMDRXX"
# Base directory: the analysis directory from which I will fetch the samples
bd: "/nexusb/nipt/"
# Define the prefix
# will be used to subset the folders in bd
prefix: "NIPT"
# Reference:
ref: "/nexus/bhinckel/19/ONT_projects/PGD_breakpoint/ref_hg19_local/hg19_chr1-y.fasta"
And below is my snakefile
import os
import re
#############
# config file
#############
configfile: "config.yaml"
#######################################
# Parsing variables from config.yaml
#######################################
RUN = config['run']
BD = config['bd']
PREFIX = config['prefix']
FQDIR = f'/nexusb/Novaseq/{RUN}/Unaligned/'
BASEDIR = BD + RUN
SAMPLES = [sample for sample in os.listdir(BASEDIR) if sample.startswith(PREFIX)]
# explanation: in BASEDIR I have multiple subdirectories. The names of the subdirectories starting with PREFIX will be the name of the elements I want to have in the list SAMPLES, which eventually shall be my {sample} wildcard
#############
# RULES
#############
rule all:
input:
expand("aligned/{sample}.bam", sample = SAMPLES)
rule bwa_map:
input:
REF = config['ref'],
R1 = FQDIR + "{sample}_S{s}_R1_001.fastq.gz",
R2 = FQDIR + "{sample}_S{s}_R2_001.fastq.gz"
output:
"aligned/{sample}.bam"
shell:
"bwa mem {input.REF} {input.R1} {input.R2}| samtools view -Sb - > {output}"
But I am getting:
Building DAG of jobs...
WildcardError in line 55 of /nexusb/nipt/200311_A00154_0454_AHHHKMDRXX/testMetrics/snakemake/Snakefile:
Wildcards in input files cannot be determined from output files:
's'
When calling snakemake -np
I believe my error lies in the definitions of R1 and R2 in the input directive. I find it puzzling because according to the official documentation snakemake should interpret any wildcard as the regex .+. But it is not doing that for sample NIPT-PearlPPlasma-05-PPx, whose R1 and R2 should be NIPT-PearlPPlasma-05-PPx_S5_R1_001.fastq.gz and NIPT-PearlPPlasma-05-PPx_S5_R2_001.fastq.gz, respectively.

Take a look again at the snakemake tutorial on how input is inferred from output, anyways I think the problem lies in this piece of code:
output:
expand("aligned/{sample}.bam", sample = SAMPLES)
And needs to be changed into
output:
"aligned/{sample}.bam"
What you had didn't work because before expand("aligned/{sample}.bam", sample = SAMPLES) basically becomes a list like this ["aligned/sample0.bam","aligned/sample1.bam"]. When you remove the expand, you only give a "description" of how the output should look like, and thus snakemake can infer the wildcards and input.
edit:
It's difficult to test it since I don't have the actual files, but you should do something like this. Won't work if multiple S-thingies exist.
def get_reads(wildcards):
R1 = FQDIR + f"{wildcards.sample}_S{{s}}_R1_001.fastq.gz"
R2 = FQDIR + f"{wildcards.sample}_S{{s}}_R2_001.fastq.gz"
globbed = glob_wildcards(R1)
R1, R2 = expand([R1, R2], s=globbed.s)
return {"R1": R1, "R2": R2}
rule bwa_map:
input:
unpack(get_reads),
REF = config['ref']
output:
"aligned/{sample}.bam"
shell:
"bwa mem {input.REF} {input.R1} {input.R2}| samtools view -Sb - > {output}"

The problem is here:
rule bwa_map:
input:
REF = config['ref'],
R1 = FQDIR + "{sample}_S{s}_R1_001.fastq.gz",
R2 = FQDIR + "{sample}_S{s}_R2_001.fastq.gz"
output:
"aligned/{sample}.bam"
Your output clearly defines a pattern where {sample} is a wildcard. When Snakemake builds the DAG and finds that any other rule requires a file that matches this pattern, it sets a concrete value to the wildcard.sample. At this moment all the inputs shall be defined, but you are introducing one more level of indirection: the wildcard {s} which is not defined.
The value of {s} shall be clearly inferred from the output. If you can do it in design time, substitute the it with the concrete values, otherwise you may use checkpoint feature of Snakemake.

Python enum set as parameter in a function

I am looking at an python API with the following function example:
bpy.ops.object.bake(type='COMBINED', pass_filter={"DIFFUSE", "DIRECT"})
while the pass_filter parameter accepts one or more of any of the following:
pass_filter (enum set in {
'NONE', 'AO', 'EMIT',
'DIRECT', 'INDIRECT',
'COLOR', 'DIFFUSE', 'GLOSSY',
'TRANSMISSION', 'SUBSURFACE',
})
on the other hand I have the following to determine whether or not the parameters should be added to pass_filter:
is_NONE = False
is_AO = True
is_EMIT = False
is_DIRECT = True
#..etc.
How do I insert these to the function, like a list or array to the parameter?

The key to this is knowing {"DIFFUSE", "DIRECT"} represents a set:
#!/usr/bin/python3
is_NONE = False
is_AO = True
is_EMIT = False
is_DIRECT = True
pass_filter = set()
if is_AO:
pass_filter.add('AO')
if is_DIRECT:
pass_filter.add('DIRECT')
print(pass_filter)
See it here
Feel free to add your extra if statements!

How do I escape true/false in terraform?

I need to pass the word true or false to a data template file in terraform. However, if I try to provide the value, it comes out 0 or 1 due to interpolation syntax. I tried doing \\true\\ as recommended in https://www.terraform.io/docs/configuration/interpolation.html, however that results in \true\, which obviously isn't right. Same with \\false\\ = \false\
To complicate matters, I also have a scenario where I need to pass it the value of a variable, which can either equal true or false.
Any ideas?
# control whether to enable REST API and set other port defaults
data "template_file" "master_spark_defaults" {
template = "${file("${path.module}/templates/spark/spark- defaults.conf")}"
vars = {
spark_server_port = "${var.application_port}"
spark_driver_port = "${var.spark_driver_port}"
rest_port = "${var.spark_master_rest_port}"
history_server_port = "${var.history_server_port}"
enable_rest = "${var.spark_master_enable_rest}"
}
}
var.spark_master_enable_rest can be either true or false. I tried setting the variable as "\\${var.spark_master_enable_rest}\\" but again this resulted in either \true\ or \false\
Edit 1:
Here is the relevant portion of conf file in question:
spark.ui.port ${spark_server_port}
# set to default worker random number.
spark.driver.port ${spark_driver_port}
spark.history.fs.logDirectory /var/log/spark
spark.history.ui.port ${history_server_port}
spark.worker.cleanup.enabled true
spark.worker.cleanup.appDataTtl 86400
spark.master.rest.enabled ${enable_rest}
spark.master.rest.port ${rest_port}

I think you must be overthinking,
if i set my var value as
spark_master_enable_rest="true"
Then i get :
spark.worker.cleanup.enabled true
spark.worker.cleanup.appDataTtl 86400
spark.master.rest.enabled true
in my result when i apply.

I ended up creating a cloud-config script to find/replace the 0/1 in the file:
part {
content_type = "text/x-shellscript"
content = <<SCRIPT
#!/bin/sh
sed -i.bak -e '/spark.master.rest.enabled/s/0/false/' -e '/spark.master.rest.enabled/s/1/true/' /opt/spark/conf/spark-defaults.conf
SCRIPT
}

Parameter aliasing

when implementing Origen::Parameters, I understood the importance of defining a 'default' set. But, in essence, my real default is named something different. So I implemented a hack of a parameter alias:
Origen.top_level.define_params :default do |params|
params.tconds.override = 1
params.tconds.override_lev_equ_set = 1
params.tconds.override_lev_spec_set = 1
params.tconds.override_levset = 1
params.tconds.override_seqlbl = 'my_pattern'
params.tconds.override_testf = 'tm_3'
params.tconds.override_tim_spec_set = 'bist_xxMhz'
params.tconds.override_timset = '1,1,1,1,1,1,1,1'
params.tconds.site_control = 'parallel:'
params.tconds.site_match = 2
end
Origen.top_level.define_params :cpu_mbist_hr, inherit: :default do |params|
# way of aliasing parameter names
end
Is there a proper method of parameter aliasing that is just not documented?

There is no other way to do this currently, though I would be open to a PR to enable something like:
default_params = :cpu_mbist_hr
If you don't want them to be called :default in this case though, then maybe you don't really want them to be the default anyway.
e.g. adding this immediately after you define them would effectively give you an alternative default and would do pretty much the same job as the proposed API above:
# self is required here to help Ruby know that you are calling the params= API
# and not defining a local variable called params
self.params = :cpu_mbist_hr

Include monotonically increasing value in logstash field?

I know there's no built in "line count" functionality while processing files through logstash (for various, understandable and documented reasons). But - there should be a mechanism, within any given logstash instance - to have an monotonically increasing variable / count for every parsed line.
I don't want to go the metrics route since it's a continuous polling mechanism (every n-seconds). Alternatives include pre-processing of log files which given my particular use case - is unacceptable.
Again, let me reiterate - I need the ability to generate/read a monotonically increasing variable that I can store during in a logstash filter.
Thoughts?

here's nothing built into logstash to do it.
You can build a filter to do it pretty easily
Just drop something like this into lib/logstash/filters/seq.rb
# encoding: utf-8
require "logstash/filters/base"
require "logstash/namespace"
require "set"
#
# This filter will adds a sequence number to a log entry
#
# The config looks like this:
#
# filter {
# seq {
# field => "seq"
# }
# }
#
# The `field` is the field you want added to the event.
class LogStash::Filters::Seq < LogStash::Filters::Base
config_name "seq"
milestone 1
config :field, :validate => :string, :required => false, :default => "seq"
public
def register
# Nothing
end # def register
public
def initialize(config = {})
super
#threadsafe = false
# This filter needs to keep state.
#seq=1
end # def initialize
public
def filter(event)
return unless filter?(event)
event[#field] = #seq
#seq = #seq + 1
filter_matched(event)
end # def filter
end # class LogStash::Filters::Seq
This will start at 1 every time Logstash is restarted, but for most situations, this would be ok. If you need something that is persistent across restarts, you need to do a bit more work to persist it somewhere

For anyone finding this in 2018+: logstash now has a ruby filter that makes this much simpler. Put the following in a file somewhere:
# encoding: utf-8
def register(params)
#seq = 1
end
def filter(event)
event.set("seq", #seq)
#seq += 1
return [event]
end
And then configure it like this in your logstash.conf (substitute in the filename you used):
ruby {
path => "/usr/local/lib/logstash/seq.rb"
}
It would be pretty easy to make the field name configurable from logstash.conf, but I'll leave that as an exercise for the reader.
I suspect this isn't thread-safe, so I'm running only a single logstash worker.

this is another choice to slove the problem,this work for me,thanks to the answer from the previous person about thread safe. i use seq field to sort my desc
this is my configure
logstash.conf
filter {
ruby {
code => 'event.set("seq", Time.now.strftime("%N").to_i)'
}
}
logstash.yml
pipeline.batch.size: 200
pipeline.batch.delay: 60
pipeline.workers: 1
pipeline.output.workers: 1

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string

VCR BadAlias error when used with Cucumber tag - cucumber

Related

snakemake - replacing wildcards in input directive by anonymous function

Python enum set as parameter in a function

How do I escape true/false in terraform?

Parameter aliasing

Include monotonically increasing value in logstash field?

Categories

Resources