While trying to set up Nextflow with Azure Batch (NF-Core), I am getting following error. I tried this on multiple workflows (sarek, ataseq etc.) I get the same error -
N E X T F L O W ~ version 22.04.0
Pulling nf-core/atacseq ...
downloaded from https://github.com/nf-core/atacseq.git
Launching `https://github.com/nf-core/atacseq` [rhl6d5529] DSL1 - revision: 1b3a832db5 [1.2.1]
Downloading plugin nf-azure#0.13.1
----------------------------------------------------
,--./,-.
___ __ __ __ ___ /,-._.--~'
|\ | |__ __ / ` / \ |__) |__ } {
| \| | \__, \__/ | \ |___ \`-._,-`-,
`._,._,'
nf-core/atacseq v1.2.1
----------------------------------------------------
Run Name : rhl6d5529
Data Type : Paired-End
Design File : https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/design.csv
Genome : Not supplied
Fasta File : https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/reference/genome.fa
GTF File : https://raw.githubusercontent.com/nf-core/test-datasets/atacseq/reference/genes.gtf
Mitochondrial Contig : MT
MACS2 Genome Size : 1.2E+7
Min Consensus Reps : 1
MACS2 Narrow Peaks : No
MACS2 Broad Cutoff : 0.1
Trim R1 : 0 bp
Trim R2 : 0 bp
Trim 3' R1 : 0 bp
Trim 3' R2 : 0 bp
NextSeq Trim : 0 bp
Fingerprint Bins : 100
Save Genome Index : No
Max Resources : 6 GB memory, 2 cpus, 12h time per job
Container : docker - nfcore/atacseq:1.2.1
Output Dir : ./results
Launch Dir : /
Working Dir : /nextflow/atacseq/rhl6d5529
Script Dir : /.nextflow/assets/nf-core/atacseq
User : root
Config Profile : test,azurebatch
Config Description : Minimal test dataset to check pipeline function
Config Contact : Venkat Malladi (#vsmalladi)
Config URL : https://azure.microsoft.com/services/batch/
----------------------------------------------------
Uploading local `bin` scripts folder to az://nextflow/atacseq/rhl6d5529/tmp/66/bd55d79e42999df38ba04a81c3aa04/bin
[- ] process > CHECK_DESIGN -
[- ] process > CHECK_DESIGN [ 0%] 0 of 1
[- ] process > CHECK_DESIGN [ 0%] 0 of 1
Error executing process > 'CHECK_DESIGN (design.csv)'
Caused by:
Cannot find a matching VM image with publisher=microsoft-azure-batch; offer=centos-container; OS type=linux; verification type=verified
[58/55b7f7] process > CHECK_DESIGN (design.csv) [100%] 1 of 1, failed: 1
Error executing process > 'CHECK_DESIGN (design.csv)'
Caused by:
Cannot find a matching VM image with publisher=microsoft-azure-batch; offer=centos-container; OS type=linux; verification type=verified
I tried looking into the source code of nextflow. I found the error to be in AzBatchService.groovy (line number below).
https://github.com/nextflow-io/nextflow/blob/0e593e6ab82880810d8139a4fe6e3c47ff69a531/plugins/nf-azure/src/main/nextflow/cloud/azure/batch/AzBatchService.groovy#L442
I did some further digging in my Azure Batch account instance. Basically, I wanted to confirm if the list of supported images being received from the Azure Batch account has the one that is required for this pipeline. I could confirm that the server did indeed respond with the required image -
What could be the issue here? I remember running the exact same pipeline a few weeks back and it did work a few times. Am I missing something?
Just had another look through the Azure Cloud docs and think this might be relevant:
By default, Nextflow creates CentOS 8-based pool nodes, but this
behavior can be customised in the pool configuration. Below the
configurations for image reference/SKU combinations to select two
popular systems.
Ubuntu 20.04:
sku = "batch.node.ubuntu 20.04"
offer = "ubuntu-server-container"
publisher = "microsoft-azure-batch"
CentOS 8 (default):
sku = "batch.node.centos 8"
offer = "centos-container"
publisher = "microsoft-azure-batch"
I think the issue here is a mismatched nodeAgentSkuId. Nextflow is expecting a CentOS 8 node agent SKU, but you have a CentOS 7 SKU. If it's not possible to change the nodeAgentSkuId somehow, the node agent SKU that Nextflow uses should be able to be overridden by adding this to your nextflow.config:
azure.batch.pools.<name>.sku = 'batch.node.centos 7'
Where <name> is the pool identifier:
azure.batch.pools.<name>.sku
Specify the ID of the Compute Node agent SKU which the pool identified with <name> supports (default: batch.node.centos 8, requires nf-azure#0.11.0).
https://www.nextflow.io/docs/edge/azure.html#advanced-settings
Related
I want to set up a linux kernel with preempt-RT with yocto.
According to meta/recipes-rt/README, I add the following code in build/conf/local.conf, and do bitbake core-image-sato, but bitbake fail.
MACHINE ?= "genericx86-64"
PREFERRED_PROVIDER_virtual/kernel = "linux-yocto-rt"
COMPATIBLE_MACHINE_genericx86-64 = "genericx86-64"
COMPATIBLE_MACHINE_quilt-native = "genericx86-64"
Yocto output the following error:
NOTE: Bitbake server didn't start within 5 seconds, waiting for 90
Loading cache: 100% |#######################################################################################################################################################################| Time: 0:00:13
Loaded 1330 entries from dependency cache.
NOTE: Resolving any missing task queue dependencies
Build Configuration:
BB_VERSION = "1.46.0"
BUILD_SYS = "x86_64-linux"
NATIVELSBSTRING = "universal"
TARGET_SYS = "x86_64-poky-linux"
MACHINE = "genericx86-64"
DISTRO = "poky"
DISTRO_VERSION = "3.1.20"
TUNE_FEATURES = "m64 core2"
TARGET_FPU = ""
meta
meta-poky
meta-yocto-bsp = "dunfell:90a6f6a110ab14890e2f6a1616e74ee259fc0f8f"
Initialising tasks: 100% |##################################################################################################################################################################| Time: 0:00:48
Sstate summary: Wanted 14 Found 0 Missed 14 Current 1203 (0% match, 98% complete)
NOTE: Executing Tasks
ERROR: linux-yocto-rt-5.4.213+gitAUTOINC+2f18e629f7_03cd66d981-r0 do_kernel_metadata: Could not locate BSP definition for genericx86-64/preempt-rt and no defconfig was provided
ERROR: linux-yocto-rt-5.4.213+gitAUTOINC+2f18e629f7_03cd66d981-r0 do_kernel_metadata: Execution of '/media/fff/disk1T/yocto/demo3/poky/build/tmp/work/genericx86_64-poky-linux/linux-yocto-rt/5.4.213+gitAUTOINC+2f18e629f7_03cd66d981-r0/temp/run.do_kernel_metadata.1138429' failed with exit code 1
ERROR: Logfile of failure stored in: /media/fff/disk1T/yocto/demo3/poky/build/tmp/work/genericx86_64-poky-linux/linux-yocto-rt/5.4.213+gitAUTOINC+2f18e629f7_03cd66d981-r0/temp/log.do_kernel_metadata.1138429
Log data follows:
| DEBUG: Executing python function extend_recipe_sysroot
| NOTE: Direct dependencies are ['/media/fff/disk1T/yocto/demo3/poky/meta/recipes-kernel/kern-tools/kern-tools-native_git.bb:do_populate_sysroot']
| NOTE: Installed into sysroot: []
| NOTE: Skipping as already exists in sysroot: ['kern-tools-native', 'quilt-native']
| DEBUG: Python function extend_recipe_sysroot finished
| DEBUG: Executing shell function do_kernel_metadata
| NOTE: do_kernel_metadata: for summary/debug, set KCONF_AUDIT_LEVEL > 0
| ERROR: Could not locate BSP definition for genericx86-64/preempt-rt and no defconfig was provided
| WARNING: exit code 1 from a shell command.
| ERROR: Execution of '/media/fff/disk1T/yocto/demo3/poky/build/tmp/work/genericx86_64-poky-linux/linux-yocto-rt/5.4.213+gitAUTOINC+2f18e629f7_03cd66d981-r0/temp/run.do_kernel_metadata.1138429' failed with exit code 1
ERROR: Task (/media/fff/disk1T/yocto/demo3/poky/meta/recipes-kernel/linux/linux-yocto-rt_5.4.bb:do_kernel_metadata) failed with exit code '1'
NOTE: Tasks Summary: Attempted 3173 tasks of which 3172 didn't need to be rerun and 1 failed.
Summary: 1 task failed:
/media/fff/disk1T/yocto/demo3/poky/meta/recipes-kernel/linux/linux-yocto-rt_5.4.bb:do_kernel_metadata
Summary: There were 2 ERROR messages shown, returning a non-zero exit code.
My hardware cpu is x86-64 core. It hint genericx86-64/preempt-rt dont exist, what action should I adapt to generate core-image-sato with preempt-RT? Please leave a comment and help me if you are similar with this problem.
I try to checkout various branch of yocto but doesn't matter.These dunfell、langdale、kirkstone.I expect to use dunfell.
Following is my build/conf/bblayers.conf:
POKY_BBLAYERS_CONF_VERSION = "2"
BBPATH = "${TOPDIR}"
BBFILES ?= ""
BBLAYERS ?= " \
/media/fff/disk1T/yocto/demo3/poky/meta \
/media/fff/disk1T/yocto/demo3/poky/meta-poky \
/media/fff/disk1T/yocto/demo3/poky/meta-yocto-bsp \
"
I am noticing that all my rules request memory twice, one at a lower maximum than what I requested (mem_mb) and then what I actually requested (mem_gb). If I run the rules as localrules they do run faster. How can I make sure the default settings do not interfere?
resources: mem_mb=100, disk_mb=8620, tmpdir=/tmp/pop071.54835, partition=h24, qos=normal, mem_gb=100, time=120:00:00
The rules are as follows:
rule bwa_mem2_mem:
input:
R1 = "data/results/qc/{species}.{population}.{individual}_1.fq.gz",
R2 = "data/results/qc/{species}.{population}.{individual}_2.fq.gz",
R1_unp = "data/results/qc/{species}.{population}.{individual}_1_unp.fq.gz",
R2_unp = "data/results/qc/{species}.{population}.{individual}_2_unp.fq.gz",
idx= "data/results/genome/genome",
ref = "data/results/genome/genome.fa"
output:
bam = "data/results/mapped_reads/{species}.{population}.{individual}.bam",
log:
bwa ="logs/bwa_mem2/{species}.{population}.{individual}.log",
sam ="logs/samtools_view/{species}.{population}.{individual}.log",
benchmark:
"benchmark/bwa_mem2_mem/{species}.{population}.{individual}.tsv",
resources:
time = parameters["bwa_mem2"]["time"],
mem_gb = parameters["bwa_mem2"]["mem_gb"],
params:
extra = parameters["bwa_mem2"]["extra"],
tag = compose_rg_tag,
threads:
parameters["bwa_mem2"]["threads"],
shell:
"bwa-mem2 mem -t {threads} -R '{params.tag}' {params.extra} {input.idx} {input.R1} {input.R2} | "
"samtools sort -l 9 -o {output.bam} --reference {input.ref} --output-fmt CRAM -# {threads} /dev/stdin 2> {log.sam}"
and the config is:
cluster:
mkdir -p logs/{rule} && # change the log file to logs/slurm/{rule}
sbatch
--partition={resources.partition}
--time={resources.time}
--qos={resources.qos}
--cpus-per-task={threads}
--mem={resources.mem_gb}
--job-name=smk-{rule}-{wildcards}
--output=logs/{rule}/{rule}-{wildcards}-%j.out
--parsable # Required to pass job IDs to scancel
default-resources:
- partition=h24
- qos=normal
- mem_gb=100
- time="04:00:00"
restart-times: 3
max-jobs-per-second: 10
max-status-checks-per-second: 1
local-cores: 1
latency-wait: 60
jobs: 100
keep-going: True
rerun-incomplete: True
printshellcmds: True
scheduler: greedy
use-conda: True # Required to run with local conda enviroment
cluster-status: status-sacct.sh # Required to monitor the status of the submitted jobs
cluster-cancel: scancel # Required to cancel the jobs with Ctrl + C
cluster-cancel-nargs: 50
Cheers,
Angel
Right now there are two separate memory resource requirements:
mem_mb
mem_gb
From the perspective of snakemake these are different, so both will be passed to the cluster. A quick fix is to use the same units, e.g. if the resource really requires only 100 mb, then the default resource should be changed to:
default-resources:
- partition=h24
- qos=normal
- mem_mb=100
Trying to add custom rule on regular expression in order to block the below log.
Mar 17 18:46:52 s21409974 named[1577]: client #0x7g246c107030 1.1.1.1#8523 (.): query (cache) './ANY/IN' denied
I did tried with online tools like this one (https://www.regextester.com) but on the fail2ban-regex test command does display like it miss it.
Any suggestion about the rule or about how to better troubleshoot?
Thank in advance
Why do you try to write a custom regex? This message is pretty well matching with original fail2ban filter named-refused:
$ msg="Mar 17 18:46:52 s21409974 named[1577]: client #0x7g246c107030 1.1.1.1#8523 (.): query (cache) './ANY/IN' denied"
$ fail2ban-regex "$msg" named-refused
Running tests
=============
Use failregex filter file : named-refused
Use single line : Mar 17 18:46:52 s21409974 named[1577]: client #0x7...
Results
=======
Prefregex: 1 total
| ^(?:\s*\S+ (?:(?:\[\d+\])?:\s+\(?named(?:\(\S+\))?\)?:?|\(?named(?:\(\S+\))?\)?:?(?:\[\d+\])?:)\s+)?(?: error:)?\s*client(?: #\S*)? (?:\[?(?:(?:::f{4,6}:)?(?P<ip4>(?:\d{1,3}\.){3}\d{1,3})|(?P<ip6>(?:[0-9a-fA-F]{1,4}::?|::){1,7}(?:[0-9a-fA-F]{1,4}|(?<=:):)))\]?|(?P<dns>[\w\-.^_]*\w))#\S+(?: \([\S.]+\))?: (?P<content>.+)\s(?:denied|\(NOTAUTH\))\s*$
`-
Failregex: 1 total
|- #) [# of hits] regular expression
| 1) [1] ^(?:view (?:internal|external): )?query(?: \(cache\))?
`-
Ignoreregex: 0 total
Date template hits:
|- [# of hits] date format
| [1] {^LN-BEG}(?:DAY )?MON Day %k:Minute:Second(?:\.Microseconds)?(?: ExYear)?
`-
Lines: 1 lines, 0 ignored, 1 matched, 0 missed
[processed in 0.01 sec]
But if you need it, here you go (regex interpolated from fail2ban's pref- & failregex):
^\s*\S+\s+named\[\d+\]: client(?: #\S*)? <ADDR>#\S+(?: \([\S.]+\))?: (?:view (?:internal|external): )?query(?: \(cache\))? '[^']+' denied
replace <ADDR> with <HOST> if your fail2ban version is smaller than 0.10.
I've been looking for tutorials online to setup sphinx search, and I have got the test database working. However I am having trouble getting my own database to work.
sphinx.conf
source src1
{
type = mysql
sql_host = localhost
sql_user = root
sql_pass = MyPassword
sql_db = MyDatabase
sql_port = 3306
sql_query = \
SELECT listing_id, title, description, image_id \
FROM listings
sql_attr_uint = listing_id
sql_query_info = SELECT listing_id, title, description, image_id FROM listings
}
index test1
{
source = src1
path = /var/lib/sphinxsearch/data/test1
docinfo = extern
charset_type = sbcs
}
searchd
{
listen = 9312
log = /var/log/sphinxsearch/searchd.log
However when I try and run:
sudo indexer --all --rotate
The output in putty is:
using config file '/etc/sphinxsearch/sphinx.conf'...
indexing index 'test1'...
WARNING: attribute 'listing_id' not found - IGNORING
WARNING: Attribute count is 0: switching to none docinfo
WARNING: collect_hits: mem_limit=0 kb too low, increasing to 24576 kb
collected 3 docs, 0.0 MB
sorted 0.0 Mhits, 100.0% done
total 3 docs, 49 bytes
total 0.002 sec, 16740 bytes/sec, 1024.94 docs/sec
total 2 reads, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
total 6 writes, 0.000 sec, 0.0 kb/call avg, 0.0 msec/call avg
rotating indices: succesfully sent SIGHUP to searchd (pid=911)
However, when I try and run "search df" for example, I get:
Sphinx 2.0.4-id64-release (r3135)
Copyright (c) 2001-2012, Andrew Aksyonoff
Copyright (c) 2008-2012, Sphinx Technologies Inc (http://sphinxsearch.com)
using config file '/etc/sphinxsearch/sphinx.conf'...
FATAL: 'sql_query_info' value must contain '$id'
I am running Sphinx Search on Ubuntu 14.04 using an account called "user" which is part of the sudoers file.
I have lost my mind with this, so would appreciate someones help.
Thanks
Your sql_query_info is invalid. It needs to contain $id, as the message says.
However, would highly recommend not using search tool - its broken. Skip it. (articles recommending its use, are outdated) - sql_query_info is only used by search.
Move right on to starting searchd, and use test.php if dont have an application yet. Using test.php to test your index is MUCH better.
I am trying to compute the eigenvalues of a big matrix on matlab using the parallel toolbox.
I first tried:
A = rand(10000,2000);
A = A*A';
matlabpool open 2
spmd
C = codistributed(A);
tic
[V,D] = eig(C);
time = gop(#max, toc) % Time for all labs in the pool to complete.
end
matlabpool close
The code starts its execution:
Starting matlabpool using the 'local' profile ... connected to 2 labs.
But, after few minutes, I got the following error:
Error using distcompserialize
Out of Memory during serialization
Error in spmdlang.RemoteSpmdExecutor/initiateComputation (line 82)
fcns = distcompMakeByteBufferHandle( ...
Error in spmdlang.spmd_feval_impl (line 14)
blockExecutor.initiateComputation();
Error in spmd_feval (line 8)
spmdlang.spmd_feval_impl( varargin{:} );
I then tried to apply what I saw on tutorial videos from the parallel toolbox:
>> job = createParallelJob('configuration', 'local');
>> task = createTask(job, #eig, 1, {A});
>> submit(job);
waitForState(job, 'finished');
>> results = getAllOutputArguments(job)
>> destroy(job);
But after two hours computation, I got:
results =
Empty cell array: 2-by-0
My computer has 2 Gi memory and intel duoCPU (2*2Ghz)
My questions are the following:
1/ Looking at the first error, I guess my memory is not sufficient for this problem. Is there a way I can divide the input data so that my computer can handle this matrix?
2/ Why is the second result I get empty? (after 2 hours computation...)
EDIT: #pm89
You were right, an error occurred during the execution:
job =
Parallel Job ID 3 Information
=============================
UserName : bigTree
State : finished
SubmitTime : Sun Jul 14 19:20:01 CEST 2013
StartTime : Sun Jul 14 19:20:22 CEST 2013
Running Duration : 0 days 0h 3m 16s
- Data Dependencies
FileDependencies : {}
PathDependencies : {}
- Associated Task(s)
Number Pending : 0
Number Running : 0
Number Finished : 2
TaskID of errors : [1 2]
- Scheduler Dependent (Parallel Job)
MaximumNumberOfWorkers : 2
MinimumNumberOfWorkers : 1