im trying to use Multi-Threadding (or better Multi-Processor) in Julia-Lang. Just using Base.Threads just made my Application Slower so i wanted to try Distributed.
module Parallel
# ... includes ..
using Distributed
#Distributed.everywhere include("...jl")
#... Includes Needed in Proccesses
export loop_inner
#Distributed.everywhere function loop_inner(parentValue, value, i, depth)
...
end
function langfordSequence(parentValue, depth)
...
if depth < 4 && depth > 1
futures = [#spawnat :any loop_inner(parentValue, value, i, depth) for i = 0:possibilites]
return sum(fetch.(futures))
% 2) PROBLEMATIC LINE ^^
else
return sum([loop_inner(parentValue, value, i, depth) for i = 0:possibilites])
% 1) PROBLEMATIC LINE ^^
end
end
end
But i run into
$julia -L ...jl -L Parallel.jl main.jl -p 8
ERROR: LoadError: UndefVarError: loop_inner not defined (At Line [see Code Above at % 1) PROBLEMATIC LINE ^^ )
I hope someone can tell me what im doing wrong.
If im chaning if depth < 4 && depth > 1 to if depth < 4 im getting UndefVarError: Parallel not defined (At Line [see Code Above at % 2) PROBLEMATIC LINE ^^ )
Ty in advance
It does not work because each worker process needs to separately load your module.
Your module should look like this:
module MyParallel
include("somefile.jl")
export loop_inner
function loop_inner(parentValue, value, i, depth)
end
end
Now you use it in the following way (this assumes that the module is inside your private package):
using Distributed
using Pkg
Pkg.activate(".") # or wherever is the package
using MyParallel # enforces module compilation which should not occur in parallel
addprocs(4) # should always happen before any `#everywhere` macro or use `-p` command line option instead.
#everywhere using Pkg, Distributed
#everywhere Pkg.activate(".")
#everywhere using MyParallel
Now you are ready to work with the MyParallel module.
EDIT
To make it more clear. The important goal of a module is to provide a common namespace for a set of functions, types and global (module) variables. If you put any distributed code inside a module you obviously break this design because each Julia's worker is a totally separate system process, having it's own memory and namespace. Hence the good design, in my opinion, is to keep all the do-the-work code inside the module and distributed computation management outside the module. Perhaps in some scenarios one might want the distributed orchestration code to be in a second module - but normally it is just more convenient to keep such code outside of the module.
Related
I'm fairly new to snakemake and inherited a kind of huge worflow that consists in a sequence of 17 rules that run in serial.
Each rule takes outputs from the previous rules and uses them to run a python script. Everything has worked great so far except that now I'm trying to improve the worflow since some of the rules can be run in parallel.
A rough example of what I'm trying to achieve, my understanding is that wildcards should allow me to solve this.
grid = [ 10 , 20 ]
rule all:
input:
expand("path/to/C/{grid}/file_C" ,grid = grid)
rule process_A:
input:
path_A = "path/to/A/file_A"
path_B = "path/to/B/{grid}/file_B" # A rule further in the worflow could need a file from a previous rule saved with this structure
params:
grid = lambda wc: wc.get(grid)
output:
path_C = "path/to/C/{grid}/file_C"
script:
"script_A.py"
And inside the script I retrieve the grid size parameter:
grid = snakemake.params.grid
In the end the whole rule process_A should be rerun with grid = 10 and with grid = 20 and save each result to a folder whose path depends on grid also.
I know there are several things wrong with this, but I can't seem to find were to start from to figure this out. The error I'm getting now is:
name 'params' is not defined
Any help as to where to start from?
It would be useful to post the error stack trace of name 'params' is not defined to know exactly what is causing it. For now...
And inside the script I retrieve the grid size parameter:
grid = snakemake.params.grid
I suspect you are mixing the script directive with the shell directive. Probably you want something like:
rule process_A:
input: ...
output: ...
params: ...
script:
"script_A.py"
inside script_A.py snakemake will replace snakemake.params.grid with the actual param value.
Alternatively, write a standalone python script that parses command line arguments and you execute like any other program using the shell directive. (I tend to prefer this solution as it makes things more explicit and easier to debug but it also means more boiler-plate code to write a standalone script).
I am attempting to write a python script using the angr binary analysis library (http://angr.io/). I have written code that successfully loads a core dump of the process I want to play with by using the ElfCore back end (http://angr.io/api-doc/cle.html#cle.backends.elf.elfcore.ELFCore) passed to the project constructor, doing something like the following:
ap = angr.Project("corefile", main_opts={'backend': 'elfcore'})
What I am wondering is, how do I now "run" the program forward from the state (registers and memory) which was defined by the core dump? For example, when I attempted to create a SimState using the above project:
ss = angr.sim_state.SimState(project=ap)
ss.regs.rip
I got back that rip was uninitialized (which it was certainly initialized in the core dump/at the point when the core dump was generated).
Thanks in advance for any help!
Alright! I figured this out. Being a total angr n00b® this may not be the best way of doing this, but since nobody offered a better way this is what I came up with.
First...
ap = angr.Project("corefile", main_opts={'backend': 'elfcore'}, rebase_granularity=0x1000)
ss = angr.factory.AngrObjectFactory(ap).blank_state()
the rebase_granularity was needed because my core file had the stack mapped high in the address range and angr refuses to map things above your main binary (my core file in this case).
From inspecting the angr source (and playing at a Python terminal) I found out that at this point, the above state will have memory all mapped out the way the core file defined it to be, but the registers are not defined appropriately yet. Therefore I needed to proceed to:
# Get the elfcore_object
elfcore_object = None
for o in ap.loader.all_objects:
if type(o) == cle.backends.elf.elfcore.ELFCore:
elfcore_object = o
break
if elfcore_object is None:
error
# Set the reg values from the elfcore_object to the sim state, realizing that not all
# of the registers will be supported (particularly some segment registers)
for regval in elfcore_object.initial_register_values():
try:
setattr(ss.regs, regval[0], regval[1])
except Exception:
warn
# get a simgr
simgr = ap.factory.simgr(ss)
Now, I was able to run forward from here using the state defined by the core dump as my starting point...
for ins in ap.factory.block(simgr.active[0].addr).capstone.insns:
print(ins)
simgr.step()
...repeat
As you know $fdisplay can print info into a file. But if you instantiate a module (like a BFM:Bus functional model) several times in the test bench and each of them has $fdisplay, then a problem may occur : Simultaneous access to a file
I my expriene that issue causes a not neat and tidy output file.
So how can I achieve my goal?
Python-Equivalent of my question is here.
P.S. The simulator's console has limitation of what can be aggregated and my logs are somewhat long. So I should print them to file. Also merging all verilog codes into one is not possible at all.(Think how the BFM models are)
If you want all the output from your BFMs to go into a single file, your problem is not with $fdisplay but with $fopen. You need to create a top level function that calls $fopen only if it has not been called before.
integer file=0;
function integer bfm_fopen;
begin
if (file)
bfm_fopen = file;
else begin
file = $fopen("logfile");
bfm_fopen = file;
end
end
endfunction
Then call top_level.bfm_fopen from your BFMs
I have an input file where some variables are defined. For each iteration in a loop, I would like to read the file, update the values of some of the variables, then run calculations.
I have an input file called input.jl with
myval=1
Then I have a file myscript.jl with the following commands
for i=1:2
include("input.jl")
println(myval)
myval=2
end
If I run the file (julia myscript.jl), I get an error that myval is not defined. If I comment out the third or fourth lines, then it runs with no problem. If I remove the for loop, the three lines run with no problem. How can I read myval from input.jl, use it, then update its value during each iteration of the loop?
Unfortunately, it seems that the include function executes things at global scope, and then continues from where it left off. So if you're trying to dynamically include new variables into local scope, this is not the way to do it.
You can either introduce the variable at global scope first so that the function has access to it, and therefore the assignment will work (but, be aware that the variable will be updated at the global scope).
or
you can cheat by wrapping your input file into a module first. You still need to call the variable by its name, and you will get warnings about updating the module, but this way you can update your local variable dynamically at least, without needing that variable at global scope:
# in input.jl
module Input
myval = 1;
end
# in your main file
for i=1:2
include("input.jl")
myval = Input.myval;
println(myval)
myval=2
end
or
you could add a separate process and offload the calculation to its global scope, and retrieve it to your current process locally, e.g.
# in file input.jl
myval = 1
# in main file
addprocs(1);
for i=1:2
myval = remotecall_fetch(() -> (global myval; include("input.jl"); myval), 2);
println(myval)
myval=2
end
I'm currently using the Atom editor to work with Julia 0.5 and somehow fail to make functions available to my worker threads. Here is my testfile test.jl:
module testie
export t1
function t1()
a= rand()
println("a is $a, thread $(myid())")
return a
end
end
if nprocs()<2
addprocs(1)
end
#everywhere println("Hi")
using testie
t1()
println(remotecall_fetch(t1,2))
Executing this file, I get as output a "Hi" from the master and the worker, and the master will also output the "a is ..." line. But the worker won't, and on the remotecall_fetch line it throws the following error msg (shortened)
LoadError: On worker 2:
UndefVarError: testie not defined
http://docs.julialang.org/en/release-0.5/manual/parallel-computing/ states: using DummyModule causes the module to be loaded on all processes; however, the module is brought into scope only on the one executing the statement. Nothing more I could see how to solve this situation. I tried to add an #everywhere before the using line, also tried to add an #everywhere include("test.jl") right before it. Didn't help. This should be really simple but I can't figure it out.
On SO I only found Julia parallel programming - Making existing function available to all workers but this doesn't really answer it to me.
If you are importing a module yourself with include then you need to tell julia that you want to use t1 from that module by prefixing it with testie.t1
Try this
if nprocs()<2
addprocs(1)
end
#everywhere include("testie.jl")
println(remotecall_fetch(testie.t1,2)) #NB prefix here
where testie.jl is:
module testie
export t1
function t1()
a= rand()
println("a is $a, thread $(myid())")
return a
end
end