Running paralell threads on Powershell - multithreading

I need to execute some operation on a PS script that should be ran in parallel. Using PS Jobs is not a real option since the tasks that must be paralized depends on custom functions that are defined inside a separete Module. Although I know that I can use the -InitializationScript flag and import the module that contains my custom function, I think that I loose speed since importing the hole module is "time consuming" operation.
Bearing in mind all those things I'm trying launching those "tasks" in separate threads that share the runspace. My code looks like:
$ps = [Powershell]::Create().AddScript({ Get-CustomADDomain -dnsdomain $env: })
$threadRes = $ps.beginInvoke()
$ps.EndInvoke($threadRes)
The drawback of this approach is that, since I'm creating a new "powershell process" this runspace do not have my custom modules loaded and thus I'm in the same situation that I got with Jobs.
If I try to attach current runspace to the newly created $ps by using following code:
$ps = [Powershell]::Create()
$ps.runspace = $host.runspace
$ps.AddScript({ Get-CustomADDomain -dnsdomain $env: })
$threadRes = $ps.beginInvoke()
$ps.EndInvoke($threadRes)
I get an error because I'm trying to close current pipeline (bad thing).
I think my second shot is on the right way but I cannot retrieve results from the invocation of the script, or at least I'm not able to see the way to do it.
It's obvious that I must missing something so any advice you may have will be very appretiated!!!!

A new job or runspace isn't going to inherit functions from a module that was imported into the current session. That being said, you don't have to import the entire module. If you've got specific functions in the current session you need to have available in the job, you can add just those functions like this:
function test_function {'This is a test'}
function test_function2 {'This is also a test'}
$job_functions = 'test_function','test_function2'
$init = [scriptblock]::Create(
$(foreach ($job_function in $job_functions)
{
#"
function $job_function
{$((get-item function:$job_function).definition)}
"#
}))
$init
function test_function
{'This is a test'}
function test_function2
{'This is also a test'}

Related

Azure function run code on startup for Node

I am developing Chatbot using Azure functions. I want to load the some of the conversations for Chatbot from a file. I am looking for a way to load these conversation data before the function app starts with some function callback. Is there a way load the conversation data only once when the function app is started?
This question is actually a duplicate of Azure Function run code on startup. But this question is asked for C# and I wanted a way to do the same thing in NodeJS
After like a week of messing around I got a working solution.
First some context:
The question at hand, running custom code # App Start for Node JS Azure Functions.
The issue is currently being discussed here and has been open for almost 5 years, and doesn't seem to be going anywhere.
As of now there is an Azure Functions "warmup" trigger feature, found here AZ Funcs Warm Up Trigger. However this trigger only runs on-scale. So the first, initial instance of your App won't run the "warmup" code.
Solution:
I created a start.js file and put the following code in there
const ErrorHandler = require('./Classes/ErrorHandler');
const Validator = require('./Classes/Validator');
const delay = require('delay');
let flag = false;
module.exports = async () =>
{
console.log('Initializing Globals')
global.ErrorHandler = ErrorHandler;
global.Validator = Validator;
//this is just to test if it will work with async funcs
const wait = await delay(5000)
//add additional logic...
//await db.connect(); etc // initialize a db connection
console.log('Done Waiting')
}
To run this code I just have to do
require('../start')();
in any of my functions. Just one function is fine. Since all of the function dependencies are loaded when you deploy your code, as long as this line is in one of the functions, start.js will run and initialize all of your global/singleton variables or whatever else you want it to do on func start. I made a literal function called "startWarmUp" and it is just a timer triggered function that runs once a day.
My use case is that almost every function relies on ErrorHandler and Validator class. And though generally making something a global variable is bad practice, in this case I didn't see any harm in making these 2 classes global so they're available in all of the functions.
Side Note: when developing locally you will have to include that function in your func start --functions <function requiring start.js> <other funcs> in order to have that startup code actually run.
Additionally there is a feature request for this functionality that can voted on open here: Azure Feedback
I have a similar use case that I am also stuck on.
Based on this resource I have found a good way to approach the structure of my code. It is simple enough: you just need to run your initialization code before you declare your module.exports.
https://github.com/rcarmo/azure-functions-bot/blob/master/bot/index.js
I also read this thread, but it does not look like there is a recommended solution.
https://github.com/Azure/azure-functions-host/issues/586
However, in my case I have an additional complication in that I need to use promises as I am waiting on external services to come back. These promises run within bot.initialise(). Initialise() only seems to run when the first call to the bot occurs. Which would be fine, but as it is running a promise, my code doesn't block - which means that when it calls 'listener(req, context.res)' it doesn't yet exist.
The next thing I will try is to restructure my code so that bot.initialise returns a promise, but the code would be much simpler if there was a initialisation webhook that guaranteed that the code within it was executed at startup before everything else.
Has anyone found a good workaround?
My code looks something like this:
var listener = null;
if (process.env.FUNCTIONS_EXTENSION_VERSION) {
// If we are inside Azure Functions, export the standard handler.
listener = bot.initialise(true);
module.exports = function (context, req) {
context.log("Passing body", req.body);
listener(req, context.res);
}
} else {
// Local server for testing
listener = bot.initialise(false);
}
You can use global variable to load data before function execution.
var data = [1, 2, 3];
module.exports = function (context, req) {
context.log(data[0]);
context.done();
};
data variable initialized only once and will be used within function calls.

Use Node.js as Shell

How might I set up node.js as a shell replacement for bash? For example I should be able to run vi('file') to open a file and cd('location') to change between directories.
Is this even possible?
Sure you can! It will become much less straightforward to use your computer, though.
First off, you will need to know how to set this up. While you could likely set your user shell in Linux to usr/bin/node, this will leave you with only a Node.js REPL with no additional programs set up. What you're going to want to do is write a setup script that can do all of the below setup/convenience steps for you, essentially something that ends with repl.start() to produce a REPL after setting everything up. Of course, since UNIX shell settings can't specify arguments, you will need to write a small C program that executes your shell with those arguments (essentially, exec("/usr/bin/node", "path/to/setup/script.js");) and set that as your UNIX shell.
The main idea here is that any commands that you use beyond the very basics must be require()d into your shell - e.g. to do anything with your filesystem, execute
var fs = require("fs")
and do all of your filesystem calls from the fs object. This is analogous to adding things to your PATH. You can get basic shell commands by using shelljs or similar, and to get at actual executable programs, use Node's built-in child_process.spawnSync for a foreground task or child_process.spawn for a background task.
Since part of your requirement is that you want to call each of your programs like a function, you will need to produce these functions yourself, getting something like:
function ls(path) {
child_process.spawnSync('/bin/ls', [path], { stdio: 'inherit' });
}
for everything you want to run. You can probably do this programmatically by iterating through all the entries in your PATH and using something involving eval() or new Function() to generate execute functions for each, assigning them to the global object so that you don't have to enter any prefixes.
Again, it will become much less straightforward to use your computer, despite having these named functions. Lots of programs that cheat and use bash commands in the background will likely no longer work. But I can certainly see the appeal of being able to leverage JavaScript in your command-line environment.
ADDENDUM: For writing this setup script, the REPLServer object returned by repl.start() has a context object that is the same as the global object accessible to the REPL session it creates. When you write your setup script, you will need to assign everything to the context object:
const context = repl.start('> ').context;
context.ls = function ls(path) { /* . . . */ }
context.cd = function cd(path) { /* . . . */ }
I think it would be an intersting proposition. Create a test account and tell it to use node as it's shell. See 'man useradd' for all options
$ useradd -s /usr/bin/node test
$ su - test
This works on mac and linux.
require('child_process').spawnSync('vi', ['file.txt'], { stdio: 'inherit' })
You could bootstrap a repl session with your own commands, then run the script
#!/bin/bash
node --experimental-repl-await -i -e "$(< scripts/noderc.js)"`
This allows for things like:
> ls()
> run('vi','file.txt')
> await myAsyncFunc()
I think you're looking for something like this https://youtu.be/rcwcigtOwQ0 !
If so.... YES you can!
If you like I can share my code. But I need to fix some bugs first!
tell me if you like.
my .sh function:
const hey = Object.create(null),
sh = Object.create(null);;
hey.shell = Object.create(null);
hey.shell.run = require('child_process').exec;
sh.help = 'Execute an OS command';
sh.action = (...args) => {
// repl_ is the replServer
// the runningExternalProgram property is one way to know if I should
// render the prompt and is not needed. I will create a better
// way to do this (action without if/decision)!
repl_.runningExternalProgram = true;
hey.shell.run(args.join(' '),
(...args) => {
['error', 'log'].forEach((command, idx) => {
if (args[idx]) {
console[command](args[idx]);
}
});
repl_.runningExternalProgram = false;
});
};
PS: to 'cd' into some directory you just need to change the process.cwd (current working directory)
PS2: to avoid need to write .sh for every OS program/command you can use Proxy on the global object.

node.js multithreading with max child count

I need to write a script, that takes an array of values and multithreaded way it (forks?) runs another script with a value from array as a param, but so max running forks would be set, so it would wait for script to finish if there are more than n running already. How do I do that?
There is a plugin named child_process, but not sure how to get it done, as it always waits for child termination.
Basically, in PHP it would be something like this (wrote it from head, may contain some syntax errors):
<php
declare(ticks = 1);
$data = file('data.txt');
$max=20;
$child=0;
function sig_handler($signo) {
global $child;
switch ($signo) {
case SIGCHLD:
$child -= 1;
}
}
pcntl_signal(SIGCHLD, "sig_handler");
foreach($data as $dataline){
$dataline = trim($dataline);
while($child >= $max){
sleep(1);
}
$child++;
$pid=pcntl_fork();
if($pid){
// SOMETHING WENT WRONG? NEVER HAPPENS!
}else{
exec("php processdata.php \"$dataline\"");
exit;
}//fork
}
while($child != 0){
sleep(1);
}
?>
After the conversation in the comments, here's how to have Node executing your PHP script.
Since you're calling an external command, there's no need to create a new thread. The Node.js runloop understands that calls to external commands are async operations, and it can execute all of them at the same time.
You can see different ways for executing an external process in this SO question (linked answer may be the best in your case).
However, since you're already moving everything to Node, you may even consider rewriting your "process.php" script to Node.js code. Since, as you explained, that script connects to remote servers and databases and uses nslookup (which you may not really need with Node.js), you won't need any separate thread: they're all async operations that Node.js excels at performing.

is it possible to call lua functions defined in other lua scripts in redis?

I have tried to declare a function without the local keyword and then call that function from anther script but it gives me an error when I run the command.
test = function ()
return 'test'
end
# from some other script
test()
Edit:
I can't believe I still have no answer to this. I'll include more details of my setup.
I am using node with the redis-scripto package to load the scripts into redis. Here is an example.
var Scripto = require('redis-scripto');
var scriptManager = new Scripto(redis);
scriptManager.loadFromDir('./lua_scripts');
var keys = [key1, key2];
var values = [val];
scriptManager.run('run_function', keys, values, function(err, result) {
console.log(err, result)
})
And the lua scripts.
-- ./lua_scripts/dict_2_bulk.lua
-- turns a dictionary table into a bulk reply table
dict2bulk = function (dict)
local result = {}
for k, v in pairs(dict) do
table.insert(result, k)
table.insert(result, v)
end
return result
end
-- run_function.lua
return dict2bulk({ test=1 })
Throws the following error.
[Error: ERR Error running script (call to f_d06f7fd783cc537d535ec59228a18f70fccde663): #enable_strict_lua:14: user_script:1: Script attempted to access unexisting global variable 'dict2bulk' ] undefined
I'm going to be contrary to the accepted answer, because the accepted answer is wrong.
While you can't explicitly define named functions, you can call any script that you can call with EVALSHA. More specifically, all of the Lua scripts that you have explicitly defined via SCRIPT LOAD or implicitly via EVAL are available in the global Lua namespace at f_<sha1 hash> (until/unless you call SCRIPT FLUSH), which you can call any time.
The problem that you run into is that the functions are defined as taking no arguments, and the KEYS and ARGV tables are actually globals. So if you want to be able to communicate between Lua scripts, you either need to mangle your KEYS and ARGV tables, or you need to use the standard Redis keyspace for communication between your functions.
127.0.0.1:6379> script load "return {KEYS[1], ARGV[1]}"
"d006f1a90249474274c76f5be725b8f5804a346b"
127.0.0.1:6379> eval "return f_d006f1a90249474274c76f5be725b8f5804a346b()" 1 "hello" "world"
1) "hello"
2) "world"
127.0.0.1:6379> eval "KEYS[1] = 'blah!'; return f_d006f1a90249474274c76f5be725b8f5804a346b()" 1 "hello" "world"
1) "blah!"
2) "world"
127.0.0.1:6379>
All of this said, this is in complete violation of spec, and is entirely possible to stop working in strange ways if you attempt to run this in a Redis cluster scenario.
Important Notice: See Josiah's answer below. My answer turns out to be wrong or at the least incomplete. Which makes me very happy ofcourse, it makes Redis all the more flexible.
My incorrect/incomplete answer:
I'm quite sure this is not possible. You are not allowed to use global variables (read the docs ), and the script itself gets a local and temporary scope by the Redis Lua engine.
Lua functions automatically set a 'writing' flag behind the scenes if they do any write action. This starts a transaction. If you cascade Lua calls, the bookkeeping in Redis would become very cumbersome, especially when the cascade is executed on a Redis slave. That's why EVAL and EVALSHA are intentionally not made available as valid Redis calls inside a Lua script. Same goes for calling an already 'loaded' Lua function which you are trying to do. What would happen if the slave is rebooted between the load of the first script and the exec of the second script?
What we do to overcome this limitation:
Don't use EVAL, only use SCRIPT LOAD and EVALSHA.
Store the SHA1 inside a redis hash set.
We automated this in our versioning system, so a committed Lua script automatically gets it's SHA1 checksum stored in the Redis master, in a hash set, with a logical name. The clients can't use EVAL (on a slave; we disabled EVAL+LOAD in config). But the client can ask for the SHA1 for the next step. Almost all our Lua functions return a SHA1 for the next call.
Hope this helps, TW
Because I'm not one to leave well enough alone, I built a package that allows for simple internal calling semantics. The package (for Python) is available on GitHub.
Long story short, it uses ARGV as a call stack, translates KEYS/ARGV references to _KEYS and _ARGV, uses Redis as a name -> hash mapping internally, and translates CALL.<name>(<keys>, <argv>) to a table append + Redis lookup + Lua function call.
The METHOD.txt file describes what goes on, and all of the regular expressions I used to translate the Lua scripts are available in lua_call.py. Feel free to re-use my semantics.
The use of the function registry makes this very unlikely to work in Redis cluster or any other multi-shard setup, but for single-master applications, it should work for the foreseeable future.

Writing a persistent perl script

I am trying to write a persistent/cached script. The code would look something like this:
...
Memoize('process_fille');
print process_file($ARGV[0]);
...
sub process_file{
my $filename = shift;
my ($a, $b, $c) = extract_values_from_file($filename);
if (exists $my_hash{$a}{$b}{$c}){
return $my_hash{$a}{$b}{$c};
}
return $default;
}
Which would be called from a shell script in a loop as follows
value=`perl my_script.pl`;
Is there a way I could call this script in such a way that it will keep its state. from call to call. Lets assume that both initializing '%my_hash' and calling extract_values_from_file is an expensive operation.
Thanks
This is kind of dark magic, but you can store state after your script's __DATA__ token and persist it.
use Data::Dumper; # or JSON, YAML, or any other data serializer
package MyPackage;
my $DATA_ptr;
our $state;
INIT {
$DATA_ptr = tell DATA;
$state = eval join "", <DATA>;
}
...
manipulate $MyPackage::state in this and other scripts
...
END {
open DATA, '+<', $0; # $0 is the name of this script
seek DATA, $DATA_ptr, 0;
print DATA Data::Dumper::Dumper($state);
truncate DATA, tell DATA; # in case new data is shorter than old data
close DATA;
}
__DATA__
$VAR1 = {
'foo' => 123,
'bar' => 42,
...
}
In the INIT block, store the position of the beginning of your file's __DATA__ section and deserialize your state. In the END block, you reserialize the current state and overwrite the __DATA__ section of your script. Of course, the user running the script needs to have write permission on the script.
Edited to use INIT block instead of BEGIN block -- the DATA block is not set up during the compile phase.
If %my_hash in your example have moderate size in its final initialized state, you can simply use one of serialization modules like Storable, JSON::XS or Data::Dumper to keep your data in pre-assembled form between runs. Generate a new file when it is absent and just reload ready content from there when it is present.
Also, you've mentioned that you would call this script in loops. A good strategy would be to not call script right away inside the loop, but build a queue of arguments instead and then pass all of them to script after the loop in single execution. Script would set up its environment and then loop over arguments doing its easy work without need to redo setup steps for each of them.
You can't get the script to keep state. As soon as the process exists any information not written to disk is gone.
There are a few ways you can accomplish this though:
Write a daemon which listens on a network or unix socket. The daemon can populate my_hash and answer questions sent from a very simple my_script.pl. It'd only have to open a connection to the daemon, send the question and return an answer.
Create an efficient look-up file format. If you need the information often it'll probably stay in the VFS cache anyway.
Set up a shared memory region. The first time your scripts starts you save the information there, then re-use it later. That might be tricky from a Perl script though.
No. Not directly but can be achieved by very many ways.
1) I understand **extract_values_from_file()** parses given file returning hash.
2) 1 can be made as a script, then dump the parsed hash using **Data::Dumper** into file.
3) When running my_script.pl, ensure that file generated by 2 is later than of the config file. Can achieve this via **make**
3.1) **use** the file generated by 2 to retrieve values.
The same can be achieved via freeze/thaw

Resources