How to add Azure durableActivity to batch - azure

$batch = #(
Invoke-DurableActivity -FunctionName "F_ComputeEngineTasks" -Input (ConvertTo-json $computeEngineTasksInput -Depth 3) -NoWait
Invoke-DurableActivity -FunctionName "F_UpdateInstancesAndMembers" -Input (ConvertTo-Json $updateInstancesAndMembersInput) -NoWait
foreach ($contactId in $members.contacts.modified.Keys) {
$taskInput = #{contactId = $contactId; contactDetails = $members.contacts.modified.$contactId }
Invoke-DurableActivity -FunctionName "F_UpdateContactInfo" -Input (ConvertTo-json $taskInput -Depth 3) -NoWait
}
)
$output = Wait-ActivityFunction -Task $batch
I want to execute with parallelism F_ComputeEngineTasks, F_UpdateInstanceAndMembers and all F_UpdateContactInfo.
I tried to make a batch and execute it ... thing is I have no idea if this is working or not (I get all the outputs of my functions but I have no way to know if they have been executed at the same time).
If someone can tell me if I am doing wrong or not I gladdly appreciate.

You code looks correct. Whether the activities actually run concurrently or not depends on many factors (scale out, concurrency limits, workload, etc.). Strictly speaking, using Invoke-DurableActivity with the -NoWait means that the function author does not object to running these activities concurrently, but it does not guarantee that they indeed will run concurrently. It is still up to the Azure Functions platform to determine the execution order depending on the available resources and concurrency configuration settings (such as https://aka.ms/functions-powershell-concurrency).
Perhaps the easiest way to check if your activities run in parallel or not is to measure the time it takes for finishing the entire orchestration and compare it with the expectations. You may need to insert artificial delays into your activities to measure confidently. But, if you cannot measure the difference, does the answer really matter to you? :-)

Related

How to speed up Parallel::ForkManager in perl

I am using EC2 amazon server to perform data processing of 63 files,
the server i am using has 16 core but using perl Parallel::ForkManager with number of thread = number of core then it seems like half the core are sleeping and the working core are not at 100% and fluctuate around 25%~50%
I also checked IO and it is mostly iddling.
use Sys::Info;
use Sys::Info::Constants qw( :device_cpu );
my $info = Sys::Info->new;
my $cpu = $info->device( CPU => %options );
use Parallel::ForkManager;
my $manager=new Parallel::ForkManager($cpu->count);
for($i=0;$i<=$#files_l;$i++)
{
$manager->start and next;
do_stuff($files_l[$i]);
$manager->finish;
}
$manager->wait_all_children;
The short answer is - we can't tell you, because it depends entirely on what 'do_stuff' is doing.
The major reasons why parallel code doesn't create linear speed increases are:
Process creation overhead - some 'work' is done to spawn a process, so if the children are trivially small, that 'wastes' effort.
Contented resources - the most common is disk IO, but things like file locks, database handles, sockets, or interprocess communication can also play a part.
something else causing a 'back off' that stalls a process.
And without knowing what 'do_stuff' does, we can't second guess what it might be.
However I'll suggest a couple of steps:
Double the number of processes to twice CPU count. That's often a 'sweet spot' because it means that any non-CPU delay in a process just means one of the others get to go full speed.
Try strace -fTt <yourprogram> (if you're on linux, the commands are slightly different on other Unix variants). Then do it again with strace -fTtc because the c will summarise syscall run times. Look at which ones take the most 'time'.
Profile your code to see where the hot spots are. Devel::NYTProf is one library you can use for this.
And on a couple of minor points:
my $manager=new Parallel::ForkManager($cpu->count);
Would be better off written:
my $manager=Parallel::ForkManager -> new ( $cpu->count);
Rather than using indirect object notation.
If you are just iterating #files then it might be better to not use a loop count variable and instead:
foreach my $file ( #files ) {
$manager -> start and next;
do_stuff($file);
$manager -> finish;
}

How to deal with tasks running too long (comparing to others in job) in yarn-client?

We use a Spark cluster as yarn-client to calculate several business, but sometimes we have a task run too long time:
We don't set timeout but I think default timeout a spark task is not too long such here ( 1.7h ).
Anyone give me an ideal to work around this issue ???
There is no way for spark to kill its tasks if its taking too long.
But I figured out a way to handle this using speculation,
This means if one or more tasks are running slowly in a stage, they
will be re-launched.
spark.speculation true
spark.speculation.multiplier 2
spark.speculation.quantile 0
Note: spark.speculation.quantile means the "speculation" will kick in from your first task. So use it with caution. I am using it because some jobs get slowed down due to GC over time. So I think you should know when to use this - its not a silver bullet.
Some relevant links: http://apache-spark-user-list.1001560.n3.nabble.com/Does-Spark-always-wait-for-stragglers-to-finish-running-td14298.html and http://mail-archives.us.apache.org/mod_mbox/spark-user/201506.mbox/%3CCAPmMX=rOVQf7JtDu0uwnp1xNYNyz4xPgXYayKex42AZ_9Pvjug#mail.gmail.com%3E
Update
I found a fix for my issue (might not work for everyone). I had a bunch of simulations running per task, so I added timeout around the run. If a simulation is taking longer (due to a data skew for that specific run), it will timeout.
ExecutorService executor = Executors.newCachedThreadPool();
Callable<SimResult> task = () -> simulator.run();
Future<SimResult> future = executor.submit(task);
try {
result = future.get(1, TimeUnit.MINUTES);
} catch (TimeoutException ex) {
future.cancel(true);
SPARKLOG.info("Task timed out");
}
Make sure you handle an interrupt inside the simulator's main loop like:
if(Thread.currentThread().isInterrupted()){
throw new InterruptedException();
}
The trick here is to login directly to the worker node and kill the process. Usually you can find the offending process with a combination of top, ps, and grep. Then just do a kill pid.

How can I improve performance with FutureTasks

The problem seems simple, I have a number (huge) of operations that I need to work and the main thread can only proceed when all of those operations return their results, however. I tried in one thread only and each operation took about let's say from 2 to 10 seconds at most, and at the end it took about 2,5 minutes. Tried with future tasks and submited them all to the ExecutorService. All of them processed at a time, however each of them took about let's say from 40 to 150 seconds. In the end of the day the full process took about 2,1 minutes.
If I'm right, all the threads were nothing but a way of execute all at once, although sharing processor's power, and what I thought I would get would be the processor working heavily to get me all the tasks executed at the same time taking the same time they take to excecuted in a single thread.
Question is: Is there a way I can reach this? (maybe not with future tasks, maybe with something else, I don't know)
Detail: I don't need them to exactly work at the same time that actually doesn't matter to me what really matters is the performance
You might have created way too many threads. As a consequence, the cpu was constantly switching between them thus generating a noticeable overhead.
You probably need to limit the number of running threads and then you can simply submit your tasks that will execute concurrently.
Something like:
ExecutorService es = Executors.newFixedThreadPool(8);
List<Future<?>> futures = new ArrayList<>(runnables.size());
for(Runnable r : runnables) {
es.submit(r);
}
// wait they all finish:
for(Future<?> f : futures) {
f.get();
}
// all done

multithreading or forking in perl

in my perl script I'm collecting a large data and later I need it to post to server, up to this I'm good but my criteria is that post to server takes subsequently large time so I need to a threading / forking concept so that one will post and parallely I can dig my second data set at same time while posting to server is taking place.
code snippet
if(system("curl -sS $post_url --data-binary \#$filename -H 'Content-type:text/xml;charset=utf-8' 1>/dev/null") != 0)
{
exit_script(" xml: Error ","Unable to update $filename xml on $post_url");
}
can any one tell me is this achievable with threading or forking.
It's difficult to give an answer to your question, because it depends.
Yes, Perl supports both forking and threading.
In general, I would suggest looking at threading for data-oriented tasks, and forking for almost anything else.
And so what you want to so is eminently achievable.
First you need to:
Encapsulate your tasks into subroutines. Get that working first. (This is very important - parallel stuff causes worlds of pain and is difficult to troubleshoot if you're not careful - get it working single threaded first).
Run your subroutines as threads, and capture their results.
Something like this:
use threads;
sub curl_update {
my $result = system ( "you_curl_command" );
return $result;
}
#start the async curl
my $thr = threads -> create ( \&curl_update );
#do your other stuff....
sleep ( 60 );
my $result = $thr -> join();
if ( $result ) {
#do whatever you would if the curl update failed
}
In this, the join is a blocking call - your main code will stop and wait for your thread to complete. If you want to do something more complicated, you can use is_running or is_joinable which are non blocking.
I'd suggest neither.
You're just talking lots of HTTP. You can talk concurrent HTTP a lot nicer, because it's just network IO, by using any of the asynchronous IO systems. Perl has many of them.
Principally I'd suggest IO::Async, but then I wrote it. You can use Net::Async::HTTP to make an HTTP hit. This will fully support doing many of them at once - many hundreds or thousands if need be.
Otherwise, you can also try either POE or AnyEvent, which will both support the same thing in their own way.

context switch measure time

I wonder if anyone of you know how to to use the function get_timer()
to measure the time for context switch
how to find the average?
when to display it?
Could someone help me out with this.
Is it any expert who knows this?
One fairly straightforward way would be to have two threads communicating through a pipe. One thread would do (pseudo-code):
for(n = 1000; n--;) {
now = clock_gettime(CLOCK_MONOTONIC_RAW);
write(pipe, now);
sleep(1msec); // to make sure that the other thread blocks again on pipe read
}
Another thread would do:
context_switch_times[1000];
while(n = 1000; n--;) {
time = read(pipe);
now = clock_gettime(CLOCK_MONOTONIC_RAW);
context_switch_times[n] = now - time;
}
That is, it would measure the time duration between when the data was written into the pipe by one thread and the time when the other thread woke up and read that data. A histogram of context_switch_times array would show the distribution of context switch times.
The times would include the overhead of pipe read and write and getting the time, however, it gives a good sense of big the context switch times are.
In the past I did a similar test using stock Fedora 13 kernel and real-time FIFO threads. The minimum context switch times I got were around 4-5 usec.
I dont think we can actually measure this time from User space, as in kernel you never know when your process is picked up after its time slice expires. So whatever you get in userspace includes scheduling delays as well. However, from user space you can get closer measurement but not exact always. Even a jiffy delay matters.
I believe LTTng can be used to capture detailed traces of context switch timings, among other things.

Resources