Multithreading WMI calls - how best to handle this?

Multithreading WMI calls - how best to handle this? - multithreading

Hi I need to send a WMI query to each system in a domain (potentially thousands), and WMI queries seem to take a long time to return. So I am reviewing the best ways to send multiple requests using multiple threads, so the process can run in the background and the calls can overlap.
I like the features that BackgroundWorker offers, and I read HERE that it uses the ThreadPool under the covers. I dont really understand though how I would leverage this to serve my purposes. It seems that if I had to send 1000 queries, I could do a loop in which I invoke a new BG worker for each query, and the threadpool will use up to 25(?) threads at one time, and the remaining 975 requests are queued. Is that what happens?
If this is right, I imagine the process of queuing up 1000 requests will itself freeze the UI, so should the queuing loop itself be running in another BG worker?
Is there a problem with invoking other worker threads from a worker thread?
Should I be only creating say 20 BG worker threads and manually launching another when one completes?
Am I understanding this right? Any advice would be much appreciated!

I use the Parallel.ForEach method found in the System.Threading.Tasks namespace.
Make a List<string> containing all the host names you want to query. Then make a method that takes a string as it's input and queries it and does whatever it is you're wanting to do with that data.
Put them in a ForEach method like this
Parallel.ForEach(ComputerList, QueryAComputer);
and let it rip. Be sure to call the Dispose() method on ManagementObjects as soon as you don't need them. I think there's some kind of issue that causes WMI to break when too many queries are performed at once. Dispose() should help release those resources and prevent deadlock.

Related

python: ProcessPoolExecutor processing in chunks

I am trying to work out how to process bulk records into elastic search using the bulk function and need to use threads to get some performance out of it. But I am stuck trying to work out how to limit the threads to 5 concurrent so its not to heavy on elastic.
I was thinking of just looping the db and filling a list, then when it hits eg (50), push to a thread for processing and continue. But this method will spawn to many threads and I cannot see an obvious way to limit the treads without waiting for all of them to finish, before adding another thread.
I have done this in golang before, where you can just add threads and when it hits the limit it will just wait before adding more to the queue, but seeming a little more elusive in python so far.
I am open to alternatives but this seems like the cleanest way to go so far, but there might be better methods like db -> queue with limit, then just threads to consume from the queue.. ?
look forward to some responses.

Can I use child process or cluster to do custom function calls in node?

I have a node program that does a lot of heavy synchronous work. The work that needs to be done could easily be split into several parts. I would like to utilize all processor cores on my machine for this. Is this possible?
Form the docs on child processes and clusters I see no obvious solution. Child processes seems to be focused on running external programs and clusters only work for incoming http connections (or have I misunderstood that?).
I have a simple function var output = fn(input) and would just like to run it several times, spread all the calls across the cores on my machine and provide the result in a callback. Can that be done?

Yes, child processes and clusters are the way to do that. There are a couple of ways of implementing a solution to your problem.
Your server creates a queue and manages that queue. Whenever you need to call your function, you will drop it into the queue. You will then process the queue N items at a time, where N equals the number of your cores. When you start processing, you will spawn a child process, probably either using spawn or exec, with the argument being another standalone Node.js script, along with any additional parameters (it's just a command line call, basically). Inside that script you will do your work, and emit the result back to the server. The worker is then freed up.
You can create a dedicated server with cluster, where all it will do is run your function. With the cluster module, you can (once again) create N number of other workers, and delegate work to these wokers.
Now this may seem like a lot of work, and it is. And for that reason you should use an existing library as this is a, for the most part, a solve problem at this point. I really like redis-based queues, so if you're interested in that see this answer for some queue recommendations.

What could be the delay in executing Thread.Start() method?

I am using a X_Trader(TT api) to get the prices for some products, whenever a new product price comes, i am calling a method inside a thread to update/insert into database,
new Thread(()=>{ /*call method to insert/update*/ }).Start();
The price for a product come asynchronously from a callback method. What could be the delay in inserting into database? Is there any specific thing I am missing?

You should definitely think about another way of handling those product updates.
Imagining that you get a high frequency of updates you will spawn a lot of threads which takes a lot of resources (memory, CPU usage, etc.).
Take a look into the Task Parallel Library (TPL) which is using a Thread Pool for optimized usage of a many parallel threads.
You can also collect all the updates in a stack and update the database once every 5 minutes (or less). This would save spawning many threads and won't bother the database with many single update queries in a short range of time.

Good approaches for queuing simultaneous NodeJS processes

I am building a simple application to download a set of XML files and parse them into a database using the async module (https://npmjs.org/package/node-async) for flow control. The overall flow is as follows:
Download list of datasets from API (single Request call)
Download metadata for each dataset to get link to XML file (async.each)
Download XML for each dataset (async.parallel)
Parse XML for each dataset into JSON objects (async.parallel)
Save each JSON object to a database (async.each)
In effect, for each dataset there is a parent process (2) which sets of a series of asynchronous child processes (3, 4, 5). The challenge that I am facing is that, because so many parent processes fire before all of the children of a particular process are complete, child processes seem to be getting queued up in the event loop, and it takes a long time for all of the child processes for a particular parent process to resolve and allow garbage collection to clean everything up. The result of this is that even though the program doesn't appear to have any memory leaks, memory usage is still too high, ultimately crashing the program.
One solution which worked was to make some of the child processes synchronous so that they can be grouped together in the event loop. However, I have also seen an alternative solution discussed here: https://groups.google.com/forum/#!topic/nodejs/Xp4htMTfvYY, which pushes parent processes into a queue and only allows a certain number to be running at once. My question then is does anyone know of a more robust module for handling this type of queueing, or any other viable alternative for handling this kind of flow control. I have been searching but so far no luck.
Thanks.

I decided to post this as an answer:
Don't launch all of the processes at once. Let the callback of one request launch the next one. The overall work is still asynchronous, but each request gets run in series. You can then pool up a certain number of the connections to be running simultaneously to maximize I/O throughput. Look at async.eachLimit and replace each of your async.each examples with it.
Your async.parallel calls may be causing issues as well.

UpdateAllViews() from within a worker thread?

I have a worker thread in a class that is owned by a ChildView. (I intend to move this to the Doc eventually.) When the worker thread completes a task I want all the views to be updated. How can I make a call to tell the Doc to issue an UpdateAllViews()? Or is there a better approach?
Thank you.
Added by OP: I am looking for a simple solution. The App is running on a single user, single CPU computer and does not need network (or Internet) access. There is nothing to cause a deadlock.
I think I would like to have the worker thread post (or send) a message to cause the views to update.
Everything I read about threading seems way more complicated than what I need - and, yes, I understand that all those precautions are necessary for applications that are running in multiprocessor, multiuser, client-server systems, etc. But none of those apply in my situation.
I am just stuck at getting the right combination of getting the window handle, posting the message and responding to the message in the right functions and classes to compile and function at all.

UpdateAllViews is not thread-safe, so you need to marshal the call to the main thread.
I suggest you to signal a manual-reset event to mark your thread's completion and check the event's status in a WM_TIMER handler.
suggested reading:
First Aid for the Thread-Impaired:
Using Multiple Threads with MFC
More First Aid for the Thread
Impaired: Cool Ways to Take Advantage
of Multithreading

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string