A specific process consumes huge amount of CPU time at opening a file - windows-10

I have a program of which process name is OZReportDesigner_eform.exe.
When I try to open a specific file with the program, it consumes huge amount of CPU time about 50 seconds.
With Process Monitor tool, I found that it performs many Process Profiling operations at the file opening. I don't know what went wrong.
Any advice will be helpful.
Thanks.

Related

How to know how many resources a process uses in all its execution time? linux

I would like to know if there is a program to analyze how many resources it takes to execute a command.
for example as follows:
# magic_program python3 app.py
And that the program tells you how many resources the execution of a program uses, the use of cpu, memory, disk, network, etc.
that in a certain way watches over the program during execution and then gives you a report. If it doesn't exist, I would love to carry out a project like this.
Questions
Is there this magic program? if not, how viable would its creation be?

Python - Running multiple python scripts that uses multiprocessing affects performance and errors out sometimes

I have a PYTHON script that uses multiprocessing to extract the data from DB2/Oracle database to CSV and ingest to Snowflake. When I run this script, performance is good (extracts the source table that is large dataset in 75 seconds). So I made a copy of this python script and changed the input parameters (basically different source tables). When I run all these python scripts together, performance gets an impact (for the same table, it extracts in 100 seconds) and sometimes i see an error 'Cannot allocate memory'.
I am using Jupyter Nootebook and all these different python scripts extracts different source tables to CSV files and saves it in same server location.
I am also checking on my own. But any help will be appreciated.
Thanks
Bala
If you are running multiple scripts that use multiprocessing and write to the same disk at the same time, you will eventually hit a bottleneck somewhere.
It could be concurrent access to the database, writing speed of the disk, amount of memory used or CPU cycles. What specifically is the problem here is impossible to say without doing measurments.
But e.g. writing things to a HDD is very slow compared to current CPU speeds.
Also, when you are running multiple scripts that use multiprocessing you could have more worker processes than the CPU has cores. In which case there will be some worker processes waiting for CPU time all the time.

100% cpu usage profile output, what could cause this based on our profile log?

We have a massively scaled nodejs project (~1m+ users) that is suddenly taking a massive beating on our CPU. (Epyc 24c 2ghz)
We've been trying to debug what's using all our CPU using a profiler, (and I can show you the output down below) and it's behaving really weirdly whatever it is.
We have a master process that spawns 48 clusters, after they're all loaded the cpu usage slowly grows to max. After killing a cluster, the LA doesn't dip at all. However after killing the master process, it all goes back to normal.
The master process obviously isn't maxing all threads, and killing a cluster should REALLY do the trick?
We even stopped the user input of the application and a cluster entirely and it didn't reduce cpu usage at all.
We've got plenty of log files we could send if you want them.
Based on the profile, it looks like the code is spending a lot of time getting the current time from the system. Do you maybe have Date.now() (or oldschool, extra-inefficient +new Date()) calls around a bunch of frequently used, relatively quick operations? Try removing those, you should see a speedup (or drop in CPU utilization, respectively).
As for stopping user input not reducing CPU load: are you maybe scheduling callbacks? Or promises, or other async requests? It's not difficult to write a program that only needs to be kicked off and then keeps the CPU busy forever on its own.
Beyond these rough guesses, there isn't enough information here to dig deeper. Is there anything other than time-related stuff on the profile, further down? In particular, any of your own code? What does the bottom-up profile say?

Script to spawn a list of commands up to n% of CPU utilization and/or m% of memory consumption

I am looking for a script that will spawn each command in a list until CPU/memory/network reaches a specified bound.
Some commercial scheduling tools will run as many jobs as it can until CPU utilization hits 90%. At that point, they wait until CPU utilization goes below a specified point and start another job. This maximizes utilization to finish the set faster.
An obvious example is in copying files. With 100+ files to copy, it is ludicrous to copy them one at a time. Such little CPU time is used, there should be many copies started. I/O bandwidth and network bandwidth become the constraint to manage.
I would like to not reinvent the wheel if there is already something available. Anyone know of something like this?

C# multithreading a process with a much fewer CPU is faster than a much more CPU

Currently our application is processing a large amount of files about over 1000 XML files on the same directory. The files are all being read, parsed and updated/saved to the database.
When we tested our application on a 12 core machine the total process is much slower than processing it on a 4 core machine.
What we observed is that the thread count produced by our application goes up to a range of 30 to 90 threads and the Context Switches is massively increasing. This is possibly caused by a lot of parallel execution being spawned but all of them are important.
Is the context switch the culprit? or the parallel read/write of files? or do we lessen the number of parallel tasks?
The bottle neck here is the disk access. No matter how many threads you start, the file system can only read one file at a time. Starting more threads will only make them fight over this single resource, increasing both the context switching and the disk seek times.
In the other end of the process is also a limitation as only one thread at a time can update a table in the database, but the database is designed to handle multiple processes.
Make a single thread responsible for the disk reads, and once a file has been read it can start a thread that processes it. That way you read from the disk in the most efficient way, and you have the multi threaded part of the operation behind the bottle neck.

Resources