I am using DebugDiag Analysis tool in the server to generate report for the dump file which is 1.74 GB and it's running for more than a day and could not generate report. I also tried running the tool in separate computer (local)
https://www.iis.net/learn/troubleshoot/performance-issues/troubleshooting-high-cpu-in-an-iis-7x-application-pool
It stops on "Preloading symbol files (this may take a while)"
Related
We have a set of utility programs which reads an .xlsx file for some input data and generate reports, Apache POI is used for this purpose. Excel file got 8 sheets with an average of 50 rows and 20 columns of data. Everything was working fine in normal Windows 7 box (Read developers machine). The file reading will get finished in few seconds.
Recently we moved these jobs to a Windows Server 2012 R2 box and we have noticed that the last sheet in the excel file takes lots of time to finish reading. I have duplicated the last sheet to confirm that this is not the data issue and executed the job, the second last sheet( was the last one in the previous execution) got finished reading in milli seconds and the last one (duplicated sheet) got again stuck for 15 minutes. My best guess here is that this may be because the time taken to close the file is getting too high but that is just a guess and no concrete evidence to prove that, also if that is the case I am not sure why so. Only difference between working Windows boxes and non-working boxes are the OS, rest all configurations are similar. I have analyzed the heap and thread dump and no issues found.
Is there any known compatibility issues with POI and Windows server boxes? Or is it something related to code? We are using POI-XSSF implementation.
Ok, finally we got the problem; the issue identified is with the the VM itself, the Disk I/O operation is always 100% and file read/write was taking a lot of time to complete, this caused the program to stuck there. However we couldnt identify why the disk I/O is high, tried some blogs but didnt work hence we downgraded the OS to Windows 2008 server and it worked well.
Note that there is nothing to do with POI or anything, it was certainly a VM/OS issue.
For the development of an object recognition algorithm, I need to repeatedly run a detection program on a large set of volumetric image files (MR scans).
The detection program is a command line tool. If I run it on my local computer on a single file and single-threaded it takes about 10 seconds. Processing results are written to a text file.
A typical run would be:
10000 images with 300 MB each = 3TB
10 seconds on a single core = 100000 seconds = about 27 hours
What can I do to get the results faster? I have access to a cluster of 20 servers with 24 (virtual) cores each (Xeon E5, 1TByte disks, CentOS Linux 7.2).
Theoretically the 480 cores should only need 3.5 minutes for the task.
I am considering to use Hadoop, but it's not designed for processing binary data and it splits input files, which is not an option.
I probably need some kind of distributed file system. I tested using NFS and the network becomes a serious bottleneck. Each server should only process his locally stored files.
The alternative might be to buy a single high-end workstation and forget about distributed processing.
I am not certain, if we need data locality,
i.e. each node holds part of the data on a local HD and processes only his
local data.
I regularly run large scale distributed calculations on AWS using Spot Instances. You should definitely use the cluster of 20 servers at your disposal.
You don't mention which OS your servers are using but if it's linux based, your best friend is bash. You're also lucky that it's a command line programme. This means you can use ssh to run commands directly on the servers from one master node.
The typical sequence of processing would be:
run a script on the Master Node which sends and runs scripts via ssh on all the Slave Nodes
Each Slave Node downloads a section of the files from the master node where they are stored (via NFS or scp)
Each Slave Node processes its files, saving required data via scp, mysql or text scrape
To get started, you'll need to have ssh access to all the Slaves from the Master. You can then scp files to each Slave, like the script. If you're running on a private network, you don't have to be too concerned about security, so just set ssh passwords to something simple.
In terms of CPU cores, if the command line program you're using isn't designed for multi-core, you can just run several ssh commands to each Slave. Best thing to do is run a few tests and see what the optimal number of process is, given that too many processes might be slow due to insufficient memory, disk access or similar. But say you find that 12 simultaneous processes gives the fastest average time, then run 12 scripts via ssh simultaneously.
It's not a small job to get it all done, however, you will forever be able to process in a fraction of the time.
You can use Hadoop. Yes, default implementation of FileInputFormat and RecordReader are splitting files into chunks and split chunks into lines, but you can write own implementation of FileInputFormat and RecordReader. I've created custom FileInputFormat for another purpose, I had opposite problem - to split input data more finely than default, but there is a good looking recipes for exactly your problem: https://gist.github.com/sritchie/808035 plus https://www.timofejew.com/hadoop-streaming-whole-files/
But from other side Hadoop is a heavy beast. It has significant overhead for mapper start, so optimal running time for mapper is a few minutes. Your tasks are too short. Maybe it is possible to create more clever FileInputFormat which can interpret bunch of files as single file and feed files as records to the same mapper, I'm not sure.
I'm a bio major only recently doing major coding for research stuff. Our campus in order to support research has an on campus supercomputer for researcher use. I work remotely from this supercomputer and it uses a linux shell to access it and submit jobs. I'm writing a job submission script for the alignment of a lot of genomes using a program installed on the computer called Mauve. Now I've run a job on Mauve fine before and have altered the script from that job to fit this job. Only this time I keep getting this error
Storing raw sequence at
/scratch/addiseg/Elizabethkingia_clonalframe/rawseq16360.000
Sequence loaded successfully.
GCA_000689515.1_E27107v1_PRJEB5243_genomic.fna 4032057 base pairs.
Storing raw sequence at
/scratch/addiseg/Elizabethkingia_clonalframe/rawseq16360.001
Sequence loaded successfully.
e.anophelisGCA_000496055.1_NUH11_genomic.fna 4091484 base pairs.
Caught signal 11
Cleaning up and exiting!
Temporary files deleted.
So I've got no idea how to troubleshoot this. I'm so sorry if this is super basic and wasting time but I don't know how to troubleshoot this at a remote site. All possible solutions I've seen so far require me to access the hardware or software neither of which I can control.
My current submission script is this.
module load mauve
progressiveMauve --output=8elizabethkingia-alignment.etc.xmfa --output-guide-tree=8.elizabethkingia-alignment.etc.tree --backbone-output=8.elizabethkingia-alignment.etc.backbone --island-gap-size=100 e.anophelisGCA_000331815.1_ASM33181v1_genomicR26.fna GCA_000689515.1_E27107v1_PRJEB5243_genomic.fna e.anophelisGCA_000496055.1_NUH11_genomic.fna GCA_001596175.1_ASM159617v1_genomicsrr3240400.fna e.meningoseptica502GCA_000447375.1_C874_spades_genomic.fna e.meningoGCA_000367325.1_ASM36732v1_genomicatcc13253.fna e.anophelisGCA_001050935.1_ASM105093v1_genomicPW2809.fna e.anophelisGCA_000495935.1_NUHP1_genomic.fna
How can you profile a very long running script that is spawning lots of other processes?
We have a job that takes a long time to run - 11 or more hours, sometimes more than 17 - so it runs on a Amazon EC2 instance.
(It is doing cufflinks DNA alignment and stuff.)
The job is executing lots of processes, scripts and utilities and such.
How can we profile it and determine which component parts of the job take the longest time?
A simple CPU utilisation per process per second would probably suffice. How can we obtain it?
There are many solutions to your question :
munin is a great monitoring tool that can scan almost everything in your system and make nice graph about it :). It is very easy to install and use it.
atop could be a simple solution, it can scan cpu, memory, disk regulary and you can store all those informations into files (the -W option), then you'll have to anaylze those file to detect the bottleneck.
sar, that can scan more than everything on your system, but a little more hard to interpret (you'll have to make the graph yourself with RRDtool for example)
I am working to reduce the build time of a large Visual C++ 2008 application. One of the worst bottlenecks appears to be the generation of the PDB file: during the linking stage, mspdbsrv.exe quickly consumes available RAM, and the build machine begins to page constantly.
My current theory is that our PDB files are simply too large. However, I've been unable to find any information on what the "normal" size of a PDB file is. I've taking some rough measurements of one of the DLLs in our application, as follows:
CPP files: 34.5 MB, 900k lines
Header files: 21 MB, 400k lines
Compiled DLL: 33 MB (compiled for debug, not release)
PDB: 187 MB
So, the PDB file is roughly 570% the size of the DLL. Can someone with experience with large Visual C++ applications tell me whether these ratios and sizes make sense? Or is there a sign here that we are doing something wrong?
(The largest PDB file in our application is currently 271 MB, for a 47.5 MB DLL. Source code size is harder to measure for that one, though.)
Thanks!
Yes, .pdb files can be very large - even of the sizes you mention. Since a .pdb file contains data to map source lines to machine code and you compile a lot of code there's a lot of data in the .pdb file and you likely can't do anything with that directly.
One thing you could try is to split your program into smaller parts - DLLs. Each DLL will have its own independent .pdb. However I seriously doubt it will decrease the build time.
Do you really need full debug information at all time? You can create a configuration with less debug info in it.
But as sharptooth already said, it is time to refactor and split your program in small more maintainable parts. This won't only reduce build time.