How to interpret this crash dump

How to interpret this crash dump - iis

We've been encountering an issue with a particular website hosted in iis for which I've not managed to get much information from the event log. I'm a bit out of my league with these low level 'raw' diagnostic tools and I'm not if I'm barking up the wrong tree (in which case please tell me - e.g IIS is just broken) or whether I'm following the correct paths to try and locate the issue.
A process serving application pool 'MyWebsite' suffered a fatal communication error with the Windows Process Activation Service. The process id was '4372'. The data field contains the error number.
Running with DebugView open I reliably see these lines when I encounter an issue
[5904] 4692 iisutil!ReadMultiStringParameterValueFromAnyService [helpfunc.cxx # 490]:Inetinfo: Failed reading registry value
[5904]
Error(80070002): The system cannot find the file specified.
I therefore tried installing DebugDiag and looking for any exceptions, which create a number of full dumps for me. Once I've analyzed them I've got a report out the other end shown below, but I'm not sure how to further analyse this. It tells me the type and message were NOT_FOUND and suggests contacting Microsoft. While this is 1 route - I'd like to know if there are further things that can be done before considering that approach:

I managed to find the cause of my problem, a StackOverflowException because a local reproduction was possible and it was quite apparent once the debugger was attatched.
I've therefore got to assume, that StackOverflowExceptions are similar to OutOfMemoryExceptions in that it makes IIS unstable and therefore completely unable to complete running (even to the point of providing/logging exception information).

Related

Process/thread information which accessing the file

I was looking if there exists a program in c/c++ which gives the process and thread information which access files within a given folder. Actually when my application is running all of sudden I am getting the "Reason = 13 (Permission denied)" for fopen() method call. I have tried googling this, but I didn't get relevant information. There is a chance of other applications also access the file. So I would like to log the process and thread information when I got the above mentioned error. I am unaware of anything which does this and would like to know if one exists.

The error says that you don't have permissions to access the file. See Traditional Unix permissions
for more details.
This error is not related to the file being used by other processes.

Determining Website Crash Time on Linux Server

2.5 months ago, I was running a website on a Linux server to do a user study on 3 variations of a tool. All 3 variations ran on the same website. While I was conducting my user study, the website (i.e., process hosting the website) crashed. In my sleep-deprived state, I unfortunately did not record when the crash happened. However, I now need to know a) when the crash happened, and b) for how long the website was down until I brought it back up. I only have a rough timeframe for when the crash happened and for long it was down, but I need to pinpoint this information as precisely as possible to do some time-on-task analyses with my user study data.
The server runs Linux 16.04.4 LTS (GNU/Linux 4.4.0-165-generic x86_64) and has been minimally set up to run our website. As such, it is unlikely that any utilities aside from those that came with the OS have been installed. Similarly, no additional setup has likely been done. For example, I tried looking at a history of commands used in hopes that HISTTIMEFORMAT was previously set so that I could see timestamps. This ended up not being the case; while I can now see timestamps for commands, setting HISTTIMEFORMAT is not retroactive, meaning I can't get accurate timestamps for the commands I ran 2.5 months ago. That all being said, if you have an idea that you think might work, I'm willing to try (as long as it doesn't break our server)!
It is also worth mentioning that I currently do not know if it's possible to see a remote desktop or something of the like; I've been just ssh'ing in and use the terminal to interact with the server.
I've been bouncing ideas off with friends and colleagues, and we all feel that there must be SOMETHING we could use to pinpoint when the server went down (e.g., network activity logs showing spikes around the time that the user study began as well as when the website was revived, a log of previous/no longer running processes, etc.). Unfortunately, none of us know about Linux logs or commands to really dig deep into this very specific issue.
In summary:
I need a timestamp for either when the website crashed or when it was revived. It would be nice to have both (or otherwise determine for how long the website was down for), but this is not completely necessary
I'm guessing only a "native" Linux command will be useful since nothing new/special has been installed on our server. Otherwise, any additional command/tool/utility will have to be retroactive.
It may or may not be possible to get a remote desktop working with the server (e.g., to use some tool that has a GUI you interact with to help get some information)
Myself and my colleagues have that sense of "there must be SOMETHING we could use" between various logs or system information, such at network activity, process start times, etc., but none of us know enough about Linux to do deep digging without some help
Any ideas for what I can try to help figure out at least when the website crashed (if not also for how long it was down)?

A friend of mine pointed me to the journalctl command, which apparently maintains timestamps of past commands separately from HISTTIMEFORMAT and keeps logs that for me went as far back as October 7. It contained enough information for me to determine both when I revived my Node js server as well as when my Node js server initially went down

ArangoDB - Help diagnosing database corruption after system restart

I've been working with Arango for a few months now within a local, single-node development environment that regularly gets restarted for maintenance reasons. About 5 or 6 times now my development database has become corrupted after a controlled restart of my system. When it occurs, the corruption is subtle in that the Arango daemon seems to start ok and the database structurally appears as expected through the web interface (collections, documents are there). The problems have included the Foxx microservice system failing to upload my validated service code (generic 500 service error) as well as queries using filters not returning expected results (damaged indexes?). When this happens, the only way I've been able to recover is by deleting the database and rebuilding it.
I'm looking for advice on how to debug this issue - such as what to look for in log files, server configuration options that may apply, etc. I've read most of the development documentation, but only skimmed over the deployment docs, so perhaps there's an obvious setting I'm missing somewhere to adjust reliability/resilience? (this is a single-node local instance).
Thanks for any help/advice!

please note that issues like this should rather be discussed on github.

MS Access 2016 program pops Stack errors and Security Warnings on non-developer PCs

I read all the rules on asking good questions here, I hope this will suffice.
I am having problems with an Access 2016 .ACCDE database.
The program runs fine on my machine. When I try to run it on my friends' machines (either the .ACCDE or .ACCDB version) it won't load and pops Out Of Stack Space errors and the Security Notice instead.
So, here's the set up:
The program was written in Access 2016. It is a Front End/Back End design. It's not a very big program 16 tables, 41 forms and 51 code modules.
I use the FMS Access Analyzer to help make sure my code is clean so the quality of the program is good to very good.
PRIOR versions of the program ran fine on all machines. I made several changes, improvements and moved it to the \Documents folder. Now we are having problems.
Machine 'A' (Development PC): New Win 10, 8GB RAM, Full MS Access (not runtime).
Machine 'B': Newish laptop 2GB RAM, lots of disk, Access 2016 Runtime. It ran prior versions of the program fine but now is blowing errors.
Machine 'C': Newish desktop 8GB RAM lots of free disk, full Access (not runtime). It also ran prior versions of the program fine but now is blowing errors.
Initally, the opening form would pop an error that the On Load event caused an Out Of Stack Space event. User says,
"Still happens after a fresh reboot. It does NOT happen with other .accde files." Both A and B machines are showing the same errors.
I made many changes but could not cure the Out Of Stack Space error. Finally, I went to an Autoexec Macro instead of a startup form. The autoexec macro that caused Error 3709 and aborted the macro. Machine B had CPU 49%, Mem 60%. The micro sd drive had 5.79GB used and 113GB free.
I deleted the macro. Went back to startup Form, still no luck.
I asked if he got a MS Security error, he said, "Yes, Microsoft Access Security Notice. Figuring just a general warning since it let's me go ahead and open the file. The directory where we have the program (C:Documents\Condor) was already a Trusted Location on my work machine."
So, does this sound like a Security error?
Is it a problem to have the program in the \Documents folder?

okay well there's a lot going on in this post - so to sanity check I would suggest getting back to basics: working just with .accdb and full license - - does it throw any errors at all?
an aside: because with runtime an error = crash....usually it just rolls over and closes without any message.
an aside: you don't need .accde for run time as it can't affect design, only if there are full license people you want to keep from going into design view would you need accde.
you have to be sure that the runtime / accde machines have the exact same path to the back end as your full license machine's path - as the path is stored in the front end
but sanity checking the accdb on the full license machine is the first step in debugging this... if this is not all okay then must be dealt with first.

I'm sorry, I thought I had posted that the problem was resolved.The table links broke because, as you pointed out, one person's This PC\Documents\whatever folder is different from anyone else's. (C:\Users\KentH\Documents\whatever vs. C:\Users\JohnT\Documents\whatever)
Thank you for your time and suggestions. Broken table links can cause the stack error, fer sure, and that can be caused by trying to put programs someplace other than the C:\Programs folder.
D'oh!

Segmentation Fault (11) on client webpage. 99% of the time it is limited only to Chrome

Hoping some SysAdmins can weigh in here, because I am most assuredly not one.
OS: Ubuntu Server 14.04
CMS: Expression Engine 2.9 (with extras, such as Expresso Store)
Server type: Virtual
Error: Segmentation fault (11)
Unable to load the webpage because the server sent no data. Error code: ERR_EMPTY_RESPONSE
We do not believe it is a code issue on the ExpressionEngine side of things, and my research indicates it is normally something awry on the server itself or externally (browser, ISP, etc). Issue is, no matter where in the country one accesses this particular page on the site it will routinely fail, specifically in Chrome.
The client cannot launch the site in its present state so we have been scrambling to find an issue.
While playing detective certain facts became known to me.
The virtual server is owned by the client themselves and the physical boxes are located at their facility. Their lead IT professional, who has absolutely no real experience with Linux, has been maintaining the box and the OS. This last point is critical, because he has been updating any and everything on the server the second it appears on the list. They have indicated that, for them, this is normal procedure for their Windows servers.
This set off a few alarm bells.
The IT professional has been doing this for many weeks without us knowing, and the error started happening on the 5th of September. This coincided with two updates made by him, one of which was ligbcrypt11 amd64 1.5.3-2ubuntu4.1 . This has remained unchanged since September 5th.
Could this be causing the issue? Does anybody know of any problems afflicting specifically Chrome regarding the server sending no data?
An aside: I have attempted to use GDP to backtrace the problem, but I cannot get Apache to actually generate an error file out in the folder located in /tmp that I created. When I look at the logs it does say that a dump file could be located there, so the code I placed in apache2.conf is clearly working. Permissions and ownership have been set for the folder.
I made the following changes to try and get it to work:
etc/apache2.conf (file location)
CoreDumpDirectory /tmp/apache2-gdb-dump (code added)
/etc/sysctl.conf (file location)
kernel.core_uses_pid = 1 (code added)
kernel.core_pattern = /tmp (code added)
fs.suid_dumpable = 2 (code added)
There are so many things that could be happening that I just don't know where to start with this. This isn't my area of expertise.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string