I have several virtual IIS servers (2016) with multiple applications.
26 GB RAM
8 vCPU
Im getting lots of Event ID: 2004 in event viewer on several servers:
Log Name: System
Source: Microsoft-Windows-Resource-Exhaustion-Detector
Event ID: 2004
Task Category: Resource Exhaustion Diagnosis Events
Level: Warning
Keywords: Events related to exhaustion of system commit limit (virtual memory).
User: SYSTEM
Windows successfully diagnosed a low virtual memory condition. The following programs consumed the most virtual memory: w3wp.exe (47500) consumed 1342939136 bytes, w3wp.exe (18704) consumed 1042980864 bytes, and w3wp.exe (10612) consumed 1007996928 bytes.
I moved the paging file to a separate 75GB size partition
however I still see the commit limit is ~35GB
as mentioned by Jamie Hanrahan, commit limit = current pagefile size + RAM size
my questions are:
why is the commit limit not growing to its full potential(26GBRAM+75GB PF)?
why does the w3wp.exe applications not consuming more memory if they need it ?
For this problem, I think you should concern why w3wp.exe consumed much more memory. Rather than move the paging file to a separate 75GB size partition.
It may be a problem about IIS high memory usage, or your IIS is supposed to consume so much memory.
Related
I have a websocket server that hoards memory during days, till the point that Kubernetes eventually kills it. We monitor it using prometheous-net.
# dotnet --info
Host (useful for support):
Version: 2.1.6
Commit: 3f4f8eebd8
.NET Core SDKs installed:
No SDKs were found.
.NET Core runtimes installed:
Microsoft.AspNetCore.All 2.1.6 [/usr/share/dotnet/shared/Microsoft.AspNetCore.All]
Microsoft.AspNetCore.App 2.1.6 [/usr/share/dotnet/shared/Microsoft.AspNetCore.App]
Microsoft.NETCore.App 2.1.6 [/usr/share/dotnet/shared/Microsoft.NETCore.App]
But when I connect remotely and take a memory dump (using createdump), suddently the memory drops... without the service stopping, restarting or loosing any connected user. See the green line in the picture.
I can see in the graphs, that GC is collecting regularly in all generations.
GC Server is disabled using:
<PropertyGroup>
<ServerGarbageCollection>false</ServerGarbageCollection>
</PropertyGroup>
Before disabling GC Server, the service used to grow memory way faster. Now it takes two weeks to get into 512Mb.
Other services using ASP.NET Core on request/response fashion do not show this problem. This uses Websockets, where each connection last usually around 10 minutes... so I guess everything related with the connection survives till Gen 2 easily.
Note that there are two pods, showing the same behaviour, and then one (the green) drops suddenly in memory ussage due the taking of the memory dump.
The pods did not restart during the taking of the memory dump:
No connection was lost or restarted.
Heap:
(lldb) eeheap -gc
Number of GC Heaps: 1
generation 0 starts at 0x00007F8481C8D0B0
generation 1 starts at 0x00007F8481C7E820
generation 2 starts at 0x00007F852A1D7000
ephemeral segment allocation context: none
segment begin allocated size
00007F852A1D6000 00007F852A1D7000 00007F853A1D5E90 0xfffee90(268430992)
00007F84807D0000 00007F84807D1000 00007F8482278000 0x1aa7000(27947008)
Large object heap starts at 0x00007F853A1D7000
segment begin allocated size
00007F853A1D6000 00007F853A1D7000 00007F853A7C60F8 0x5ef0f8(6222072)
Total Size: Size: 0x12094f88 (302600072) bytes.
------------------------------
GC Heap Size: Size: 0x12094f88 (302600072) bytes.
(lldb)
Free objects:
(lldb) dumpheap -type Free -stat
Statistics:
MT Count TotalSize Class Name
00000000010c52b0 219774 10740482 Free
Total 219774 objects
Is there any explanation to this behaviour?
The problem was the connection to RabbitMQ. Because we were using sort lived channels, the "auto-reconnect" feature of the RabbitMQ.Client was keeping a lot of state about dead channels. We switched off this configuration, since we do not need the "perks" of the "auto-reconnect" feature, and everything start working normally. It was a pain, but we basically had to setup a Windows deploy and do the usual memory analysis process with Windows tools (Jetbrains dotMemory in this case). Using lldb is not productive at all.
Disclaimer: I am no .NET Wizard.
But you should do two things to go with Kubernetes best practices:
Define sensible resource limits for your app. If the app does not need more than 200MB memory define a resource limit to prevent the app from consuming all available host memory. But be aware that the Unix API to get available memory is not capable of processing the cgroup the process has and always outputs the host memory no matter what your cgroup says.
Tell your app what this resource limit is. It seems like your app does not "feel the need" to free memory as there is plenty. Almost all applications, and frameworks as well, have a switch to define the maximum memory to be consumed. Tell your app this limit, and it will "see" memory pressure and perform a full GC (what I guess could be the problem here)
I'm currently prototyping a very light weight TCP server based on a custom protocol. It's written in C++ and using Boost Asio for cross-platform sockets. When I monitor the process on Windows it only eats about <3MB in memory, barely grows with many concurrent connections (I tested up to 8).
I built the same server for Linux and put it on a 128MB + 64MB swap VPS for testing. It runs fine and my testings are successful, but the process gets killed in the middle of the night by kernel. I checked the logs and it was out of memory (OOM score was 0).
I highly doubt my process has memory leaks. I checked my server logs and only 1 person has connected to it the previous night, which should not result in OOM. The process sleeps for majority of the time as it only does processing if Boost's async handler wakes up the main thread to process the packet.
What I did notice is that the default VM allocation for the process is a whooping 89MB (using top command). And as soon as I make a connection it is doubled to about 151MB. My VPS has about 100MB free ram and all 64MB swap while running the server, so the the only thing I could think of is that the process tried to allocate more virtual memory going over the ~164MB remaining and went beyond the physical limit and triggered the OOM-Killer.
I've since used the ulimit command to limit the VM allocation to 30MB and it seems to be working fine, but I'll have to wait a while to see if it actually helps the issue.
My question is how does Linux determine how much VM to allocate for a process? Is there a compiler/linker setting I can use to reduce the default VM reservation? Is my reasoning correct or are there other reasons for the OOM?
I'm running Node.js on a server with only 512MB of RAM. The problem is when I run a script, it will be killed due to out of memory.
By default the Node.js memory limit is 512MB. So I think using --max-old-space-size is useless.
Follows the content of /var/log/syslog:
Oct 7 09:24:42 ubuntu-user kernel: [72604.230204] Out of memory: Kill process 6422 (node) score 774 or sacrifice child
Oct 7 09:24:42 ubuntu-user kernel: [72604.230351] Killed process 6422 (node) total-vm:1575132kB, anon-rss:396268kB, file-rss:0kB
Is there a way to get rid of out of memory without upgrading the memory? (like using persistent storage as additional RAM)
Update:
It's a scraper which uses node module request and cheerio. When it runs, it will open hundreds or thousands of webpage (but not in parallel)
If you're giving Node access to every last megabyte of the available 512, and it's still not enough, then there's 2 ways forward:
Reduce the memory requirements of your program. This may or may not be possible. If you want help with this, you should post another question detailing your functionality and memory usage.
Get more memory for your server. 512mb is not much, especially if you're running other services (such as databases or message queues) which require in-memory storage.
There is the 3rd possibility of using swap space (disk storage that acts as a memory backup), but this will have a strong impact on performance. If you still want it, Google how to set this up for your operating system, there's a lot of articles on the topic. This is OS configuration, not Node's.
Old question, but may be this answer will help people. Using --max-old-space-size is not useless.
Before Nodejs 12, versions have an heap memory size that depends on the OS (32 or 64 bits). So, following documentations, on 64-bit machines that (the old generation alone) would be 1400 MB, far away from your 512mb.
From Nodejs12, heap size take care of system RAM; however Nodejs' heap isn't the only thing in memory, especially if your server isn't dedicated to it. So set the --max-old-space-size permit to have a limit regarding the old memory heap, and if your application comes closer, the garbage collector will be triggered and will try to free memory.
I've write a post about how I've observed this: https://loadteststories.com/nodejs-kubernetes-an-oom-serial-killer-story/
I have recently installed NewRelic server monitoring to our Azure web role. The role is a small instance. We are on OSv4 (Win 2012 R2) using 2.2 Service Runtime.
Looking at memory usage I notice that WallSHost.exe (which I understand to be Azure related) it reported as consuming 219Mb (down from a peak of 250Mb) via NewRelic. Is that a lot of memory for it? Can I reduce it? Just seemed like a lot to be taking up.
CPU usage seems to aperiodically spike at about 4% for it. However CPU isn't really an issue as my instance rarely goes above 50%
First off, why do you care how much memory a process is taking up? All of that memory will be paged out to disk, and assuming it isn't being paged back in regularly then all it does is take up page file size which is usually irrelevant.
The WaIISHost process runs your role entry point code (OnStart, Run, StatusCheck, Changing, etc) and is typically implemented in WebRole.cs. If you want to reduce the memory size of this process then you can reduce the amount of memory being loaded by your role entry point code.
See http://blogs.msdn.com/b/kwill/archive/2011/05/05/windows-azure-role-architecture.aspx for more information about the WaIISHost.exe process and what it does.
I'm new to IIS. I have several questions about recycling application pool:
Private memory limit and Vitual memory limit are all 0 by default. I read the official document of IIS 7.0 (We are using IIS 8.0 and Windows Server 2012 but I think they should be the same in this respect). So there is really no limit for the memory usage? It's just waiting 1740 minutes (by default) for the application to recycle? It won't recycle even if the total memory usage is very high until 1740 minutes later? I searched very hard for the answers but couldn't find any...
I read an article which said application pool should never be recycled. So what is the mechanism of memory management for IIS? When would the memory not being used be released? Is it similar to Java? I think no one said that full GC in Java is no good practice...
Thanks.