Using CLR 4.0 Background GC on a Server - garbage-collection

We're building a MMO server, highly optimized for latency.
So, with the CLR 4.0 and with introduced new workstation GC, is it now possible to use Background Garbage collection on a Windows Server?

Apparently not. See this article, which specifically states that Microsoft is not offering background GC for server GC in V4.0 (though it looks like this is under consideration).

You might also find this essay (PDF) interesting, given what you're trying to do.

Related

.Net Core 2.0 Web API - Servers crashing because of issue with Destructors

Ok, so I have an app with pretty heavy traffic (about 17 requests per second). The app is a REST API built with .Net Core 2.0 (just recently upgraded).
The app is hosted on Azure and we are having a problem that looked like a memory leak in that the servers would very slowly (over a week) eat up all the handlers and resources and eventually crash.
I have spoken a good bit to MS Support and they helped me narrow down the problem. Here is their last email:
"We are seeing a high amount of large objects (strings and arrays over
85000 bytes) can lead to GC Heap fragmentation and thus higher memory
usage in your application. We were investigating how to manage the
destructor and I can provide you the following documentation:
Why does the Finalize/Destructor example not work in .NET Core? Why does the Finalize/Destructor example not work in .NET Core?
(not a Microsoft official documentation but it can be use as reference )
ASP.NET Case Study: Bad perf, high memory usage and high
CPU in GC – Death By ViewState:
https://blogs.msdn.microsoft.com/tess/2006/11/24/asp-net-case-study-bad-perf-high-memory-usage-and-high-cpu-in-gc-death-by-viewstate/
Finalizers (C# Programming Guide):
https://learn.microsoft.com/en-us/dotnet/csharp/programming-guide/classes-and-structs/destructors
I will continue looking for more documentation related with the
destructor in .NET Core."
After this they basically said that Azure was not to blame and I needed to open up a "code" support ticket that costs about $500...
So I am coming here instead. :)
While I have been a .Net developer for over 15 years, this was my first time using .Net Core. I found this great article and used it as the backbone to my API (https://chsakell.com/2016/06/23/rest-apis-using-asp-net-core-and-entity-framework-core/).
When I compared it to other .Net Core examples it seemed to fall in line with those so I am reasonably confident that I am following "best practices", but I could be wrong.
My fear is that there is a fundamental problem with .Net Core (which those articles that MS referred to kinda suggest), but I am not sure how to find the answer. I don't want have to rewrite my code because of this, but aside from occasionally rebooting the servers I am not sure what my options are.
Thoughts?
Ok... for posterity... my eventual solution turned out to be a configuration setup issue... the destructor issue with Core wasn't a factor for me because we weren't sending strings large enough to trigger it.
You can see my approach and the eventual answer (using a singleton) in this question:
ASP.Net Core 2 configuration taking up a lot of memory. How do I get config information differently?

Is using Node.js or Ringojs safe for live websites?

As stated in the title, I would like to know if it's safe to develop a website using one of the actuals "omg" platforms that are Node.js and Ringo.js at their actual version.
Also, I would like to know if they support cookies/sessions and how do they deals with multi-fields post (fieldname[] in PHP).
Thank you
--Edit--
Thanks for all the links guys.
What can you tell me about Ringojs ?
Since I haven't figured which platform to start playing with. I must admit that the fact it can use Java seamlessly really impress me. The only available XSLT 2.0 library is in Java. I could use it as a templating system.
Is there anyone who had the chance to play with Ringojs?
From my experience using both, Ringo is more stable and "safer" for production use but you can comfortably deploy both. In addition to the ability to wrap existing Java libraries that you mention, you also get the benefit of being able to run it in an existing webapp container which manages the lifecycle of the application for you and ensures its availability.
That being said, it doesn't have to be an either or decision. By using my common-node package and assuming you don't use any Java libraries, it's perfectly feasible to maintain a project that runs on both without any changes to the code.
I've also included benchmarks that test the performance of Node.js vs. RingoJS the results of which you can find in the common-node/README.md. To summarize: RingoJS has slightly lower throughput than Node.js, but much lower variance in response times while using six times the RAM with default Java settings. The latter can be tweaked and brought down to as little as twice the memory usage of Node with e.g. my ringo-sunserver but at the expense of decreased performance.
Node.js is stable, so yes it's safe to use. Node.js is capable of handling cookies, sessions, and multiple fields but are not as easy to manage. Web frameworks solve this problem.
I recommend Express.js, it's an open-source web framework for Node.js which handles all of this and more.
You can download it here:
https://github.com/visionmedia/express
I hope this helped!
Examples of some of the bigger sites running Node.js
https://www.learnboost.com/
http://ge.tt/
https://gomockingbird.com/
https://secured.milewise.com/
http://voxer.com/
https://www.yammer.com/
http://cloud9ide.com/
http://beta.etherpad.org/
http://loggly.com/
http://wordsquared.com/
Yes. It is. https://github.com/joyent/node/wiki/Projects,-Applications,-and-Companies-Using-Node and https://github.com/joyent/node/wiki/modules
cookies/sessions/forms etc http://expressjs.com/ makes it easier
Ringojs is a framework developed by Hannes Wallnöver and uses rhino as it's scripting framework. There are webframeworks, templating-engines, orm-packages and many many more things already available. Have a look at the tutorial featuring a good subset of packages you may use for a simple web-application. It's not too long and straightforward.
Even thought some of those packages used within the tutorial (e.g. ringo-sqlstore]) are marked as 0.8 and come with the hint "consider this being beta" they are already very stable and bugs - if you find one - get fixed or commented on very fast.
And the power of uncountable java-libraries out there is at your fingertips - so if you already have java-knowledge this knowledge isn't wasted. Rhino - the scripting-engine - even enables you to implement interfaces and extend classes. It is possible a little more advanced but i've done it and i know of packages taking advantage of such features (like ringo-ftpserver which is a wrapper around Apache FtpServer written in java)
Another pro for me is - because ringojs is based on java - it works fairly well with multithreading with ringo/worker for example.

What are Managed objects and unmanaged objects in C++/CLI?

What are Managed objects and unmanaged object in C++/CLI
Managed objects are a feature of the .NET framework and its implementation of a C++-like language, and have their memory managed for you by the .NET garbage collector. C++ itself has no such concept, and a better (in general) way of managing all resources (not just memory) called RAII.
The concept Managed/Unmanaged is not typically C++. It is Microsoft .Net technology speak.
In normal, plain C++ applications, the application itself is responsible for deleting all the memory it has allocated. This requires the developer to be very careful about when to delete memory. If memory is deleted too soon, the application may crash if it still has a pointer to it. If memory is deleted too late, or not deleted at all, the application has a memory leak.
Environments like Java and .Net solve this problem by using garbage collectors. The developer should not delete memory anymore, the garbage collector will do this for him.
In the 'native' .Net languages (like C#), the whole language works with the garbage collector concept. To make the transition from normal, plain C++ applications to .Net easier, Microsoft added some extensions to its C++ compiler, so that C++ developers could already benefit from the advantages of .Net.
Whenever you use normal, plain C++, Microsoft talks about unmanaged, or native C++. If you use the .Net extensions in C++, Microsoft talks about managed C++. If your application contains both, you have a mixed-mode application.
Managed objects do not exist in C++.
They exist in Microsoft's .NET extensions to C++, and a complete explanation would be a bit long, sorry.

Managing multiple-processes: What are the common strategies?

While multithreading is faster in some cases, sometimes we just want to spawn multiple worker processes to do work. This has the benefits of not crashing the main app if one of the worker crashes, and that the user doesn't need to worry a lot about inter-locking stuffs.
COM+'s Application Pooling seems like a good way to achieve this on Windows. The downside is that we need to write a COM+ wrapper for the worker process.
However, when I search for Application Pooling on Google, it seems like most of its usages are related to IIS. Don't other applications (such as scientific/graphics) find it useful to spawn multiple worker processes?
So there are several questions:
Why isn't COM+ more popular in areas other than IIS? If I write a non-IIS application and want to use process management on Windows, should I go with COM+ or are there better alternatives out there?
What would be the cross platform way to do it? Are there libraries out there that give me a "process pool" (worker processes will intelligently pick up work, can be managed, etc.)
I can't offer any answers to the COM aspect of your question, but it's worth noting there's another world (besides HPC MPI) where multi-processing (rather than the more common multi-threading approach) is apparently alive, well and thriving: Python.
Why ? Python's GIL ("global interpreter lock") cripples most attempts to multithread python code so badly that multiprocessing is the generally recommended approach to parallelising Python on SMP. The standard library includes process pools; there are various other options too.
Python certainly ought to satisfy any multi-platform requirement!
You might want to investigate how the apache web server manages process pools. From version 2.0 it runs natively on windows and one of the multi-processing models it supports are process pools. A part of apache is also APR (apache portable runtime), which handles platform-specific issues.
No one can answer why something is not popular because may be no body is looking for what you are looking for. After .NET came in picture, people shifted from COM to Managed Environment, before .NET, COM and ATL and relative other technologies were quite painful to implement and they would crash and were also quite difficult to debug.
That is the reason, managed environment came in existence.
However, .NET 4 onwards, parallel libraries give much more power to user for parallel programming and also you can spawn and control other proceeses.
For multiplatform, you can look for zvrba's answer.
Yes, other applications--especially science applications--find it useful to spawn multiple processes. Since few super-computers run Microsoft Windows, scientists generally avoid using anything that ties them to a Microsoft platform. Nothing related to COM will help scientists leverage their enormous existing code base written in Fortran.
People who choose to run IIS have generally already drunk the Microsoft Koolaid, so they have fewer inhibitions to tying themselves to Microsoft's proprietary platforms, which is why COM-specific terminology will get lots of hits related to IIS.
One of the open standards for doing what you want is the Message Passing Interface. Several implementations exist and some of them run on supercomputers using Fortran. Some of them run on cheaper computers using sexier languages.
See http://en.wikipedia.org/wiki/Message_Passing_Interface
There hasn't been a mob rushing through the doors of COM application pooling primarily because of two factors:
COM is a pain in the ass to deal with compared to just about anything else
Threading can be a headache, but it's a lot easier and more convenient to manage than inter-process communication
COM application pooling was essentially created for IIS. It has one very specific benefit over normal multithreading: the multiple processes are fully isolated from each other. This is important for data security and for app stability when dealing with third party plugins of questionable stability.
Scientific computing generally doesn't need strong data security isolation between operations, and I would venture to guess that scientific computing doesn't rely much on third party plugins of questionable stability. When doing big math operations, you're either using a sexy numerics library that had better be rock solid to be taken seriously, or you're using your own code, in which case crashes should be fixed and repeat offenders should be spanked.
Oh, and all crashes except stack overflow can be trapped and dealt with within a multithreaded app, especially if it's your own code.
In short, COM app pooling is overkill for just about anything other than IIS.
Google's webbrowser chrome is a multi-process architecture software. It is open source, so you can check out its code and see how to manage processes.

Garbage Collector in Real-Time System

I'm new to C#/Java and plan to prototype it for soft real-time system.
If I wrote C#/Java app just like how I do in C++ in terms of memory management, that is, I explicitly "delete" the objects that I no longer use, then would the app still be affected by garbage collector? If so, how does it affect my app?
Sorry if this sounds like an obvious answer, but being new, I want to be thorough.
Take a look at IBM's Metronome, their garbage collector for hard real-time systems.
Your premise is wrong: you cannot explicitly “delete” objects in either Java or C#, so your application will always be affected by the GC.
You may try to trigger a collection by calling GC.Collect (C#) with an appropriate parameter (e.g. GC.MaxGeneration) but this still doesn’t guarantee that the GC won’t be working at other moments during execution.
By explicitly "delete" if you mean releasing the reference to the object then you are reliant on the garbage collector in C# managed code - see the System.GC class for ways of controlling it.
If you choose to write unmanaged C# code then you will have more control over memory, akin to C++, and will be responsible for deleting your instantiated objects, able to use pointers, etc. For more info see MSDN doc - Unsafe Code and Pointers (C# Programming Guide).
In unmanaged code you will not be at the mercy of the the Garbage Collector and its indeterminate cleanup algorithms.
I don't know if Java has an equivalent unmanaged mode, but this Microsoft info might help provide some direction on C#/.NET to use its available features for your requirement of dealing with the garbage collector.
In Csharp or Java you can't delete object. What you can do is only mark them available for deletion. The memory free up will be done by Garbage Collector.. It might be the case that Garbage Collector may not run during the life time of your application. However it's likely to run. When your system is becoming short of resources it is the most likely time when GC routines are run by the runtime. And when resources are low GC becomes the highest priority thread. So your application do get effected. However you can minimize the effect by calculating the correct load and required resources for your application life time and make sure to buy the right hardware which is good enough for that. But still you can't just bench mark your performance.
Besides just GC the managed application do get a slight overhead over the traditional C++ application due to the extra delegation layer involved. And a slight first time performance panelty since the run time needs to be up and running before your application get started.
Here are some references for developing real-time systems with the .net compact framework:
IEEE - C# and the .NET Framework: Ready for Real Time?
MSDN - Real-Time Behavior of the .NET Compact Framework
They both talk about the memory requirements of using the .net framework.
C# and Java are not for Real-Time development. Soft real-time is attainable however as you note.
For C#, the best you can do is implement the finalize/dispose pattern:
http://msdn.microsoft.com/en-us/library/b1yfkh5e(VS.71).aspx
You can request it to collect, but typically it's much better at determining how to do this.
http://msdn.microsoft.com/en-us/library/system.gc(VS.71).aspx
For Java, there are many options to optimize it:
http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html
Along with third party solutions like IBM Metronome as noted above.
This is a real science within CS itself.

Resources