I develop a WPF application which uses NLog.
When I profile it using dotMemory I can see ~300k of Memory used by a Dictionary which NLogs creates during configuration.
I do not know what the ObjectReflectionCache and MruCache are used for an whether their memory will be freed at some time. Maybe someone can clarify the purpose of the classes and the huge capacity used for the Dictionary.
Thank you.
stacktrace how NLog creates Dictionary
memory usage of Dictionary
NLog version: 4.7.2
Platform: .NET Framework 4.6.1
Current NLog config
LoggingConfiguration config = new LoggingConfiguration();
DebuggerTarget debuggerTarget = new DebuggerTarget { Name = "vs", Layout = DebuggerLayout };
DebuggerLoggingRule = new LoggingRule(nlogLoggerNamePattern, debuggerTarget);
config.LoggingRules.Add(DebuggerLoggingRule);
LogManager.Configuration = config;
Taking my hat off for someone that cares about 300 KByte. Long time since I have been focusing on overhead of this size (But still important).
From your screenshot that it is this collection:
https://github.com/NLog/NLog/blob/29879ece25a7d2e47a148fc3736ec310aee29465/src/NLog/Internal/Reflection/ObjectReflectionCache.cs#L52
The dictionary capacity is 10103 entries. Which is probably the prime number closest to 10000.
I'm guessing the size of Tkey + TValue of the Dictionary is close to 30 bytes. This gives the total result of 300 KBytes though probably unused. Guess NLog could reduce its initial overhead by not allocating 300 KByte upfront.
The dictionary is used for caching object-reflection for object-types logged for use in structured logging. Before NLog 4.7 the dictionary was only allocated when actually doing structured logging, but this changed with https://github.com/NLog/NLog/pull/3610
Update Memory footprint has been reduced with NLog ver. 4.7.3
Related
Presently I'm working on updating a Windows 11 DX12 desktop app to take advantage of the technologies introduced by DX12 Ultimate (i.e. mesh shaders, VRS & DXR).
All the official samples for Ultimate compile and run on my machine (Core i9/RTX3070 laptop) so as a first step, I wish to begin migrating as much static (i.e. unskinned) geometry over from the conventional (IA-vertex shader) rendering pipeline over to the Amplification->Mesh shader pipeline.
I'm naturally using code from the official samples to facilitate this, and in the process I've encountered a very strange issue which only triggers in my app, but not in the compiled source project.
The specific problem relates to setting up meshlet instancing culling & dynamic LOD selection. When setting descriptors into the mesh shader SRV heap, my app was failing to create a CBV:
// Mesh Info Buffers
D3D12_CONSTANT_BUFFER_VIEW_DESC cbvDesc{};
cbvDesc.BufferLocation = m.MeshInfoResource->GetGPUVirtualAddress();
cbvDesc.SizeInBytes = MeshletUtils::GetAlignedSize<UINT>(sizeof(MeshInfo)); // 256 bytes which is correct
device->CreateConstantBufferView(&cbvDesc, OffsetHandle(i)); // generates error
A CBV into the descriptor range couldn't be generated because the resource's GPU address range was created with only 16 bytes:
D3D12 ERROR: ID3D12Device::CreateConstantBufferView:
pDesc->BufferLocation + SizeInBytes - 1 (0x0000000008c1f0ff) exceeds
end of the virtual address range of Resource
(0x000001BD88FE1BF0:'MeshInfoResource', GPU VA Range:
0x0000000008c1f000 - 0x0000000008c1f00f). [ STATE_CREATION ERROR
#649: CREATE_CONSTANT_BUFFER_VIEW_INVALID_RESOURCE]
What made this frustrating was the code is identical to the official sample, but the sample was compiling without issue. But after many hours of trying dumb things, I finally decided to examine the size of the MeshInfo structure, and therein lay the solution.
The MeshInfo struct is defined in the sample's Model class as:
struct MeshInfo
{
uint32_t IndexSize;
uint32_t MeshletCount;
uint32_t LastMeshletVertCount;
uint32_t LastMeshletPrimCount;
};
It is 16 bytes in size, and passed to the resource's description prior to its creation:
auto meshInfoDesc = CD3DX12_RESOURCE_DESC::Buffer(sizeof(MeshInfo));
ThrowIfFailed(device->CreateCommittedResource(&defaultHeap, D3D12_HEAP_FLAG_NONE, &meshInfoDesc, D3D12_RESOURCE_STATE_COPY_DEST, nullptr, IID_PPV_ARGS(&m.MeshInfoResource)));
SetDebugObjectName(m.MeshInfoResource.Get(), L"MeshInfoResource");
But clearly I needed a 256 byte range to conform with D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT, so I changed meshInfoDesc to:
auto meshInfoDesc = CD3DX12_RESOURCE_DESC::Buffer(sizeof(MeshInfo) * 16u);
And the project compiles successfully.
So my question is, why isn't this GPU virtual address error also occurring in the sample???
PS: It was necessary to rename Model.h/Model.cpp to MeshletModel.h/MeshletModel.cpp for use in my project, which is based on the DirectX Tool Kit framework, where Model.h/Model.cpp files already exist for the DXTK rigid body animation effect.
The solution was explained in the question, so I will summarize it here as the answer to this post.
When creating a constant buffer view on a D3D12 resource, make sure to allocate enough memory to the resource upon creation.
This should be at least 256 bytes to satisfy D3D12_CONSTANT_BUFFER_DATA_PLACEMENT_ALIGNMENT.
I still don't know why the sample code on GitHub could compile without this requirement. Without having delved into the sample's project configuration in detail, it's possible that D3D12 debug layer errors are being dealt with differently, but that's purely speculative.
In 2018 a Tech Lead at Google said they were working to "support buffers way beyond 4GiB" in V8 on 64 bit systems. Did that happen?
Trying to load a large file into a buffer like:
const fileBuffer = fs.readFileSync(csvPath);
in Node v12.16.1 and getting the error:
RangeError [ERR_FS_FILE_TOO_LARGE]: File size (3461193224) is greater than possible Buffer: 2147483647 bytes.
and in Node v14.12.0 (latest) and getting the error:
RangeError [ERR_FS_FILE_TOO_LARGE]: File size (3461193224) is greater than 2 GB
Which looks to me to be a limit set due to 32 bit integers for addressing of the buffers. But I don't understand why this would be a limitation on 64 bit systems... Yes I realize I can use streams or read from the file at a specific address, but I have massive amounts of memory laying around, and I'm limited to 2147483647 bytes because Node is limited at 32 bit addressing?
Surely having a buffer of a high frequency random access data-set fully loaded into a buffer rather than streamed has performance benefits. The code involved in directing the request to pull from the multiple buffer alternative structure is going to cost something, regardless of how small...
I can use the --max-old-space-size=16000 flag to increase the maximum memory used by Node, but I suspect this is a hard-limit based on the architecture of V8. However I still have to ask since the tech lead at Google did claim they were increasing the maximum buffer size past 4GiB: Is there any way in 2020 to have a buffer beyond 2147483647 bytes in Node.js?
Edit, relevant tracker on the topic by Google, where apparently they were working on fixing this since at least last year: https://bugs.chromium.org/p/v8/issues/detail?id=4153
Did that happen?
Yes, V8 supports very large (many gigabytes) ArrayBuffers nowadays.
Is there any way to have a buffer beyond 2147483647 bytes in Node.js?
Yes:
$ node
Welcome to Node.js v14.12.0.
Type ".help" for more information.
> let b = Buffer.alloc(3461193224)
undefined
> b.length
3461193224
That said, it appears that fs.readFileAsync has its own limit: https://github.com/nodejs/node/blob/master/lib/internal/fs/promises.js#L5
I have no idea what it would take to lift that. I suggest you file an issue on Node's bug tracker.
FWIW, Buffer has yet another limit:
> let buffer = require("buffer")
undefined
> buffer.kMaxLength
4294967295
And again, that's Node's decision, not V8's.
As part of an app that allows auditors to create findings and associate photos to them (Saved as Base64 strings due to a limitation on the web service) I have to loop through all findings and their photos within an audit and set their sync value to true.
Whilst I perform this loop I see a memory spike jumping from around 40MB up to 500MB (for roughly 350 photos and 255 findings) and this number never goes down. On average our users are creating around 1000 findings and 500-700 photos before attempting to use this feature. I have attempted to use #autorelease pools to keep the memory down but it never seems to get released.
for (Finding * __autoreleasing f in self.audit.findings){
#autoreleasepool {
[f setToSync:#YES];
NSLog(#"%#", f.idFinding);
for (FindingPhoto * __autoreleasing p in f.photos){
#autoreleasepool {
[p setToSync:#YES];
p = nil;
}
}
f = nil;
}
}
The relationships and retain cycles look like this
Audit has a strong reference to Finding
Finding has a weak reference to Audit and a strong reference to FindingPhoto
FindingPhoto has a weak reference to Finding
What am I missing in terms of being able to effectively loop through these objects and set their properties without causing such a huge spike in memory. I'm assuming it's got something to do with so many Base64 strings being loaded into memory when looping through but never being released.
So, first, make sure you have a batch size set on the fetch request. Choose a relatively small number, but not too small because this isn't for UI processing. You want to batch a reasonable number of objects into memory to reduce loading overhead while keeping memory usage down. Try 50 or 100 and see how it goes, then consider upping the batch size a little.
If all of the objects you're loading are managed objects then the correct way to evict them during processing is to turn them into faults. That's done by calling refreshObject:mergeChanges: on the context. BUT - that discards any changes, and your loop is specifically there to make changes.
So, what you should really be doing is batch saving the objects you've modified and then turning those objects back into faults to remove the data from memory.
So, in your loop, keep a counter of how many you've modified and save the context each time you hit that count and refresh all the objects that were processed so far. The batch on the fetch and the batch size to save should be the same number.
There's probably a big difference in size between your "Finding" objects and the associated images. So your primary aim should be to redesign your database in a way so that unfaulting (loading) a Finding object does not automatically load the base64 encoded image.
That's actually one of the major strengths of Code Data: Loading part of an object hierarchy. Just try to move the base64 encoded data to an own (managed) object so that Core Data does not load it. It will still be loaded as needed when the reference is touched.
In Node.js 0.12.x maximum size of the buffer was limited by allocatable memory, which size could be got with:
require('smalloc').kMaxLength;
The actual value of kMaxLength was hardcoded in old versions of V8 and was equal to 0x3fffffff.
The problem is there is no smalloc module in io.js >=3.x (including node.js 4.x). It was mentioned that Buffer implementation was rewritten in V8 4.4.x.
So, my question is: is there a way to get the maximum size of the Buffer (and/or allocatable memory) in io.js >= 3.x ?
This file "calculated" (https://github.com/v8/v8-git-mirror/blob/4.4.63/src/objects.h) also has fixed constant for the external array.
4642 // Maximal acceptable length for an external array.
4643 static const int kMaxLength = 0x3fffffff;
EDIT:
It looks like you can use
require('buffer').kMaxLength;
That was the change in 3.0 and still in 4.0
b625ab4242 - buffer: fix usage of kMaxLength (Trevor Norris) #2003
I always thought that the default constructor for List would initialize a list with a capacity of 4 and that the capacity would be doubled when adding the 5th element, etc...
In my application I make a lot of lists (tree like structure where each node can have many children), some of these nodes won't have any children and since my application was fast but was also using a bit much memory I decided to use the constructor where I can specify the capacity and have set this at 1.
The strange thing now is that the memory usage when I start with a capacity of 1 is about 15% higher then when I use the default constructor. It can't be because of a better fit with 4 since the doubling would be 1,2,4. So why this extra increase in memory usage? As an extra test I've tried to start with a capacity of 4. This time again the memory usage was 15% higher then when using no specified capacity.
Now this really isn't a problem, but it bothers me that a pretty simple data structure that I've used for years has some extra logic that I didn't know about yet. Does anyone have an idea of the inner workings of List in this aspect?
That's because if you use the default constructor the internal storage array is set to an empty array, but if you use the constructor with a set size an array of the correct size gets set immediately, instead of being generated on the first call to Add.
You can see this using a decompiler like JustDecompile:
public List(int capacity)
{
if (capacity < 0)
{
ThrowHelper.ThrowArgumentOutOfRangeException(ExceptionArgument.capacity, ExceptionResource.ArgumentOutOfRange_NeedNonNegNum);
}
this._items = new T[capacity];
}
public List()
{
this._items = List<T>._emptyArray;
}
If you look at the Add function it calls EnsureCapacity, which will enlarge the internal storage array if required. Obviously, if the array is set to an empty array initially the first add will create the default size array.