Golang : fatal error: runtime: out of memory - linux

I trying to use this package in Github for string matching. My dictionary is 4 MB. When creating the Trie, I got fatal error: runtime: out of memory. I am using Ubuntu 14.04 with 8 GB of RAM and Golang version 1.4.2.
It seems the error come from the line 99 (now) here : m.trie = make([]node, max)
The program stops at this line.
This is the error:
fatal error: runtime: out of memory
runtime stack:
runtime.SysMap(0xc209cd0000, 0x3b1bc0000, 0x570a00, 0x5783f8)
/usr/local/go/src/runtime/mem_linux.c:149 +0x98
runtime.MHeap_SysAlloc(0x57dae0, 0x3b1bc0000, 0x4296f2)
/usr/local/go/src/runtime/malloc.c:284 +0x124
runtime.MHeap_Alloc(0x57dae0, 0x1d8dda, 0x10100000000, 0x8)
/usr/local/go/src/runtime/mheap.c:240 +0x66
goroutine 1 [running]:
runtime.switchtoM()
/usr/local/go/src/runtime/asm_amd64.s:198 fp=0xc208518a60 sp=0xc208518a58
runtime.mallocgc(0x3b1bb25f0, 0x4d7fc0, 0x0, 0xc20803c0d0)
/usr/local/go/src/runtime/malloc.go:199 +0x9f3 fp=0xc208518b10 sp=0xc208518a60
runtime.newarray(0x4d7fc0, 0x3a164e, 0x1)
/usr/local/go/src/runtime/malloc.go:365 +0xc1 fp=0xc208518b48 sp=0xc208518b10
runtime.makeslice(0x4a52a0, 0x3a164e, 0x3a164e, 0x0, 0x0, 0x0)
/usr/local/go/src/runtime/slice.go:32 +0x15c fp=0xc208518b90 sp=0xc208518b48
github.com/mf/ahocorasick.(*Matcher).buildTrie(0xc2083c7e60, 0xc209860000, 0x26afb, 0x2f555)
/home/go/ahocorasick/ahocorasick.go:104 +0x28b fp=0xc208518d90 sp=0xc208518b90
github.com/mf/ahocorasick.NewStringMatcher(0xc208bd0000, 0x26afb, 0x2d600, 0x8)
/home/go/ahocorasick/ahocorasick.go:222 +0x34b fp=0xc208518ec0 sp=0xc208518d90
main.main()
/home/go/seme/substrings.go:66 +0x257 fp=0xc208518f98 sp=0xc208518ec0
runtime.main()
/usr/local/go/src/runtime/proc.go:63 +0xf3 fp=0xc208518fe0 sp=0xc208518f98
runtime.goexit()
/usr/local/go/src/runtime/asm_amd64.s:2232 +0x1 fp=0xc208518fe8 sp=0xc208518fe0
exit status 2
This is the content of the main function (taken from the same repo: test file)
var dictionary = InitDictionary()
var bytes = []byte(""Partial invoice (€100,000, so roughly 40%) for the consignment C27655 we shipped on 15th August to London from the Make Believe Town depot. INV2345 is for the balance.. Customer contact (Sigourney) says they will pay this on the usual credit terms (30 days).")
var precomputed = ahocorasick.NewStringMatcher(dictionary)// line 66 here
fmt.Println(precomputed.Match(bytes))

Your structure is awfully inefficient in terms of memory, let's look at the internals. But before that, a quick reminder of the space required for some go types:
bool: 1 byte
int: 4 bytes
uintptr: 4 bytes
[N]type: N*sizeof(type)
[]type: 12 + len(slice)*sizeof(type)
Now, let's have a look at your structure:
type node struct {
root bool // 1 byte
b []byte // 12 + len(slice)*1
output bool // 1 byte
index int // 4 bytes
counter int // 4 bytes
child [256]*node // 256*4 = 1024 bytes
fails [256]*node // 256*4 = 1024 bytes
suffix *node // 4 bytes
fail *node // 4 bytes
}
Ok, you should have a guess of what happens here: each node weighs more than 2KB, this is huge ! Finally, we'll look at the code that you use to initialize your trie:
func (m *Matcher) buildTrie(dictionary [][]byte) {
max := 1
for _, blice := range dictionary {
max += len(blice)
}
m.trie = make([]node, max)
// ...
}
You said your dictionary is 4 MB. If it is 4MB in total, then it means that at the end of the for loop, max = 4MB. It it holds 4 MB different words, then max = 4MB*avg(word_length).
We'll take the first scenario, the nicest one. You are initializing a slice of 4M of nodes, each of which uses 2KB. Yup, that makes a nice 8GB necessary.
You should review how you build your trie. From the wikipedia page related to the Aho-Corasick algorithm, each node contains one character, so there is at most 256 characters that go from the root, not 4MB.
Some material to make it right: https://web.archive.org/web/20160315124629/http://www.cs.uku.fi/~kilpelai/BSA05/lectures/slides04.pdf

The node type has a memory size of 2084 bytes.
I wrote a litte program to demonstrate the memory usage: https://play.golang.org/p/szm7AirsDB
As you can see, the three strings (11(+1) bytes in size) dictionary := []string{"fizz", "buzz", "123"} require 24 MB of memory.
If your dictionary has a length of 4 MB you would need about 4000 * 2084 = 8.1 GB of memory.
So you should try to decrease the size of your dictionary.

Set resource limit to unlimited worked for me
if ulimit -a return 0 run ulimit -c unlimited
Maybe set a real size limit to be more secure

Related

NtQueryObject returns wrong insufficient required size via WOW64, why?

I am using the NT native API NtQueryObject()/ZwQueryObject() from user mode (and I am aware of the risks in general and I have written kernel mode drivers for Windows in the past in my professional capacity).
Generally when one uses the typical "query information" function (of which there are a few) the protocol is first to ask with a too small buffer to retrieve the required size with STATUS_INFO_LENGTH_MISMATCH, then allocate a buffer of said size and query again -- this time using the buffer and previously returned size.
In order to get the list of object types (67 on my build) on the system I am doing just that:
ULONG Size = 0;
NTSTATUS Status = NtQueryObject(NULL, ObjectTypesInformation, &Size, sizeof(Size), &Size);
And in Size I get 8280 (WOW64) and 8968 (x64). I then proceed to allocate the buffer with calloc() and query again:
ULONG Size2 = 0;
BYTE* Buf = (BYTE*)::calloc(1, Size);
Status = NtQueryObject(NULL, ObjectTypesInformation, Buf, Size, &Size2);
NB: ObjectTypesInformation is 3. It isn't declared in winternl.h, but Nebbett (as ObjectAllTypesInformation) and others describe it. Since I am not querying for a particular object's traits but the system-wide list of object types, I pass NULL for the object handle.
Curiously on WOW64, i.e. 32-bit, the value in Size2 upon return from the second query is 16 Bytes (= 8296) bigger than the previously returned required size.
As far as alignment is concerned, I'd expect at most 8 Bytes for this sort of thing and indeed neither 8280 nor 8296 are at a 16 Byte alignment boundary, but on an 8 Byte one.
Certainly I can add some slack space on top of the returned required size (e.g. ALIGN_UP to the next 32 Byte alignment boundary), but this seems highly irregular to be honest. And I'd rather want to understand what's going on than to implement a workaround that breaks, because I miss something crucial.
The practical issue for the code is that in Debug configurations it tells me there's a corrupted heap somewhere, upon freeing Buf. Which suggests that NtQueryObject() was indeed writing these extra 16 Bytes beyond the buffer I provided.
Question: Any idea why it is doing that?
As usual for NT native API the sources of information are scarce. The x64 version of the exact same code returns the exact number of bytes required. So my thinking here is that WOW64 is the issue. A somewhat cursory look into wow64.dll with IDA didn't reveal any immediate points for suspicion regarding what goes wrong in translating the results to 32-bit here.
PS: Windows 10 (10.0.19043, ntdll.dll "timestamp" 77755782)
PPS: this may be related: https://wj32.org/wp/2012/11/30/obquerytypeinfo-and-ntqueryobject-buffer-overrun-in-windows-8/ Tested it, by checking that OBJECT_TYPE_INFORMATION::TypeName.Length + sizeof(WCHAR) == OBJECT_TYPE_INFORMATION::TypeName.MaximumLength in all returned items, which was the case.
The only part of ObjectTypesInformation that's public is the first field defined in winternl.h header in the Windows SDK:
typedef struct __PUBLIC_OBJECT_TYPE_INFORMATION {
UNICODE_STRING TypeName;
ULONG Reserved [22]; // reserved for internal use
} PUBLIC_OBJECT_TYPE_INFORMATION, *PPUBLIC_OBJECT_TYPE_INFORMATION;
For x86 this is 96 bytes, and for x64 this is 104 bytes (assuming you have the right packing mode enabled). The difference is the pointer in UNICODE_STRING which changes the alignment in x64.
Any additional memory space should be related to the TypeName buffer.
UNICODE_STRING accounts for 8 bytes of the difference between 8280 and 8296. The function uses the sizeof(ULONG_PTR) for alignment of the returned string plus an extra WCHAR, so that could easily account for the remaining 8 bytes.
AFAIK: The public use of NtQueryObject is supposed to be limited to kernel-mode use which of course means it always matches the OS native bitness (x86 code can't run as kernel in x64 native OS), so it's probably just a quirk of using the NT functions via the WOW64 thunk.
Alright, I think I figured out the issue with the help of WinDbg and a thorough look at wow64.dll using IDA.
NB: the wow64.dll I have has the same build number, but differs slightly in data only (checksum, security directory entry, pieces from version resources). The code is identical, which was to be expected, given deterministic builds and how they affect the PE timestamp.
There's an internal function called whNtQueryObject_SpecialQueryCase (according to PDBs), which covers the ObjectTypesInformation class queries.
For the above wow64.dll I used the following points of interest in WinDbg, from a 32 bit program which calls NtQueryObject(NULL, ObjectTypesInformation, ...) (the program itself is irrelevant, though):
0:000> .load wow64exts
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B0E0
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B14E
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B1A7
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B24A
0:000> bp wow64!whNtQueryObject_SpecialQueryCase+B252
Explanation of the above points of interest:
+B0E0: computing length required for 64 bit query, based on passed length for 32 bit
+B14E: call to NtQueryObject()
+B1A7: loop body for copying 64 to 32 bit buffer contents, after successful NtQueryObject() call
+B24A: computing written length by subtracting current (last + 1) entry from base buffer address
+B252: downsizing returned (64 bit) required length to 32 bit
The logic of this function in regards to just ObjectTypesInformation is roughly as follows:
Common steps
Take the ObjectInformationLength (32 bit query!) argument and size it up to fit the 64 bit info
Align the retrieved size up to the next 16 byte boundary
If necessary allocate the resulting amount from some PEB::ProcessHeap and store in TLS slot 3; otherwise using this as a scratch space
Call NtQueryObject() passing the buffer and length from the two previous steps
The length passed to NtQueryObject() is the one from step 1, not the one aligned to a 16 byte boundary. There seems to be some sort of header to this scratch space, so perhaps that's where the 16 byte alignment comes from?
Case 1: buffer size too small (here: 4), just querying required length
The up-sized length in this case equals 4, which is too small and consequently NtQueryObject() returns STATUS_INFO_LENGTH_MISMATCH. Required size is reported as 8968.
Down-size from the 64 bit required length to 32 bit and end up 16 bytes too short
Return the status from NtQueryObject() and the down-sized required length form the previous step
Case 2: buffer size supposedly (!) sufficient
Copy OBJECT_TYPES_INFORMATION::NumberOfTypes from queried buffer to 32 bit one
Step to the first entry (OBJECT_TYPE_INFORMATION) of source (64 bit) and target (32 bit) buffer, 8 and 4 byte aligned respectively
For for each entry up to OBJECT_TYPES_INFORMATION::NumberOfTypes:
Copy UNICODE_STRING::Length and UNICODE_STRING::MaximumLength for TypeName member
memcpy() UNICODE_STRING::Length bytes from source to target UNICODE_STRING::Buffer (target entry + sizeof(OBJECT_TYPE_INFORMATION32)
Add terminating zero (WCHAR) past the memcpy'd string
Copy the individual members past the TypeName from 64 to 32 bit struct
Compute pointer of next entry by aligning UNICODE_STRING::MaximumLength up to an 8 byte boundary (i.e. the ULONG_PTR alignment mentioned in the other answer) + sizeof(OBJECT_TYPE_INFORMATION64) (already 8 byte aligned!)
The next target entry (32 bit) gets 4 byte aligned instead
At the end compute required (32 bit) length by subtracting the value we arrived at for the "next" entry (i.e. one past the last) from the base address of the buffer passed by the WOW64 program (32 bit) to NtQueryObject()
In my debugged scenario these were: 0x008ce050 - 0x008cbfe8 = 0x00002068 (= 8296), which is 16 bytes larger than the buffer length we were told during case 1 (8280)!
The issue
That crucial last step differs between merely querying and actually getting the buffer filled. There is no further bounds checking in that loop I described for case 2.
And this means it will just overrun the passed buffer and return a written length bigger than the buffer length passed to it.
Possible solutions and workarounds
I'll have to approach this mathematically after some sleep, the workaround is obviously to top up the required length returned from case 1 in order to avoid the buffer overrun. The easiest method is to use my up_size_from_32bit() from the example below and use that on the returned required size. This way you are allocating enough for the 64 bit buffer, while querying the 32 bit one. This should never overrun during the copy loop.
However, the fix in wow64.dll is a little more involved, I guess. While adding bounds checking to the loop would help avert the overrun, it would mean that the caller would have to query for the required size twice, because the first time around it lies to us.
Which means the query-only case (1) would have to allocate that internal buffer after querying the required length for 64 bit, then get it filled and then walk the entries (just like the copy loop), skipping over the last entry to compute the required length the same as it is now done after the copy loop.
Example program demonstrating the "static" computation by wow64.dll
Build for x64, just the way wow64.dll was!
#define WIN32_LEAN_AND_MEAN
#include <Windows.h>
#include <cstdio>
typedef struct
{
ULONG JustPretending[24];
} OBJECT_TYPE_INFORMATION32;
typedef struct
{
ULONG JustPretending[26];
} OBJECT_TYPE_INFORMATION64;
constexpr ULONG size_delta_3264 = sizeof(OBJECT_TYPE_INFORMATION64) - sizeof(OBJECT_TYPE_INFORMATION32);
constexpr ULONG down_size_to_32bit(ULONG len)
{
return len - size_delta_3264 * ((len - 4) / sizeof(OBJECT_TYPE_INFORMATION64));
}
constexpr ULONG up_size_from_32bit(ULONG len)
{
return len + size_delta_3264 * ((len - 4) / sizeof(OBJECT_TYPE_INFORMATION32));
}
// Trying to mimic the wdm.h macro
constexpr size_t align_up_by(size_t address, size_t alignment)
{
return (address + (alignment - 1)) & ~(alignment - 1);
}
constexpr auto u32 = 8280UL;
constexpr auto u64 = 8968UL;
constexpr auto from_64 = down_size_to_32bit(u64);
constexpr auto from_32 = up_size_from_32bit(u32);
constexpr auto from_32_16_byte_aligned = (ULONG)align_up_by(from_32, 16);
int wmain()
{
wprintf(L"32 to 64 bit: %u -> %u -(16-byte-align)-> %u\n", u32, from_32, from_32_16_byte_aligned);
wprintf(L"64 to 32 bit: %u -> %u\n", u64, from_64);
return 0;
}
static_assert(sizeof(OBJECT_TYPE_INFORMATION32) == 96, "Size for 64 bit struct does not match.");
static_assert(sizeof(OBJECT_TYPE_INFORMATION64) == 104, "Size for 64 bit struct does not match.");
static_assert(u32 == from_64, "Must match (from 64 to 32 bit)");
static_assert(u64 == from_32, "Must match (from 32 to 64 bit)");
static_assert(from_32_16_byte_aligned % 16 == 0, "16 byte alignment failed");
static_assert(from_32_16_byte_aligned > from_32, "We're aligning up");
This does not mimic the computation that happens in case 2, though.

Is there any limit in setting zram device disksize on Linux?

I'm trying to create a zram device on my target device. My target can not allocate memory if the zram disksize is above 100GB, but it's okay with the disksize of 50GB or less.
Is there any limit in setting zram device disksize on Linux? My target device only has 2GB of RAM memory.
I guess you can give a number up to UINT64_MAX - 4095 = 18446744073709547520 on a 64-bit platform.
https://github.com/torvalds/linux/blob/master/drivers/block/zram/zram_drv.h#L101
https://github.com/torvalds/linux/blob/master/drivers/block/zram/zram_drv.c#L1506
https://github.com/torvalds/linux/blob/master/drivers/block/zram/zram_drv.c#L901
So what we have:
... disksize_store(...) {
u64 disksize;
...
// ok, we can give at least UINT64_MAX here.
disksize = unsigned long long memparse(...);
// PAGE_ALIGN, PAGE_SIZE = 1<<12
disksize = PAGE_ALIGN(disksize)
= (((disksize)+((PAGE_SIZE)-1))&(~((typeof(disksize))(PAGE_SIZE)-1)))
= (disksize + ((1<<12)-1))&(~((1<<12)-1))
= (disksize + 4095) & 0xfffffffffffff000
// ^^^^^^^^^^^^^^^ this can overflow
// so max number is UINT64_MAX - 4095 so it doesn't overflow
// otherwise this macro will return 0
...
if (!zram_meta_alloc(..., disksize) {
...
return ...;
}
...
zram->disksize = disksize;
...
}
So let's see into zram_meta_alloc:
... zram_meta_alloc(..., disksize) {
...
num_pages = disksize >> PAGE_SHIFT;
// max num_pages = 0xfffffffffffff = UINT64_MAX >> PAGE_SHIFT
... = vzalloc(num_pages * sizeof(*zram->table));
// ^^^^^^^^^^^^^^^ this can overflow
...
}
vzallloc takes as argument unsigned long. ULONG_MAX should be UINT64_MAX on 64-bit platform. sizeof(*zram->table) is equal to sizeof(unsigned long) + sizeof(unsigned long) + [optional: + sizeof(ktime_t)] + padding (see here). Without padding, assuming 64-bit platform, sizeof(unsigned long) = 8 that should be equal to 8+8[+8] = 16 or 24. But anyway, maximum num_pages is equal to UINT64_MAX >> 12, so to overflow it on 64bit multiplication we would need sizeof(*zram->table) = 2^PAGE_SIZE = 4096, and that shouldn't happen (unless the compiler decides to give over 4000 bytes of padding into the zram->table struct). So we are left with UINT64_MAX - 4095.
So we are left, that the maximum number of disksize is UINT64_MAX-4095. If you give the disksize equal to UINT64_MAX - x, where 0 <= x < 4095, than because of PAGE_ALIGN macro, the disksize will be effectively set to 0. Probably this should be brought up to a kernel developer and they should modify the PAGE_ALIGN macro to support such numbers.
6 days ago to vzalloc calls the call to array_size was added to protect against overflow with this commit.
There is no limit but there is an overhead.
"Note that zram uses about 0.1% of the size of the disk when not in use so a huge zram is wasteful."
https://www.kernel.org/doc/Documentation/blockdev/zram.txt
Also disk_size is a virtual size purely dependent on the input and the compression ratio that receives via chosen alg. Disk-size is the max uncompressed size and general disk parameters.
The only 'actual' control is via mem_limit which is compressed size + disk & zram overheads.
Compression ratio is completely dependent on comp alg chosen from /proc/crypto as zlib & zstd are far more effective but are far slower. It is also very dependent on input as with text zlib & zstd can be over double that what lzo & lz4 will achieve.
If the input is already compressed any alg might garner little to zero compression and without a mem_limit could grab much precious memory from the system.
Mem_limit is the max you are prepared zram to grab from system and a disk-size any more than the compression ratio expected applied to mem_limit is likely a waste.
It will never get used but be part of the 0.1% empty creation overhead.
Maybe try https://github.com/StuartIanNaylor/zram-config

Different memory allocations in linux and windows?

I have a tree (T*Tree: binary tree with many elements in the node) implemented in C++.
I want to insert around 5,000,000 integer values in it (let's say from 1 till 5,000,000). The tree size should be around 8 * 5,000,000 byte or 41MB in memory (according to my implementation which is reasonable).
When I display the size of the tree(in my program by calculating the size of every node), it is 41MB as normal. However when I checked in Windows 32bit>>"Task Manager" I found the memory taken is 732MB!!
I checked that there is no extra malloc in my code. Even after I freed the tree by traversing from node to node and deleting them(and the keys inside) the size in "Task Manager" becomes 513MB only!!
After that I compiled same code in Linux Ubuntu 32bit(virtual machine on another PC) and ran the program. Again tree size does not change in my program i.e. 41MB as normal, but in "System Monitor" memory is 230MB and when freeing the tree nodes in my program the memory in "System Monitor" remains same 230MB.
And in both Windows and Linux if I freed & reinitialized the tree and insert again 5,000,0000 integer values, the memory is increased by double like if the previous space is not freed and used somewhere (which I am not able to find where).
The question:
1) why are those huge memory differences in Windows and Linux although the code & input data is same?
2) why freeing the Tree nodes doesn't reduce the memory to some reasonable value like 10MB.
code: https://drive.google.com/open?id=0ByKaCojxzNa9dEt6cEJNeDI4eXc
below are some snippets:
typedef struct Keylist {
unsigned int k;
struct Keylist *next_ptr;
};
typedef struct Keylist Keylist;
typedef struct TstarTreeNode {
//Binary Node specific
struct TstarTreeNode *left;
struct TstarTreeNode *right;
//Bool rightVisitedDuringInsert;
//AVL Node specific
int height;
//T Node specific
int length; //length of keys array for easy locating
struct Keylist *keys; //later you deal with it like one dimentional array
int max; //max key
int min; //min key
//T* Node specific
struct TstarTreeNode *successor;
};
typedef struct TstarTreeNode TstarTreeNode;
/*****************************************************************************
* *
* Define a structure for binary trees. *
* *
*****************************************************************************/
typedef struct TstarTree {
int size; //number of element(not number of nodes) in a tree
int MinCount; //Min Count of elements in a Node
int MaxCount; //Max Count of elements in a Node
TstarTreeNode *root;
//Provide functions for comarison elements and destroying elements
int (*compare)(int key1, int key2); //// -1 smaller, 0 equal, 1 bigger
int (*inRange)(int key, int min, int max); // -1 smaller, 0 in range, 1 bigger
} ;
typedef struct TstarTree TstarTree;
Insert function of the tree uses dynamic allocation i.e. malloc.
Update
according to what "John Zwinck" pointed out (thanks John), I have two things now:
1) The huge memory taken in Windows was because of the compiling options in Visual Studio, which I think enabled debugging and a lot of extra things. When I compiled in Windows using Cygwin without that options i.e. "gcc main.c tstarTree.c -o main" I got same result as in Linux. The size now in Windows>>"Task Manager" becomes 230MB
2) If OS is 64bit then let's see how the size is calculated (as John said and as I modified):
5 million unsigned int k. 20 MB.
5 million 4-byte pads (after k to align next_ptr). 20 MB.
5 million 8-byte next_ptr. 40 MB.
5 million times the overhead of malloc(). I think for 64bit OS it is 32 bytes each (according to John provided link). so 160 MB.
N TstarTreeNodes, each of which is 48 bytes in the full code.
N times the overhead of malloc() (I think, 32 bytes each).
N is the number of nodes. I have a resulting balanced complete tree of height 16 so I assume the number of nodes are 2^17-1. so the last two items become 6.2MB(i.e. 2^17 * 48) + 4.1MB(i.e. 2^17 * 32) =10MB
So the total is: 20+20+40+160+10= 250MB which is somehow reasonable and close to 230MB.
However I have Windows/Linux 32bit it will be (I think):
5 million unsigned int k. 20 MB.
5 million 4-byte next_ptr. 20 MB.
5 million times the overhead of malloc(). I think for 32bit OS it is 16 bytes each. so 80 MB.
N TstarTreeNodes, each of which is 32 bytes in the full code.
N times the overhead of malloc() (I think, 16 bytes each).
N is the number of nodes. I have a resulting balanced complete tree of height 16 so I assume the number of nodes are 2^17-1. so the last two items become 4.1MB(i.e. 2^17 * 32) + 2MB(i.e. 2^17 * 16) =6MB
So the total is: 20+20+80+6= 126MB it is a little far from 230MB which I get in "Task Manager" (if you know why please tell me?)
Currently the remaining important question is, why isn't the tree freed from memory when I am freeing all the nodes and keys in the tree using this code:
void freekeys(struct Keylist ** keys){
if ((*keys) == NULL)
{
return;
}
freekeys(&(*keys)->next_ptr);
(*keys)->next_ptr = NULL;
free((*keys));
(*keys) = NULL;
}
void freeTree(struct TstarTreeNode ** tree){
if ((*tree) == NULL)
{
return;
}
freeTree(&(*tree)->left);
freeTree(&(*tree)->right);
freekeys(&(*tree)->keys);
(*tree)->keys = NULL;
(*tree)->left = NULL;
(*tree)->right = NULL;
(*tree)->successor = NULL;
free((*tree));
(*tree) = NULL;
}
and in main():
TstarTree * tree;
...
freeTree(&tree->root);
free(tree);
Note:
The tree is working perfectly (insert, update, delete, lookup, display...) but when trying to free the tree from memory nothing changed in its size
You say your data takes:
8 * 5,000,000 byte or 41MB in memory
But that is not correct. Looking at your code there are two main structures:
struct Keylist {
unsigned int k;
Keylist *next_ptr;
};
struct TstarTreeNode {
TstarTreeNode *left, *right;
Keylist *keys;
TstarTreeNode *successor;
};
Let's say we have 5 million integers to store, as in your example. What will we need?
5 million unsigned int k. 20 MB.
5 million 4-byte pads (after k to align next_ptr). 20 MB.
5 million 8-byte next_ptr. 40 MB.
5 million times the overhead of malloc(). Likely 16 bytes each. 80 MB.
N TstarTreeNodes, each of which is 48 bytes in the full code.
N times the overhead of malloc() (again, 16 bytes each).
If N is 500,000 (for example, I don't know the real value but you do), those last two items add up to 32 MB. That brings the total to at least 192 MB as a bare minimum. Therefore, seeing 230 MB of memory usage in Linux is not surprising.
Some systems, especially when optimization is not fully enabled at build time, will add more bookkeeping and debugging information to each block allocated with malloc(). Are you building with optimization fully enabled?
One way you can save a lot of overhead is to stop using Keylist and just store the integers in plain arrays (created with malloc(), but only one per TstarTreeNode).

Can the logical erase block size of an MTD device be increased?

The minimum erase block size for jffs2 (mtd-utils version 1.5.0, mkfs.jffs2 revision 1.60) seems to be 8KiB:
Erase size 0x1000 too small. Increasing to 8KiB minimum
However I am running Linux 3.10 with an at25df321a,
m25p80 spi32766.0: at25df321a (4096 Kbytes),
and the erase block size is only 4KiB:
mtd5
Name: spi32766.0
Type: nor
Eraseblock size: 4096 bytes, 4.0 KiB
Amount of eraseblocks: 1024 (4194304 bytes, 4.0 MiB)
Minimum input/output unit size: 1 byte
Sub-page size: 1 byte
Character device major/minor: 90:10
Bad blocks are allowed: false
Device is writable: true
Is there a way to make the mtd system treat multiple erase blocks as one? Maybe some ioctl or module parameter?
If I flash a jffs2 image with larger erase block size, I get lots of kernel error messages, missing files and sometimes panic.
workaround
I found that flasherase --jffs2 results in a working filesystem inspite of the 4KiB erase block size. So I hacked the mkfs.jfss2.c file and the resulting image seems to work fine. I'll give it some testing.
diff -rupN orig/mkfs.jffs2.c new/mkfs.jffs2.c
--- orig/mkfs.jffs2.c 2014-10-20 15:43:31.751696500 +0200
+++ new/mkfs.jffs2.c 2014-10-20 15:43:12.623431400 +0200
## -1659,11 +1659,11 ## int main(int argc, char **argv)
}
erase_block_size *= units;
- /* If it's less than 8KiB, they're not allowed */
- if (erase_block_size < 0x2000) {
- fprintf(stderr, "Erase size 0x%x too small. Increasing to 8KiB minimum\n",
+ /* If it's less than 4KiB, they're not allowed */
+ if (erase_block_size < 0x1000) {
+ fprintf(stderr, "Erase size 0x%x too small. Increasing to 4KiB minimum\n",
erase_block_size);
- erase_block_size = 0x2000;
+ erase_block_size = 0x1000;
}
break;
}
http://lists.infradead.org/pipermail/linux-mtd/2010-September/031876.html
JFFS2 should be able to fit at least one node to eraseblock. The
maximum node size is 4KiB+few bytes. This is why the minimum
eraseblocks size is 8KiB.
But in practice, even 8KiB is bad because you and up with wasting a
lot of space at the end of eraseblocks.
You should join several erasblock into one virtual eraseblock of 64 or
128 KiB and use it - this will be more optimal.
Some drivers have already implemented this. I know about
MTD_SPI_NOR_USE_4K_SECTORS
Linux configuration option. It have to be set to "n" to enable large erase sectors of size 0x00010000.

RES != CODE + DATA in the output information of the top command,why?

what 'man top' said is: RES = CODE + DATA
q: RES -- Resident size (kb)
The non-swapped physical memory a task has used.
RES = CODE + DATA.
r: CODE -- Code size (kb)
The amount of physical memory devoted to executable code, also known as the 'text resident set' size or TRS.
s: DATA -- Data+Stack size (kb)
The amount of physical memory devoted to other than executable code, also known as the 'data >resident set' size or DRS.
what when i run 'top -p 4258',i get the following:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ CODE DATA COMMAND
258 root 16 0 3160 1796 1328 S 0.0 0.3 0:00.10 476 416 bash
1796 != 476+416
why?
ps:
linux distribution:
linux-iguu:~ # lsb_release -a
LSB Version: core-2.0-noarch:core-3.0-noarch:core-2.0-ia32:core-3.0-ia32:desktop-3.1-ia32:desktop-3.1-noarch:graphics-2.0-ia32:graphics-2.0-noarch:graphics-3.1-ia32:graphics-3.1-noarch
Distributor ID: SUSE LINUX
Description: SUSE Linux Enterprise Server 9 (i586)
Release: 9
Codename: n/a
kernel version:
linux-iguu:~ # uname -a
Linux linux-iguu 2.6.16.60-0.21-default #1 Tue May 6 12:41:02 UTC 2008 i686 i686 i386 GNU/Linux
I'll explain this with the help of an example of what happens when a program allocates and uses memory. Specifically, this program:
#include <stdio.h>
#include <stdlib.h>
#include <errno.h>
#include <string.h>
int main(){
int *data, size, count, i;
printf( "fyi: your ints are %d bytes large\n", sizeof(int) );
printf( "Enter number of ints to malloc: " );
scanf( "%d", &size );
data = malloc( sizeof(int) * size );
if( !data ){
perror( "failed to malloc" );
exit( EXIT_FAILURE );
}
printf( "Enter number of ints to initialize: " );
scanf( "%d", &count );
for( i = 0; i < count; i++ ){
data[i] = 1337;
}
printf( "I'm going to hang out here until you hit <enter>" );
while( getchar() != '\n' );
while( getchar() != '\n' );
exit( EXIT_SUCCESS );
}
This is a simple program that asks you how many integers to allocate, allocates them, asks how many of those integers to initialize, and then initializes them. For a run where I allocate 1250000 integers and initialize 500000 of them:
$ ./a.out
fyi: your ints are 4 bytes large
Enter number of ints to malloc: 1250000
Enter number of ints to initialize: 500000
Top reports the following information:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ SWAP CODE DATA COMMAND
<program start>
11129 xxxxxxx 16 0 3628 408 336 S 0 0.0 0:00.00 3220 4 124 a.out
<allocate 1250000 ints>
11129 xxxxxxx 16 0 8512 476 392 S 0 0.0 0:00.00 8036 4 5008 a.out
<initialize 500000 ints>
11129 xxxxxxx 15 0 8512 2432 396 S 0 0.0 0:00.00 6080 4 5008 a.out
The relevant information is:
DATA CODE RES VIRT
before allocation: 124 4 408 3628
after 5MB allocation: 5008 4 476 8512
after 2MB initialization: 5008 4 2432 8512
After I malloc'd 5MB of data, both VIRT and DATA increased by ~5MB, but RES did not. RES did increase after I touched 2MB of the integers I allocated, but DATA and VIRT stayed the same.
VIRT is the total amount of virtual memory used by the process, including what is shared and what is over-committed. DATA is the amount of virtual memory used that isn't shared and that isn't code-text. I.e., it is the virtual stack and heap of the process. RES is not virtual: it is a measurment of how much memory the process is actually using at that specific time.
So in your case, the large inequality CODE+DATA < RES is likely the shared libraries included by the process. In my example (and yours), SHR+CODE+DATA is a closer approximation to RES.
Hope this helps.
There's a lot of hand-waving and voodoo associated with top and ps. There are many articles (rants?) online about the descrepancies. E.g., this and this.
This explanation is terrific to resolve my some queries. Thanks!
And meanwhile, trying to add something got during my understanding of linux memory management knowledge. If any misunderstand, please correct me!
Modern OS process concepts are based on virtual memory. Virtual memory system includes the RAM+SWAP;
So I think most of the memory concepts related with processes refer to the virtual memory, except that there are some supplement notes.
Any virtual address(page) allocated to a process is in below state:
a) allocated, but no mapping to any physical memory(something like COW)
b) allocated, already mapped to physical memory
c) allocated, already mapped to swapped memory.
The fields ouput of top command:
a) VIRT -- it refers to all virtual memory that the process have the right
to access, no matter it is already mapped to physical memory or swapped
memory, or even has no any mapping.
b) RES -- it refers to the virtual address already mapped to physical address and it still in RAM.
c) SWAP -- refers to the virtual address already mapped to physical address and it is swapped into SWAP space.
d) SHR -- it refers to the shared memory available to a process(VM?)
e) CODE + DATA -- CODE could be in a state of 2.b/2.c, and DATA could be in any of 3 state 2.a/2.b/3.c, and 3.b/3.c also have a fields name called "USED".
4) So the calculation maybe look like:
a) VIRT(VM) = RES(VM in memory) + SWAP(VM in swap) + VM unmapped(DATA, SHR?).
b) USED = RES + SWAP
c) SWAP = CODE(vm in memory) + DATA(vm in memory) + SHR(vm in memory?)
d) RES = CODE(vm in memory) + DATA(vm in memory) + SHR(vm in memory?)
At least DATA segment still have a "DATA(VM unmapped)", this could be observed from above malloc example. That's a little different from the manpage of top command which says "DATA: The amount of physical memory devoted to other than executable code, also known as the Data Resident Set size or DRS". Thanks again.
So amount of (CODE + DATA + SHR) usually larger than RES, because at least DATA(vm unmapped) actually calculated in "DATA", not like the manpge claiming.
Regards,

Resources