I am trying to get rid of memory leak but my understanding of things is pretty low in this area and I have nobody to ask for help expect you guys. My script is killing server RAM and I can't figure out what is wrong with my approach.
I have this function:
function getPages(params){
gmail.users.messages.list(params, (err, resp)=>{
for (var message of resp.messages) {
message['ownerEmail'] = currentUser;
getMessage(message); // this does something with it later
var message = null;
}
if(resp.nextPageToken){
params.pageToken = resp.nextPageToken;
getPages(params);
} else {
// resolve end here...
}
})//gmail.users.messages.list
}//fetchPages
getPages(params);
Basically it gets messages from the API and should do something with it afterwards. It will execute itself as long as there is more data to fetch. (as long as nextPageToken exists in response).
Now I ran this command:
$ free -lm
total used free shared buff/cache available
Mem: 11935 1808 7643 401 2483 9368
Low: 11935 4291 7643
High: 0 0 0
Swap: 6062 0 6062
As script is running buff/cache is constantly increasing.
What is the buff/cache thing actually and how is it related to my Node script?
How do I manage what is buffered/cached and how do I kill/clear such stuff?
How do I optimize function above to forget everything that is already processed?
How do I make sure that script takes absolutely zero resources once it is finished? (I even tried process.exit at the end of the script)
How do I debug and monitor RAM usage from my Node.js script?
I don't think that there is a memory leak. I think that you are in an infinite loop with the recursion. The gmail.users.messages returns the response with the resp.nextPageToken being present (I suppose) and then you are calling the getPages(params); again. Can you put a console.log just before the getPages(params); function call? Something like that:
if (resp.nextPageToken) {
params.pageToken = resp.nextPageToken;
console.log('token', params.pageToken)
getPages(params);
}
and check how many times do you print this and if you ever get out of the recursion. Also, why do you set the message to null into the iteration? There is a redefinition of the variable.
You can use N|Solid (its free for development), you'll launch your app inside its wrapper. Its quite easy to use and it allows you to make full profile where leak occurs.
You can also do it manually with built in debugger, check memory consumption at each step.
Just to answer one of questions within the post:
How do I make sure that script takes absolutely zero resources once it
is finished? (I even tried process.exit at the end of the script)
There has been misunderstanding:
http://www.linuxatemyram.com/
Don't Panic! Your ram is fine!
What's going on? Linux is borrowing unused memory for disk caching.
This makes it looks like you are low on memory, but you are not!
Everything is fine!
Related
I get this every time I try to create an account to ask this on Stack Overflow:
Oops! Something Bad Happened!
We apologize for any inconvenience, but an unexpected error occurred while you were browsing our site.
It’s not you, it’s us. This is our fault.
That's the reason I post it here. I literally cannot ask it on Overflow, even after spending hours of my day (on and off) repeating my attempts and solving a million reCAPTCHA puzzles. Can you maybe fix this error soon?
With no meaningful/complete examples, and basically no documentation whatsoever, I've been trying to use the "shmop" part of PHP for many years. Now I must find a way to send data between two different CLI PHP scripts running on the same machine, without abusing the database for this. It must work without database support, which means I'm trying to use shmop, but it doesn't work at all:
$shmopid = shmop_open(1, 'w', 0644, 99999); // I have no idea what the "key" is supposed to be. It says: "System's id for the shared memory block. Can be passed as a decimal or hex.", so I've given it a 1 and also tried with 123. It gave an error when I set the size to 64, so I increased it to 99999. That's when the error changed to the one I now face above.
shmop_write($shmopid, 'meow 123', 0); // Write "meow 123" to the shared variable.
while (1)
{
$shared_string = shmop_read($shmopid, 0, 8); // Read the "meow 123", even though it's the same script right now (since this is an example and minimal test).
var_dump($shared_string);
sleep(1);
}
I get the error for the first line:
shmop_open(): unable to attach or create shared memory segment 'No error':
What does that mean? What am I doing wrong? Why is the manual so insanely cryptic for this? Why isn't this just a built-in "superarray" that can be accessed across the scripts?
About CLI:
It cannot work in standalone CLI processes, as an answer here says:
https://stackoverflow.com/a/34533749
The master process is the one to hold the shared memory block, so you will have to use php-fpm or mod_php or some other web/service-running version, and maybe even start/request/stop it all from a CLI php script.
About shmop usage itself:
Use "c" mode in shmop_open() for creating the segment before it can be used with "a" or "w".
I stumbled on this error in a different scenario where shared memory is completely optional to speed up some repeated operations. So I wanted to try reading first without knowing memory size and then allocate from actual data when needed. In my case I had to call it as #shmop_open() to hide this error output.
About shmop on Windows:
PHP 7 crashed Apache worker process causing its restart with status 3221225477 when trying to reallocate a segment with the same predefined (arbitrary number) key but different size, even after shmop_delete(). As a workaround for this, I took filemtime() of the source file which contains data to be stored in memory, and used that timestamp as the key in shmop_open(). It still was not flawless IIRC, and I don't know if it would cause memory leaks, but it was enough to test my code which would mainly run on Linux anyway.
Finally, as of PHP version 8.0.7, shmop seems to work fine with Apache 2.4.46 and mod_php in Windows 10.
In Linux, you can read the value of /proc/sys/fs/aio-nr and this returns the total no. of events allocated across all active aio contexts in the system. The max value is controlled by /proc/sys/fs/aio-max-nr.
Is there a way to tell which process is responsible for allocating these aio contexts?
There isn't a simple way. At least, not that I've ever found! However, you can see them being consumed and freed using systemtap.
https://blog.pythian.com/troubleshooting-ora-27090-async-io-errors/
Attempting to execute the complete script in that article produced errors on my Centos 7 system. But, if you just take the first part of it, the part that logs allocations, it may give you enough insight:
stap -ve '
global allocated, allocatedctx
probe syscall.io_setup {
allocatedctx[pid()] += maxevents; allocated[pid()]++;
printf("%d AIO events requested by PID %d (%s)\n",
maxevents, pid(), cmdline_str());
}
'
You'll need to coordinate things such that systemtap is running before your workload kicks in.
Install systemtap, then execute the above command. (Note, I've altered this slightly from the linked article to removed the unused freed symbol.) After a few seconds, it'll be running. Then, start your workload.
Pass 1: parsed user script and 469 library scripts using 227564virt/43820res/6460shr/37524data kb, in 260usr/10sys/263real ms.
Pass 2: analyzed script: 5 probes, 14 functions, 101 embeds, 4 globals using 232632virt/51468res/11140shr/40492data kb, in 80usr/150sys/240real ms.
Missing separate debuginfos, use: debuginfo-install kernel-lt-4.4.70-1.el7.elrepo.x86_64
Pass 3: using cached /root/.systemtap/cache/55/stap_5528efa47c2ab60ad2da410ce58a86fc_66261.c
Pass 4: using cached /root/.systemtap/cache/55/stap_5528efa47c2ab60ad2da410ce58a86fc_66261.ko
Pass 5: starting run.
Then, once your workload starts, you'll see the context requests logged:
128 AIO events requested by PID 28716 (/Users/blah/awesomeprog)
128 AIO events requested by PID 28716 (/Users/blah/awesomeprog)
So, not as simple as lsof, but I think it's all we have!
Given an untrusted memory address, is there a way in Linux to test whether it points to valid, accessible memory?
For example, in mach you can use vm_read_overwrite() to attempt to copy data from the specified location. If the address is invalid or inaccessible, it will return an error code rather than crashing the process.
write from that memory (into /dev/null, for example (EDIT: with /dev/null it might not work as expected, use a pipe)), and you'll receive EFAULT error if the address is unaccessible.
I have no idea how to test for writable memory without destroying its content if it is writable.
This a typical case of TOCTOU - you check at some point that the memory is writeable, then later on you try to write to it, and somehow (e.g. because the application deallocated it), the memory is no longer accessible.
There is only one valid way to actually do this, and that is, trap the fault you get from writing to it when you actually need to use it.
Of course, you can use tricks to try to figure out if the memory "may be writeable", but there is no way you can actually ensure it is writeable.
You may want to explain slightly more what you are actually trying to do, and maybe we can have some better ideas if you are more specific.
You can try msync:
int page_size = getpagesize();
void *aligned = (void *)((uintptr_t)p & ~(page_size - 1));
if (msync(aligned, page_size, MS_ASYNC) == -1 && errno == ENOMEM) {
// Non-accessibe
}
But this function may be slow and should not be used in performance critical circumstance.
I'm using IPython.parallel to process a large amount of data on a cluster. The remote function I run looks like:
def evalPoint(point, theta):
# do some complex calculation
return (cost, grad)
which is invoked by this function:
def eval(theta, client, lview, data):
async_results = []
for point in data:
# evaluate current data point
ar = lview.apply_async(evalPoint, point, theta)
async_results.append(ar)
# wait for all results to come back
client.wait(async_results)
# and retrieve their values
values = [ar.get() for ar in async_results]
# unzip data from original tuple
totalCost, totalGrad = zip(*values)
avgGrad = np.mean(totalGrad, axis=0)
avgCost = np.mean(totalCost, axis=0)
return (avgCost, avgGrad)
If I run the code:
client = Client(profile="ssh")
client[:].execute("import numpy as np")
lview = client.load_balanced_view()
for i in xrange(100):
eval(theta, client, lview, data)
the memory usage keeps growing until I eventually run out (76GB of memory). I've simplified evalPoint to do nothing in order to make sure it wasn't the culprit.
The first part of eval was copied from IPython's documentation on how to use the load balancer. The second part (unzipping and averaging) is fairly straight-forward, so I don't think that's responsible for the memory leak. Additionally, I've tried manually deleting objects in eval and calling gc.collect() with no luck.
I was hoping someone with IPython.parallel experience could point out something obvious I'm doing wrong, or would be able to confirm this in fact a memory leak.
Some additional facts:
I'm using Python 2.7.2 on Ubuntu 11.10
I'm using IPython version 0.12
I have engines running on servers 1-3, and the client and hub running on server 1. I get similar results if I keep everything on just server 1.
The only thing I've found similar to a memory leak for IPython had to do with %run, which I believe was fixed in this version of IPython (also, I am not using %run)
update
Also, I tried switching logging from memory to SQLiteDB, in case that was the problem, but still have the same problem.
response(1)
The memory consumption is definitely in the controller (I could verify this by: (a) running the client on another machine, and (b) watching top). I hadn't realized that non SQLiteDB would still consume memory, so I hadn't bothered purging.
If I use DictDB and purge, I still see the memory consumption go up, but at a much slower rate. It was hovering around 2GB for 20 invocations of eval().
If I use MongoDB and purge, it looks like mongod is taking around 4.5GB of memory and ipcluster about 2.5GB.
If I use SQLite and try to purge, I get the following error:
File "/usr/local/lib/python2.7/dist-packages/IPython/parallel/controller/hub.py", line 1076, in purge_results
self.db.drop_matching_records(dict(completed={'$ne':None}))
File "/usr/local/lib/python2.7/dist-packages/IPython/parallel/controller/sqlitedb.py", line 359, in drop_matching_records
expr,args = self._render_expression(check)
File "/usr/local/lib/python2.7/dist-packages/IPython/parallel/controller/sqlitedb.py", line 296, in _render_expression
expr = "%s %s"%null_operators[op]
TypeError: not enough arguments for format string
So, I think if I use DictDB, I might be okay (I'm going to try a run tonight). I'm not sure if some memory consumption is still expected or not (I also purge in the client like you suggested).
Is it the controller process that is growing, or the client, or both?
The controller remembers all requests and all results, so the default behavior of storing this information in a simple dict will result in constant growth. Using a db backend (sqlite or preferably mongodb if available) should address this, or the client.purge_results() method can be used to instruct the controller to discard any/all of the result history (this will delete them from the db if you are using one).
The client itself caches all of its own results in its results dict, so this, too, will result in growth over time. Unfortunately, this one is a bit harder to get a handle on, because references can propagate in all sorts of directions, and is not affected by the controller's db backend.
This is a known issue in IPython, but for now, you should be able to clear the references manually by deleting the entries in the client's results/metadata dicts and if your view is sticking around, it has its own results dict:
# ...
# and retrieve their values
values = [ar.get() for ar in async_results]
# clear references to the local cache of results:
for ar in async_results:
for msg_id in ar.msg_ids:
del lview.results[msg_id]
del client.results[msg_id]
del client.metadata[msg_id]
Or, you can purge the entire client-side cache with simple dict.clear():
view.results.clear()
client.results.clear()
client.metadata.clear()
Side note:
Views have their own wait() method, so you shouldn't need to pass the Client to your function at all. Everything should be accessible via the View, and if you really need the client (e.g. for purging the cache), you can get it as view.client.
I have a curious memory leak, it seems that the library function to_unbounded_string is leaking!
Code snippets:
procedure Parse (Str : in String;
... do stuff...
declare
New_Element : constant Ada.Strings.Unbounded.Unbounded_String :=
Ada.Strings.Unbounded.To_Unbounded_String (Str); -- this leaks
begin
valgrind output:
==6009== 10,276 bytes in 1 blocks are possibly lost in loss record 153 of 153
==6009== at 0x4025BD3: malloc (vg_replace_malloc.c:236)
==6009== by 0x42703B8: __gnat_malloc (in /usr/lib/libgnat-4.4.so.1)
==6009== by 0x4269480: system__secondary_stack__ss_allocate (in /usr/lib/libgnat-4.4.so.1)
==6009== by 0x414929B: ada__strings__unbounded__to_unbounded_string (in /usr/lib/libgnat-4.4.so.1)
==6009== by 0x80F8AD4: syntax__parser__dash_parser__parseXn (token_parser_g.adb:35)
Where token_parser_g.adb:35 is listed above as the "-- this leaks" line.
Other info: Gnatmake version 4.4.5. gcc version 4.4 valgrind version valgrind-3.6.0.SVN-Debian, valgrind options -v --leak-check=full --read-var-info=yes --show-reachable=no
Any help or insights appreciated,
NWS.
Valgrind clearly says that there is possibly a memory leak. It doesn't necessarily mean there is one. For example, if first call to that function allocates a pool of memory that is re-used during the life time of the program but is never freed, Valgrind will report it as a possible memory leak, even though it is not, as this is a common practice and memory will be returned to OS upon process termination.
Now, if you think that there is a memory leak for real, call this function in a loop, and see it memory continues to grow. If it does - file a bug report or even better, try to find and fix the leak and send a patch along with a bug report.
Hope it helps.
Was trying to keep this to comments, but what I was saying got too long and started to need formatting.
In Ada string objects are generally assumed to be perfectly-sized. The language provies functions to return the size and bounds of any string. Because of this, string handling in Ada is very different than C, and in fact more resembles how you'd do it in a functional language like Lisp.
But the basic principle is that, except in some very unusual situations, if you find yourself using Ada.Strings.Unbounded, you are going about things the wrong way.
The one case where you really can't get around using a variable-length string (or perhaps a buffer with a separate valid_length variable), is when reading strings as input from some external source. As you say, your parsing example is such a situation.
However, even here you should only have that situation on the initial buffer. Your call to your Parse routine should look something like this:
Ada.Text_IO.Get_Line (Buffer, Buffer_Len);
Parse (Buffer(Buffer'first..Buffer'first + Buffer_Len - 1));
Now inside the Parse routine you have a perfectly-sized constant Ada string to work with. If for some reason you need to pull out a subslice, you would do the following:
... --// Code to find start and end indices of my subslice
New_Element : constant String := Str(Element_Start...Element_End);
If you don't actually need to make a copy of that data for some reason though, you are better off just finding Element_Start and Element_End and working with a slice of the original string buffer. Eg:
if Str(Element_Start..Element_End) = "MyToken" then
I know this doesn't answer your question about Ada.Strings.Unbounded possibly leaking. But even if it doesn't leak, that code is relatively wasteful of machine resources (CPU and memory), and probably shouldn't be used for string manipulation unless you really need it.
Are bound[ed] strings scoped?
Expanding on #T.E.D.'s comments, Ada.Strings.Bounded "objects should not be implemented by implicit pointers and dynamic allocation." Instead, the maximum size is fixed when the generic in instantiated. As an implmentation detail, GNAT uses a discriminant to specify the maximum size of the string and a record to store the current size & contents.
In contrast, Ada.Strings.Unbounded requires that "No storage associated with an Unbounded_String object shall be lost upon assignment or scope exit." As an implmentation detail, GNAT uses a buffered implementation derived from Ada.Finalization.Controlled. As a result, the memory used by an Unbounded_String may appear to be a leak until the object is finalized, as for example when the code returns to an enclosing scope.