rust application VSZ keeps growing - rust

I have a rust application and it gets uploaded files from user and saves it to disk.
i am using axum and tokio. here is the part that i think have a problem.
in my request handler i get uploaded files and save it to disk like this:
while let Some(chunk) = field.next().await {
f.write_all(&chunk.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?)
.await
.map_err(|_| StatusCode::INTERNAL_SERVER_ERROR)?;
}
my app allocates memory for this operation but it looks like the memory does not go back to os. here is my docker top result after 5 days of server running:
i ran heaptrack and it said i have no leak.
i also used valgrind and it said i have no leak.

Related

Node.js read big file with fs.readFileSync

I try to load big file (~6Gb) into memory with fs.readFileSync on the server with 96GB RAM.
The problem is it fails with the following error message
RangeError: Attempt to allocate Buffer larger than maximum size: 0x3fffffff bytes
Unfortunately I didn't find how it is possible to increase Buffer, it seems like it's a constant.
How I can overcome this problem and load a big file with Node.js?
Thank you!
I have also with same problem when try to load 6.4G video file to create file hash.
I read whole file by fs.readFile() and it cause an error RangeError [ERR_FS_FILE_TOO_LARGE]. Then i use stream to do it:
let hash = crypto.createHash('md5'),
stream = fs.createReadStream(file_path);
stream.on('data', _buff => { hash.update(_buff, 'utf8'); });
stream.on('end', () => {
const hashCheckSum = hash.digest('hex');
// Save the hashCheckSum into database.
});
Hope it helped.
From a joyent FAQ:
What is the memory limit on a node process?
Currently, by default v8 has a memory limit of 512mb on 32-bit
systems, and 1gb on 64-bit systems. The limit can be raised by setting
--max_old_space_size to a maximum of ~1024 (~1 GiB) (32-bit) and ~1741 (~1.7GiB) (64-bit), but it is recommended that you split your single
process into several workers if you are hitting memory limits.
If you show more detail about what's in the file and what you're doing with it, we can probably offer some ideas on how to work with it in chunks. If it's pure data, then you probably want to be using a database and let the database handle getting things from disk as needed and manage the memory.
Here's a fairly recent discussion of the issue: https://code.google.com/p/v8/issues/detail?id=847
And, here's a blog post that claims you can edit the V8 source code and rebuilt node to remove the memory limit. Try this at your own discretion.

How to dump the heap of running C++ process to a file under Linux?

I've got a program that is running on a headless/embedded Linux box, and under certain circumstances that program seems to be using up quite a bit more memory (as reported by top, etc) than I would expect it to use.
Since the fault condition is difficult to reproduce outside of the actual working environment, and since the embedded box doesn't have niceties like valgrind or gdb installed, what I'd like to do is simply write out the process's heap-memory to a file, which I could then transfer to my development machine and look through at my leisure, to see if I can tell from the contents of the file what kind of data it is that is taking up the bulk of the heap. If I'm lucky there might be a smoking gun like a repeating string or magic-number that comes up a lot, that points me to the place in my code that is either leaking or perhaps just growing a data structure without bounds.
Is there a good way to do this? The only way I can think of would be to force the process to crash and then collect a core dump, but since the fault condition is rare it would be preferable if I could collect the information without crashing the process as a side effect.
You can read the entire memory space of the process via /proc/pid/mem; You can read /proc/pid/maps to see what is where in the memory space (so you can find the bounds of the heap and read just that). You can attempt to read the data while the process is running (in which case it might be changing while you are reading it), or you can stop the process with a SIGSTOP signal and later resume it with a SIGCONT.

Node JS, Highcharts Memory usage keeps climbing

I am looking after an app built with Node JS that's producing some interesting issues. It was originally running on Node JS v0.3.0 and I've since upgraded to v0.10.12. We're using Node JS to render charts on the server and we've noticed the memory usage keeps climbing chart after chart.
Q1: I've been monitoring the RES column in top for the Node JS process, is this correct or should I be monitoring something else?
I've been setting variables to null to try and reallocate memory back to the system resources (I read this somewhere as a solution) and it makes only a slight difference.
I've pushed the app all the way to 1.5gb and it then ceases to function and the process doesn't appear to die. No error messages which I found odd.
Q2: Is there anything else I can do?
Thanks
Steve
That is a massive jump in versions. You may want to share what code changes you may have made to get it working on latest stable. The api is not the same as back in v0.3, so that may be part of the problem.
If not then the issue you see it more likely from heap fragmentation than from an actual leak. In later v8 versions garbage collection is more liberal with cleanup to improve performance. (see http://code.google.com/p/chromium/issues/detail?id=112386 for some discussion on this)
You may try running the application with --max_old_space_size=32 which will limit the amount of memory v8 can use to around 32MB. Note the docs say "max size of the old generation", so it won't be exactly 32MB. Just around it, for lack of a better technical explanation.
Also you can track the amount of external memory usage with --trace_external_memory. This will allow you to know if external memory (i.e. Buffers) are being retained in your application.
You're note on the application hanging around 1.5GB would tell me you're probably on a 64-bit system. You only mentioned it ceases to function, but didn't note if the CPU is spinning during that time. Also since I don't have example code I'm not sure of what might be causing this to happen.
I'd try running on latest development (v0.11.3 at the time of this writing) and see if the issue is fixed. A lot of performance/memory enhancements are being worked on that may help your issue.
I guess you have somewhere a memory leak (in form of a closure?) that keeps the (not longer used?) diagrams(?) somewhere in memory.
The v8 sometimes needs a bit tweaking when it comes to > 1 GB of memory. Try out --noincremental_marking and/or --max_old_space_size=81920000 (if you have 8 GB available).
Check for more options with node --v8-options and go through the --trace*-parameters to find out what slows down/stops node.

Why Nodejs serves a file with 80x more CPU usage than Nginx?

Take the same code that sits on nodejs.org home page. Serve a static file that is 1.8Mb. And do the same with Nginx, and watch the difference.
Code : http://pastie.org/3730760
Screencast : http://screencast.com/t/Or44Xie11Fnp
Please share if you know anything that'd prevent this from happening, so we don't need to deploy nginx servers and complicate our lives.
ps1. this test is done with node 0.6.12. out of curiosity, i downgraded to 0.4.12 just to check if it's a regression, on the contrary, it was worse. same file used 25% twice.
ps2. this post is not a nodejs hate - we use nodejs, and we love it, except this glitch which actually delayed our launch (made us really sad), and seemed quite serious to me. i've never read, heard, seen or expected to come across.
The problem with your node benchmark is that you store the static file in a variable inside the V8 heap. Due to the way how V8 handles memory it can't directly send data contained in javascript variables to the network, because addresses of allocated objects may change during runtime, therefore V8 has to make a copy of your 1.8MB string on every request, sure that kills performance.
What you could do is to use a Buffer:
replace: longAssString = fs.readFileSync(pathToABigFile, 'utf8');
with: longAssString = fs.readFileSync(pathToABigFile);
that way you have your static file in a buffer, buffers are stored outside of V8s heap and require no copy when sent to the network and should therefore be much faster.

Finding out memory footprint size

I would like to be able to restart a service when it is using too much memory (this is related to a bug in a third party library)
I have used this to limit the amount of memory that can be requested:
resource.setrlimit(resource.RLIMIT_AS, (128*1024*1024, 128*1024*1024))
But the third party library gets stuck in a memory allocation busyloop failing and re-requesting memory. So I want to be able to, in a thread, poll the current size of the memory of the process.
Language I'm using is python, but a solution for any programming language can be translated into python code, provided it's viable and sensible on linux.
Monit is a service you can run to monitor external processes. All you need to do is dump your pid to a file for monit to read. People often use it to monitor their web server. One of the tests monit can do is for total memory usage. You can set a value and if your process uses too much memory it will be restarted. Here's an example monit config
check process yourProgram
with pidfile "/var/run/YOUR.pid"
start program = "/path/to/PROG.py"
stop program = "/script/to/kill/prog/kill_script.sh"
restart if totalmem is greater than 60.0 MB
This is the code that I came up with. Seems to work properly, and avoids too much string parsing. The variable names I unpack come from proc(5) man page, and this is probably a better way of extracting the OS information than string parsing /proc/self/status.
def get_vsize():
parts = open('/proc/self/stat').read().split()
(pid, comm, state, ppid, pgrp, session, tty, tpgid, flags, minflt, cminflt,
majflt, cmajflt, utime, stime, cutime, cstime, counter, priority, timeout,
itrealvalue, starttime, vsize, rss, rlim, startcode, endcode, startstack,
kstkesp, kstkeip, signal, blocked, sigignore, sigcatch, wchan,
) = parts[:35]
return int(vsize)
def memory_watcher():
while True:
time.sleep(120)
if get_vsize() > 120*1024*1024:
os.kill(0, signal.SIGTERM)
You can read the current memory usage using the /proc filesystem.
The format is /proc/[pid]/status. In the status virtual file you can see the current VmRSS (resident memory).

Resources