I have serious issue with custom foxx application.
About the app
The application is customized algorithm for finding path in graph. It's optimized for public transport. On init it loads all necessary data into javascript variable and then it traverse through them. Its faster then accessing the db each time.
The issue
When I access through api the application for first time then it is fast eg. 300ms. But when I do absolutely same request second time it is very slow. eg. 7000ms.
Can you please help me with this? I have no idea where to look for bugs.
Without knowing more about the app & the code, I can only speculate about reasons.
Potential reason #1: development mode.
If you are running ArangoDB in development mode, then the init procedure is run for each Foxx route request, making precalculation of values useless.
You can spot whether or not you're running in development mode by inspecting the arangod logs. If you are in development mode, there will be a log message about that.
Potential reason #2: JavaScript variables are per thread
You can run ArangoDB and thus Foxx with multiple threads, each having thread-local JavaScript variables. If you issue a request to a Foxx route, then the server will pick a random thread to answer the request.
If the JavaScript variable is still empty in this thread, it may need to be populated first (this will be your init call).
For the next request, again a random thread will be picked for execution. If the JavaScript variable is already populated in this thread, then the response will be fast. If the variable needs to be populated, then response will be slow.
After a few requests (at least as many as configured in --server.threads startup option), the JavaScript variables in each thread should have been initialized and the response times should be the same.
Related
I have a Node.js (Express.js) server for my React.js website as BFF. I use Node.js for SSR, proxying some request and cache some pages in Redis. In last time I found that my server time to time went down. I suggest an uptime is about 2 days. After restart, all ok, then response time growth from hour to hour. I have resource monitoring at this server, and I see that server don't have problems with RAM or CPU. It used about 30% of RAM and 20% of CPU.
I regret to say it's a big production site and I can't make minimal reproducible example, cause i don't know where is reason of these error :(
Except are memory and CPU leaks, what will be reasons for Node.js server might go went down?
I need at least direction to search.
UPDATE1:
"went down" - its when kubernetes kills container due 3 failed life checks (GET request to a root / of website)
My site don't use any BD connection but call lots of 3rd party API's. About 6 API requests due one GET/ request from browser
UPDATE2:
Thx. To your answers, guys.
To understand what happend inside my GET/ request, i'm add open-telemetry into my server. In longtime and timeout GET/ requests i saw long API requests with very big tcp.connect and tls.connect.
I think it happens due lack of connections or something about that. I think Mostafa Nazari is right.
I create patch and apply them within the next couple of days, and then will say if problem gone
I solve problem.
It really was lack of connections. I add reusing node-fetch connection due keepAlive and a lot of cache for saving connections. And its works.
Thanks for all your answers. They all right, but most helpful thing was added open-telemetry to my server to understand what exactly happens inside request.
For other people with these problems, I'm strongly recommended as first step, add telemetry to your project.
https://opentelemetry.io/
PS: i can't mark two replies as answer. Joe have most detailed and Mostafa Nazari most relevant to my problem. They both may be "best answers".
Tnx for help, guys.
Gradual growth of response time suggest some kind of leak.
If CPU and memory consumption is excluded, another potentially limiting resources include:
File descriptors - when your server forgets to close files. Monitor for number of files in /proc//fd/* to confirm this. See what those files are, find which code misbehaves.
Directory listing - even temporary directory holding a lot of files will take some time to scan, and if your application is not removing some temporary files and lists them - you will be in trouble quickly.
Zombie processes - just monitor total number of processes on the server.
Firewall rules (some docker network magic may in theory cause this on host system) - monitor length of output of "iptables -L" or "iptables-save" or equivalent on modern kernels. Rare condition.
Memory fragmentation - this may happen in languages with garbage collection, but often leaves traces with something like "Can not allocate memory" in logs. Rare condition, hard to fix. Export some health metrics and make your k8s restart your pod preemptively.
Application bugs/implementation problems. This really depends on internal logic - what is going on inside the app. There may be some data structure that gets filled in with data as time goes by in some tricky way, becoming O(N) instead of O(1). Really hard to trace down, unless you have managed to reproduce the condition in lab/test environment.
API calls from frontend shift to shorter, but more CPU-hungry ones. Monitor distribution of API call types over time.
Here are some of the many possibilities of why your server may go down:
Memory leaks The server may eventually fail if a Node.js application is leaking memory, as you stated in your post above. This may occur if the application keeps adding new objects to the memory without appropriately cleaning up.
Unhandled exceptions The server may crash if an exception is thrown in the application code and is not caught. To avoid this from happening, ensure that all exceptions are handled properly.
Third-party libraries If the application uses any third-party libraries, the server may experience problems as a result. Before using them, consider examining their resource usage, versions, or updates.
Network Connection The server's network connection may have issues if the server is sending a lot of queries to third-party APIs or if the connection is unstable. Verify that the server is handling connections, timeouts, and retries appropriately.
Connection to the Database Even though your server doesn't use any BD connections, it's a good idea to look for any stale connections to databases that could be problematic.
High Volumes of Traffic The server may experience performance issues if it is receiving a lot of traffic. Make sure the server is set up appropriately to handle a lot of traffic, making use of load balancing, caching, and other speed enhancement methods. Cloudflare is always a good option ;)
Concurrent Requests Performance problems may arise if the server is managing a lot of concurrent requests. Check to see if the server is set up correctly to handle several requests at once, using tools like a connection pool, a thread pool, or other concurrency management strategies.
(Credit goes to my System Analysis and Design course slides)
With any incoming/outgoing web requests, 2 File Descriptors will be acquired. as there is a limit on number of FDs, OS does not let new Socket to be opened, this situation cause "Timeout Error" on clients. you can easily check number of open FDs by sudo ls -la /proc/_PID_/fd/ | tail -n +4 | wc -l where _PID_ is nodejs PID, if this value is rising, you have connection leak issue.
I guess you need to do the following to prevent Connection Leak:
make sure you are closing outgoing API call Http Connection (it depends on how you are opening them, some libraries manage this and you just need to config them)
cache your outgoing API call (if it is possible) to reduce API call
for your outgoing API call, use Connection pool, this would manage number of open HttpConnection, reuse already-opened connection and ...
review your code, so that you can serve a request faster than now (for example make your API call more parallel instead of await or nested call). anything you do to make your response faster, is good for preventing this situation
I solve problem. It really was lack of connections. I add reusing node-fetch connection due keepAlive and a lot of cache for saving connections. And its works.
Thanks for all your answers. They all right, but most helpful thing was added open-telemetry to my server to understand what exactly happens inside request.
For other people with these problems, I'm strongly recommended as first step, add telemetry to your project.
https://opentelemetry.io/
I took over a project where the developers were not fully aware of how Node.js works, so they created code accessing MongoDB with Mongoose which would leave inconsistent data in the database whenever you had any concurrent request reaching the same endpoint / modifying the same data. The project uses the Express web framework.
I already instructed them to implement a fix for this (basically, to use Mongoose transaction support with automatically managed retriable transactions), but due to the size of the project they will take a lot of time to fix it.
I need to put this in production ASAP, so I thought I could try to do it if I'm able to guarantee sequential processing of the incoming requests. I'm completely aware that this is a bad thing to do, but it would be just a temporary solution (with a low count of concurrent users) until a proper fix is in place.
So is there any way to make Node.js to process incoming requests in a sequential manner? I just basically don't want code from different requests to run interleaved, or putting it another way, I don't want non-blocking operations (.then()/await) to yield to another task and instead block until the asynchronous operation ends, so every request is processed entirely before attending another request.
I have an NPM package that can do this: https://www.npmjs.com/package/async-await-queue
Create a queue limited to 1 concurrent user and enclose the code that calls Mongo in wait()/end()
Or you can also use an async mutex, there are a few NPM packages as well.
I'd like to store some info. in a node.js array variable (to be a local cache) that my middleware would check before making a database query.
I know that I can do this w/redis and it's generally the preferred method b/c redis offers snapshots for persistence and is quite performant, but I can't imagine anything being more performant than a variable stored in-memory.
Every time someone brings up this topic, however, folks say "memory leaks" make this a bad idea. But why? Why is node.js bad at managing server-side vars?
Is there a preferred method (outside of an external k/v db store) of managing a server-side array/cache through node.js?
The problem with using a node variable as storage is that by using it you have made your application unable to scale. Consider a large application which serves thousands of requests per second, and cannot be run on a single machine. If you spin up a second node process, it has different values for your node storage variable.
Let's say a user making an API call to your application hits machine 1, and stores a session variable. They make a second API call and this time are routed by your load balancer to machine 2. Their session variable is not found and you throw an error.
If you are writing a small application and have no expectations of scaling up in the near term, by all means use a node variable - I've done this myself before for auth tokens on small websites. You can always switch to redis later if you need to. Of course, you need to be aware that if your node process restarts, the contents of your variable will be lost.
I know that blocking code is discouraged in node.js because it is single-threaded. My question is asking whether or not blocking code is acceptable in certain circumstances.
For example, if I was running an Express webserver that requires a MongoDB connection, would it be acceptable to block the event loop until the database connection was established? This is assuming that all pages served by Express require a database query (which would fail if MongoDB was not initialized).
Another example would be an application that requires the contents of a configuration file before being initializing. Is there any benefit in using fs.readFile over fs.readFileSync in this case?
Is there a way to work around this? Is wrapping all the code in a callback or promise the best way to go? How would that be different from using blocking code in the above examples?
It is really up to you to decide what is acceptable. And you would do that by determining what the consequences of blocking would be ... on a case-by-case basis. That analysis would take into account:
how often it occurs,
how long the event loop is likely to be blocked, and
the impact that blocking in that context will have on usability1.
Obviously, there are ways to avoid blocking, but these tend to add complexity to your application. Really, you need to decide ... on a case-by-case basis ... whether that added complexity is warranted.
Bottom line: >>you<< need to decide what is acceptable based on your understanding of your application and your users.
1 - For example, in a game it would be more acceptable to block the UI while switching "levels" than during active play. Or for a general web service, "once off" blocking while a config file is loaded or a DB connection is established during webserver startup is more acceptable that if this happened on every request.
From my experience most tasks should be handled in a callback or by returning a promise. You DO NOT want to block code in a Node application. That's what makes it so nice! Mostly with MongoDB it will crash before it has a chance to connect if there is no connection. It won't' really have an effect on an API call because your server will be dead!
Source: I'm a developer at a bootcamp that teaches MEAN stack.
Your two examples are completely different. The distinction actually answers the question in and of itself.
Grabbing data from a database is dependent on being connected to that database. Any code that is dependent upon that data is then dependent upon that connection. These things have to happen serially for the app to function and be meaningful.
On the other hand, readFileSync will block ALL code, not just code that is reliant on it. You could start reading a csv file while simultaneously establishing a database connection. Once both are done, you could add that csv data to the database.
I am developing an application that allows users to run AI algorithms on the server remotely. Some of these algorithms take a VERY long time. It is set up such that AJAX calls supply the algorithm parameters and launch a C++ algorithm on the server. The results and status of the computation are tracked via AJAX calls polling status files. This solution seems to work well for multiple users concurrently using the service, but I am now looking for a way to cancel the computation from the user's browser. I have a stop button that stops the AJAX updating service and ceases any communication between the browser and the running process on the server. The problem is that the process still runs, and I would like to free up the server resources when the user cancels the operation. Below are some more details.
The web service where the AJAX calls hit are run under the user 'tomcat' and can be listed by ps -U tomcat. The algorithm executions are all child processes of 'java' and can be listed by ps --ppid ###.
The browser keeps a record of the time that the current computation began (user system time, not server system time).
Multiple computations may be going on at once from users connected from different locations, resulting in many processes under the same name and parent process.
The restful service executes terminal commands via java runtime.exec().
I am not so knowledgeable about shell scripting, so any help would be greatly appreciated. Can anyone think of a way to either use java process object or shell script/awk to locate a process via timestamp (maybe the closest timestamp to user system time..?) or some other way?
Thanks in advance.
--edit
Is there even a way in java to get a handle for a given process if you have the pid...? Doesn't seem like it.
--edit
I cannot change the source code of the long running process on the server. :(
Your AJAX call should be manipulating some sort of a resource (most conveniently a text file) that acts as a semaphore to the process, which in every iteration of polling checks whether that semaphore file has been set to the stop status. If the AJAX changes the semaphore file to stop, then the process stops because your application checks it and responds accordingly. Which in turn means that the functionality needs to be programmed into your Java AI application rather than figuring out what the PID is and then killing it at the OS level. That, of course, assumes you have access to the source code of the app.
Of course, the semaphore does not have to be a file but can be a value in the DB etc., whichever suits your taste and configuration.
I have finally found a secure solution. From the restful java service, using Process p = Runtime.getRuntime().exec() gives you a handle on the running process. The only way, however, to get the pid is through a technique called reflection.
Field f = p.getClass().getDeclaredField();
f.setAccessible(true);
String pid = Integer.toString(f.getInt(p));
How unbelievably awkward...
Anyways, due to the passing of p from the server to the client being impossible, and the insecurity of allowing a remote call to kill an arbitrary server process by a pid passed by parameter, the only logical strategy I could come up with was to write the obtained pid to a process-unique file indicated by the initial client timestamp, and to delete this file upon restful service function return. This unique file can be used as a termination handle via yet another restful service which reads the file, and terminates the process with pid equal to the contents of the file. This
You could keep the Process instance returned by runtime.exec and invoke Process.destroy to kill the subprocess. Not knowing much about your webservice application I would assume you can keep the process instances in a global session map that maps users to process lists. Make sure access to this map is thread-safe. Also it only works if you have one webservice process that allows to share such a global session map across different requests.
Alternatively take a look at Get subprocess id in Java.