I'm new to node.js (and javascript in general) so I thought I would learn by creating a simple weather app utilizing YQL. Overall, the app is working but the request is extremely slow. It takes about ~6 seconds to return the json. On the other hand, I created the same app using jQuery(getJSON) and I get results almost immediately.
Is this the best way to parse json from an url in node.js?
var request = require('request');
var url = 'http://query.yahooapis.com/v1/public/yql?q=SELECT%20*%20FROM%20weather.forecast%20WHERE%20location%3D96720&format=json&diagnostics=true&callback='
request(url, function (error, response, body) {
console.log(body);
})
I'd appreciate any feedback and/or suggestions.
Thanks in advance.
I just tried to open that url in my browser and after > 10 seconds, it's still loading. (Update: it timed out.)
My guess is that, Yahoo is setting caching headers that your browser is respecting, so your first request was probably cached and then every jQuery request after that loaded from your browser cache nearly instantly. request() does not have a local cache, so it's downloading a fresh copy each time, which takes much longer.
Try clearing your browser cache and see if it runs as quickly - if the first request is slow then that would confirm my suspicion.
Update: here's some more info about request speeds:
My home internet connection: mostly timeouts. (Time Werner Cable / Road Runner; Western Ohio)
My prgmr.com server in california: fast - first request took 1.151 seconds in node.js, after that every request was < 0.1 seconds. Node v0.8.21
node.js web proxy running on heroku: fast - 0.28 seconds including the time to proxy it back to my laptop - http://www.nodeunblocker.com/proxy/http://query.yahooapis.com/v1/public/yql?q=SELECT%20*%20FROM%20weather.forecast%20WHERE%20location%3D96720&format=json
So, I'm starting to think that maybe it's location dependent, and can say at the very least that node isn't slow. If nothing else, you can feel free to proxy requests through my heroku server. Note that it's a free account, so when heroku "idles" it, then the next request is slow or sometimes errors out.
Related
I'm writing an API that returns an array of redirects for any given page:
router.post('/trace', function(req,res){
if(!req.body.link)
return res.status(405).send(""); //error: no link provided!
console.log("\tapi/trace()", req.body.link);
var redirects = [];
function exit(goodbye){
if(goodbye)
console.log(goodbye);
res.status(200).send(JSON.stringify(redirects)); //end
}
function getRedirect(link){
request({ url: link, followRedirect: false }, function (err, response, body) {
if(err)
exit(err);
else if(response.headers.location){
redirects.push(response.headers.location);
getRedirect(response.headers.location);
}
else
exit(); //all done!
});
}
getRedirect(req.body.link);
});
and here is the corresponding browser request:
$.post('/api/trace', { link: l }, cb);
a page will make about 1000 post request very quickly and then waits a very long time to get each request back.
The problem is the response to the nth request is very slow. individual request takes about half a second, but as best I cant tell the express server is processing each link sequentially. I want the server to make all the requests and respond as it receives a response.
Am I correct in assuming express POST router is running processes sequentially? How do I get it to blast all requests and pass the responses as it gets them?
My question is why is it so slow / is POST an async process on a "out of the box" express server?
You may be surprised to find out that this is probably first a browser issue, not a node.js issue.
A browser will have a max number of simultaneous requests it will allow your Javascript ajax to make to same host which will vary slightly from one browser to the next, but is around 6. So, if you're making 1000 requests, then only around 6 are being sent at at time. The rest go in a queue in the browser waiting for prior requests to finish. So, your node server likely isn't getting 1000 simultaneous requests. You should be able to confirm this by logging incoming requests in your node.js app. You will probably see a long delay before it receives the 1000th request (because it's queued by the browser).
Here's a run-down of how many simultanous requests to a given host each of the browser supported (as of a couple years ago): Max parallel http connections in a browser?.
My first recommendation would be to package up an array of requests to make from the client to the server (perhaps 50 at a time) and then send that in one request. That will give your node.js server plenty to chew on and won't run afoul of the browser's connection limit to the same host.
As for the node.js server, it depends a lot on what you're doing. If most of what you're doing in the node.js server is just networking and not a lot of processing that requires CPU cycles, then node.js is very efficient at handling lots and lots of simultaneous requests. If you start engaging a bunch of CPU (processing or preparing results), then you make benefit from either adding worker processes or using node.js clustering. In your case, you may want to use worker processes. You can examine your CPU load when your node.js server is processing a bunch of work and see if the one CPU that node.js is using is anywhere near 100% or not. If it isn't, then you don't need more node.js processes. If it is, then you do need to spread the work over more node.js processes to go faster.
In your specific case, it looks like you're really only doing networking to collect 302 redirect responses. Your single node.js process should be able to handle a lot of those requests very efficiently so probably the issue is just that your client is being throttled by the browser.
If you want to send a lot of requests to the server (so it can get to work on as many as feasible), but want to get results back immediately as they become available, that's a little more work.
One scheme that could work is to open a webSocket or socket.io connection. You can then send a giant array of URLs that you want the server to check for you in one message over the socket.io connection. Then, as the server gets a result, it can send back each individual result (tagged with the URL that it corresponds to). That way, you can somewhat get the best of both worlds with the server crunching on a long list of URLs, but able to send back individual responses as soon as it gets them.
Note, you will probably find that there is an upper limit to how many outbound http requests you may want to run at the same time from your node.js server too. While modern versions of node.js don't throttle you like the browser does, you probably also don't want your node.js server attempting to run 10,000 simultaneous requests because you may exhaust some sort of network resource pool. So, once you get past the client bottleneck, you will want to test your server at different levels of simultaneous requests open to see where it performs best. This is both to optimize its performance, but also to protect your server against attempting to overextend its use of networking or memory resources and get into error conditions.
Best way to respond restfully to a time consuming calculation (Node.js with Express)
I'm building a REST API using Node.js with Express.
Now I have to design the response to a time consuming calculation on the server side to a http request witch is ending up in a zip file with a few megabytes.
The calculation is done in a few seconds at the moment. But perhaps, if data grows up, it can take minutes sometime. So I think it's not the best way to wait in Express till data is ready and then res.sendfile() it back to the client. But what is the best solution?
Should I implement a delayed response with long polling? Like http-delayed-response
What I don't like here is that I immediately have to report HTTP 202. If the generation of the ZIP file fails, I have to deal with this in the response and cannot report it via HTTP status code. Also I cannot respond the file directly - as far as I know.
Or something with job ids and states to each job? Like express-delayed-response
In this way my API is not stateless anymore and I have to build a status handler.
Are there better and more established solutions to this problem?
I am running an express application which uses imagemagick to take a screenshot of a webpage and upload the result to S3 and insert a row into our database.
The service works well however after a few requests eventually the server just hangs upon request. There is no response, but the server is technically not 'down'.
Is there any way for me to see logs, or anything that can be causing this such as memory leak?
You should profile your application to check when the bottleneck happens and where it happens in your code. A good starting point is with node's integrated V8 stack profiling module.
I have build a node.js app for establishing connections with keep-alive, event-stream post request with some external server. I send 400 of them and keep acting on received data using data event from request node package. I also listen to the end, response and error events.
When I run the application from localhost everything works perfectly according to plan. However, when I push it to Openshift, only first 5 requests work as intended, the rest just... disappears. I don't get any error, I don't get any response, nor end. I tried sending the requests in with some delay between them, I tried looking for information about maximum requests, I debugged it thoroughly - nothing works. Does anybody have an idea, basing on this description of the problem, how to make all 400 request work (or have an answer why they won't)?
I found the solution for that problem myself. It appeared, that the Request.js library couldn't establish more connections then 5 due to the default agent.maxSockets property of the http server set to that number in 0.10 version of Node.js. Although I had not been requiring http package directly, the Request.js library used it. All I had to do to fix that was to put the following code in the main application module:
var http = require('http');
http.globalAgent.maxSockets = 500; //Or any other number you want
Btw. the default value has been changed to infinity already in the 0.12 version of Node.js.
In our asp.net application we have an upload feature. When Fiddler is running on the client (with Act as a system proxy) the upload is quick (10megs in 20 sec). however, when Fiddler is not up on the client it's taking about 5 minutes. Any one have any suggestions?
Converting to answer
Fiddler isn't replacing that setting, but as a local proxy, it's buffering the complete POST request locally (so the small buffer doesn't matter) and then blasting it to the server as quickly as the server will take it. The send buffer size was increased in later browser versions.
For IE6, see the steps in http://support.microsoft.com/kb/329781 to adjust the buffer size.