I do not understand why resp.json() needs to be awaited. From my understanding async/await is useful when dealing with I/O. But when I call resp.json() in the example below, has the web request not already been processed with session.get() in the line above?
async with session.get('https://api.github.com/events') as resp:
print(await resp.json())
But when I call resp.json() in the example below, has the web request not already been processed with session.get() in the line above?
No, it reads only HTTP headers, to get response body you need to read the rest of the response.
It's pretty useful since you can check HTTP headers and avoid reading the rest of the response if, let's say, server returned wrong HTTP code.
Another example: if you expect response body to be big, you can read it by chunks to avoid RAM overusage (check note here).
Related
badStream.pipe(res)
When badStream throws an error, the response is not terminating and the request in the browser is stuck in a pending state.
badStream.on(error, function() {this.end()}).pipe(res)
I've tried the above to no avail. What's the proper way to handle the error in this case? Thanks for any help.
In nodejs, an error on the readstream that is piped to the http response stream just unpipes it from the response stream, but does not otherwise do anything to the response stream it was piped to. That leaves it hanging as an open socket with the browser still waiting for it to finish (as you observed). As such, you have to manually handle the error and do something to the target stream.
badStream.pipe(res);
badStream.on('error', err => {
// log the error and prematurely end the response stream
console.log(err);
res.end();
});
Because this is an http response and you are already in the middle of sending the http response body and thus the http status and headers have already been sent, there aren't a lot of things you can do in the middle of sending the response body.
Ultimately, you're going to have to call res.end() to terminate the response so the browser knows the request is done. If there's a content-length header on this response (the length was known ahead of time), then just terminating the response stream before it's done will cause the browser to see that it didn't get the whole response and thus know that something went wrong.
If there's no content-length on the response, then it really depends upon what type of data you're sending. If you're just sending text, then the browser probably won't know there's an error because the text response will just end. If it's human readable text, you could send "ERROR, ERROR, ERROR - response ended prematurely" or some visible text marker so perhaps a human might recognize that the response is incomplete.
If it's some particular format data such as JSON or XML or any multi-part response, then hanging up the socket prematurely will probably lead to a parsing error that the client will notice. Unfortunately, http just doesn't really make provisions for mid-response errors so it's left to the individual applications to detect and handle.
FYI, here's a pretty interesting article that covers a lot about error handling with streams. And, note that using stream.pipeline() instead of .pipe() also does a lot more complete error handling, including giving you one single callback that will get called for an error in either stream and it will automatically call .destroy() on all streams. In many ways, stream.pipeline(src, dest) is meant to replace src.pipe(dest).
So I've got a working node.js code that processes data from a website's API. I'd like to speed it up a bit and I figured the best way to do that would be to send a request and while waiting for a response some code would execute and not just wait for a response like it is now. Right now my code is essentially this:
function httpGet(url){
var response = requestSync(
'GET',
url
);
return response.body;
}
var returnCode;
var getUrl = "url"
returnCode = httpGet(getUrl);
var object = JSON.parse(returnCode);
//Some code executes
As you can see with this way some time is lost because you're waiting for the response. I'd be looking for something in this sense (pseudocode):
Send a request
Some code that's not related to the request is executed right after the request is sent
After the part above is done the request result is parsed
In conclusion I'm looking for a way to send a request and not waste time waiting for a response. If you have any other ideas on how to speed up the code please let me know :)
You are looking for asynchronous code. When you use a function like requestSync it means that it "blocks" until it's done. It's synchronous. When you use something asynchronous, you will usually do so with a callback (a function to call when the desired action is completed) or a promise (an abstraction over callbacks). There are lots of questions about using those on SO. This post: How do I return the response from an asynchronous call? has a bunch of info related to your question.
When writing asynchronous crawlers using asyncio and aiohttp in Python, I have always had a question: why you must use async with, and it's easy to report errors if you don't use them.
Although aiohttp also has a method request, it can support calling a simpler api. I want to know what is the difference. I still like the requests module very much, I don't know if it can be used as simple as the requests module.
why you must use async with
It's not like you must use async with, it's just a fail-safe device for ensuring that the resources get cleaned up. Taking a classic example from the documentation:
async def fetch(session, url):
async with session.get(url) as response:
return await response.text()
You can re-write it as:
async def fetch(session, url):
response = await session.get(url)
return await response.text()
This version appears to work the same, but it doesn't close the response object, so some OS resources (e.g. the underlying connection) may continue to be held indefinitely. A more correct version would look like this:
async def fetch(session, url):
response = await session.get(url)
content = await response.text()
response.close()
return content
This version would still fail to close the response if an exception gets raised while reading the text. It could be fixed by using finally - which is exactly what with and async with do under the hood. With an async with block the code is more robust because the language makes sure that the cleanup code is invoked whenever execution leaves the block.
I know that it is not an advisable solution to use GET however I am not in control of how this server works and have very little experience with requests.
I'm looking to add a dictionary via a GET request and was told that the server had been set up to accept this but I'm not sure how that works. I have tried using
import requests
r = request.get('www.url.com', data = 'foo:bar')
but this leaves the webpage unaltered, any ideas?
To use request-body with a get request, you must override the post method. e.g.
request_header={
'X-HTTP-Method-Override': 'GET'
}
response = requests.post(request_uri, request_body, headers=request_header)
Use requests like this pass the the data in the data field of the requests
requests.get(url, headers=head, data=json.dumps({"user_id": 436186}))
It seems that you are using the wrong parameters for the get request. The doc for requests.get() is here.
You should use params instead of data as the parameter.
You are missing the http in the url.
The following should work:
import requests
r = request.get('http://www.url.com', params = {'foo': 'bar'})
print(r.content)
The actual request can be inspected via r.request.url, it should be like this:
http://www.url.com?foo=bar
If you're not sure about how the server works, you should send a POST request, like so:
import requests
data = {'name', 'value'}
requests.post('http://www.example.com', data=data)
If you absolutely need to send data with a GET request, make sure that data is in a dictionary and instead pass information with params keyword.
You may find helpful the requests documentation
A simple question, if I have a route setup with a chain of callbacks e.g.
app.route('/myroute').post(callback1, callback2, callback3);
I call next() on each of my callback, except the last one.
Suppose I use my 'res' object to render and send back a response on callback2 but I still want to do some processing on callback3 which does not need to interact with the client or return anything.
Will my callback3 be always executed even if callback1 or callback2 uses the res object to return a response?
Doing some tests shows that it does call callback3 but some say expressjs will terminate the call chain if res returns a response. So I don't want to have any doubts, is there a clear answer on the behaviour here?
If you are calling next() in callback2, then the code in callback3 will be executed. The only thing that has "finished" in callback2 is the HTTP request. The connection will be closed, as you've sent a response already. Any further attempts to send a response afterwards will result in: Error: Can't render headers after they are sent to the client.