StreamingResponse FASTAPI returns strange file name - python-3.x

I have an API that outputs StreamingReponse (https://fastapi.tiangolo.com/advanced/custom-response/?h=fileresponse#streamingresponse) as zip/gz.
When I download the file VIA Swagger, I get a very strange name, for example:
application_gz export something=1&something=1&something=Example&archive_type=gz blob https __<ip_address>_aaaaaaaa-aaaa-aaaa-aaaa-aaaaaaaaaaaaa
so basically - with an ip address of the server, a uuid, some names. Is there anyway to change this to be something I decide, or atleast more elegant?
thanks!

You can use the Content-Disposition HTTP header to give an alternative file name for the resource. Since StreamingResponse is a subclass of Response, you can set this by using the headers parameter:
return StreamingResponse(fp, headers={'Content-Disposition': 'attachment; filename="yourfilename.zip"'}
You can also use inline instead of attachment if you don't want to force a download but let the client display it directly instead (for example for PDF files).

Related

Downloading Binary File from OneDrive API Using Node/Axios

I am using the One Drive API to grab a file with a node application using the axios library.
I am simply trying to save the file to the local machine (node is running locally).
I use the One Drive API to get the download document link, which does not require authentication (with https://graph.microsoft.com/v1.0/me/drives/[location]/items/[id]).
Then I make this call with the download document link:
response = await axios.get(url);
I receive a JSON response, which includes, among other things, the content-type, content-length, content-disposition and a data element which is the contents of the file.
When I display the JSON response to the console, the data portion looks like this:
data: 'PK\u0003\u0004\u0014\u0000\u0006\u0000\b\u0000\u0000\u0000!\u...'
If the document is simply text, I can save it easily using:
fs.writeFileSync([path], response.data);
But if the file is binary, like a docx file, I cannot figure out how to write it properly. Every time I try it seems to have the wrong encoding. I tried different encodings.
How do I save the file properly based on the type of file retrieved.
Have you tried using an encoding option of fs.writeFileSync of explicitly null, signifying the data is binary?
fs.writeFileSync([path], response.data, {
encoding: null
});

How to re-order HTTP headers?

I was wondering if there was any way to re-order HTTP headers that are being sent by our browser, before getting sent back to the web server?
Since the order of the headers leaves some kind of "fingerprinting", see this post and this post, I was thinking about using MITMProxy (with Inline Scripting, I guess) to modify headers on-the-fly. Is this possible?
How would one achieve that?
Note: I'm looking for a method that could be scripted, not a method using a graphical tool like the Burp Suite (although Burp is known to be able to re-order headers)
I'm open to suggestions. Perhaps NGINX might come to the rescue as well?
EDIT: I should be more specific, by giving an example...
Let's say I'm using Firefox. With the use of a funky add-on, I'm spoofing my user-agent to "look" like a Chrome browser. But then if I test my browser with ip-check.info, the "signature" of my browser remains the one of Firefox, even though my spoofed user-agent shows "Chrome".
So the solution, in this specific case, should be to re-order the HTTP headers in the same manner as Chrome does.
How can this be done?
For the record, the order of the HTTP headers should not matter at all according to RFC 7230. But now that you have asked... this can be done in mitmproxy as follows:
import random
def request(context, flow):
# flow.request.headers.fields is a tuple of (name, value) header tuples.
h = list(flow.request.headers.fields)
random.shuffle(h)
flow.request.headers.fields = tuple(h)
See the mitmproxy documentation on netlib.http.Headers for more details.
There are tons of way to reorder them as you wish:
def reorder(headers, header_order=["Host","User-Agent","Accept"]):
lines = []
for name in header_order: # add existing headers in the specified order
if name in headers:
lines.extend(headers.get_all(name))
del headers[name]
lines.extend(headers.fields) # all other headers
return lines
request.headers.fields = reorder(request.headers)

NodeJs web crawler file extension handling

I'm developing a web crawler in nodejs. I've created a unique list of the urls in the website crawle body. But some of them have extensions like jpg,mp3, mpeg ... I want to avoid crawling those who have extensions. Is there any simple way to do that?
Two options stick out.
1) Use path to check every URL
As stated in comments, you can use path.extname to check for a file extension. Thus, this:
var test = "http://example.com/images/banner.jpg"
path.extname(test); // '.jpg'
This would work, but this feels like you'll wind up having to create a list of file types you can crawl or you must avoid. That's work.
Side note -- be careful using path. Typically, url is your best tool for parsing links because path is aimed at files/directories, not urls. On some systems (Windows), using path to manipulate a url can result in drama because of the slashes involved. Fair warning!
2) Get the HEAD for each link & see if content-type is set to text/html
You may have reasons to avoid making more network calls. If so, this isn't an option. But if it is OK to make additional calls, you could grab the HEAD for each link and check the MIME type stored in content-type.
Something like this:
var headersOptions = {
method: "HEAD",
host: "http://example.com",
path: "/articles/content.html"
};
var req = http.request(headersOptions, function (res) {
// you will probably need to also do things like check
// HTTP status codes so you handle 404s, 301s, and so on
if (res.headers['content-type'].indexOf("text/html") > -1) {
// do something like queue the link up to be crawled
// or parse the link or put it in a database or whatever
}
});
req.end();
One benefit is that you only grab the HEAD, so even if the file is a gigantic video or something, it won't clog things up. You get the HEAD, see the content-type is a video or whatever, then move along because you aren't interested in that type.
Second, you don't have to keep track of file names because you're using a standard MIME type to differentiate html from other data formats.

Express res.download() not actually downloading file

I'm attempting to return generated files to the front end through Express' res.download function. I'm using chrome, but whenever I call that API that executes the following code all that is returned is the same values returned from the Express res.sendFile() function.
I know that res.download uses res.sendFile, but I would like the download function to actually save to the file system instead of just returning the file in the body of the response.
This is my code.
exports.download = function(req,res) {
var filePath = //somefile that I want to download
res.download(filePath, 'response.txt', function(err) {
throw err;
}
}
I know that the above code at least partly works because I'm getting back, in the response, the contents of the file. However, I want it to be saved onto the file system.
Am I misunderstanding what the download function is supposed to do? Do I just need to take the response data and write it to the file system manually?
res.download adds headers that suggest to the browser that the file should be downloaded rather than opened. However, there's no way to force the browser to do this; it's ultimately the user's choice whether to download a particular file, typically.
If you're triggering this request with AJAX, well, that's not going to cause it to be downloaded, because your JavaScript is requesting that it get the data.
Do I just need to take the response data and write it to the file system manually?
You don't have file system access in browser-side JavaScript. I'm not sure how you intend to do this.

Node.js (sails.js) Find total number of files sent in a file upload

I have a file upload system in sails.js app. I want to process the uploads before saving them in the server. My form on the client side allows multiple file uploads. Now on the server side how do I know how many files were sent?
For example I can find the total bytes to be expected from the upload using the following:
req._fileparser.form.bytesExpected
However, I couldn't find something similar that helps me find the total number of files sent to the server.
Also the above code req._fileparser.form.bytesExpected, is there a better way to get total combined file size of the files sent through the upload form by the client?
In the github repository for Skipper there is a file: index.js
Line 92 from the above file, which appears to deal with multipart file uploads, contains the following:
var hasUpstreams = req._fileparser && req._fileparser.upstreams.length;
You should check the length of upstreams in your code, and see if that contains the number of files you sent.
Another option: send a parameter in your request from the client with the number of files uploaded.
See the skipper ReadMe section about Text Parameters.
Skipper allows you to access the other non-file metadata parameters (e.g "photoCaption" or
"commentId") in the conventional way. That includes url/JSON-encoded HTTP body parameters
(req.body), querystring parameters (req.query), or "route" parameters (req.params); in other words,
all the standard stuff sent in standard AJAX uploads or HTML form submissions. And helper methods
like req.param() and req.allParams() work too.
I've just found a previous question/answer on stackoverflow.
You might try using var upload = req.file('file')._files[0].stream to access and validate, as shown in the above answer.

Resources