How to download XHR files from chromedriver using python? - python-3.x

As shown in the screenshot, I need to download all the XHR files in the network when any call made using python?

First load an URL on the domain you're targeting a file download from. This allows you to perform an AJAX request on that domain, without running into cross site scripting issues.
Next, inject some javascript into the DOM which fires off an AJAX request. Once the AJAX request returns a response, take the response and load it into a FileReader object. From there you can extract the base64 encoded content of the file by calling readAsDataUrl(). Then take the base64 encoded content and appending it to window, a gobally accessible variable.
Finally, because the AJAX request is asynchronous, enter a Python while loop waiting for the content to be appended to the window. Once it's appended, decode the base64 content retrieved from the window and save it to a file.
This solution should work across all modern browsers supported by Selenium, and works whether text or binary, and across all mime types.

Related

How to use azure logic app action to download files in browser

I originally created a logic app that would, given a JSON payload, run a stored procedure, transform the results into a CSV table and then email the CSV to a specified email account. Unfortunately requirements changed slightly and instead of emailing the csv they want it to download directly in the browser.
I am unable to get the HTTP response action to tell the browser to download the file using the Content-Disposition header. It looks like this is pulled out of the request by design. Is anyone aware of another action (perhaps a function?) that could be used in place of the HTTP response to get a web browser to download the file rather than returning it as text in the response body?
It does indeed seem to be the case that the Response action doesn't support the Content-Disposition header for some reason. Probably the easiest workround is to proxy the request through a simple HTTP-triggered Azure Function with CORS enabled (or an API on your server) that just fetches the file from the Logic App and then returns it with the Content-Disposition header attached.
NB. Don't rely on <a download="filename"> - most browsers that support the download attribute only respect it for same-origin requests.

Can I make a browser extension return a custom response for a web request?

In a Firefox or Chrome(1) extension (using WebExtensions), is it possible to interrupt a request and return an alternate response instead, preventing the network request? What I'd like to do is store some html data (dynamically) using the storage API, and then return that html when the browser attempts to send off a specific request.
webRequest.onBeforeRequest seems to only support cancelling the request or returning a redirect. Is there a way to redirect to something inside the extension that returns the data? Or a way to craft and return a response directly?
(1) The Chrome docs for webRequest seem to reflect the same reality as the Firefox docs, and it seems that Firefox deliberately adopted much of WebExtensions to be consistent with Chrome.

Load urls with aiohttp, wait a few seconds, refresh the page, then read the contents of the page

As the title mentions, I'm attempting to grab data from several pages using aiohttp and asyncio. However, the problem I'm having involves the program grabbing the info from the pages too quickly then exiting. The webpage needs to update its contents first (which can take a couple of seconds) and then refresh to display the properly updated contents, which are what I want to collect.
Is there a way I can load the page, wait a few seconds, refresh the page, and then read the contents of it? This is what my current fetch method looks like:
async def fetch(session, url):
with aiohttp.Timeout(10):
async with session.get(url) as response:
return await response.text()
When you load url in your browser tab, browser sends request to get url's content (which includes in our case only html text). Then browser searches for links in this html - links to images, to css, to scripts and sends requests to load it too. When browser loads some of this links it updates view of your page, in particular when javascript link loaded browser starts to execute it (updating page's html content). When all links needed to display page loaded and all scripts executed - your page is fully loaded.
From all this process request lib like aiohttp do only first thing - sends request to get url's content (response.text()). It wouldn't load scripts links inside this content, it wouldn't execute them to modify content.
What you ask can't be done with aiohttp.
If you need to load content with executed javascript you need much more complicated browser-based solution like PyQt.

How the browser treats static file,downloadable file,json,xml

Very basic question.Here it goes.
The client hits a url in the server.The server can send content in the form of
static files(javascript/html).
xml/json(predominantly the purpose of this file is to return some DATA to the client).
Downloadable file-kinda zip files.For this part the server needs to set the content type property to something to let the client know that it wants this file to download of something.
My question is how does the browser differentiate between the static files and api responses(form of xml/json/string) ??
Thanks,
Gully
HTTP Headers.
There's no such thing as a "file" in HTTP. There are requests and responses, each of which consist of headers and content. The response content may be the contents of what was a "file" on the server, and may be intended to be treated as a "file" on the client (such as downloading a .zip file), but the response itself is not a file. The way that the server indicates to the client that something should be a file is through the HTTP headers.
Specifically the two headers you're talking about are:
Content-Type
Content-Disposition
The first tells the client (browser) what kind of data it's receiving. There are lots of examples, and most browsers understand what to do with most common types. The second can be used to suggest to the client that the content should be saved as a file rather than displayed. For example, the Content-Type might be for an image, and by default a browser will just display an image. But you can add a Content-Disposition header to indicate that the image is an "attachment" and even suggest a file name for it, instructing the browser to save the file (or prompt the user asking to save the file) instead of displaying it.

How can you send html and other files from a node.js server

I am running (and learning) a basic node.js program that returns unformatted numeric data in the response object. The chrome browser just shows that data unformatted on screen. However, now I want to return HTML files, so I read an HTML file using the fs module and it returned that in the response object. The browser is showing the entire html instead of interpreting it. Here's what I want to do:
Send an HTML file with a javascript code in it.
Connect the javascript on the client with that in the server to excahange more HTML, css, javascript, json or other objects like image files etc.
How can I establish this architecture? I am completely new to web development.
1) Set the content-type header properly.
2) Use frameworks like expressjs socket.io, etc

Resources