How does the browser handle TCP out of band data? - browser

I am interested in understanding whether browsers can handle TCP OOB data sent to them from a server? For instance, I request a web page (from my web server) using firefox browser, and I inject some OOB data in the middle of the response stream from the web server. It seems that if the OOB data is sent after an entire object (e.g. image, stylesheet, javascript) is returned, then the browser does not face any trouble in reading and loading that object. However, if the OOB data is sent in the middle of an object's byte stream, then the browser may fail to load that object. I am not sure if the browser does not manage to receive the remaining of the object after the OOB data or it just fails to load the object.
Apart from looking at the source code of a browser, is there any other reference to understand this aspect of the browser behaviour?
Thanks and regards

Related

If server response is just a simple JSON, then why there are 5 different resources showing in my network tab?

I am learning node and wrote a simple app which returns just JSON data as response.
There is no any error but I am not getting why there are 5 different resources (content.min.css, favicon.ico, jsonview-core.css, options.png, etc) showing in my network tab. Can anyone help me to understand this.
If possible how can I avoid sending these extra resources and just send the JSON data?
It's because of the chrome extension that I am using.
I found some interesting about favicon.ico file:
Modern browsers will show an icon to the left of the URL. This known
as the 'favicon.ico' and is typically fetched from
website.com/favicon.ico. Your browser will automatically request it
when browsing to different sites. If your browser receives a valid
favicon.ico file, it will display this icon. If it fails, it will not
display a special icon.

Is my picture of a website correct?

I tried analyzing what in essence is a website . I thought of deconstructing or reverse engineering a website . The following are the sequence of events, I speculate or theorize the following sequence of events to be taking place during interaction with a website .
1.Every website is basically a set of computer programs,which get executed when the system where they are stored are contacted .
2.Depending on the processing of the type of request sent by the sender , some xml files , files containing the code to be executed,in response to different events and some script purported for dynamic alteration of the xml files are sent. Out of these xml files .
Out of these xml files , one contains the information about the initial appearance of the page and the furnishing of different controls or event generators on the screen .
4.So when some activity is done in the locality of one event generator , like a mouse click , an event is generated .
The code snippet to respond to the event is executed . If the code contains contacting the server and sending some request then the server is again pinged .
When the server is pinged again , depending on the request sent it again executes some code and in response transfers some more code files ,xml files and scripts to dynamically change the appearance of the page .
Is my understanding about the flow of a website correct ?
A web server is basically just a program sitting on a computer that listens on some TCP port (usually 80 for HTTP, 443 for HTTPS).
Clients (such as browsers) can connect and send a request (in HTTP format) to the server.
The server then sends an HTTP response back.
That's it. That's the basic flow: Connect, request, response.
The response contains a "type" field that tells the client what to do with the data. E.g. it could send an image (which is usually displayed on screen), an audio file (which is played), or a "normal" web page in HTML format.
HTML contains structured information about page content and layout, and may contain references to other resources such as images, style sheets, and scripts. A browser automatically fetches these resources (another HTTP request/response) and processes them.
Scripts can be used to customize the behavior on the client side. These are typically written in JavaScript and make use of an API exposed by the browser for interacting with the current page. They can e.g. register "click" handlers to define what happens when the user clicks on some page element.
XML may or may not be used internally by the web server. It doesn't really matter as far as clients are concerned.
If you want to learn more about this, I suggest researching HTTP, HTML, CSS, and JavaScript. MDN has some good articles, for example.

What computer languages does my browser understand?

I am trying to understand computer languages on a client-side and server-side level a little bit better. From what I understand, client-side code (HTML, CSS and Javascript) are all built into the browwer and can be understood without an internet connection.
However, let's say I build a simple blog application in Python or Ruby. Would my server have the ONLY knowledge of how to break down the Python and Ruby code before sending it back to the client? If so, how does the server compile/interpret the code before sending it back to the client/browser for it to understand?
Please help me understand this.
This is a very general and broad response:
A web server (server) and and a browser (client) like Firefox will communicate by sending text to each other. The method this 'text' is sent is described by a set of rules, or a protocol called Hyper Text Transfer Protocol (HTTP).
HTTP responses contain a 'body' field. This body contains text. The server may send any text to the client it wants. How the client renders said text is up to the client. The text could take the form of HTML, CSS, JAVASCRIPT, Chinese, numbers.....
So if the text the server sends to the client is in HTML format, the client will render it so. The same is true for CSS and Javascript.
But how does the server know what to send to the client?
Simply put, the person who built the website and owns the server put the code onto the server and said 'respond to clients with this stuff when you receive a request.'
Wait, so what's the deal with Python/Ruby/Java etc and those languages being used to write servers?
Servers are programs that take 'requests' and handle the logic that decides how to respond, and what to respond with. The actual content of what the response contains however, has nothing to do with the language that is used to handle the response.
Our Browser Understands only javascript. It converts markup language HTML elements to DOM elements. And our CSS is also like javascript objects

How web browsers decide which resource should be requested

I have a fundamental question and I am searching for that for a long but I still don't know the exact response for that.
I am working with browsers and web applications. I am wondering how and based on what a web browser decide to send a particular request to the web server.
For example when you enter http://www.google.com inside the address bar of your web browser. the Browser will send a bunch of request to the web server for rendering the web page properly.
Now, my question is that how the web browser decide which request it needs to send to the web server.
does it related to some tags like 'link' or 'script' inside the body of the responses.
does the browser parse the javascript functions to see if it should send a request based on those functions?
Lets take an example to explain this one.
Consider you want to search for something and you hit http://www.google.com on your browser. These are the events that unfold to fetch you the page that will let you type in your query.
First, the networking stack on your machine will try to figure out which actual internet address matches www.google.com. This is called a DNS lookup. Once it receives a response for this lookup in form of an IP address, it can make a connection to the actual server that is serving google.com.
The machine makes a socket connection and uses the HTTP protocol to communicate with the server. It queries for the resource at / (which is the root) of the address you are trying to reach. This is called a GET request. The request is normally described like so: GET /
Google will respond with an HTML page. normally "index.html", which gets downloaded by the browser.
Once the HTML is downloaded, all linked resources, such as images to render the HTML as well as javascript referenced by the HTML page gets downloaded.
The downloaded HTML page is parsed and an in-memory tree is created called the "DOM Tree". This tree contains the elements of the HTML page in a hierarchy. Once the DOM is created, you can see the page being rendered on the browser.
During this parsing, the browser discovers more resources to be downloaded, such as images, stylesheets, javascript files. The HTML page references these resources via different tags such as <img> for images, <script> for javascript.
All detected resources are downloaded. Browsers download many of these resources in parallel, but apply them (javascript and stylesheets) sequentially in the order they where found on the page.
Stylesheets are parsed, and the styles are applied to the DOM of the HTML page. Sometimes, if stylesheets take longer to download, you can see the "raw" HTML page being rendered before the styles are applied. This happens sometimes over a slow connection.
Once the HTML page and related javascript files have been downloaded, the browser calls the "onload" callback function of javascript. Most Javascript heavy applications are started at this time.
Once onload is called, Javascript takes over and can attach handlers for different elements on the web page. Once the handlers have all been installed, interacting with the webpage could call one or more javascript functions that are listening for these events.
Javascript can also manipulate the DOM (the elements on the page), which results in UI updates (what the user sees) and therefore can be used to build a complete app on a single page.
Here is some more reading on the process: http://friendlybit.com/css/rendering-a-web-page-step-by-step/
The best way to examine this interaction is to use Developer tools on Chrome/FireFox or IE and view the network activity when you visit a web page.

Scraping adf faces oracle rich client

I am trying to scrape a oracle adf faces rich client webpage but I am not getting the best of luck, I login automatically using node.js request module but after that I can't get to any other page with request. I get stuck on redirects, the loop script or simply don't get information I expect to.
I am using Wireshark to view every page and the way it handles, I recreate the page to match headers and even size but everytime the framework denies me access.
Before you ask, it's legal and I am not breaking any terms of service. Just trying to make a web api to speed up a process. I have used phantomjs with casperjs but get stuck on ajax calls that don't show on page and php curl but it's much easier with java.
Any suggestions are really really appreciated.
My bad on this one, wireshark was displaying fields as truncated, if you want to see the full field you need to right click the packet and click follow TCP stream, rich clients have very long posts generated by the framework behind the rich client and it appears I was missing about half of them when I did the calls.

Resources