Differences between wget and browser wireshark traffic

Differences between wget and browser wireshark traffic - browser

I'm trying to get Wireshark output that is as close as possible to using a browser
manually, via wget or urllib.
The output is different, and I was wondering why, and how do I overcome this?
Thanks!

wget is used primarily to grab whole or partial web sites for offline viewing, or for fast download of single files from HTTP or FTP servers instead.
A browser request contains HTTP headers like User Agent, Referer, etc.
If you want to mimic wget to a browser like request, you can pass HTTP headers with your wget request.
Something like this-
# wget http://www.remote.co.in/images/myimage.jpg --header="User-Agent: Mozilla/5.0 (Windows NT 5.1; rv:23.0) Gecko/20100101 Firefox/23.0" --header="Accept: image/png,image/;q=0.8,/*;q=0.5" --header="Accept-Language: en-US,en;q=0.5" --header="Accept-Encoding: gzip, deflate" --header="Referer: http://www.mywebsite.com"

There are a couple things...
A browser:
May have several specific headers (useragent, cookies, referer, misc. pplugins, no-track)
Requests all child elements/scripts/resources, possibly on the same connection (keep-alive)
May request gzipped datastream in return
WGet:
Has minimal headers by default (useragent), but can use/alter others with parameters
Is generally a 1-off, requesting only the main html only and not its child resources
It may be if you are seeing different main HTML that the site is server-side scripting tailored content based on useragent and/or cookies (e.g. "logged in")

Related

Is there an api I can fetch to return a curl value?

I recently noticed that some sites block node-libcurl but not curl. Is there an api that I can fetch to return the actual curl response? I used reqbin and tried to mimic the requests they sent, but to no avail. Thanks

many websites (for example wikipedia.org) blocks user-agents lacking a User-Agent header, curl has a default user-agent, it looks like curl/7.73.0 , but libcurl does not hence when you make a request with libcurl, it doesn't have an useragent and will be blocked many places, the solution is CURLOPT_USERAGENT

How can I download a video file (with dynamically changing link) directly to my server?

Hy, my question is about transferring videos directly to my web-server so that I don't have to upload them to the server (as it takes time).
I am using WGET tool via ssh on server for this task. WGET is downloading videos which has links (static) like this one:
http://pmln-hec.site88.net/CrackMD5SHA1hash.mp4
But when I try to download a video from a website like openload.co, I get forbidden 403 error (even when I am using user agent in WGET)
This is the command which I am using:
WGET -U 'Mozilla/5.0 (Windows NT 6.1; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0' https://ph2dos.oloadcdn.net/dl/l/lTmwcDbNky0pMsKN/pM06kIyNq7M/Barbie+Princesha+dhe+Kengetarja+e+Popit+%282013%29.mp4?mime=true
Output
--2016-12-29 13:27:58-- http://ph2dos.oloadcdn.net/dl/l/lTmwcDbNky0pMsKN/pM06kIyNq7M/Barbie+Princesha+dhe+Kengetarja+e+Popit+%282013%29.mp4?mime=true
Resolving ph2dos.oloadcdn.net (ph2dos.oloadcdn.net)... 91.207.102.204, 2a04:9dc0:3:51::
Connecting to ph2dos.oloadcdn.net (ph2dos.oloadcdn.net)|91.207.102.204|:80... connected.
HTTP request sent, awaiting response... 403 Forbidden
2016-12-29 13:27:58 ERROR 403: Forbidden.
Can you tell me whether it's even possible to download something from such type of (dynamically generated link, I hope I am making sense in it) or not? If it is then where am I going wrong? If this is not the way, can you please give me an alternative way?
Note:If I paste the above URL in a download manager like IDM, it's working fine.

Classic ASP -- Protect Against HTTP Header Injection

I'm trying to protect a Classic ASP web application from HTTP Header Injected XSS attacks and am having trouble finding a solution that stops scripts found in the User Agent String.
Here is an example HTTP request to the web application:
HTTP Request
GET /WebApp/Login.aspx HTTP/1.1
Host: WebServer.Webapp.Com
User‐Agent: Mozilla/5.0 (X11; Linux x86_64; rv:41.0) Gecko/20100101 Firefox/41.0**alert(1)**
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept‐Language: en‐US,en;q=0.5
Accept‐Encoding: gzip, deflate
Cookie: ASP.NET_SessionId=foobarID
Connection: keep‐alive
Basically what we're trying to do is keep that alert script in the User Agent String from firing off when the page is loaded. I've been doing a lot of research and haven't been able to find too much help for this old app.
We do have validateRequest and EnableHeaderChecking set to true, But this script still executes. Any help is really appreciated.

The issue was from the user agent string (with the malicious script) being rendered on the page at the bottom for debug purposes. If you're having this issue, please check that you aren't displaying the object with the bad script on the page.
If you are, than remember to use HTML Encoding to render it safely.
Thanks to the_lotus and Lankymart for the quick answers.

Does https encrypt the whole URL?

I googled a lot and many answers are Yes. For example: Is GET data also encrypted in HTTPS? But the senior security engineer in our company told me the URL would not be encrypted.
Image that, if the URL was encrypted, how does the DNS server find the host and connect?
I think is this is very strong point although it's against most of the answers. So I'm really confused and my questions are:
Does https encrypt the everything in the request? (including the URL, host, path, parameters, headers)
If yes, how the DNS server decrypt the request and send it to the host server?
I tried to access https://www.amazon.com/gp/css/homepage.html/ref=ya_surl_youracct and my IE sent two requests to the server:
First:
CONNECT www.amazon.com:443 HTTP/1.0
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Host: www.amazon.com
Content-Length: 0
DNT: 1
Connection: Keep-Alive
Pragma: no-cache
Second:
GET /gp/css/homepage.html/ref=ya_surl_youracct HTTP/1.1
Accept: text/html, application/xhtml+xml, */*
Accept-Language: en-US,zh-CN;q=0.5
User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; Trident/7.0; rv:11.0) like Gecko
Accept-Encoding: gzip, deflate
Host: www.amazon.com
DNT: 1
Connection: Keep-Alive
It seems my browser has requested twice: the first time is to establish the connection with host (without encryption) and the second time send an encrypted request over https? Am I right? If I am understanding this correctly, when a client call the RESTFUL API using https, it sends the requests (connection and get/post) twice every time?

The URL IS encrypted from the time it leaves the browser until it hits the destination server.
What happens is that the browser extracts the domain name and the port from the URL and uses that to resolve DNS itself. Then it starts an encrypted channel to the destination server IP:port. Then it sends a HTTP request through that encrypted channel.
The important part is anyone but you and the destination server can only see that you're connecting to a specific IP address and port. They can't tell anything else (like specific URLs, GET parameters, etc).
Attackers can't even see the domain in most cases (though they can infer it if there is actually a DNS lookup - if it wasn't cached).
The big thing to understand is that DNS (Domain Name Service) is a completely different service with a different protocol from HTTP. The browser makes DNS lookup requests to convert a domain name into an IP address. Then it uses that IP address to issue a HTTP request.
But at no time does the DNS server receive a HTTP request, and at no time does it actually do anything other than provide a domain-name - IP mapping for users.

While the other responses are correct so far as they go, there are many other considerations than just the encryption between the browser and the server. Here are some things to think about...
The IP address of the server is resolved.
The browser makes a TCP socket connection to the server's IP address using TLS. This is the CONNECT you see in your example.
The request is sent to the server over the encrypted session.
If this was all there is to it, you are done. No problem.
But wait, there's more!
Having the fields in a GET instead of a POST reveals sensitive data when...
Someone looks in the server logs. This might be a snoopy employee, but it can also be the NSA or other three-letter government agency, or the logs might become public record if subpoenaed in a trial.
An attacker causes the web site encryption to fall back to cleartext or a broken cipher. Have a look at the SSL checker from Qualsys labs to see if a site is vulnerable to this.
Any link on the page to an external site will show the URI of the page as the referrer. User ID and passwords are unintentionally yet commonly given away in this fashion to advertising networks. I sometimes spot these in my own blog.
The URL is available in the browser history and therefore accessible to scripts. If the computer is public (someone checks your web site from the guest PC in the hotel or airport lounge) the GET request leaks data to anyone else using that device.
As I mentioned, I sometimes find IDs, passwords and other sensitive info in the referrer logs of my blogs. In my case, I contact the owner of the referring site and tell them they are exposing their users to hacking. A less scrupulous person would add comments or updates to the site with links to their own web site, with the intention of harvesting the sensitive data in their referrer logs.
So your company's senior security engineer is correct that the URL is not encrypted in many places where it is extremely important to do so. You and the other respondents are also correct that it is encrypted in the very narrow use case of the browser talking to the server in context of a TLS session. Perhaps the confusion you mention has to do with the difference in the scope of these two use cases.
Please see also:
Testing for Exposed Session Variables (OTG-SESS-004)
Session Management - How to protect yourself (Note that "always use POST" is repeated over and over on this page.)
Client account hijacking through abusing session fixation on the provider

The URL (also known as "Uniform Resource Locator") contains four parts:
Protocol (e.g. https)
Host name (e.g. stackoverflow.com)
Port (not always included, typically 80 for http and 443 for https)
Path and file name or query
Some examples:
ftp://www.ftp.org/docs/test.txt
mailto:user#test101.com
news:soc.culture.Singapore
telnet://www.test101.com/
The URL as an entire unit is not actually encrypted because it is not passed in its entirety. The URL is actually pulled apart into bits and each part is used in different ways. E.g. the protocol portion will tell your browser how to use the rest of the URL, the host name will tell it how to look up the IP address of the intended recipient, and the port will tell it, well, which port to use. The only portion of the URL that is passed in the payload itself is the path and query, and that portion is encrypted.
If you take a look at an HTTP request in the raw, it looks something like this:
GET /docs/index.html HTTP/1.1
Host: www.test101.com
Accept: image/gif, image/jpeg, */*
Accept-Language: en-us
Accept-Encoding: gzip, deflate
User-Agent: Mozilla/4.0 (compatible; MSIE 6.0; Windows NT 5.1)
(blank line)
--Body goes here--
What you see in the example above is passed. Notice the full URL appears nowhere. The host header can actually be omitted completely (it is not used for routing). The only portion of the URL that appears here is to the right of the GET verb, and only includes the rightmost portion of the original URL. The protocol and the port number appear nowhere in the message itself.
Short answer: Everything to the right of the port number in the URL is included in the payload of the https request and is in fact encrypted.

How do I setup CORS on Lotus Domino?

I'm attempting to communicate with Domino via REST via a cross domain request, but I'm encountering an issue. I've setup an Internet Site document with the IP Address, localhost and a server name listed as the host names. The internet site is working as a redirect rule I've setup on that internet site is working. I've also setup a Web Site Rule with the following:
Now when I attempt to hit the rest.xsp page via an html GET request I'm getting this error:
XMLHttpRequest cannot load
http://192.168.1.104/testing/restService.nsf/rest.xsp/testRest?reqType=UserCanAc…TOP&startId=BA4241EC74912860ED60FD1123473BF7&returnType=ARRAYOBJECTS.
No 'Access-Control-Allow-Origin' header is present on the requested resource. Origin
'http://127.0.0.1:8020' is therefore not allowed access.
Here are the request headers:
Accept:application/json, text/javascript, */*; q=0.01
Cache-Control:max-age=0
Origin:http://127.0.0.1:8020
Referer:http://127.0.0.1:8020/Backbone%20Playground/index.html
User-Agent:Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_3) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/35.0.1916.114 Safari/537.36
I can't for the life of me figure out what I've missed. Can someone point me in the right direction?

The CORS header is part of the response, so you need to check if you get a CORS response header with your page. In any case, for an XPage you can get direct access to the servlet response object and set the header in your XPage:
var externalContext = facesContext.getExternalContext();
var response = externalContext.getResponse();
response.setHeader("Access-Control-Allow-Origin","*");
You want to replace the * with a little more restrictive setting. Cors doesn't work in all browsers, so you need to check that end too.

I think your configuration is fine and you can test it using CURL . You should be able to see the Custom Headers by checking any URL different to the one you're using.
The problem, maybe, is due to the XPages Extension Library control, REST Service, you're using. I think the "HTTP response headers" are not applied for this control. I've tested it in Domino 8.5.3

I know this is kinda old thread but since it's not being answered and there are some news, I think it's worth throwing in my own findings.
Mark Leusink caved into this and discovered that there's a need to accept also return code 204 for GET and 201 also for any write (PUT / POST) operations
There is now a new possibility to include a fourth Response Header to all website rules by the means of notes.ini parameter "HTTPAdditionalRespHeader=", see this technote
However, I'm also struggling on completing a CORS task currently, because Domino always responds with an 401 to the preflight (which seems clear as it comes unauthenticated, at least within Chrome).

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string