Hide referral information when my site users click on external links

Hide referral information when my site users click on external links - security

I apologize for my lack of knowledge on how the intricacies of the web work ahead of time.
I run a fairly large deal site (lets call it dealsite.com) and we send a lot of traffic to Amazon.com. Is there anyway for me to hide from Amazon that the users are are coming from dealsite.com? I do not want Amazon to know that we (dealsite.com) are the ones sending the traffic.
Maybe strip certain cookies?
Send outbound traffic through a proxy?
I am not doing anything illegal and these are real users not bots.

By using the noreferrer tag on your links, you can prevent Amazon from learning their traffic is coming from your site, and you don't need to set up a proxy, vpn, or cookie redirects.
HTTP generally sends the referring page along with its request for the new page as part of the HTTP referer section of the request header, and that's how sites track where their visitors come from. So for example, a user would click through to Amazon.com from Dealsite.com, and the request would include an HTTP referer telling Amazon.com that the user was linked from Dealsite.com.
To prevent web sites like Amazon from learning that their traffic came from your site, prevent your links from sending the HTTP referer. In HTML5, just add rel="noreferrer" to your links, and then referral information will not be sent to the site that was linked. The noreferrer link type is only suppported in new browsers, so I suggest using the knu's noreferrer polyfill to make sure it works on older browsers too.
So far this will prevent referrer information from being sent from 99.9% of your users - the only users that will send referral information will be users that are both using old browsers and have JavaScript disabled. To make it 100%, you could require users have JavaScript enabled to be able to click on those particular links.

Disclaimer: This is not the thorough idea you're looking for. I ran out of space in the comments so posted it as an answer. A couple of possible solutions come to my mind.
Proxy servers: Multiple distributed proxy servers to be specific. You can round robin your users through these servers and and hit Amazon so that the inbound traffic to Amazon from dualist.com keeps revolving. Disadvantage is that this will be slow depending on where the proxy server resides. So not the most ideal solution for an Ecommerce site but it works. And the major advantage is that implementation will be very simple.
VPN tunneling: Extremely similar to proxy server. VPN tunnel to another server and send redirect to Amazon from there. You'll get a new (non dealsite.com) IP from the VPN server of this network and your original IP will be masked
Redirects from user (Still in works) For this one I was thinking of if you could store the info you need from dealsite.com in a cookie and then instruct the host to redirect to Amazon by itself. Hence the inbound traffic to Amazon will be from the users IP and not dealsite.coms. If you need to get back to the dealsite session from Amazon, you could use the previously saved cookie to do so.
Ill add to this answer if I find something better.
Edit 1 A few hours more hours researching brought me to the Tor project. This might be useful but be wary, Many security experts advise against using Tor. See here

Related

Detect that a Browser is on the Intranet

I've got a requirement to detect if a webpage is being served on the internet or intranet, i.e. assuming a url of https://accessibleanyway.com, is the phone connected to the work wifi or to something else like their home wifi or the phone network?
What different ways are there to do this?
(1) Use WebRTC to get the local ip address. Not widely supported
(2) Try to access a local web page using jsonp/cors/iframe
The problem with 2 is that the webpage is https and the local resource is likely to be http which you can't do in IE afaik. If I make the local resource https then it's via a self cert which means installing CAs on the phones (can you buy certificates for the intranet anymore?)
Any suggestions?

The problem with (2) was that the same page was trying to use http and https, and even with an iframe you get issues.
What you could do instead is start on a http loading page, use an iframe to access a local resource which you can only access if you are on the intranet, jsonp will work fine for this. Once that's worked or failed, redirect to your start page with some token in the querystring to indicate that you are on the intranet or not
NB jumping from http to https would probably have some security issues if you are on the same website (authentication cookies being initially visible), but I would have thought it would be fine if you are going to a different one
Obviously there'll be some security needed around the token as otherwise the user could just generate their own but that's a different matter which depends on individual setups. It would obviously have to be generated by a server call, otherwise someone could just read the client code.
NB I think the IP address approach is never going to work as you have no way of knowing what a companies intranet setup looks like until you go there, so it's not a generic answer

Is is possible to restrict a requesting domain at the application level?

I wonder how some video streaming sites can restrict videos to be played only on certain domains. More generally, how do some websites only respond to requests from certain domains.
I've looked at http://en.wikipedia.org/wiki/List_of_HTTP_header_fields and saw the referrer field that might be used, but I understand that HTTP headers can be spoofed (can they?)
So my question is, can this be done at the application level? By application, I mean, for example, web applications deployed on a server, not a network router's operating system.
Any programming language would work for an answer. I'm just curious how this is done.
If anything's unclear, let me know. Or you can use it as an opportunity to teach me what I need to know to clearly specify the question.

HTTP Headers regarding ip-information are helpful (because only a smaller portion is faked) but is not reliable. Usually web-applications are using web-frameworks, which give you easy access to these.
Some ways to gain source information:
originating ip-address from the ip/tcp network stack itself: Problem with it is that this server-visible address must not match the real-clients address (it could come from company-proxy, anonymous proxy, big ISP... ).
HTTP X-Forwarded-For Header, proxies are supposed to set this header to solve the mentioned problem above, but it also can be faked or many anonymous proxies aren't setting it at all.
apart from ip-source information you also can use machine identifiers (some use the User-Agent Header. Several sites for instance store this machine identifiers and store it inside flash cookies, so they can reidentify a recalling client to block it. But same story: this is unreliable and can be faked.
The source problem is that you need a lot of security-complexity to securely identify a client (e.g. by authentication and client based certificates). But this is high effort and adds a lot of usability problem, so many sites don't do it. Most often this isn't an issue, because only a small portion of clients are putting some brains to fake and access server.
HTTP Referer is a different thing: It shows you from which page a user was coming. It is included by the browser. It is also unreliable, because the content can be corrupted and some clients do not include it at all (I remember several IE browser version skipping Referer).

These type of controls are based on the originating IP address. From the IP address, the country can be determined. Finding out the IP address requires access to low-level protocol information (e.g. from the socket).
The referrer header makes sense when you click a link from one site to another, but a typical HTTP request built with a programming library doesn't need to include this.

SaaS DNS Records Design

This question is an extension to previously answered question
How to give cname forward support to saas software
Sample sites -
client1.mysite.com
client2.mysite.com
...
clientN.mysite.com
Create affinity by say client[1-10].mysite.com to be forwarded to europe.mysite.com => IP address.
Another criteria is it should have little recourse to proxy, firewall and network changes. In essence the solution I am attempting is a Data Dependent Routing (based on URL, Login Information etc.).
However they all mean I have a token based authentication system to authenticate and then redirect the user to a new URL. I am afraid that can be a single point of failure and will need a seperate site from my core app to do such routing. Also its quite some refactoring to existing code. Another concenr is the solution also may not be entirely transparent to the end user as it will be a HTTP Redirect 301.
Keeping in mind that application can be served from Load Balanced Web Servers (IIS) with LB Switch and other Network appliances, I would greatly appreciate if someone can simplify and educate me how this should be designed.
Another resource I have been looking up is -
http://en.wikipedia.org/wiki/DNAME#DNAME_record

You could stick routing information into a cookie, so that the various intermediary systems can then detect that cookie and redirect the user accordingly without there being a single point of failure.
If the user forges a cookie of his own, he might get redirected to a server where he does not belong, but that server would then check whether the cookie is indeed valid, and prevent unauthorized access.

What are the pros and cons of a 100% HTTPS site?

First, let me admit that what I know about HTTPS is pretty rudimentary. I don't know much about session security, encryption, or how either of those things is supposed to be done.
What I do know is that web security is important; that horror stories of XSS, CSRF, and database injections pop up over and over again. I know that a preventative stance against such exploits is better than a reactive one.
But the motivation for this question comes from a different point of view. I work at a site that regularly accepts payment from users. Obviously, the payments are sent over a secure channel (HTTPS). I mainly work on the CSS, HTML, and JavaScript of the site. What I've been told is that it is necessary to duplicate CSS, JavaScript, and image files before they can be called over HTTPS. So assume I have the following files:
css/global.css
js/global.js
images/
logo.png
bg.png
The way I understand it, these files need to be duplicated before they can be "added" to the HTTPS. So a file can either be under security (HTTPS) or not.
If this is true, then this is a major hindrance. In even the smallest site, it would be a major pain to duplicate files and then have to maintain them every time you make a CSS or JS change. Obviously this could be alleviated by moving everything into the HTTPS.
So what I want to know is, what are the pros and cons of a site that is completely behind HTTPS? Does it cause noticeable overhead? Is it just foolish to place the entire site under encryption? Would users feel safer seeing the "secure" notifications in their browser during their entire visit? And last but not least, does it truly make for a more secure site? What can HTTPS not protect against?

You can serve the same content via HTTPS as you do via HTTP (just point it to the same document root).
Cons that may be major or minor, depending:
serving content over HTTPS is slower than serving it via HTTP.
certificates signed by well-known authorities can be expensive
if you don't have a certificate signed by a trusted authority (eg, you sign it yourself), visitors will get a warning
Those are pretty basic, but just a few things to note. Also, personally, I feel much better seeing that the entire site is HTTPS if it's anything related to financial stuff, obviously, but as far as general browsing, no, I don't care.

Noticeable overhead? Yes, but that matters less and less these days as clients and servers are much faster.
You don't need to make a copy of everything, but you do need to make those files accessible via HTTPS. Your HTTPS and HTTP services can use the same doc root.
Is it foolish to put the whole site under encryption? Typically no.
Would users feel safer? Probably.
Does it truly make for a more secure site? Only when dealing with the communication channel between the client and the server. Everything else is still up for grabs.

You've been misinformed. The css, js, and image files need not be duplicated assuming you've set up the http and https mapping to point to the same physical website on the server. The only important thing is that these files are referenced with https when the page you're looking at is also under https. This will prevent the dreaded security message that says that some objects on the page are not secured.
For every other page where you're running the site under http (unsecured) you can reference those same files in the same locations, but with an http address.
To answer your other question, there would indeed be a performance penalty to put the entire site under https. The server has to work hard to encrypt everything it sends over the wire. And then some not-so-old browsers won't cache https content to disk by default, which of course will result in an even heavier load on the server.
Because I like my sites to be as responsive as possible, I'm always selective about which sections of a site I choose to be SSL-encrypted. In most typical e-commerce sites, the only pages that need SSL encryption are the login, registration, and checkout pages.

The traditional reason for not having the entire site behind SSL is processing time. It does take more work for both the client and the server to use SSL. However, this overhead is fairly small compared to modern processors.
If you are running a very large site, you may need to scale slightly faster if you are encrypting everything.
You also need to buy a certificate, or use a self signed one which may not be trusted by your users.
You also need a dedicated IP address. If you are on a shared hosting system, you need to have an IP that you can dedicate to only having SSL on your site.
But if you can afford a certificate and private ip and don't mind needed a slightly faster server, using SSL on your entire site is a great idea.
With the number of attacks that SSL mitigates, I would say do it.

You do not need multiple copies of these files for them to work with HTTPs. You may need to have 2 copies of these files if the hosting setup has been configured in such that you have a separate https directory. So to answer your question - no duplicate files are not required for HTTPs but depending on the web hosting configuration - they may be.
In regards to the pros and cons of https vs http there are already a few posts addressing that.
HTTP vs HTTPS performance
HTTPS vs HTTP speed comparison
HTTPs only encrypts the data between the client computer and the server. It does not software holes or issues such as remote javascript includes. HTTPs doesn't make your application better - it only helps secure the data between the user and your app. You need to make sure your app has no security holes, practice filtering all data, SQL, and review security logs frequently.
However if you're only responsible for the frontend part of the site I wouldn't worry about it but would bring up concerns of security with the main developer for the backend.

One of the concerns is that https traffic could be blocked, for example on Apple computers if you set parental control on it blocks https traffic because it can't read the encrypted content, you can read here:
http://support.apple.com/kb/ht2900
https note: For websites that use SSL
encryption (the URL will usually begin
with https), the Internet content
filter is unable to examine the
encrypted content of the page. For
this reason, encrypted websites must
be explicitly allowed using the Always
Allow list. Encrypted websites that
are not on the Always Allow list will
be blocked by the automatic Internet
content filter.

An important "pro" for more https at your site is the following:
a user connecting thru an unencrypted WiFi, like at an airport, can give their password in https, but if the site then switches back to http after the password page, the session cookie becomes exposed and can be immediately used by an eavesdropper.
See this article http://steve.grc.com/2010/10/28/why-firesheeps-time-has-come/#comment-2666

How to prevent SSL urls from leaking info?

I was using google SSL search (https:www.google.com) with the expectation that my search would be private. However, my search for 'toasters' produced this query:
https://encrypted.google.com/search?hl=en&source=hp&q=toasters&aq=f
As you can see, my employer can still log this and see what the search was. How can I make sure that when someone searches on my site using SSL (using custom google search) their search terms isn't made visible.

The URL is sent over SSL. Of course a user can see the URL in their own browser, but it isn't visible as it transits the network. Your employer can't log it unless they are the other end of the SSL connection. If your employer creates a CA certificate and installs it in your browser, they could use a proxy to spoof Google host names, but otherwise, the traffic is secure.

HTTPS protects the entire HTTP exchange, including the URL, so the only thing someone intercepting network traffic will be able to determine is that there was communication between the browser and your site (or Google in this case). Even without the innards, that information can be useful.
Unless you have full administrative control over the systems making the queries, you should assume that anything transpiring on them can be intercepted or logged. Browsers typically store history and cache pages in files on the local disk which can be read by administrators. You also can't verify that the browser itself hasn't been recompiled with code to log sites that were visited, even in "private" mode.

Presumably your employer provides you with a PC, the software on it, the LAN connection to its own corporate network, the internet proxy and corporate firewall, maybe DNS servers, etc etc.
So you are exposed to traffic sniffing and tracing at many different levels. Even if you browse to a url over SSL TLS, you have to assume that the contents of your http session can be recorded. Do you always check that the cert in your browser is from google and not your employer's proxy? Do you know what software sits between your browser and your network card, etc.
However, if you had complete control over the client, then you could be sure that no-one external to your https conversation with google would be able to see the url you are requesting.
Google still knows what you're up to, but that's a private matter between your search engine and your conscience ;)

to add to what #erickson said, read this. SSL will protect the data between the connected parties. If you need to hide that link from the boss then disable the browser caching of the sites visited, i.e. disable or delete the history data.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string