How can URL shorteners (Service Provider) be more secure?

How can URL shorteners (Service Provider) be more secure? - security

As a short URL service provider what safety checks I should follow on URLs to keep minimum risk on my users. For example someone should not use my service to short their hacking, spamming, phishing, etc type of links.
For example I can do domain whitelisting on my service so only trusted domain URLs can be short through my service. Like that what other safety checks I should follow before shorting any URLs.

As a URL Shortner (Service Provider) this safety checklist should be followed before shorting any URLs to keep the risk at minimum. Please note that safety on the internet can not be provided by following or using only one checklist or certain rules. We must try new things on own and learn new things day by day. This checklist is part of my research on the internet.
Domain or IP Whitelisting/Blacklisting - This is one of the best safety mechanism we can implement to allow shorting URLs from only trusted domains/IPs. And we can also block certain domains/IPs like malicious domains and IPs. Sometime blacklisting is easy to bypass but whitelisting is hard to bypass so whitelisting is recommended. However we can do both according to our need.
Domain/IP Reputation Check - We can also implement a mechanism to check Domain/IP reputation on global internet before shorting it. For example malicious IPs/Domains might have bad reputation on global internet database, we can simply block those address for shorting.
Reference :-
https://talosintelligence.com/reputation_center/
https://www.ipqualityscore.com/ip-reputation-check
https://ipremoval.sms.symantec.com/lookup
Check DNS Records (mxtoolbox) - MxToolbox supports global Internet operations by providing free, fast and accurate network diagnostic and lookup tools. MxToolbox provide a great list of tools including blacklisting of any ip/host, whois lookup etc.
Virus Total IP/Domain check - Virus Total is an online service that analyzes suspicious files and URLs to detect types of malware and malicious content using antivirus engines and website scanners.
HTTPS Protocol use - Before shorting any URLs we must verify that the URL contain proper SSL/TLS certificate and it contains HTTPS protocol. It is high chance that malicious sites/IPs will communicate through HTTP protocol. So, we must avoid shorting of HTTP hosts.
Disallow known shortened links - Some users try to over smart the security system by shorting the URLs multiple times. As a short link service provider we know the security risks behind short links. So we must avoid shorting of links that are already shorted by other services or even our service.
Compare URLs to list of known badware - There are lots of online database that provide list of suspected malicious IPs and domains. Use this database to prevent shorting of this hosts.
Reference:-
https://zeltser.com/malicious-ip-blocklists/
https://www.trendmicro.com/vinfo/us/threat-encyclopedia/malicious-url
URL Filtering - We can implement some policies like firewall policies that blocks certain content. For example, spam, nudity, violence, weapons, drugs, hacking, etc.

Related

How does Google force HTTPS on their .app TLD?

In I/O 2018 Google announced their new .app TLD and they said that it will be HTTPS only.
I thought that DNS just maps domain names to IP's.
How are they forcing HTTPS?

(a little offtopic here)
It is called HSTS Preloading, see https://hstspreload.org/
HSTS (HTTP Strict Transport Security) is a way for servers to reply to clients: please contact me over HTTPS only (see https://www.troyhunt.com/the-6-step-happy-path-to-https/ for examples). It enhances security but still does not solve one case: the first connection to a given server can happen over HTTP before the browser learns it should have done an HTTPS instead.
Hence come the "preloading" of HSTS.
Basically this is an hardcoded list embarked in all major browsers code
(see https://caniuse.com/#feat=stricttransportsecurity for compatibility depending on browser and version, or see at bottom for links to code[1]) that says which domains/TLD are HSTS enabled, which means no HTTP connection allowed to them at all.
Note that:
Anyone can submit names to this list by following some requirements, see https://hstspreload.org/#submission-requirements
Google (as it started with Chrome but it is now spread among browsers) welcome inclusion of TLDs and not only hostnames, see end of document at https://hstspreload.org/ ("TLD Preloading")
They already did add .DEV in the past (the TLD by itself is not live yet, but Google will launch it "soon") which broke many developers setup where they used (wrongly) a .DEV domain name to name their local resources and as soon as their browsers were updated with the newer HSTS preloading list, they refused to connect to their local .DEV host without HTTPS. You can find here and elsewhere (ex: https://ma.ttias.be/chrome-force-dev-domains-https-via-preloaded-hsts/) many horror stories of developers up in arms against that and also may people offering bad solutions for that (like disabling HSTS preloading which is a very bad idea).
Also when you buy a .APP domain name (and it will be same for .DEV), Google (as registry of .APP) made sure contractually with all registrars that they will, during checkout of a .APP domain name buy, display a prominent message saying something along the line of: ".APP is a secure TLD and websites will only work with an SSL certificate(sic); make sure to buy an SSL certificate" (SSL certificate is straight out of Google documentation and this is very sad to read out of them since it is a doubly wrong term, it should have been an "X.509 certificate" or, in order not to frighten anyone, at least a "certificate used for TLS communications", noone should use SSL anymore nowadays...).
By the way, .APP opened for the public at standard prices yesterday, May 8th.
Of course all of that is only related to web browsing. You could set any other kind of service, like email, on top of a .APP domain name, without any mandatory TLS (which of course is not a good idea nowadays but nothing will refrain you from doing that). For email, there is ongoing discussion to have basically HSTS but for MTAs, see https://datatracker.ietf.org/doc/draft-ietf-uta-mta-sts/
[1] see some source codes with the HSTS preloading list:
https://chromium.googlesource.com/chromium/src/net/+/master/http/transport_security_state_static.json
https://dxr.mozilla.org/mozilla-central/source/security/manager/ssl/nsSTSPreloadList.inc
or you can use the API at https://hstspreload.com/ to learn if a name is on the list

It's just a policy. A domain name is a domain name, and DNS only cares about how the name is translated to other resources, like for example an IP address. Technically any IP address can be used together with any IP protocol (there are 256 to choose from, one of which is TCP) and when applicable, any port number (there are 65536 to choose from, two of which are HTTP and HTTPS respectively). There is no way to place restrictions on this via DNS, but of course the TLD registrar can attempt to do this via policy rules.
By trial and error I easily found an .app domain where HTTPS is not enforced:
curl -v -L http://foo.app/
This results in a couple of redirects, but none of them redirect to HTTPS, and the final response is a HTTP response from a GoDaddy address.

Is is possible to restrict a requesting domain at the application level?

I wonder how some video streaming sites can restrict videos to be played only on certain domains. More generally, how do some websites only respond to requests from certain domains.
I've looked at http://en.wikipedia.org/wiki/List_of_HTTP_header_fields and saw the referrer field that might be used, but I understand that HTTP headers can be spoofed (can they?)
So my question is, can this be done at the application level? By application, I mean, for example, web applications deployed on a server, not a network router's operating system.
Any programming language would work for an answer. I'm just curious how this is done.
If anything's unclear, let me know. Or you can use it as an opportunity to teach me what I need to know to clearly specify the question.

HTTP Headers regarding ip-information are helpful (because only a smaller portion is faked) but is not reliable. Usually web-applications are using web-frameworks, which give you easy access to these.
Some ways to gain source information:
originating ip-address from the ip/tcp network stack itself: Problem with it is that this server-visible address must not match the real-clients address (it could come from company-proxy, anonymous proxy, big ISP... ).
HTTP X-Forwarded-For Header, proxies are supposed to set this header to solve the mentioned problem above, but it also can be faked or many anonymous proxies aren't setting it at all.
apart from ip-source information you also can use machine identifiers (some use the User-Agent Header. Several sites for instance store this machine identifiers and store it inside flash cookies, so they can reidentify a recalling client to block it. But same story: this is unreliable and can be faked.
The source problem is that you need a lot of security-complexity to securely identify a client (e.g. by authentication and client based certificates). But this is high effort and adds a lot of usability problem, so many sites don't do it. Most often this isn't an issue, because only a small portion of clients are putting some brains to fake and access server.
HTTP Referer is a different thing: It shows you from which page a user was coming. It is included by the browser. It is also unreliable, because the content can be corrupted and some clients do not include it at all (I remember several IE browser version skipping Referer).

These type of controls are based on the originating IP address. From the IP address, the country can be determined. Finding out the IP address requires access to low-level protocol information (e.g. from the socket).
The referrer header makes sense when you click a link from one site to another, but a typical HTTP request built with a programming library doesn't need to include this.

SaaS DNS Records Design

This question is an extension to previously answered question
How to give cname forward support to saas software
Sample sites -
client1.mysite.com
client2.mysite.com
...
clientN.mysite.com
Create affinity by say client[1-10].mysite.com to be forwarded to europe.mysite.com => IP address.
Another criteria is it should have little recourse to proxy, firewall and network changes. In essence the solution I am attempting is a Data Dependent Routing (based on URL, Login Information etc.).
However they all mean I have a token based authentication system to authenticate and then redirect the user to a new URL. I am afraid that can be a single point of failure and will need a seperate site from my core app to do such routing. Also its quite some refactoring to existing code. Another concenr is the solution also may not be entirely transparent to the end user as it will be a HTTP Redirect 301.
Keeping in mind that application can be served from Load Balanced Web Servers (IIS) with LB Switch and other Network appliances, I would greatly appreciate if someone can simplify and educate me how this should be designed.
Another resource I have been looking up is -
http://en.wikipedia.org/wiki/DNAME#DNAME_record

You could stick routing information into a cookie, so that the various intermediary systems can then detect that cookie and redirect the user accordingly without there being a single point of failure.
If the user forges a cookie of his own, he might get redirected to a server where he does not belong, but that server would then check whether the cookie is indeed valid, and prevent unauthorized access.

How to prevent SSL urls from leaking info?

I was using google SSL search (https:www.google.com) with the expectation that my search would be private. However, my search for 'toasters' produced this query:
https://encrypted.google.com/search?hl=en&source=hp&q=toasters&aq=f
As you can see, my employer can still log this and see what the search was. How can I make sure that when someone searches on my site using SSL (using custom google search) their search terms isn't made visible.

The URL is sent over SSL. Of course a user can see the URL in their own browser, but it isn't visible as it transits the network. Your employer can't log it unless they are the other end of the SSL connection. If your employer creates a CA certificate and installs it in your browser, they could use a proxy to spoof Google host names, but otherwise, the traffic is secure.

HTTPS protects the entire HTTP exchange, including the URL, so the only thing someone intercepting network traffic will be able to determine is that there was communication between the browser and your site (or Google in this case). Even without the innards, that information can be useful.
Unless you have full administrative control over the systems making the queries, you should assume that anything transpiring on them can be intercepted or logged. Browsers typically store history and cache pages in files on the local disk which can be read by administrators. You also can't verify that the browser itself hasn't been recompiled with code to log sites that were visited, even in "private" mode.

Presumably your employer provides you with a PC, the software on it, the LAN connection to its own corporate network, the internet proxy and corporate firewall, maybe DNS servers, etc etc.
So you are exposed to traffic sniffing and tracing at many different levels. Even if you browse to a url over SSL TLS, you have to assume that the contents of your http session can be recorded. Do you always check that the cert in your browser is from google and not your employer's proxy? Do you know what software sits between your browser and your network card, etc.
However, if you had complete control over the client, then you could be sure that no-one external to your https conversation with google would be able to see the url you are requesting.
Google still knows what you're up to, but that's a private matter between your search engine and your conscience ;)

to add to what #erickson said, read this. SSL will protect the data between the connected parties. If you need to hide that link from the boss then disable the browser caching of the sites visited, i.e. disable or delete the history data.

What are potential issues with allowing clients to have CNAME / DNS Masking support in a web application?

Our company develops a web application that other companies can license. Typically, our application runs on:
www.company.example
And a client's version of the application is run on:
client.company.example
Usually, a client runs their own site at:
www.client.example
Sometimes, clients request to have their version of the application available from:
application.client.example
This kind of setup is often seen with blogs (Wordpress, Blogger, Kickapps).
Technically, achieving this "DNS Masking" with a CNAME/A Record and some application configuration is straightforward. I've thought out some potential issues related to this, however, and wonder if you can think of any others that I've missed:
1) Traffic statistics (as measured by external providers, e.g., compete.com) will be lower since the traffic for company.example won't include that of application.client.example. (Local stats would not be affected, of course)
2) Potential cookie disclosure from application.client.example to company.example. If the client is setting cookies at .client.example, those cookies could be read by the company.example server.
3) Email Spoofing. Email could be sent from company.example with the domain application.client.example, possibly causing problems with spam blacklisting due to incompatible SPF records.
Thanks for any thoughts on this.

CNAME has been widely used for so long, especially by hosting companies. There are no major issues.
The biggest problem for us is when you have to use HTTPS. It's very difficult to support multiple CNAMEs on the same server. We use aliases in certificate (SAN extension). We have to get a new cert every time a new CNAME is added in DNS. Other than that, everything works great for us.
As to the issues you mentioned,
This should be an advantage. It's a lot easier to combine the stats than to separate them. So we prefer granular reports.
Cookies are not shared between domains, even if they are on the same IP. As long as apps are properly sandboxed on the server, they can't read each other's cookie.
You should rate-limit your own outgoing SMTP traffic on the server end so you don't get blacklisted.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string