How much data can I store in a cookie

How much data can I store in a cookie - browser

Are there any limits in modern browsers regarding how much data I can store in one single cookie?
I found an article stating it should hold a minimum of 4k, but no upper limit.
http://www.faqs.org/rfcs/rfc2965.html

Here are the limits according to 'browser cookie limits':
Chrome & Firefox - No limit to the number of cookies; limit of 4096 bytes per cookie.
IE8-10 - 5117 characters per cookie; limit of 10234 characters.
Safari on Mac - 4093 bytes per cookie. Limit of 4093 bytes.
Safari Mobile - 4093 bytes per cookie. Limit of 4093 bytes.
Android - 4093 bytes per cookie. No limit.
Source:
http://browsercookielimits.x64.me/#limits
I'd say if you have a destkop application, to be bound by the limits of IE or even Safari if you have Mac users. If it is on the phone, than definitely just 4093 bytes. If you need more, you are going to save to the server or create two different experiences for Safari / IE and Firefox / Chrome users.

To comply with the standard, you should store no more than 4096 bytes per cookie.
Another point worth remembering is that cookies are sent on EVERY request to the matching domain, which is very significant overhead in the case of a sizable cookie (upload speeds are often 10x slower than download speeds).
As to specific limits per browser, I defer to another response.

This can be also helpful here:

Related

Allowing users to download very large files in IIS (10GB+)

What solution are people using to enable users to download large files in application running on IIS?

For the web application, there is a cap on the maximum file request size and also the request timeout, this limit can be used to prevent denial of service attacks that are caused by users who post large files to the server.
As far as I know, the maximum length of content in a request can be specified via the maxAllowedContentLength, but maxAllowedContentLength has type uint, its maximum value is 4,294,967,295 bytes = 4 gb, this does not meet your requirements. or you can consider splitting this large file into several parts.

Querystring security and dos attack

I've got an image generation servlet that generates an image text from a querystring and I use like this:
<img src="myimage.jpg.jsp?text=hello%20world">
These below are my security measures:
Urlencoding of querystring parameter
Domain whitelist
Querystring parameter length check
My questions:
Any security measure I'm forgetting there?
How does the above increase DOS attack risks compared to a standard:
<img src="myimage.jpg">
Thanks for helping.

Things to check would be,
Use HTTP Referer header to verify that requests are originated from your pages. This is only relevant if you are only going to use these images on pages of your site. You can verify that these images are loaded from your page and are not being directly included in page on some other site or the image URL is not directly accessed. This can easily be forged for performing a DOS attack though.
Check the underlying library you are using to generate the images. What parameters you are passing it for generating the image and check which parameters can potentially be controlled by user which can affect the size of image or processing time for the image. I am not sure how the font, font size are provided to the image, if they are hardcoded, or if they are derived through the information from the user.
Since this URL pattern generates an image, I am assuming every call is CPU intensive as well as includes some data transfer for the actual image. You may want to control the rate at which these URLs are fired, if you are really worried about DOS.
As I already mentioned in my comment, the URL can only be 1024 characters long, so there is inherent limit on number of characters that the text field can have. You can enforce a even smaller limit by providing an additional check.

For DoS prevention rate limit how many requests can be received per IP per X number of seconds.
So to implement this, before doing any processing log the remote IP address of each request and then count the number of previous requests in the last e.g. 30 seconds. If the number from that IP address is greater than say 15 then reject the request with a "HTTP 500.13 Web Server Too Busy".
This is on the assumption that your database logging and lookup are less processor intensive than your image generation code. This will not protect against a large scale DDoS attack but it will reduce the risks considerably.
Domain whitelist
I assume this is on the "referer" header? Yes, this would stop your image from being directly included in other websites but it could be circumvented by proxying the request via the other site's server. The DoS protection above would help alleviate this though.
Querystring parameter length check
Yes that would help reduce the amount of processing that a single image request could do.
My questions:
Any security measure I'm forgetting there?
Probably. As a start I would verify you are not at risk to the OWASP Top 10.
How does the above increase DOS attack risks compared to a standard
A standard image request would simply request the static image off your server and the only real overhead would be IO. Processing though your JSP means that it is possible to overload your server by executing multiple requests at the same time as the CPU is doing a more work.

How to remember the non-register user choice?

My website collect user comments about some images. Non registered user can click "Good" button. What is the best way to remeber by system the user choice? One person can click "Good" only one time. Cookies? Session? Other way?

First you have to realise that all session techniques are cookie based. That is all good techniques. That means that they all have the downside that they will not work where cookies do not work user choices will be forgotten. In those (hopefully rare) cases you could store these choices either in the URL or as a CGI parameter. In any cases you can not make it really secure.
That being said, you have tradeoffs to consider.
Cookies
If you use purely cookie based storage then you could be limited in the number of user choices that can be stored in cookies under a single domain name. RFC 6265 states some SHOULDs regarding those and implementation matching these will give you at most 200KB which should be quite enough. Older RFC 2965 says implementations should give you 80KB. Also remember that the browser will send you the cookie for every request to your website. This could mean slow browsing for your users.
Assuming a 24 bits image ID (16 million possible images), base64 encoded to 4 bytes you can pack close to 20,000 choices into cookies. For 32 bits image ID, encoded to 6 bytes you still get more than 10,000 choices into your cookies.
When cookies prove too cumbersome, say after 1,000 votes you could switch the browser into the session technique… Or consider that he will never get to this without having registered ;-)
Sessions
If you decide to store the user choices in the session then you will have to dedicate some storage area on the server. The downsides are that:
you have no safe way to know when a session is not used anymore. Therefore you need some mecanism to reclaim unused sessions, typically expiring sessions after a fixed amount of inactivity,
it is more difficult to scale if and when you want to distribute the load amongs multiple HTTP servers.

You create a unique "token" that you save as cookie (hash of IP + timestamp for example). This value is also beeing saved to the database in conjunction with the vote.

Secure Token URL - How secure is it? Proxy authentication as alternative?

I know it as secure-token URL, maby there is another name out there. But I think you all know it.
Its a teqniuque mostly applied if you want to restrict content delivery to a certain client, that you have handed a specific url in advance.
You take a secret token, concatenate it with the resource you want to protect, has it and when the client requests the this URL on one of your server, the hash is re-constructed from the information gathered from the request and the hash is compared. If its the same, the content is delivered, else the user gets redirected to your webseite or something else.
You can also put a timestamp in the has to put a time to live on the url, or include the users ip adress to restrict the delivere to his connection.
This teqnique is used by Amazon (S3 and Cloudfront),Level 3 CDN, Rapidshare and many others. Its also a basic part of http digest authentication, altough there is it taken a step further with link invalidation and other stuff.
Here is a link to the Amazon docs if you want to know more.
Now my concerns with this method is that if one person cracks one token of your links, the attacker gets your token plain-text and can sign any URL in your name himself.
Or worse, in the case of amazon, access your services on an administrative scope.
Granted, the string hashed here is usually pretty long. And you can include a lot of stuff or even force the data to have a minimum length by adding some unnecessary data to the request. Maby some pseudo variable in the URL that is not used, and fill it up with random data.
Therefore brute force attacks to crack the sha1/md5 or whatever you use hash are pretty hard. But protocol is open, so you only have to fill in the gap where the secret token is and fill up the rest with the data known from the requst. AND today hardware is awesome and can calculate md5s at a rate of multiple tens of megabytes per second. This sort of attack can be distributed to a computing cloud and you are not limited to something like "10 tries per minute by a login server or so" which makes hashing approaches usually quite secure. And now with amazon EC2 you can even rent the hardware for short time (beat them with their own weapons haha!)
So what do you think? Do my concerns have a basis or am I paranoic?
However,
I am currently designing an object storage cloud for special needs (integrated media trans coding and special delivery methods like streaming and so on).
Now level3 introduced an alternative approach to secure token urls. Its currently beta and only open to clients who specifically request it. They call it "Proxy authentication".
What happens is that the content-delivery server makes a HEAD request to a server specified in your (the client's) settings and mimics the users request. So the same GET path and IP Address (as x_forwarder) is passed. You respond with a HTTP status code that tells the server to go a head with the content delivery or not.
You also can introduce some secure-token process into this and you can also put more restrictions on it. Like let a URL only be requested 10 times or so.
It obviously comes with a lot of overhead because additional request and calculations take place, but I think its reasonable and I don't see any caveats with it. Do you?

You could basically reformulate your question to: How long a secret token is needed to be safe.
To answer this consider the number of possible characters (alphanumeric + uppercase is is already 62 options per character). Secondly ensure that the secret token is random, and not in a dictionary or something. Then for instance if you would take a secret token of 10 characters long, it would take 62^10 (= 839.299.365.868.340.224 )attempts to bruteforce (worstcase; average case would be half of that of course). I wouldn't really be scared of that, but if you are, you could always ensure that the secret token is at least 100 chars long, in which case it takes 62^100 attempts to bruteforce (which is a number of three lines in my terminal).
In conclusion: just take a token big enough, and it should suffice.
Of course proxy authentication does offer your clients extra control, since they can way more directly control who can look and not, and this would for instance defeat emailsniffing as well. But I don't think the bruteforcing needs to be a concern given a long enough token.

It's called MAC as far as I understand.
I don't understand what's wrong with hashes. Simple calculations show that a SHA-1 hash, 160 bits, gives us very good protection. E.g. if you have a super-duper cloud which does 1 billion billions attempts per second, you need ~3000 billions billions years to brute force it.

You have many ways to secure a token :
Block IP after X failed token decoding
Add a timestamp in your token (hashed or crypted) to revoke the token after X days or X hours
My favorite : use a fast database system such as Memcached or better : Redis to stokre your tokens
Like Facebook : generate a token with timestamp, IP etc... and crypt it !

Chrome Instant invald URLs triggering website lockout

My website uses obscure, random URLs to provide some security for sensitive documents. E.g. a URL might be http://example.com/<random 20-char string>. The URLs are not linked to by any other pages, have META tags to opt out of search engine crawling, and have short expiration periods. For top-tier security some of the URLs are also protected by a login prompt, but many are simply protected by the obscure URL. We have decided that this is an acceptable level of security.
We have a lockout mechanism implemented where an IP address will be blocked for some period of time following several invalid URL attempts, to discourage brute-force guessing of URLs.
However, Google Chrome has a feature called "Instant" (enabled in Options -> Basic -> Search), that will prefetch URLs as they are typed into the address bar. This is quickly triggering a lockout, since it attempts to fetch a bunch of invalid URLs, and by the time the user has finished, they are not allowed any more attempts.
Is there any way to opt out of this feature, or ignore HTTP requests that come from it?
Or is this lockout mechanism just stupid and annoying for users without providing any significant protection?
(Truthfully, I don't really understand how this is a helpful feature for Chrome. For search results it can be interesting to see what Google suggests as you type, but what are the odds that a subset of your intended URL will produce a meaningful page? When I have this feature turned on, all I get is a bunch of 404 errors until I've finished typing.)

Without commenting on the objective, I ran into a similar problem (unwanted page loads from Chrome Instant), and discovered that Google does provide a way to avoid this problem:
When Google Chrome makes the request to your website server, it will send the following header:
X-Purpose: preview
Detect this, and return an HTTP 403 ("Forbidden") status code.

Or is this lockout mechanism just stupid and annoying for users without providing any significant protection?
You've potentially hit the nail on the head there: Security through obscurity is not security.
Instead of trying to "discourage brute-force guessing", use URLs that are actually hard to guess: the obvious example is using a cryptographically secure RNG to generate the "random 20 character string". If you use base64url (or a similar URL-safe base64) you get 64^20 = 2^6^20 = 2^120 bits. Not quite 128 (or 160 or 256) bits, so you can make it longer if you want, but also note that the expected bandwidth cost of a correct guess is going to be enormous, so you don't really have to worry until your bandwidth bill becomes huge.
There are some additional ways you might want to protect the links:
Use HTTPS to reduce the potential for eavesdropping (they'll still be unencrypted between SMTP servers, if you e-mail the links)
Don't link to anything, or if you do, link through a HTTPS redirect (last I checked, many web browsers will still send a Referer:, leaking the "secure" URL you were looking at previously). An alternative is to have the initial load set an unguessable secure session cookie and redirect to a new URL which is only valid for that session cookie.
Alternatively, you can alter the "lockout" to still work without compromising usability:
Serve only after a delay. Every time you serve a document or HTTP 404, increase the delay for that IP. There's probably an easy algorithm to asymptotically approach a rate limit but be more "forgiving" for the first few requests.
For each IP address, only allow one request at a time. When you receive a request, return HTTP 5xx on any existing requests (I forget which one is "server load too high").
Even if the initial delays are exponentially increasing (i.e. 1s, 2s, 4s), the "current" delay isn't going to be much greater than the time taken to type in the whole URL. If it takes you 10 seconds to type in a random URL, then another 16 seconds to wait for it to load isn't too bad.
Keep in mind that someone who wants to get around your IP-based rate limiting can just rent a (fraction of the bandwidth of a) botnet.
Incidentally, I'm (only slightly) surprised by the view from an Unnamed Australian Software Company that low-entropy randomly-generated passwords are not a problem, because there should be a CAPTCHA in your login system. Funny, since some of those passwords are server-to-server.

Develop Reference

node.js excel linux python-3.x azure haskell apache-spark rust .htaccess string