Adobe Analytics - Moving from 1st Party cookies back to 3rd Party Cookies for Security Reasons - security

I work for a bank and security is a major concern. We are currently using a cname on Adobe's collection servers (e.g. stats.bank.com) in order to have Adobe serve first party cookies on the bank.com domain. Our security council now says we shouldn't provide Adobe with a new SSL cert for stats.bank.com because it is too risky and if stats.bank.com is compromised and someone attacks our customers then we our liable due to it being our brand and all the cookie data is exposed as well as leaving customers open to malware attacks. So we have the following options:
Bring reporting in-house
Set up a filtering proxy operating as “stats.bank.com” that front-ends the relevant Adobe service
Go back to Adobe's 3rd Party solution 2o7.net namespace
Use a different 3rd party namespace on adobe's servers (e.g. stats.bk.com)
Here are our thoughts:
1) Too expensive
2) We thought it was a good solution but then the cost came up. It seems like it would be very costly to build that type of infrastructure due to the volume of calls.
3) Adobe's 3rd party namespace blocked too much.
4) Seems to maybe be a solution but still concerned about 3rd party being blocked.
I was wondering if anyone has had to deal with these type of security concerns and what the solution was. Also what are the drawbacks of solution #4 in particular?

There is no personally identifiable or personal information at all in Adobe's tracking cookie.
Before I say anything else, based on what you have said, let me just say that I think your security council is either misinformed about Adobe's tracking cookie or else blowing things unrealistically out of proportion.
The visitor id (s_vi) cookie is just that: a cookie that contains a visitor id value. Here is an example of what the cookie value looks like:
[CS]v1|2A933F6C05079103-6000110EA000D3F3[CE]
The value has nothing to do with a visitor's personal information or data or anything like that. It is a randomly generated value that sticks to the visitor for as long as the cookie persists.
Cookies that are created for any custom coding you do are NOT the same thing
See, this is where I think some people may be confused. Here is a common scenario to explain: member id tracking. A visitor when they first come to your site is anonymous. They login to your site and now your site knows who they are.
From a tracking perspective, it is common to have a prop and/or eVar that reflects this. So on pages/hits where you don't know the visitor, you wouldn't pop anything, or maybe you'd pop some default "anonymous" or "unknown" or "logged out" value. Then when the visitor logs in, you pop the prop/eVar with a value that your site recognizes as a member or account id.
Maybe this id is their email address. Maybe it's a randomly generated value. Maybe it's a username. Point is, it's something to uniquely identify the visitor within your own site's system.
So let's say you write code where upon login, you pop prop1 with the value and then you decide to make use of Adobe's getAndPersist plugin. This plugin basically takes a value and puts it into a cookie and then retrieves the value each time the plugin is called. The idea here is that you only have to do the work to come up with the value from your end one time and then Omniture will persist it from there. This is particularly useful for when you want a value to pop for each page/hit but may not have easy access to replicate or scope the logic to all areas of your site, particularly across subdomains.
So now you have a cookie set by Adobe Analytics code from this. This has nothing to do with the s_vi cookie at all.
Firstly, it is something you explicitly set, even if it is just to get the ball rolling. Secondly, the value is not stored in the s_vi cookie; it is stored in a separate, 1st party cookie.
Even if you have FPC tracking, it is still set in a separate cookie. The actual cookie name depends on what plugin you are using (or using Adobe's s.c_w cookie write function yourself), and also whether or not you are using the combined cookie plugin (in which case it will be put in s_sess or s_pers, depending on what you set the expiration to be)
Now.. if you do have FPC implemented, you can obviously overwrite that cookie with your own value. And you can obviously make that value whatever you want it to be, including something personal to the visitor. But that's not Adobe's doing; that's your doing.
The overall point here is that whether you make the visitor tracking 1st party or 3rd party, that's a completely separate cookie that has nothing to do with personal data.
You may have custom coding that contains personal data and you may put that data into cookies, even using Adobe Analytics functions, but that is not the same thing. It will always be first party cookies (impossible for js to write 3rd party cookies), and the cookies will always be separate.
Nonetheless, the s_vi visitor id may be used to indirectly get personal data
I'm sure the next thing heard will be something along the lines of "But it doesn't matter, it's a unique id for the visitor, and it's in Adobe, and so is this other data, and you can use the visitor id to find the data within Adobe!"
And this is true. However...
Firstly, in order for there be personally identifiable data to be found within Adobe Analytics, you have to explicitly put it there. For example, you have to set stuff like:
s.prop1='jon doe'; // name
s.prop2='4321 1111 1111 1111'; // credit card #
s.prop3='04/2020'; // exp date
s.prop4='123'; // security number
I don't think I should have to tell you that this is a supremely bad idea, but point is, this isn't Adobe collecting that info, it is you doing it. And it's not in the s_vi visitor id cookie, nor can it ever be (again, unless you have fpc imp and decide to explicitly overwrite the cookie with those values..).
So that data, along with the visitor id, goes off to Adobe servers. So there's the next road block: getting access to the data within Adobe. The bad guy would have to have a Adobe Analytics user account under your company, and it would have to have proper permissions to gain access to that data.
And even then, Adobe doesn't actually expose the visitor id value in the reports. So in order to get the data associated with a certain visitor id, you need access to data warehouse, or to be listed as a supported user and request raw hit logs from ClientCare.
I guess the overall point here is that all by itself, that visitor id isn't really the dangerous thing. It's not the personal data, and being able to make use of it to find specific data associated with it would involve acts of extreme foolishness about storing personal data on Adobe servers in the first place, as well as gaining access to said servers/interfaces.
All that aside..
Okay, so maybe you don't care about all of that stuff above. Or maybe none of that convinced your security council to budge. You're moving away from Adobe FPC imp and that's all there is to it. So let's talk about the options you listed and your concerns about them.
Bring reporting in-house
You said this is "too expensive." You know, I gotta be honest here.. this is a bit laughable, coming from a bank! But seriously..
Perhaps you thought it too expensive from a building-from-the-ground-up-from scratch perspective? If this is the case, have you considered options for ones that have already been built, that you can put on your own server and customize or build off of from there?
Webtrends offers this. Frankly, I loathe Webtrends as a tracking solution, but it does offer ability to put it on your own server (last I heard, anyways). Also, Piwik is a really good open source solution.
Filtering proxy
I'm not quite sure what you mean by this. This sounds a lot like FPC tracking.. except having a means to scrub all requests of personal data before it goes to Adobe? Well if that is the case, I'd go back to the point about sending personal data to Adobe in the first place. But okay, maybe you aren't doing that, but want to have an extra measure of precaution just in case; fair enough.
So maybe you setup a service on your end that sends all requests to stats.bank.com and it scrubs stuff and maybe even has a mapping of values (like visitor id). In principle, this isn't really a complex script, so again I have to wonder why cost is an issue, especially coming from a bank.. but whatever..
Sticking with Adobe's 3rd party cookie implementation
If you want to go back to 3rd party cookie tracking using a domain owned by Adobe, instead of using the default 2o7.net domain, I suggest you consider their new(er) 3rd party cookie implementation for Regional Data Collection.
Rolling your own 3rd party cookie implementation
As far as I am aware, Adobe does not offer any kind of service involving you specifying a domain name for them to purchase/own and collect data from as a 3rd party implementation.
The closest service to this is the first party cookie tracking. So, you if you have www.bank.com, normally you'd specify something like stats.bank.com (something on the root domain) and that's FPC tracking.
However, you can tell Adobe to use for example stats.someotherdomain.com (assuming you own and control it) and they can implement FPC tracking for that domain. Then, when you implement tracking on www.bank.com, that effectively becomes 3rd party cookie tracking.
The caveat though is that you still own that domain, so I can only assume that on some level, you will still be liable for it (I'm not a lawyer). However, maybe this will be enough to appease your security council, worth bringing it up to them.

I add that, under the Adobe General Terms of Service, "customer agrees not to collect, process, or store any Sensitive Personal Data using the on-demand or managed services." Hence, if you are collecting any data that can be traced back to an individual -- e.g., email address or phone number -- you are violating the TOS. Therefore, the response to security concerns can be, "Exposing customer PII is a violation of our terms of service and so we don't do it."

Related

How do services/platforms uniquely identify users?

If you log into a platform (Twitch, Blizzard, Steam, Most Crypto exchanges, Most Banks) from a new device you'll typically get an email stating so.
As far as my knowledge goes, the only information you can get on a request is
IP address
Device Operating system & version
Browser type & version
Are these platforms basing their "unique" users off of this information alone and/or am is there more information that can be gathered?
From a security perspective the largest thing is your identity or how you authenticate. That's king. The email stating "hey this is a new device" I've seen handled differently from site to site. Most commonly it's actually browser cache and I see banks specifically use browser cache to store these kinds of tokens. Otherwise every time your cellphone connected to a new cell tower you'd likely be flagged as different. They're not necessarily the same as an authentication token, rather it just says hey I've authenticated as this user to this site before. Since it's generated by the service provider, the service provider knows to trust it, and it's nearly impossible to hack (assuming it's implemented correctly).
From my own experience the operating systems and browser types, that's more record keeping than actionable insights, however you could build a security system that takes into account an IP address from very different geo-locations. I.e. why is this guy from the US logging in from China. They just logged in from California 3 hours ago, this is impossible. I don't believe most sites really go to that extent though. I do see MFA providers saying "hey there's a login from china, do you want to approve?". That workflow makes a lot more sense.
The last part of your question is tricky, regarding "unique users." Most calculate that based off the number of sessions opened (tabs), or in the case of Twitch (since you mentioned them specifically), the number of tabs that are streaming that video in. These open platforms where anyone without an account can stream the content obviously treat this differently than say Netflix that makes you authenticate and each account has a limited number of sessions that can be open.
AFAIK, most of the systems like this stores a cookie in your browser when you log (not the session cookie, just a random ID) that is also assiciated to your account in the provider database, so when you came back, you log in, and they check whether you have that cookie set and in case if the ID matches
They you can probably do some more advance stuff with that ID, like base that value from the browser, OS, expire date and so on

Is it safe to use a UUID in a URL for semi-private data?

I run a landscaping company and have multiple crews. I want to provide each one with a custom URL (like mysite.com/xxxx-xxxx-xxxx) that shows their daily schedule. Going to the page will list the name, address and phone number of 5-10 customers for the day.
Is it safe/wise to use a UUID in a URL for semi-private data?
Depends on how safe you want it to be.
Are the UUIDs used for anything else? If not, they are fine for creating random URLs.
But, browser history would allow anyone using the same machine to find the URLs. Also, unless using https, a network sniffer could easily see the requested URLs and go to the same page.
Another concern is spider bots. Make sure nothing links to those pages, use a robots.txt to prevent indexing the site, but you still might find that some of the pages show up on search engines. It might be better to have the UUID set in a cookie and check that for determining which employee it is, lest your semi-private pages start showing up on google.
Whether or not that schema would work for you, depends on your threat model (as well as some implementation details). Without a concrete threat model, it is not possible to give a definitive answer to your question.
I can, however, give you some ideas about potential issues with the solution, so you can determine if they are relevant for your application. This is not a complete list.
On the implementation side of things:
Not all UUID generators are created equal. Ideally, you want to use a generator based on a cryptographically secure RNG, providing an UUID where every byte is chosen at random.
Using the UUID for a database lookup or similar operation is not necessarily a constant-time operation (and thus there might be side-channel attacks unless you implement the lookup by yourself)
Make sure your URI does not leak via referrer
Some tools attempt to detect 'secret' URLs to protect them from history synchronization or other automatic features. Your schema will most likely not be detected as 'secret'. It might be better to artificially lengthen your URI and to move your UUID into a query parameter.
You can further reduce attack surface with the usual methods (rate limiting, server hardening, etc.)
On the conceptual side of things:
A single identifier for both identification and authentication is not necessarily a bad thing. However, in most cases there is a need for an identification-only identifier – you must not use the 'secret' UUID in those scenarios
If a 'crew' consists of multiple people: you cannot revoke access for a single crew member
Some software (antivirus, browser, etc.) treats information in URLs as public information, and might upload them without user interaction

What are the dangers of storing plaintext data in cookies?

I see many sites that store cookie data as garbled text, for example: a cookie named aASFaewqWDRE#fr with an equally unreadable value. I've always kept my cookies human readable, but never keep critical data within them. For example, I'd make a cookie called favorite_items with a string like so 14,73,7, each number being a reference to something like a product.
If my cookie were to be stolen, the attacker would immediately know that this user had items 14, 73, and 7 in their favorites. This doesn't compromise the users account in any way, as far as I know (assuming that my site is well built and an account can't be accessed with solely this information).
Are there other security concerns with this practice that I haven't thought of?
How do I really know that this is question is really from the legitimate user Brian? How do I know that it's not someone trying to trip people up? It would be in the general interest of security (for whatever reason) to encode ('garble') your data - simple because you do not know who or what is monitoring your data. Consider an account with amazon or a major retailer where a customer's credit card information is on file. If the data being monitored, it would be very simple for a potential hacker/malicious program to extract the information he needs. He can either directly get credit card details or he simply just has to acquire their username/password combination. Now when this comes to banks, it becomes extremely important.
But even outside of financial transactions, it is good to encrypt your details to prevent you from being spammed and or to prevent the illegal use of your account - imagine your boss getting an email from you with stuff that he might not like. The list is endless. The bottom line really is that there are a lot of messed up people out there and if you can do something to get that extra level of protection for not much additional cost, then why not?
What are the dangers of storing plaintext data in cookies?
The "danger" is obvious. The user (and potentially others!) can read the information, and potentially "fiddle" with it.
Whether it matters depends what the information is, how you handle the cookies and what you are worried about. For example ...
If you are using the cookie content for implementing your site security / user access control, then passing the information it the clear could give the user extra some knowledge to subvert your scheme ... depending on how you implemented it.
If you are using the cookie content for information that the user might consider as sensitive, then passing the "clear" cookies over an HTTP connection makes it vulnerable to some bad guy who can snoop the packets. (Actually, given the HTTPS is "not really as secure as we were lead to believe" ... this probably applies across the board!)
If you are using the cookies for tracking the user ... or something else that the user would probably not like you doing ... well, go figure!
But seriously, your question strongly suggests that you need to learn a lot more about how to address security and privacy concerns in website / webtool implementation.
For a start, simply "garbling" the information is insufficient. Any light-weight "garbling" scheme can easily be reverse engineered. If you care about security / privacy, the information should be encrypted using strong encryption with properly handled keys ... or not stored in cookies in the first place. (Read up on schemes for storing session-related information on the server side.)

Possible solutions for keeping track of anonymous users

I'm currently developing a web application that has one feature while allows input from anonymous users (No authorization required). I realize that this may prove to have security risks such as repeated arbitrary inputs (ex. spam), or users posting malicious content. So to remedy this I'm trying to create a sort of system that keeps track of what each anonymous user has posted.
So far all I can think of is tracking by IP, but it seems as though it may not be viable due to dynamic IPs, are there any other solutions for anonymous user tracking?
I would recommend requiring them to answer a captcha before posting, or after an unusual number of posts from a single ip address.
"A CAPTCHA is a program that protects websites against bots by generating and grading tests >that humans can pass but current computer programs cannot. For example, humans can read >distorted text as the one shown below, but current computer programs can't"
That way the spammers are actual humans. That will slow the firehose to a level where you can weed out any that does get through.
http://www.captcha.net/
There's two main ways: clientside and serverside. Tracking IP is all that I can think of serverside; clientside there's more accurate options, but they are all under user's control, and he can reanonymise himself (it's his machine, after all): cookies and storage come to mind.
Drop a cookie with an ID on it. Sure, cookies can be deleted, but this at least gives you something.
My suggestion is:
Use cookies for tracking of user identity. As you yourself have said, due to dynamic IP addresses, you can't reliably use them for tracking user identity.
To detect and curb spam, use IP + user browser agent combination.

What is a simple and secure way to transmit a login key from one website to another while redirecting a user?

I want to create a portal website for log-in, news and user management. And another web site for a web app that the portal redirects to after login.
One of my goals is to be able to host the portal and web-app on different servers. The portal would transmit the user's id to the web-app, once the user had successfully logged in and been redirected to the web app. But I don't want people to be able to just bypass the login, or access other users accounts, by transmitting user ids straight to the web app.
My first thought is to transmit the user id encrypted as a post variable or query string value. Using some kind of public/private key scenario, and adding a DateTime stamp to key to make it vary everytime.
But I haven't done this kind of thing before, so I'm wondering if there aren't better ways to do this.
(I could potentially communicate via database, by having the portal store the user id with a key in a database and passing that key to the web app which uses it to get the user id from that database. But that seems crazy.)
Can anyone give a way to do this or advice? Or is this a bad idea all-together?
Thanks for your time.
Basically, you are asking for a single-sign-on solution. What you describe sounds a lot like SAML, although SAML is a bit more advanced ;-)
It depends on how secure you want this entire thing to be. Generating an encrypted token with embedded timestamp still leaves you open to spoofing - if somebody steals the token (i.e. through a network sniffing) he will be able to submit his own request with the stolen token. Depending on the time to live you will give your token this time can be limited, but a determined hacker will be able to do this. Besides you cannot make time to live to small - you will be rejecting valid requests.
Another approach is to generate "use once" tokens. This is 'bullet proof' in terms of spoofing, but it requires coordination among all the servers within the server farm servicing your app, so that if one of them processed the token the other ones would reject it.
To make it really secure for the failover scenarios, etc. it would require some additional steps, so it all boils down to how secure you need it to be and how much you want to invest in building it up
I suggest looking at SAML
PGP would work but it might get slow on a high-traffic site
One thing I've done in the past is used a shared secret method. Some token that only myself and the other website operator knows concatenated to something identifying the user (like their user name), then hash that with a checksum algorithm such as SHA256 (you can use MD5 or SHA1 which usually are more available but they are much easier to break)
The other end should do the same thing as above. Take the passed identifying information and checksum it. Compare that to the passed checksum, if they match the login is valid.
For added security you could also concat the date or some other rotating key. Helps to run SSL on both sides as well.
In general, the answer resides somewhere in SHA256 / MD5 / SHA1 plus shared secret based on human actually has to think. If there is money somewhere, we may assume there are no limits to what some persons will do - I ran with [ a person ] in High School for a few months to observe what those ilks will do in practice. After a few months, I learned not to be running with those kind. Tediously avoiding work, suddenly at 4 AM on Saturday Morning the level of effort and analytical functioning could only be described as "Expertise" ( note capitalization ) There has to be a solution else sites like Google and this one would not stand the chance of a dandelion in lightning bolt.
There is a study in the mathematical works of cryptography whereby an institution ( with reputable goals ) can issue information - digital cash - that can exist on the open wire but does not reveal any information. Who would break them? My experience with [ person ]
shows that it is a study in socialization, depends on who you want to run with. What's the defense against sniffers if the code is already available more easily just using a browser?
<form type="hidden" value="myreallysecretid">
vis a vis
<form type="hidden" value="weoi938389wiwdfu0789we394">
So which one is valuable against attack? Neither, if someone wants to snag some Snake Oil from you, maybe you get the 2:59 am phone call that begins: "I'm an investor, we sunk thousands into your website. I just got a call from our security pro ....." all you can do to prepare for that moment is use established, known tools like SHA - of which the 256 variety is the acknowledged "next thing" - and have trace controls such that the security pro can put in on insurance and bonding.
Let alone trying to find one who knows how those tools work, their first line of defense is not talking to you ... then they have their own literature - they will want you to use their tools.
Then you don't get to code anything.

Resources