Storing partial credit card numbers - payment

Possible Duplicates:
Best practices for taking and storing credit card information with PHP
Storing credit card details
Storing Credit Card Information
I need to store credit card numbers within an e-commerce site. I don't intend on storing the whole credit card number, as this would be highly risky. I would like to store at least the first five digits so I can later identify the financial institution that issued the card. Ideally, I would like to store as much of the credit number as I safely can, to aid any future cross-referencing etc.
How many digits, and which particular digits, can I safely store?
For example, I imagine this would not be safe enough:
5555 5555 555* 4444
Because you could calculate the missing digit.
Similarly, this would be safe, but not be as useful:
5555 5*** **** ****
Is there a well accepted pattern for storing partial credit numbers?

The Payment Card Data Security Standard states that if you are handling cardholder data, then you are subject to the constraints of the PCI DSS (which is very comprehensive and a challenge to comply with). If you want to store part of a card number, and don't want to have to deal with the Standard, then you need to make sure that a) you store NO MORE THAN the first 6 and last 4 digits; b) you don't ever store, process or transmit more than this. That means that the truncation has to be carried out before the data enters your control.
Given that you are talking about an ecommerce site, I think you'll have to deal with the PCI DSS sooner or later (since if you're not taking full PANs, you can't process transactions). Realistically, then, you should avoid storing more than the first 6 and last 4 digits of a PAN; the Standard then does not 'care' about this data, and you can store it in whatever form you see fit. If you store, say, the first 7 digits, then Requirement 3 of the Standard kicks in (and you start having to really understand key management in encryption).
I hope that this is of use.

March 2013 Edit:
A very pertinent resource is the PCI Security Standards Council, an organisation founded in 2006 by five of the biggest global Credit Card brands (AmEx, Visa, MasterCard, JCB International and Discovery) and which is the de facto authority on Security matters for the Payment Card Industry (PCI).
This organization publishes in particular the PCI Data Security Standard, currently in its version 2.0 edition which covers issues such as the management of complete or partial credit card numbers. This document if freely available but requires a simple registration and acknowledgment of license terms.
The following is the original, c. 2009 answer, mostly correct but apocryphal.
A common practice (whether legal or not I do not know) is to store the last 4 digits, as this may be used to help the customer confirm which of his/her credit cards were used for a particular transaction.
Without significantly improving the odds of a malicious person guessing the complete number, one can store the first 4 digits which are representative of the financial institution which issued the card, as mentioned in the question.
Do NOT, save many more digits than these 8 digits because otherwise, given the LUHN-10 checksum, you may provide enough info to make guessing the complete number more plausible (if still relatively hard, even with insight from the series used by a given issuer, in a given time period, but one should be careful...)
To make this whole thing safer, technically and legally, you may consider only storing such info if the customer explicitly allows it. You should also consider masking this info with a simple hash for storing in the database.
Also, what you can / should store following a particular transaction, is the transaction ID supplied by the Credit Card Processor, at the time the transacton is submitted. This ID is the key that allows locating most (all?) of the info you would even need, would there be any issue with a particular transaction. This type of info can typically be queried from a secure web site maintained by the Processing company, along with some aggregate reports which may include a grouping by card-type (Amex, Visa...) if that is why you are thinking of storing the first four.

If you don't need to store the whole credit card number, why do you need to store it at all? If you want to save the financial institution that issued the card, why don't you store the financial institution that issued the card?

Your specific question is answered in sec 3.3 of the PCI/DSS document.
First six and last four are max for display. Customer (paper?) receipts are more restrictive. Those with a legitimiate need to know can see full card data.
My recommendation is to contact your merchant provider and see what options are available to you. A number of the modern transaction gateways have "vault" features where sensitive information is stored at the provider and you simply reference customers by a token number when you want to bill them or check account information.
Along the same lines use of transaction specific tokens can be used to reference needed data stored on the providers system.
However I can't stress enough the importance of reading and understanding PCI DSS. Simply punting secure storage does not magically obsolve you from being subject to PCI compliance requirements!! This is only possible when your system never touches full card data.

The accepted pattern is don't store them at all.
In certain jurisdictions you may be breaking the law by storing them or any part of them.
You could instead, store a one-way (and therefore unrecoverable) hash of the credit card number.

The credit card companies have a standard for this. You'll probably find it buried somewhere in the terms of service of your payment processor that you will obey this standard. It answers you questions. You can find the standard here

Here in Canada, the usual way is to store the first 4 digit ( to identify the financial institution) and the 4 last digit to identify the credit card.
But be sure that you didn't break any laws.

Related

UML Exercise ATM Use-case diagram

I have an assignment in which the following conditions are set forth:
Brief Statement of Purpose:
An ATM is an electronic device designed for automated dispensing of money. A user can withdraw money quickly and easily after authorization. The user interacts with the system through a card reader and a numerical keypad. A small display screen allows messages and information to be displayed to the user. Bank members can access special functions such as ordering a statement
Brief Summary of Requirements: The ATM is required 1. To allow authorized card holders to make transactions
Card holders shall view and/or print account balances
Card holder shall make cash withdrawals
Card holder shall make cash or check deposits
Card holder shall quit session
To allow bank members to access additional, special services
A bank member shall be able to order a statement
A bank member shall be able to change security details (e.g. PIN number)
To allow access to authorized bank staff
Authorized staff can gain access to re-stock the machine
Authorized staff are able to carry out routine servicing and maintenance
To keep track of how much money it contains and alert bank staff when stocks are getting low
Additional Notes:
Users shall be able to access the ATM by punching in their account number and PIN. Once the system has verified that the account is active and the PIN matches with the account number, the system offers the users four choices. Users can withdraw money, deposit money, check balance or quit the session.
The user must have a minimum of $100 in his / her account. At the end of any transaction a printed copy of the transaction is provided to the user. A transaction could be - withdraw money, deposit money or check balance. Once the user has completed a transaction, the system offers the user the same four choices, until the user decides to quit.
The system shall interface with the device to dispense cash, the device to accept cash or check and the printer. Since we have not studied databases in this course, the system will keep all the information in two RandomAccess files. One file will hold the passwords and the other account balances.
I have built the following use-case diagram but am confused about how detailed it's supposed to be and what should be an extension/inclusion and what should just be a base case. Any feedback would be welcomed.
Should bank members and card holders be separate or together? Technically bank members can do more than a card holder like update security details or order statements, but aren't all bank members card holders?
Here is the other version I have, minus quit not being a use case, are there other factors that are incorrect?
Here are a couple of observations:
As commented, Quit is no use case as it does not add any value to the actor. Rather it looks like post-conditions for other use cases.
Generalizing use cases is a bad idea (although allowed per UML). A use case shows a single unique piece of added value for a actor using the system under consideration. In that context grouping serveral use cases under the hat of a "common" use case is not helpful. Rather connect the specializations directly with the actor and remove that Transaction.
Instead of duplicating the Bank Members's associations to Quit/Transaction you can draw a generalization to Auth Card Holders.
Better use the singular for an actor. It's a general role name and per se covers any amount of real persons/machines.
A part of the requirement/description are scenarios which go into use cases. It's common mistake to try to expose these in functional decomposition in use cases.
Your attempt is not bad. But creating use cases from requirements is not easy and needs a lot of experience. So (like always) I recommend to read Bittner/Spence about use cases. (Do not read the UML specs to get an idea of UC synthesis. They are at best able to give the syntax how to use bubbles and stick men.)
As commented by www.admiraalit.nl there "might be" applications for generalizing use cases and you can discuss that controversially. It's my own preference to not use them since using it wrongly is more easy than using it right. The same goes for export/import. Avoid it as long as you don't know exactly what it's good for. You tend to starting functional decomposition which is not what UCs are good for.

Is usage analysis based on HyperLogLog compliant with GDPR?

Context: we have telemetry system for our service and would like to track retention, how many users use various features, etc.
There are two options to deal with user identifiable information and be GDPR compliant:
Support deleting user information based on request
Keep data for less than 30 days
Option #1 is hard to implement (for telemetry system). Option #2 doesn't allow answering questions such as "what is 6-month retention for feature X?".
One idea how to get answers for above question is to calculate HyperLogLog blobs per feature every week/day and store them separately forever. This will allow moving forward to merge/dcount/calculate retention based on these blobs.
Assuming that any user identifiable information is gone after 30 days (after user account gets deleted), will HyperLogLog blobs still allow to track users or not (i.e. to answer whether a particular user used feature X two years ago)?
If it allows then it is not compliant (doesn't mean that it is compliant if it doesn't allow).
In general HLLs are not GDPR compliant. This issue was somewhat addressed in a recent Google paper (see Section 8: 'Mitigation strategies').
The hash function used in HLL are usually not cryptographically secure (usually MurmurHash), hence even with salting you might still be able to answer the question "is a user part of a HLL data structure or not" and that's a no no.
Afaik you would be in compliance if you keep HLLs around for longer than 30 days iff you apply a salted crypto hash prior to HLL aggregation (i.e. a salted SHA-2 or BLAKE2b, BLAKE3) and you destroy the salt after each <30 day period. This would allow you to keep <30 day intervals. You would not be able to merge HLLs over several intervals but only over 28 day chunks, but that can still be super valuable dependent on your business needs.

RFID Limitations

my graduate project is about Smart Attendance System for University using RFID.
What if one student have multiple cards (cheating) and he want to attend his friend as well? The situation here my system will not understand the human adulteration and it will attend the detected RFID Tags by the reader and the result is it will attend both students and it will store them in the database.
I am facing this problem from begging and it is a huge glitch in my system.
I need a solution or any idea for this problem and it can be implemented in the code or in the real live to identify the humans.
There are a few ways you could do this depending upon your dedication, the exact tech available to you, and the consistency of the environment you are working with. Here are the first two that come to mind:
1) Create a grid of reader antennae on the ceiling of your room and use signal response times to the three nearest readers to get a decent level of confidence as to where the student tag is. If two tags register as being too close, display the associated names for the professor to call out and confirm presence. This solution will be highly dependent upon the precision of your equipment and stability of temperature/humidity in the room (and possibly other things like liquid and metal presence).
2) Similar to the first solution, but a little different. Some readers and tags (Impinj R2000 and Indy Readers, Impinj Monza 5+ for sure, maybe others aswell) have the ability to report a response time and a phase angle associated with the signal received from an interrogated tag. Using a set up similar to the first, you can get a much higher level of reliability and precision if you use this method.
Your software could randomly pick a few names of attending people, so that the professor can ask them to identify themselves. This will not eliminate the possibility of cheating, but increase the risk of beeing caught.
Other idea: count the number of attendiees (either by the prof or by camera + SW) and compare that to the number of RfID tags visible.
There is no solution for this RFID limitation.
But if you could then you can use Biometric(fingerprint) recognition facility with RFID card. With this in your system you have to:
Integrate biometric scanner with your RFID reader
Store biometric data in your card
and while making attendance :
Read UID
Scan biometric by student
Match scanned biometric with your stored biometric(in the card :
step 2)
Make attendance (present if biometric matched, absent if no match)
Well, We all have that glitch, and you can do nothing about it, but with the help of a camera system, i think it would minimise this glitch.
why use a camera system and not a biometric fingerprint system? lets re-phrase the question, why use RFID if there is biometric fingerprint system ? ;)
what is ideal to use, is an RFID middleware that handle the tag reading.
once the reader detects a tag, the middleware simply call the security camera system and request for a snapshot, and store it in the db. I'm using an RFID middleware called Envoy.

How do netbank login dongles work?

This is a question purely to satisfy my own curiosity.
Here in Norway it's common for netbanks to use a calculator-like (physical) dongle that all account holders have. You type your personal pin in the dongle and it generates an eight-digit code you can use to login online. The device itself is not connected to the net.
Anyone knows how this system works?
My best guess is that each dongle has a pregenerated sequence of numbers stored. So the login process will fail if you type an already used number or a number that is too far into the future. It probably also relies on an internal clock to generate the numbers. So far none of my programmer peers have been able to answer this question.
[Edit]
In particular I'm curious about how it's done here in Norway.
Take a look here: http://en.wikipedia.org/wiki/Security_token. If you are interested in the algorithms, these might be interesting: http://en.wikipedia.org/wiki/Hash_chain and http://en.wikipedia.org/wiki/HMAC.
TOKENs have very accurate real-time clock, and it is synced with same clock on the auth server. Real time is used as a seed along with your private key and your unique number is generated and verified on the server, that has all the required data.
One major one-time password system is Chip and PIN, in which bank cards are inserted into special, standalone card readers that accept a PIN and output another number as you describe. It is widely deployed in the UK.
Each bank card is a smart card. The card's circuitry is what checks the PIN and generates the one-time password. Cryptographic algorithms that such cards can use include DES, 3DES (Triple DES), RSA, and SHA1.
I recently went overseas and used the dongle there with no problems.
It is a sealed battery powered dongle. One pushes the button and a code number appears.
The only way it could work is that it is time synchronised to the bank.The number that is recruited only lasts for a minute if that.
A random number generator is used to create the stream of numbers recorded in the memory of the device.
It therefore becomes unique for the user and only the bank 'knows' what that random number generator produced for that particular user and dongle.
So there can only be one next number .
If the user makes a mistake, the bank 'knows' they are genuine because the next try is the next sequential number that is in the memory.
If the dongle is stolen the thief also has to have the other login details to reach the account.

Best Practices / Patterns for Enterprise Protection/Remediation of SSNs (Social Security Numbers)

I am interested in hearing about enterprise solutions for SSN handling. (I looked pretty hard for any pre-existing post on SO, including reviewing the terriffic SO automated "Related Questions" list, and did not find anything, so hopefully this is not a repeat.)
First, I think it is important to enumerate the reasons systems/databases use SSNs: (note—these are reasons for de facto current state—I understand that many of them are not good reasons)
Required for Interaction with External Entities. This is the most valid case—where external entities your system interfaces with require an SSN. This would typically be government, tax and financial.
SSN is used to ensure system-wide uniqueness.
SSN has become the default foreign key used internally within the enterprise, to perform cross-system joins.
SSN is used for user authentication (e.g., log-on)
The enterprise solution that seems optimum to me is to create a single SSN repository that is accessed by all applications needing to look up SSN info. This repository substitutes a globally unique, random 9-digit number (ASN) for the true SSN. I see many benefits to this approach. First of all, it is obviously highly backwards-compatible—all your systems "just" have to go through a major, synchronized, one-time data-cleansing exercise, where they replace the real SSN with the alternate ASN. Also, it is centralized, so it minimizes the scope for inspection and compliance. (Obviously, as a negative, it also creates a single point of failure.)
This approach would solve issues 2 and 3, without ever requiring lookups to get the real SSN.
For issue #1, authorized systems could provide an ASN, and be returned the real SSN. This would of course be done over secure connections, and the requesting systems would never persist the full SSN. Also, if the requesting system only needs the last 4 digits of the SSN, then that is all that would ever be passed.
Issue #4 could be handled the same way as issue #1, though obviously the best thing would be to move away from having users supply an SSN for log-on.
There are a couple of papers on this:
UC Berkely
Oracle Vault
I have found a trove of great information at the Securosis site/blog. In particular, this white paper does a great job of summarizing, comparing and contrasting database encryption and tokenization. It is more focused on the credit card (PCI) industry, but it is also helpful for my SSN purpose.
It should be noted that SSNs are PII, but are not private. SSNs are public information that be easily acquired from numerous sources even online. That said if SSNs are the basis of your DB primary key you have a severe security problem in your logic. If this problem is evident at a large enterprise then I would stop what you are doing and recommend a massive data migration RIGHT NOW.
As far as protection goes SSNs are PII that is both unique and small in payload, so I would protect that form of data no differently than a password for one time authentication. The last four of a SSNs is frequently used for verification or non-unique identification as it is highly unique when coupled with another data attribute and is not PII on its own. That said the last four of a SSN can be replicated in your DB for open alternative use.
I have come across a company, Voltage, that supplies a product which performs "format preserving encryption" (FPE). This substitutes an arbitrary, reversibly encrypted 9-digit number for the real SSN (in the example of SSN). Just in the early stages of looking into their technical marketing collateral...

Resources