Tokenization — Is it the Right Strategy to Protect Your PII?
If you’re old enough, can you remember back to those endless days spent stuffing tokens into Mortal Kombat and Street Fighter 2 at the arcade?
Instead of money, arcades use tokens in their place, with the intention of keeping otherwise bored kids hooked, while reducing the chances of employees pocketing loose change. The idea behind tokens is that they replace something valuable, i.e. the money used to buy the tokens in the first place, with something that is not inherently valuable, i.e. a similar sized metal disk, yet retains all the value and data of the original item. The token at the arcade is basically a reference to the original money used.
For similar reasons, tokenization is common practice in securing digital data.
Tokenization replaces sensitive items, i.e. the data organizations hold, with another item that is not sensitive, using symbols and random strings of letters and numbers. This new token is representative of the same data but acts as a map back to it, rather than the actual piece of data itself. And since the token is just random information, there’s no point in exploiting it because it doesn’t have any value. The goal is to reduce the amount of sensitive data that can be exposed in case of breaches and leaks.
How Does Tokenization Work?
Tokenization maps and replaces a sensitive data element, like an account number, credit card number or email address, with a substitute value that cannot be used to derive the original data on its own. Often, the format or structure of the data needs to be preserved. For example, a token email address needs to “look” like an email address, including the @ sign and a domain. Another example is a credit card number token that has to be in the right format to be stored, but is not the actual credit card number associated with an individual. In all cases, the only way to get the original data is to get the information from the tokenization manager where the original mapping exists.
Tokenization vs. Encryption
This all may sound kind of reminiscent of encryption and in truth it is similar, but can present less risk; encryption, which has long been considered the primary method of protecting data, uses a key and an algorithm to alter said data. Encryption security relies on the strength of the algorithm, protection, and distribution of the keys, as well as the implementation – which can be complicated and error-prone for anyone other than deep experts. By design, encryption is reversible, so data theft and data exposure concerns remain. Encryption is also often very binary; since most algorithms inherently destroy the format and structure of data, ALL data has to be encrypted or decrypted for use, which has a high overhead and makes an impact on processing.
Tokenization on the other hand uses a different model. The notion of protecting and accessing data is defined centrally, and the token is mapped to a random equivalent known only by that token-mapping system, which is heavily monitored and defended. Each access to data is governed, and there’s no risk of keys being exposed – the token to value mapping system is never exposed or accessed beyond the tokenization system. The token is simply a randomly generated string of characters and cannot be reversed to get access to the original data.
Tokenization for Data Privacy
When it comes to adhering to privacy regulations, as we’ve mentioned in the past, there are two types of organizations; there are some organizations that simply want to check compliance off their list. Then there are others that understand that, when taken to heart, privacy can become a catalyst for organizational growth and solidifying relationships with customers.
Every day, customers entrust organizations with their personal data, or PII. And though encryption has been the de-facto standard for data storage for years and has many compelling use cases, such as when used for data transmission, this method has proven to be less-than foolproof. This is evident from the multiple breaches to organizations whose data was, in fact, encrypted at the time of breach. For example, retail giant Target was certified as being PCI compliant just a few weeks before they were hacked. Encrypted debit card PIN data was found for sale on the dark web in the aftermath of the incident and although the company stated that since the data was ciphertext, therefore it was safe, as we see, when the price is right, encryption can be reversed.
This is why organizations are beginning to see the benefits of tokenization. Stolen, lost, leaked, or breached tokenized data has no value and is therefore not a liability. Moreover, tokenizing data reduces the scope of your sensitive data, thereby reducing the effort and cost required to protect it. This is why it’s important to tokenize PII as early in its lifecycle as possible, to ensure the chances of exposure are reduced as it inevitably gets copied around the enterprise. Tokenization takes securing sensitive data beyond the level of “what’s necessary for compliance”; it means that the organization understands that the risks associated with encryption alone are too great.
As the amount of PII organizations hold continues to pile up like discarded cans of Crystal Pepsi in a post-arcade-binge, how we store all that data is becoming an all-important point for consideration. Though it’s still in relative infancy, tokenization is proving to be an effective way of simplifying compliance and minimizing the risk of data breaches.
And don’t forget – you can’t tokenize data you don’t know about. For info on the best way to do sensitive data discovery, on both structured and unstructured data, data at rest, and data in motion, let’s set up a time to talk.