Part 1: On the Convergence of Data Privacy and Data Security

data privacy security

If you’re fairly new to this ‘privacy stuff’, you might be wondering why I used the phrase ‘data privacy’, not ‘data protection’. Well, unlike the security industry where we can’t even agree on when to use ‘cybersecurity’, ‘data security’, or ‘information security’, the privacy world has its act together. Hell, security folk can’t even agree […]

Part 5: Machine Learning Methods to Process Datasets With QI Values

sensitive data, machine learning, qi values, privacy, security

Differential Privacy (DP): This mathematical framework gives the ability to control to what extent the model ‘remembers’ and ‘forgets’ potentially sensitive data, which is its big advantage. The most popular concept of DP is ‘noisy counting’, which is based on drawing samples from Laplace distribution and using them to make the dataset represent augmented values […]

Part 4: Standard Ways to Process Datasets with QI Values

process, qi values, personal data, datasets

K-anonymity: This approach is quite different from the one described earlier. With K-anonymity, the aim is not to ‘hide’ any data, but rather soft ‘masking’ of the QI values. The most popular techniques used in k-anonymity are purging and generalization. Purging simply replaces QI values with random strings like ‘-’ (similar to suppression). Generalization does […]

Part 3: Machine Learning Ways to De-Identify Personal Data (Homomorphic Encryption)

gdpr compliance discover unknown sensitive data

Homomorphic Encryption: The main idea behind homomorphic encryption is that the inferences we make based on computations of encrypted data should be as accurate as if we had used decrypted data. Homomorphic encryption is an evolving field that currently has certain limitations. For example, only polynomial functions can be computed, and only additions and multiplications […]

Part 2: Standard Ways to De-Identify Personal Data

data integrity, data breach, de-identify personal data

Generally, administrators of a database try to eliminate all channels that could potentially help an attacker leverage queries to gain personal and sensitive information about a specific person. Here are a few examples: Pseudonymization: This method of processing personal data is based on replacing the values, which contain personal information, with pseudorandom strings. De-identified data is stored […]

Part 1: Introduction and Resources of the Data Breach

data breach, hacker, gdpr, ccpa

Terms like ‘sensitive data’ and ‘personal data’ have been floating in the air ever since GDPR, CCPA, and similar privacy acts were introduced to companies across the globe. One challenge they present is that the complexity of the federal laws and complicated terminology used to identify the corresponding subjects make it difficult for those in […]