Data Obfuscation: The Linchpin of Enterprise Data Security 

Add bookmark

Recent reports confirm data breaches are surging in frequency.  According to Imperva, the volume of compromised records globally has increased on average by 224% each year since 2017. Furthermore, additional research by IBM has found that, on average, it takes 280 days to identify a data breach

In an effort to better protect sensitive data from exposure and deter cyber criminals, security organizations are increasingly leveraging data obfuscation to “hide data in plain sight.”

What is Data Obfuscation?

Data obfuscation, also known as data anonymization or pseudonymization, is a data security technique whereby data is purposely scrambled, hidden or modified so that it is useless to malicious actors. 

Though recent data privacy regulations do not require the use of data obfuscation, many do promote it as an acceptable data protection strategy. For example, GDPR Article 6 (4-e) calls for ”the existence of appropriate safeguards, which may include encryption or pseudonymization.”

The most commonly used subtypes of data obfuscation are data masking, data encryption and data tokenization. 

With more than 140,000 members, Cyber Security Hub is the vibrant community connecting cyber security professionals around the world.

What is Data Masking?

Data masking refers to the use of algorithms to replace sensitive information with functional fictitious data - data that looks real but is useless. The goal is to obscure the data in a way that retains the identity of the original data to preserve its analytical value so, for example, it could still be used for software testing and general business analysis.

Broadly speaking, there are two types of data masking:

  • Static data masking (SDM): data is masked in the original database then duplicated into a test environment so that businesses can share the test data environment with third-party vendors.
  • Dynamic data masking (DDM): The original sensitive data remains in the repository and is accessible to an application when authorized by the system. Data is never exposed to unauthorized users, contents are shuffled in real-time on-demand to make the contents masked. A reverse proxy is often used for access control.

What is Data Encryption? 

Data encryption converts readable, plaintext data into an unreadable, encoded format known as ciphertext. Authorized users can unscramble encrypted data using a key. However, this key must also be protected and obfuscated. 

Broadly speaking, there are two types of of data encryption:

  • Symmetric Encryption (a.k.a. Private Key) : the same key algorithm is used to both encrypt and decrypt data. Though more efficient, the sender must exchange the encryption key with the receiver before decrypting. It also requires users to distribute and securely manage a large number of keys.
  • Asymmetric Encryption (a.k.a. Public Key Cryptography): produces two mathematically related keys, a public key and a private key - both are required to unlock the message.

What is Data Tokenization? 

Tokenization replaces sensitive information with randomly generated alphanumeric values called tokens. The original, confidential data is then securely stored a well-protected server dubbed the “token vault.” Only someone with access to the token vault can make the connection between the token and the original data it represents.

The advantage of tokens is that there is no mathematical relationship to the real data they represent and cannot be reversed into its true value. If breached, the data is truly meaningless. 

Data Obfuscation Vendors to Know

Oracle Data Masking and Subsetting

Oracle Data Masking and Subsetting enables entire copies or subsets of application data to be extracted from the database, obfuscated, and shared with partners inside and outside of the business. The integrity of the database is preserved assuring the continuity of the applications.

Delphix Data Masking

Delphix profiling identifies sensitive information, such as names, email addresses, and credit card numbers across a range of data sources including relational databases and files. Delphix provides over 50 out-of-the box profile sets covering 30 types of sensitive data, as well as the ability to define custom profiling expressions.

IBM InfoSphere Optim Data Privacy

Provides extensive capabilities to effectively mask sensitive data across non-production environments, such as development, testing, QA or training. Prepackaged data masking routines enable the user to transform complex data elements such as credit card numbers, email addresses, and national identifiers, while retaining their contextual meaning.

Informatica

De-identifies data and controls unauthorized access to production environments, such as customer service, billing, order management, and customer engagement. Dynamic Data Masking masks or blocks sensitive information to users based on their role, location, and privileges, can alert on unauthorized access attempts, and provides logs for compliance and audit