MENU

Confidentiality in Data Access Solutions

January 1, 2017 Security, Big Data

One of the key selling points for data access systems with a huge single namespace is their superior suitability for data aggregation. Data aggregation and Find capability is key for companies that want to use data to make important strategic decisions with less risk.

The (black) market for Big Data

This also makes systems providing access to such data a desirable target for criminal organizations. Such organizations have been able to develop this interest because old security measures like passwords and MD5 hash algorithms have become very easy to crack with services like AWS. For example, a collision attack against the MD5 algorithm just takes 10 hours and costs only 65 cents on a GPU instance using open source software.

In the U.S. alone, more than 1,000 companies legally offer aggregated personal data, with some claiming over 2,000 data points for every person living in the country. These companies sell their data quite aggressively, making it very easy for criminal organizations to steal identities through social engineering or using spearphishing to gain access to data centers managing critical information. That file system’s FUSE client that’s supposed to be running on the CIO’s computer could very well be running on a yacht at the antipodes of Silicon Valley, halfway between Madagascar and Port-aux-Français. A mai-tai-sipping pirate could be stealing the latest design of a missile to sell it to a competitor, deleting sequencing data from an expensive experiment, or deleting the positioning parameters for a deep sea drilling ship’s positioning system when a bribe is not paid.

In addition to the large legal market for aggregated personal data, there is also a vibrant underground black market for data stolen in security breaches. Both markets enjoy brisk business — so the issue is not whether a data access system is penetrated but when. The best response is to contain the damage and make it hard for intruders to move around, hoping they will attack a competitor’s easier target.

Making Big Data access secure

We made this sound scary, because in the cybersecurity arena, a little bit of paranoia is healthy. The important lesson is that the organization has to be vigilant and always ready to take action by creating a culture of security. In addition to firewalls and network traffic analysis, the data access system has to be very secure, because it is a big target.

Aureum provides five levels of security measures. You can use them as needed, but we recommend implementing the highest level:

  1. No security other than login
  2. Public key infrastructure (PKI) and client authentication
  3. Integrated Kerberos 5 authentication
  4. Command integrity through cryptographic signing
  5. Full confidentiality through encryption of all traffic on the wire

To understand these levels, think of your own home security. The lowest level is locking your front door, while at the higher level you also lock all doors inside the house. When a burglar cracks the front door, he can’t move around freely and the police are warned by an alarm system.

Any personal data you store should also be encrypted at the application level. Typically, database management systems have encryption at the field level. For files, if an application does not provide encryption, you can use a utility like Pretty Good Privacy (PGP), which has been widely used since 1991 and is very mature.

Almost all storage media sold today offers self-encryption (SED) in hardware. It is extremely difficult to crack, but it does not give you the protection you get with Aureum, because once the drive is unlocked, a hacker who has penetrated the system has access to all data. Still, some applications require SED and if you can, you should enable it, because it gives you two important features:

  1. If your data center is physically broken in and the drives are removed, with SED activated they become bricks.
  2. When a drive is decommissioned, you can crypto-erase it in an instant.

Authentication, integrity checking, confidentiality: you cannot live without them.