DLP is always a strange thing for me to talk about, since I remember the first round of solutions that ended up causing more headaches than solving the problem of data leakage. But with cloud, it’s all of a sudden a new conversation and DLP is right at the forefront in classifying the types of data that are the centre of the cloud design. All of a sudden DLP cannot be ignored anymore, it’s become a critical part of the new cloud landscape.
The transition of DLP from a “find unwanted data” to a key function of classifying data by importance is a key first step in planning a cloud strategy. Security engineers almost have to become data specialists in that they need to look at DLP classification as a way to ensure that these resources are only accessible by specific groups of users in order to minimize risk. But it also requires a balance between over-controlling data (making it accessible to very few people) and not putting enough controls in place or else you just end up with a lot of frustrated users who are tired of seeing messages pop up asking them if they need access to the data and administrators that are spammed with event messages notifying them of all the unsuccessful attempts.

Off the top, this doesn’t seem like anything new, but think of it in terms of cloud. Suddenly you’re dealing with a HUGE amount of data that is only going to scale larger. You have to manually tag tons of data which in itself not only takes a lot of time, but because of the distributed environment, may not be easy to find. You also have to ensure that whatever policies you set up to classify the data can be used across the entire environment to maintain consistency, or else you’ll just end up with headaches due to misalignment of policies. Oh, and then there is data that’s encrypted, compressed, mislabelled etc.

So is there a way to manage this without needing an army of data experts? Yes and no. There are automated tools that can help do preliminary classification of data, but these can’t be applied and assumed to be good to go. Each organization is different and so there is a lot of fine-tuning that will have to be done on the back end to ensure that the right controls are in place to minimize risk. A good place to start is through tools that utilize Active Directory so that management structures can be used to define who has access to what.

So while DLP will always be a bit of a pain to set up and manage, it’s not going away. In fact, it’s inherently critical to successfully managing cloud environments. The best way to deal with it is to start the classification process as soon as possible before moving to distributed environments and scale as you go. It’s much easier to add new data to existing classification methods than to do it all at once. The key is to make sure the right policies are in place and that they are designed to scale with cloud environments which may contain unique characteristics that weren’t considered during the first round of classification.

