CompTIA CASP+ CAS-004 – Data Security (Domain 1) Part 4
February 12, 2023

7. Data Loss Prevention (DLP) (OBJ 1.4)

In this lesson, we’re going to talk about data loss prevention, also known as DLP. Now, data loss prevention solutions are used to detect and prevent sensitive information from being stored on unauthorized systems or from being transmitted over unauthorized networks. Essentially, with data loss prevention systems, our goal is to protect our important data and stop it from leaving our network or leaving our control. There are three main components opponents to a data loss prevention solution a policy server, an endpoint agent, and a network agent. The first component of a DLP solution is a policy server. A policy server is going to be used to configure the rule sets that are used by data loss prevention systems based upon a certain classification, confidentiality, or privacy level.

This policy server is also configured to log any incidents that match the given rule sets and compile results of the violations that may have occurred. The second component of a DLP solution is an endpoint agent. An endpoint agent is used to enforce the rule sets and policies on a specific client computer, even if those computers are no longer connected to your corporate network. For example, if I’m using my corporate laptop and I’ve disconnected it from the network because I’m now on an airplane flying over the Atlantic Ocean, this endpoint agent on that laptop can still stop me from copying protected files. Onto an external hard drive or USB drive because the endpoint agent can still detect and stop it from happening even when the laptop is not actively on the corporate network at the time.

The third component of a DLP solution is a network agent. A network agent is a specifically configured network appliance that’s placed at the network boundary, and it’s going to be used to scan different web, email, and messaging protocols as the messages attempt to leave the network. These network agents have the ability to provide DLP functionality for both structured and unstructured data formats. A structured data format includes data messages that follow a specific format, such as JSON formatted data, a CSV file with specific column headers, or a structured database. An unstructured data format includes data messages that don’t follow a specific format, things like Word documents, PowerPoints, chat messages, emails, and other things where data can be entered in any format and in any order.

 By using a data loss prevention system, you’re going to be able to block any data that doesn’t conform to a predetermined and acceptable policy, which is known as whitelisting. Or you could configure the DLP to allow everything and only block things that match your predetermined sets of conditions known as blacklists. Most DLP systems will also include a dashboard or reporting capability that will indicate the number of matches that have been found, as well as the rate of false positives and false negatives as you continue to improve and train your DLP systems. When a DLP system finds that matching item in its rule set, it can perform one of four functions. It can alert, it can block, it can quarantine, or it can tombstone. The first action a DLP system can take is to alert when it finds a rule set match.

 If the DLP is set to alert only, it’s going to allow the data to still transmit and go on its way to its destination, but it will be logged and alerted when that happens. For example, let’s pretend I wanted to copy a file from the corporate sharedrive onto an external hard drive. I could plug in the hard drive, find the file, and drag and drop it from the sharedrive over to my external drive to begin the copying process. If this system is set to alert, it may detect it, and it may match a rule that states employees cannot copy this data from the sharedrive to an external drive. But it’s still going to let me copy that file, and then it’s going to send an alert to the DLP dashboard or Reporting Tool so an administrator can look at what I copied and decide if they need to take follow on action. Basically, in this alert mode, the DLP becomes a detective control, not a preventive control.

The second action a DLP system can take is to block the activity when it finds a matching rule set. This would actually stop the user from being able to copy that file from the sharedrive to the external hard drive. In this case, though, the user could still open the file and read it from the sharedrive. Since they’re not allowed to copy it, they could actually see it on the screen and maybe pull out their cell phone and take a picture of it. That specific action wasn’t blocked, but copying was. So copying would get blocked, and therefore that file may still be at risk. Now, the third action that a DLP system can take is to quarantine the file when it finds that matching rule. So again, let’s use the example of the user trying to copy the file from the sharedrive to the external drive.

 If that file was going to be in under a quarantine rule, the DLP system will block the user from copying that file and then take away the user’s access to even read or open that file. This is because the system is under the assumption that if you tried to copy that file and failed, you might try to do something else, like open the file and print it out. Or open it and copy and paste it into an email or some other mechanism to move the contents of that protected file off of the system and out of the network. To quarantine the file, many DLPs will simply encrypt the file and turn it into ciphertext that the end user can’t read or comprehend.

But this really does depend on your DLP system and how you have it configured. The fourth action a DLP system can take is to tombstone. The file. Now, when a file is tombstone, the file is going to be on the sharedrive and it’s going to get replaced by a file that simply contains a message that states there was a policy violation that has occurred. So using the example of the end user trying to copy the file from the shared drive to the hard drive, again, that DLP system might prevent the file from being copied.

It’s going to encrypt that file, and then it’s going to move that file to another location for safekeeping and replace it with a Word document that says there has been a policy violation and it states the reason under that DLP rule set, such as an attempt to copy a file to an external device. Additionally, this policy violation file will often include instructions for how to get the file reinstated, such as calling your It security team, taking remedial training, or other actions. So when it comes to data loss prevention systems, remember there are four actions it can take from least severe to most severe. You can alert it, you can block it, you can quarantine it, and you can tombstone it.

 These four actions are all forms of DLP remediation, and these can occur in multiple different places within the system too. The actions can occur on the client or the server depending on where the DLP agent is installed, or it can occur at the network boundary if you’re using a network based appliance with a network agent installed. After all, there are a lot of different ways to implement and configure your data loss prevention systems. For the exam, you’re not going to be asked technically how to configure a DLP system, but you do need to understand the different actions that a DLP system can take and why a DLP system would be used in an enterprise network. So, now that we’ve covered the four actions used by a data loss prevention system, let’s take a look at some common data security features that may be implemented through the configuration of a DLP.

 This includes blocking the use of external media blocking, printing blocking the use of Remote Desktop Protocol, or RDP, implementing clipboard privacy controls, restricting the implementation of Virtual Desktop Infrastructure, or VDI, and the blocking of data based on its classification. If removable or external media blocking is enabled, your DLP system is going to prevent the user from being able to read or write to an external device, such as a CD, a DVD, a USB, or other external storage device like a flash drive or a hard drive. This is used to prevent a user from conducting mass data exfiltrations using an external device because a single USB thumb drive can hold over a terabyte of data these days on a device that is smaller than your typical car key.

 Next, we need to talk about print blocking. Now, if print blocking is enabled, the DLP system will block the ability for a user to print to a networked or USB connected printer. This is a good practice to use because a lot of sensitive and confidential documents over the years have been stolen from organizations simply by people printing them out and carrying out the front door, in their briefcase or in their pockets. Another thing we can block is RDP. If RDP blocking is enabled, the DLP system will prevent the user from conducting a copy and paste between the remote client PC that they’re connected to over the Remote Desktop Protocol and their normal host machine.

 The issue here is that when an employee is connected over a VPN using an RDP protocol, they’re often using their insecure home computer to make that connection. So this type of DLP rule will prevent the corporate data from being copied or moved out of the RDP session and onto the home PC. Another control that’s often implemented to better secure your data is to implement Clipboard privacy controls. Now, this control is enabled to prevent a user from opening a document, selecting all of its contents, and then attempting to copy it, and then paste it into an email, an online platform like Paste Bin, or into another file type that may not be protected by your DLP system.

 This type of control is used to protect the data while it’s in use on a system, regardless of whether that’s on a corporate or home computer that you’re using. Many organizations have also moved into the world of VDI or virtual desktop infrastructure. This migration to VDI has caused a large increase in the overall security of these organizations, but it is still important to include a good DLP solution in your VDI implementations. To best secure your VDI environment, you need to ensure that they have a DLP agent installed and implemented to best protect the VDI hosted image when it’s being used by your end users, just like you would a traditional desktop, laptop or server. Finally, we can use DLP to also block the movement of data based upon its classification level.

 This is usually done by comparing a data classification label or tag against a given rule set and then allowing or blocking that movement of the data based on the associate classification level. To successfully use DLP with classifications, though, it’s important that you create the proper labels, tags, and categories within your systems and its protected data sets. This can be done manually or using automation. If you rely on automation, though, you must be aware that your accuracy and consistency may cause issues with false positives or false negatives that could either block or allow data to move through the system even when it shouldn’t. So you have to be careful when using that automation.

8. DLP Detection (OBJ 1.4)

In this lesson, we’re going to discuss data loss prevention detection techniques that can be used to ensure the security of our enterprise data. Now, there are six main techniques used by DLP systems to detect the data loss based upon rule sets. This includes classification, dictionary, policy templates, exact data match or EDM, document matching, and statistical or lexicon. You don’t need to be an expert in all six of these techniques techniques, but you should at least have a basic knowledge of what they are. So we’re going to do a quick look at each of them. First, we have classification. Now, classification is the action or process of classifying something according to a shared quality or characteristics. Now, for example, let’s say you’re in the military.

You’re used to classifying documents as either unclassified, confidential, secret, or top secret. You do this based on the type of data or information contained in that document. And when you label it as a secret document, for example, you’re essentially stating that this document shares the same level of data as other secret documents, and therefore it needs to be protected at that same level. Within a DLP system, classification is normally verified using labels or tags associated with the document, the data, and those files. Within a DLP system, if you tried to send a secret file to somebody else who wasn’t authorized to access that file, that DLP system could block that attempt based upon the labels associated with the file and the intended receiver.

For example, it’s a secret file and you’re sending it to an unclassified receiver. It’s going to block that. Second, we have a dictionary. Now, a DLP system may contain a dictionary that acts as a set of patterns that should be matched by the system. When considering these dictionaries, I want you to remember that they’re not necessarily going to contain words in them like a dictionary used in grade school. Instead, they contain a list of words, phrases, numbers, or even regular expressions that should be alerted on blocked quarantined or tombstone when detected by the DLP. For example, let’s say you work for my company and you’re beginning a new project called Titan Cipher, but we’re worried our competitors might find out about it before we’re ready to launch it.

 So we want to protect anything labeled with Titan Cipher. I’m going to then take the word Titan Cipher and add it to my DLP systems dictionary. And anytime the system sees that keyword Titan Cipher within an email or document, it’s going to block that from being sent to anyone who doesn’t have a deontraining. com email address or access, because we don’t want anybody gathering information on our cool, super new project known as Titan Cipher. This is how a dictionary works inside a DLP system. Third, we’re going to have a policy template. Now, a policy template is essentially a dictionary, but it’s a very specialized dictionary.

This is a template that contains dictionaries that are optimized for data points in a regulatory or legislative schema. For example, if you need to ensure you’re in compliance with PCI DSS, there’s a policy template available that you can download and install into your DLP system to make sure that anything that matches the formats contained within that policy template, like credit card numbers, those things are going to be blocked from leaving your network. Policy templates are very helpful when you’re dealing with governance requirements and your ability to prove compliance with them, things like PCI, DSS, HIPAA, GDPR, and others. Different policy templates are going to contain different formats that are used to match individual taxpayer identification numbers, Social Security numbers, passport numbers, or whatever else you need to protect to be able to be in line with that specific regulation or requirement. Fourth. We have EDM or the exact data match. EDM is a structured database of string values that will be searched for by the DLP until it finds a matching entry with an exact data match. We aren’t actually searching for that value like your Social Security number, but instead we’re searching for the hash of that value, essentially with exact data matches. We’re going to create a database of all the known hashes for the items we want to protect, and if we see a file whose hash matches one of those, we’re going to be alerted to that, or we’re going to block it from leaving our network. This is actually a very popular way of doing DLP because we’re not compromising the confidentiality or the privacy of the data itself, because we’re only looking at the hash value to determine if they match and not the actual value of the data itself.

For example, let’s pretend I have a list of all my customers’Credit card numbers, and I want to configure my DLP system to ensure that nobody could download that list or email that list to their personal email. To best protect my customers data, I don’t want to create a list of all their credit card numbers and upload that to my DLP. That would be pretty dangerous. If I did that, an attacker could get all of those credit card numbers just by exploiting my DLP. So instead, I’m going to use a hash of those credit card numbers individually and then store them in a structured database that contains all the known hashes that I want to protect. Now, when the DLP sees something that looks like a credit card number going out in an email, it’s going to hash that number, compare that hash against the database of known hashes, and if it matches, it’s considered an exact data match, and it would alert or block that data.

Fifth, we have document matching. Now, document matching involves matching an entire or partial document against a set of known hashes that we want to protect. Let’s pretend I’m working on a top secret program and it’s known as Titan cipher again. Now, I create a detailed PowerPoint with all the information in it. And I want to ensure that this PowerPoint doesn’t leave my corporate network. Well, I can set up full document matching on it by creating a hash of that document and having it labeled inside the DLP’s database of my protected documents.

 Now, anytime a file is going to be sent over the network, the DLP is going to create a hash of that file and compare it against a list of protected hashes. If there’s a match again, we’re going to alert or block it. Now, additionally, many DLP systems will allow for partial document matching as well. This is where they look at certain pieces of a file and see if those match. Maybe there’s just one slide in my PowerPoint that I’m really worried about. If this is the case, I could configure my DLP to look for that one slide in any PowerPoints that are leaving my network. Now, anytime there’s a PowerPoint sent over the network, it’s going to create a hash of each slide, looking to see if it matches my protected slide. Our 6th and final method is known as Statistical or Lexicon.

 Now, Statistical or Lexicon is a further refinement of partial document matching that uses machine learning to also analyze a range of data sources. Now, we aren’t just using a standard document match based on palm what I loaded up into the system, but instead, we can use machine learning to more intelligently identify and protect our data from loss. Remember, there are six main techniques used by DLP systems to detect any data loss based upon the given rule sets. These are known as classification dictionary, policy template, exact data match or EDM. Document matching and statistical or lexicon.

Leave a Reply

How It Works

img
Step 1. Choose Exam
on ExamLabs
Download IT Exams Questions & Answers
img
Step 2. Open Exam with
Avanset Exam Simulator
Press here to download VCE Exam Simulator that simulates real exam environment
img
Step 3. Study
& Pass
IT Exams Anywhere, Anytime!