Transcript for:
Overview of Cybersecurity Threat Detection

Well, threat research is all about understanding what is threatening your organization from a cybersecurity perspective. How it works, what it does, or what kind of resources it attacks. So in order to get a hold of this information, we need two things. First, we need a way to detect those threats. And secondly, we need to find a common language to describe those findings. So we can publish that information, share it, or even learn from each other. How exactly are we going to do this? That's precisely what we'll cover. in the next minutes. Now for a long time we've relied on signatures to identify malware and that's exactly how your antivirus works by default right now. Which means that for malware that we've seen before or for attacks that somebody else has seen before, there should be some specific strings or byte patterns that can help us recognize malware. Where do we find these patterns, you might ask? Well, you can find them in files in our disks, you can look inside processes that are currently running inside our operating systems, or you can look inside the network packets that go in and out of our network interface cards. Now, keep in mind that when it comes to analyzing network traffic, you might not always be able to inspect that content because it might be encrypted. You might be transferring encrypted files, sending an encrypted email, or you might use a VPN or just a regular HTTPS connection to a website that's also encrypted, right? Still, even with encrypted traffic, we can look for some observable details like how big are the packets? How often do they come in? Or do they have any protocol anomalies that you might be able to see? Things like requests. without a response or misplaced flags or a packet structure that doesn't exactly look right. But signatures are still very widely used because, well, they're efficient and easy to update. It usually takes you just a click or two to update your antivirus software or Windows Defender or whatever you're using. You are using an antivirus, right? So assuming you're an attacker and you want your attack to succeed, right? Well, you could try doing one of the following. First, you can create a piece of malware. that doesn't have a signature just yet. And cannot be found in any signature database. But it's probably gonna be just a matter of time before some antivirus vendor detects your malware and builds a signature for it. So your virus is gonna be just a short-lived fad. Or you can create a malware that is so advanced that it's next to impossible to create a signature for it. Well, how do you do this? Well, you make it behave like a legitimate application. And you make it act very slowly, stealthily. And you also make sure that if an antivirus asks your malware what it's doing, then it's going to report back and it's going to tell it that nothing malicious is happening in there. As you can guess, the second method is much more difficult, but the odds will be forever in the attacker's favor. All right, so let's switch our perspective for a minute and stop thinking like a hacker, but start thinking like a cybersecurity analyst. What can you do? if you're facing malware that doesn't get detected by normal antivirus signatures. And actually, ask yourself this. If it's not being detected by any tools out there, how do I know it's present in my network? How do I find out that I've been infected? And that's the most difficult part, honestly. Because a lot of attacks, a lot of breaches, are detected very late in time, or sometimes even never. But it just might happen that some... unexpected connection, some abnormal traffic, some errors, get caught by some monitoring tool someday, or even by the admin. And that's precisely the first thing that we do when we're facing malware that is so advanced that cannot be detected by signatures. We stop looking for signatures and we start looking at behavior. We start looking for proof that the malware is in there and it's doing something nasty. And those proofs are called IOCs. that's indicators of compromise. Well, what exactly is an IOC? Well, just like the name says, it's an, well, an indication and some artifact left behind, some proof that you have been breached sometime in the past, that an intrusion has happened, and that malware has been executed. Generally, these IOCs are some artifacts that you might encounter at some point, and they make you think, well, is that thing supposed to behave like that? They might be some URLs in some access logs, a DNS log, a firewall log, a proxy log, or even in your browser history. It might be that some unexpected file is located on some disk, or it might be that a file gets executed in some unlikely scenario. It might be a process that someday you find it. running in your memory and nobody knows how he got there and what it's doing. It might be a file signature that indicates that there's a remote access tool controlling that machine and you know that that machine doesn't have any remote access utilities installed on it. And don't forget that we can also look for hashes of files associated with bad reputation. That's still kind of like a signature but we have to look for them because otherwise we risk missing the low-hanging fruit. We can look for some unexpected or foreign or new registry entries in Windows Registry, especially those entries that dictate which programs, which applications run automatically whenever you start your system. We might be seeing at some point some excessive resource use, like too much CPU usage, too much memory usage, an overloaded disk or an overloaded network interface. You can use Netstat, for example, or other utilities that show you active connections. You can look for open ports. from undocumented applications, for example, applications that you've never seen before in your network, you might be able to detect some unexpected protocols that shouldn't be found in your network because you know that there's no application actually using them, or at least no legitimate application. You could also look inside a device manager for each system and find some new and unexpected devices that nobody knows how they got there. Or you might be able to see some weird usage in a user account, in a normal user account. some changes in permissions, some unexpected changes on security policies, network connections, file executions, anything happening with that system that is somehow beyond what you expect to be normal. And the list doesn't end here. There are actually thousands of potential indicators of compromise simply because computers are very complex systems. And not to mention that we're going to have a complete different set of indicators when we're moving away from computers to computers. towards mobile devices, cloud environments, wearables, IoT devices, industrial systems, drones, I don't know what else. So there's so much more that you can look for and so many possible indicators that might show you that you've been compromised. Well, we have a lot of things to look for because malware can be extremely complex. So we need to change our way of thinking about malware. And while we look for IOCs, there are a couple of things that we should keep in mind. First, while you could manually inspect all those IOCs that we've mentioned so far, you could definitely get much better performance, visibility, and, well, you'll analyze much more information if you use a dedicated automated tool for this. So you'll find tools like these under the names of HIPS or HIDS, which is Host Base Intrusion Prevention or Host Base Intrusion Detection, or Endpoint Security Suits from a lot of vendors out there. Second, Sometimes a single IOC might just not be enough to prove that you've been breached. Because one single anomaly might be open to interpretation. Or it might not even raise an alarm at all. But correlating multiple IOCs can show a pattern. This is also one of the reasons why for some specific attacks, we simply cannot create one specific signature to identify them. And the good news is that we also have dedicated solutions for this. And we call them SIEMs or C-M's. That's Security Information Event Manager. And we're going to talk about them later in another video. Third, we still have the problem of identifying each indicator of compromise and determining if it's good or bad. Computers create files, processes, network connections all the time. Sometimes they can generate thousands of events every single second. Not every event is, of course, a security event. It's a lot of effort. That's actually involved in this triage process, where you just try to determine which of those events are malicious and which are just, you know, regular daily business. Now, to help us with this last issue, to help us determine what's bad, what's good, what's ugly, or what's just simply weird, we have two methods. And the first method, we call it reputational, which means that we can associate an IOC with some reputational data. And reputation means... Being able to answer the question, has this indicator ever been connected in some way to an attack, to a previous attack, or a piece of malware? Do we know anything bad about it? Reputation is kind of like when you crash your car one day, and then for the next few months, all your friends don't want you to drive them anywhere, because you just got the reputation of a bad driver. And this reputation data can come from a lot of places. Those places can be a database, an online feed, a paid subscription. Basically we're looking for IOCs that were previously detected by somebody else and identified as malicious. Just to give you some examples here. IP addresses, known to generate spam or denial of address. of service attacks. URLs, where we know that previously we found malware, they hosted malware, or they were used in some command and control scenario, right? In a distributed denial of service attack, for example, or in a botnet. Even files, usually identified by their own hashes, can be submitted to a reputational database. Again, hashing kind of works like signatures, but instead of looking for a byte pattern inside of a file, reputation databases store the hashes of all the files ever detected as being malicious. And we can even look for things like layout of an email. This is also one of the heuristic methods that we use all over the world to detect spam. How does the email look like? How is it structured? Of course, all the databases, to be useful, need to be kept up to date. So they're going to require real-time intelligence fed to them from all over the world. That's why some of them, well, you have to pay to access them because there's a lot of effort involved in maintaining such databases. Large vendors usually do this. Fortinet does it. Cisco does it with their AMP for Endpoint solution, for example, and the centralized intelligence reputation center called Talos, Cisco Talos. And, well, the second method for deciding if what a computer does is good or bad, we call this method behavioral. And as we mentioned… Advanced threats are usually not easy to detect because there isn't just one single proof involved. There isn't just one single IOC that you can put your finger on and say, well, this is malicious. So behavioral methods do something smart. They correlate IOCs with attack patterns. What does that mean? I mean, well, let me explain. Instead of looking at one or more specific files, URLs, IP addresses, we look at what's happening. inside of our system. Then we decide if what's happening looks like an attack. Just a second, let me give you some examples here. Think about calculator.exe, your calculator application. If you see it attempting to connect to the internet, then pretty sure it's going to be affected. If you detect an attempt to change hundreds or thousands of files at the same time on your disk, then you might be looking at a crypto locker. If, for example, changes to system settings are detected while a user without any administrative privileges is logged in, then that host might be compromised, and so on and so forth. And one more mention here. These behavioral methods work so much better if the software that you're using to monitor for these behaviors can learn how your typical workday or non-malicious activity looks like. during a workday or during or even outside work hours or on specific weekdays. This is called creating a baseline and once we have this information it gets so much easier to detect unexpected behaviors or changes or events that stand out from what we call normal. All right so now that we know how to look for some of these IOCs let's try putting this information into a larger context and that's going to be the context of an attack. So to have a context the first thing that we have to define is something called a TTP, which means what are the tactics, techniques, and procedures used by a hacker to conduct an attack. As you can probably expect, we can only define TTPs for known attacks, for attacks that we have previously seen and we have previously documented. So let's see some examples of TTPs. For a distributed denial of service, we should be able to see unexpectedly high traffic, for example, and perhaps even a random distribution of connections from all over the world. For malware, like viruses or worms, we might be able to look for high CPU usage, high memory consumption, high disk activity, abnormal local connections, in case of self-propagating worms, to detect if we are being scanned. Under a reconnaissance process, we can look for a lot of scanning traffic. like half open or very short-lived connections, or if the connections are attempted at the same time on a large number of ports, especially in sequence, you're probably looking at an attacker doing reconnaissance on your network. Now, attacks coming from APTs are usually going to be much more advanced, but regardless of how advanced they are, you should always be able to identify the remote control traffic for communicating with the command and control center of the attacker. Now, some of the TTPs here... would be something like this. Port hopping, a technique that looks like a quick and apparently random change of ports for identical connections. And also you can look for the hosts that are attempting such connections. Another one is fast flux DNS. That's quickly changing the IP address that is pointed by a DNS entry. The reason is that these mappings should be... generally stable the dns systems shouldn't fluctuate so much but hackers often change them to evade ip blacklists data exfiltration stealing data from your company can be detected in a number of ways none of them is perfect but you can look for alerts generated by access control systems to sensitive information for example you can also look for high bandwidth usage and high database usage although these don't necessarily always indicate um exfiltration and don't necessarily happen every time filtration attack happens sometimes just small data sets are exfiltrated protocol anomalies this means looking for anything that doesn't play by the rules of the rsc of that specific protocols so protocols that don't exactly follow their standard definition also we can look here for high volume of normally low payload protocols So protocols that we shouldn't expect to generate so much traffic in a normal environment, in a normal scenario, things like ICMP or DNS or NTP, that write the network time protocol. These can also be used to conduct attacks and also to exfiltrate data. So that's what we called TTPs, tactics, techniques, and procedures. All right, so we have so much information already about attacks, about reputation, about malware signatures, IOCs. And by the way, this whole information... is part of one big concept called threat modeling. But how do we describe all this information in a structured and meaningful way? How do you write this stuff down? How do you share it? How do you leave this wisdom behind? And how do you communicate it to your loved ones? Well, for describing, storing, and sharing this type of threat information, we have a language called STIX and a protocol called TAXI. Well, at least the names are easy to remember, right? So remember when we first talked about information sharing, indicators of compromise, and dissemination? Well, STIX is this one standardized way of describing these findings and the relationships between them. STIX stands for Structured Threat Information Expression. And at the moment of this recording, the current version of STIX is version 2 based on JSON. Version 1 was based on XML, so make sure we remember this for the exam just in case they ask you. The structure of STIX is, of course, not arbitrary. It's a standard after all, so it's made up of something called SDOs, or Sticks Domain Objects. You just have to know about them, you don't have to memorize them all. So, just to give you some examples of SDOs, they can be things like observed data, like an IP address, a file property, something that would generally be detected and recorded by some monitoring system, automatic or manually. Indicators. These are the patterns of observables. that can be identified and might be relevant to cybersecurity. Attack patterns. These are the known behaviors of an attacker. These are the TTPs, the tactics, techniques, and procedures that we covered just a minute ago. They're basically one way of saying, if this is happening to your system, then you're most likely under attack. Campaigns and threat actors are the actual adversaries or attackers that are launching the attack, along with the scope of the attack. The scope refers to... It might be just you being targeted or multiple organizations similar to yours all over the world, in which case we're going to call it a campaign. COA or course of action. These are the instructions or what can you do in order to minimize the damage of an attack or to resolve an incident. For example, if we were to find out more about how to describe a malicious URL, we could access this link indicator for malicious URL, where we get a nice description about how this indicator can be modeled and we can also see that it is used in conjunction with a relationship that points the url to a specific piece of malware because that's why the url was malicious in the first place now if you scroll further we can also find the json definition of this threat that describes in a list of objects an indicator which is the url malware entry with a description and the unique identifier and finally a relationship between these two entities. So if we have sticks to describe this data, we also have Taxi as a protocol to transport this threat intelligence. Basically, it's just a REST API with well-known API operations, just like any API, that transports threat information over HTTPS. Now, Taxi defines two types of services or methods for sharing this information, depending on how you want to implement it. The first one is a collection. This is an interface to a threat. information database or repository. In the collection model, this data can be directly requested by the clients and responses are of course returned from the server. This is the most basic description of pretty much any client-server model out there. Now, channel implementation works slightly different and that is it relies on the data that is being pushed to the clients instead of being requested so clients just subscribe to a taxi data feed and whatever there's new information available out there it gets automatically pushed to them kind of like a file synchronization service think about Dropbox for example well is there any place where you can see it in action where you can access some sticks files or taxi feeds right now for some freely available ioc definition there is the open project called open ioc it uses an xml format and the definition includes logical statements like file names dns domains ip addresses string patterns and so on now fire eye also provides a windows utility compatible with open ioc which is called the ioc editor you can use it to visualize those sticks files in a much nicer interface you can see here all the observables that are found within the file the logic between them and of course you can also edit your own ioc definitions you can add your own logic here and your own observables now for example a network dns name like malware.com or let's say you would also want to customize it with a specific port number like port local port any number you want and you can see the definition gets updated automatically of course at the very end you can just save this and export it misp or the malware information sharing project should sound just a bit familiar to you because we've talked about it in the video about intelligence sharing but misp can also work with sticks cdos and open ioc also in the same video if you remember we've covered ibm's x-force exchange which is a cloud application that works with taxi and can automatically export known threats in this sticks language all right so for the exam make sure you understand clearly what is an ioc an indicator of compromise and what exactly we should look for the methods that we have for detecting malware and such indicators and also be sure that you are able to explain what are the sticks language and the taxi protocol so thanks for watching Don't forget to subscribe to Certified Breakfast and see you on the next video.