Transcript for:
Understanding System Logging and Analysis

A very important source of information are the actual logs generated by your endpoints, your servers, your virtual machines, basically anything that runs a variation of Windows, Linux, macOS, or any other operating system and is connected to your network. And we have a number of categories when it comes to logging. First of all, we have event logs, right, for an operating system. An event log is generated when something happens. That's it. When it happens, just write it down. Now for Windows... we do have a couple of event log categories first of all we have the application logs these are generated by applications and services for example when a service cannot start or returns an error when it attempts to start we also have security logs these are events that deal with authentication processes or privileged access or someone attempting to gain more privileged access on that system and we also have system logs which address operating system internals device components, hardware failures, and other issues like these. We also have setup logs generated during installation of specific software packages. And finally, we have forwarded logs. These are events that don't belong to us, but we are collecting them because they have been forwarded to us by some other computer out there. Now, most logging also includes a severity scale to be able to quickly jump to what is really critical. And Windows has a very simple scale starting from the informational level basically just things that happened successfully new things that don't really warrant any kind of attention warning level is not necessarily an error but some events that might cause you some concern some problems later on errors are important problems that usually mean that something doesn't work or has failed and audit success or audit failure logs these are just for the security log these tell us that a user or a device passed or failed some sort of a security policy now on linux linux will default store most of its logs and slash var slash log right that's location and it's going to be in text format now some newer distributions using systemd require the use of journal ctl command to view the systemd logs because they are stored nowadays in binary format On macOS, we have the somewhat awkwardly named Console app, which is just an interface to view the logs. Now, in most operating systems, the console means something completely different. In macOS, it's just the interface to view the system logs. This changes quite often, and there's no consistent formatting over there on macOS. Most of them are just text logs, so they're very similar to Linux ones. And after collecting all this information, here comes the question, how to use this information, how to use these logs to determine if something bad has happened? What should we be monitoring or what should we be looking for? Well, this all falls under the term of log analysis, and we basically call this correlation. That is, we're not just looking at individual events, but we try to connect information from multiple events happening at the same time. For example, 10. Failed login events are not the same as 10 failed login events and one successful one, all right? We also have to look for configuration changes in applications or appliances and ask ourselves, or I have some smart tool out there that determines, was this event supposed to happen? Did we expect this to happen? Was this part of a change management process? We should also look for gaps in time because hacking processes and hackers usually take care to cover their tracks by deleting any logs that were generated while they were acting upon their targets. So if we are not able to determine the logs that were generated during the attack, we should be able to determine a gap, a lack of logs for a specific time frame. And we can also even look at trend analysis from a high level perspective, even when it's about collecting logs like for example how many logs are generated per second per minute per hour on a normal work day or outside working hours right any deviation from these statistics might indicate that something abnormal is happening and we simply cannot mention logging without mentioning this syslog protocol which is a very old one but still very useful usable with pretty much any network device out there and we actually designed to work in a client server environment where a client that generates the log data packages it and sends it to a log collection device or log collection appliance over the network. Nowadays, it is a standard for logging in data center hardware, you know, for Linux, for Windows, but you're not going to get it out of the box with Windows. So it's mostly designed for Linux systems. Now, just like most network protocols, Syslog was originally designed without security in mind. By default, it's going to work on UDP port 514. which provides no confidentiality, no integrity, and due to the limitations of UDP, not even acknowledgements for receiving and processing that information. Now fortunately, newer versions of syslog now use TCP on port 468 for acknowledgements and also implement TLS with MD5 or SHA for encryption and message integrity respectively. Now, it is a very good idea to use syslog to store logs somewhere else, maybe even archived off-site. And this is a very important topic for the exam as well, archiving logs off-site. As this makes it more difficult for the attacker to cover their traces. They might clear the logs on the local machine, but they will not be able to do so from the remote collection device as well. A syslog example or a syslog entry, a syslog message, kind of looks like this. As you can see here on the screen. And we do have a specific structure to it, starting with a header, which includes a timestamp, maybe perhaps an IP address that the syslog was generated from, a facility, that is the entity that created the entry. It can be between 0 and 23. Usually, we're going to use some predefined numbers like 0 means the kernel, 1 means by user, 2 is the mail service, 3 is the system service, but these are not exactly standardized words. all the vendors out there. We also have the severity ranging from zero which is an emergency message to seven which is a debug message right so zero is the worst is the one that we can ignore use it as a filter to filter those critical messages out there and finally the message itself this is the only non-standard part depending on the vendor and the device you might find the clear text in here json xml csv data in here whatnot. You can also easily view these fields in Wireshark as well because you know syslog is clear text so there's nothing hidden here. On the last line here on the screen you can see the actual syslog message interpreted by Wireshark and if you collapse this you're gonna see that the first part belongs to the facility then is the level, the severity, in this case is just informational level 6 followed by the message which in this case includes a timestamp, a host name process id and the actual message which as i said before can be absolutely anything don't forget that we also have firewall logs which are the most important type of logs often enough because firewall often keeps track of all the connections who is sending who is receiving which protocol and ports are being used how much data is being transferred how much bandwidth is used and if address translation if nat is performed on the firewall It can also help us with identifying internal devices that were involved in a security incident because we can actually pinpoint to the private IP address that was involved in that security incident. So firewalls are a very important line of defense, which means that the information, the logs generated by them can be very useful, especially because we should be interested in the traffic that was dropped, first of all, and secondly, in the traffic that was permitted. So that's what would make sense to log as a... as far as the firewall is concerned. Secondly, we also have to look at statistical information or we get the chance to look at statistical information like the top protocols that are being used or that were observed, detected in the network, number of protocol anomalies which might be indicating attacks. Now, higher level firewalls like next generation firewalls or application level firewalls can also log application layer information like the websites that were accessed during that connection. Also useful type of statistic is how much traffic goes through the network inbound or outbound and also parts of statistics should be the address translation logs tracing those internal hosts under investigation. Now, of course centralized storage of logs is recommended because at some point the devices that generate them will run out of space and will start deleting and replacing and rotating old log files. It's very useful if you need to investigate an incident sometime after it happened, but But be aware that not everyone uses the same logging format. And be aware that sometimes operating systems or appliances don't even generate the logs by default. For example, the Windows Firewall doesn't generate the logs by default. It doesn't write anything to a log file. You can enable this. but it is disabled by default. We also have to be concerned with proxy logs, and we have two types of proxies. By the way, most of the time proxies are used for HTTP, so web traffic. Right, so the two types of proxies that we have, we have first four proxies. These are for outbound traffic, and they are used by our internal users. Their HTTP requests are intercepted and then forwarded to some outside destination. Now, they're useful environments where traffic must comply to some strict. security policies. We also have two subtypes here as well. We have transparent proxies, which are able to intercept the client traffic without any configuration required on the client side. So they're basically invisible or transparent. And finally, we have non-transparent proxies, which means that the client must be configured with the proxy's address in their browser, in their operating systems in order to work, to be able to communicate with it. So reviewing forward proxy logs, why is this useful? Well, you will get to see which websites your users are accessing and even the the contents of each request and reply of course this can also point out any malicious traffic that might be trying to reach some outside destination and then we have reverse proxies these are for inbound traffic and they're going to listen for requests coming from the outside intercept them and forward them to an internal server in your network now in real life reverse proxies are sold and marketed actually as load balancers because they usually serve a cluster of servers for a specific website or application. You do not think that Netflix runs on a single server, right? Now, of course, intercepting inbound traffic is definitely going to imply some security scanning and decisions. Look at the traffic, try to determine if the request is malicious or not, or try to detect even a denial of service attack. There is your reason for logging reverse proxy traffic as well to get this level of visibility onto potential security threats. And we have even more logs, starting with web application firewall logs. Now, a web application firewall is just like the name says, a firewall for a web application. Obvious, right? Well, the name is actually a pretty exact definition. Think of a normal firewall. looks at the TCP IP traffic and based on some rules permits or denies it. Well, a web application firewall does the same thing for web applications, but looks at the application level request to decide if that request is allowed or denied. So they're basically also a firewall just focused on the layer seven and the firewall that only looks for web attacks. Things like malformed inputs, SQL injection attempts, buffer overflows, buffer overruns. brute force login attempts, XML or JSON or any type of script injection attacks, anything that looks suspicious in a request that should reach a web application, right? Including cross-site scripting, cross-site request forgery, pretty much all the web attacks that can be detected over the network. We have some sort of a signature inside of our web application firewalls in order to catch those before they reach the web application behind the firewall. And just like with any other kind of firewall, it is useful to log any denied requests. Let's see what the firewall decides that is malicious, that we don't want in our network or in our web application. It might be an indicator of an attack. Now here's a nice example of a web application firewall log. If we scroll down here to the right as well, we should be able to find some of these entries that point to specific attacks, like an SQL injection attack or a cross-site scripting one. Let's drill down into one of these. and let's see which request was the the critical one there it is right here so you can see we had a get request that included as a parameter something that looks like an sql injection attempt right so this was immediately detected by the web application firewall that's the page that was detected on timestamp website and so on so just like a normal firewall but the rules apply just to web applications and web attacks And of course we need to mention IDSs and IPSs because they're the special kind of security devices. They are an analysis engine basically that receives packets or summaries of traffic, but most of the time they will receive the actual packets. And they use a set of rules to match for suspicious traffic patterns and perhaps generate an alert in case of an IDS or even attempt to block that traffic or to reset the connection in the case of an IPS. So traffic matching based on rules is not a new thing. So how is this IPS IDS stuff any different from a firewall? Well, an IDS or an IPS is focused on determining malicious intent, and they are usually placed behind the firewall. So they analyze traffic that was already signed off as being okay, as being fine. acceptable from the firewall's perspective. Instead of looking at the TCP or UDP sessions or at specific IP addresses, IPSs and IDSs try to detect signatures that might indicate attacks. Things like malformed packet headers, buffer overflows, specific abnormal sequence of sequences of packets, contents that might prove malicious, and so on. So IPS functionality is nowadays no longer implemented as a dedicated appliance or physical equipment, but is included as part of the functionality built into next generation firewalls. So we get the IPS functionality right within our traditional, let's say, network firewalls. Now, they're most likely going to generate a lot of logging activity, especially IDSs, which can only alert. And in most cases, they're going to need some rule tweaking to match your desired environment, because out of the box, they're going to alert about everything. So we need somebody to look at those rules and start teaching the IPS device or the ITS device, you know, this kind of behavior is acceptable, right? Don't go crazy if you see something like this in our network, this is normal for us, okay? And also there's a saying here, which I personally heard a couple of years ago from someone from Cisco, IPS without eyes is useless. Someone has to monitor that IPS or that IDS, right? Otherwise, it just sits there and nobody knows what it finds inside of that network traffic. Now, usually this information is ingested by CMs, security information event managers. And we're going to have a special discussion about this, the CM appliances. I believe it's going to be right in the next video. Now, one of the most well-known solutions out there, open source solutions out there. of ideas systems is called snort and actually the snort engine is now embedded in a number of commercial ips solutions right so many many vendors are actually basing their solutions on the smart engine and then enriching it with additional signatures and additional functionality but for the exam you should be able to identify at least a couple of of items within a sample snort idea signature So a Snort signature kind of looks like this. Don't worry if it looks complicated. It's much easier to read. It's basically stating that it is supposed to generate an alert whenever something between an external network and the home network 143, whatever that is, whenever that traffic matches the parameters below, right? So normally, as part of the Snort configuration, we would configure what is the external network, how does that look like, and what kind of IP addresses or what kind of network interfaces. belongs to the internal network so that snort knows exactly what is the outside to the inside flow of packets all right now in this case the signature is designed to detect imap brute force attempts on logon right so it's going to match on logon the keyword called logon under the content section uh the detection filter is going to say that we're going to track this by specific destination so we're going to match all the packets that come from the same destination because you know that's the point that it's most likely attempting the brute force attack we're going to fire up this this alert if we detect at least 30 attempts in 30 seconds so we have a count of 30 over 30 seconds uh the rest of the information here basically says that the rule set belongs to the community package so you do have access to some default community rules for free if you want to use smart inside of your inside of your own network the service is of course imap as we saw before there's also a reference url in there from mitre with a attack technique documented online so you're going to find this imap brute force attempt as one of the techniques documented in the on the mitre database the class type is considered to be a suspicious logon attempt of course and we also have an sid which is the signature identifier because this is going to show up in pretty much all the log messages that are generated whenever traffic matches this signature. So not so hard to read actually, right? It kind of tells a story if you pay close attention to it. Okay, a lot of content in this video, but make sure you understand all the potential sources of logs. the potential formats, how they should be stored, how they should be transferred. Know just a bit about the CSS log protocol. The exam is most likely going to ask you at least one question about it. Make sure you are able to explain the difference between a firewall log, a proxy log, an IDS log, a web application firewall log, how they should look like, just by thinking about what type of device that is and what kind of traffic or what type of malicious intent. They're designed to detect. All right. So I hope you found this useful and informative. Thank you so much for watching. See you next time when we're going to talk about CMs. That is security information and event management solutions. Like and subscribe and see you next time.