Cryptography is a technology we use many times a day, but we often don't even think about what's happening behind the scenes. If we encrypt data and we send it out to someone else, is that information really secure? And how would we know if that information is secure?
We need to investigate how the cryptography works to be able to understand how secure it might be. With most cryptography, the piece of information that is the difference between something being secure and insecure is the key that's being used. during the encryption process. But very often, the attackers don't have access to your encryption or decryption key, and they have to find other means to be able to gain access to that data. Instead of trying to find the key to the safe, they'll instead start attacking the safe itself.
So we have to examine the cryptography that we're using to make sure that it is secure and can't be attacked in any other way. One interesting part of cryptography is we tend to make these protocols and algorithms public so that everyone can examine exactly the way this cryptography works. This also means that we should be able to find weaknesses or workarounds within the cryptography that would allow us access to the data without the key.
Of course, if we do find a workaround, we immediately choose not to use that cryptography any longer. This means that the cryptography we use today has withstood the test of time. and we can continue to trust these algorithms to protect our data.
Since we know the algorithms are secure, the attackers now focus on the implementation of the cryptography. And very often, it's the incorrect implementation of the cryptography that ultimately provides the weakest link that allows the attacker to gain access to our data. Let's first look at attacks that we can do against the algorithms themselves. The first attack type we'll look at is a birthday attack. And here's the question.
In a classroom of 23 students, what is the chance of two students sharing a birthday? The answer is about 50%. That means if you've got 23 students in a room, you've got a 50-50 chance of someone in that room sharing their birthday with someone else.
Remember, we're not asking if one student is sharing their birthday with anyone in the room. We're trying to find out if any student is sharing their birthday with anyone else. That means if you increase the number of students to about 30, the chance goes up to 70%. In the world of cryptography, the same thing applies except on a much larger scale.
We refer to this birthday attack as being a hash collision, where you have two different plain texts, and they both result in exactly the same hash. This is usually found through a brute force process, which means that you would have to try every possible plain text and compare it to every resulting hash to see if you ever have any duplicates. One way to prevent the attacker from using this method of finding multiple identical hashes is to use a very large hash output size.
The larger the hash, the more difficult it will be to duplicate that specific hash. Ideally, if we have two different plaintexts, we should have two different hashes. If we have a single hash that is exactly the same across two different plaintexts, then we have a collision.
Unfortunately, this is exactly the situation we found. With the MD5 hashing algorithm, this is the Message Digest Algorithm 5. This was first published in April of 1992, and researchers found collisions in 1996. This became much more important in December of 2008, when researchers created a certificate from a CA that appeared legitimate when the MD5 hash was validated. This means they were able to create a certificate that was digitally signed by a certificate authority, but in reality, this had never been signed by the CA.
So the industry very quickly realized that MD5 would not be a good hashing algorithm to use in the future. And we very quickly transitioned to using different types of hashing algorithms. Here's what a hash collision really looks like. I have two types of plaintext that I'm going to put into an MD5 hashing algorithm. And you can see that the plaintexts are almost identical.
You can see they both start with delta 1, 3, 1, delta, delta. and everything that is the black letters are identical between both of those plaintexts. But there are minor differences between the plaintext.
You can see that the letters that have a bold red are different between both of those plaintexts. Because these are different, we would expect the resulting hash to also be very different. But if you put both of these plaintexts into the MD5 algorithm, you end up with exactly the same hash, which means we found a collision. In our previous example, it was the MD5 algorithm itself that had shortcomings that would create these hash collisions. But earlier, I also mentioned that it's the implementation of the cryptography that can often create these types of attacks.
For example, the downgrade attack is an attack type that uses a perfectly secure algorithm, but it's the implementation of the algorithm that creates the attack. The purpose of a downgrade attack is for the two devices that are trying to send encrypted data to either use a weaker encryption algorithm or not encrypt any of the data at all. One common form of downgrade attack is SSL stripping. This is a combination of an on-path attack, so you're sitting in the middle of a conversation.
And because you're sitting in the middle of the conversation, you're able to perform a downgrade attack. Like most on-path attacks, this can be challenging to be able to implement. because the attacker really does need to be in the middle of the conversation.
But if the attacker is able to sit in the middle of this conversation, it can send back information to the victim's browser page that the page that they're trying to visit really is not encrypted. So you don't need to request an encrypted form of the page. Simply send all of the data across the network without any type of encryption. This means the victim will use the non-encrypted HTTP protocol rather than the more secure encrypted HTTPS. Here's how SSL stripping works.
Normally, you would have a website visitor that communicates directly to a web server. But with SSL stripping, there is an on-path attack. So there's an attacker that sits in the middle of this conversation that normally would not be there.
It's this attacker in the middle who's stripping out the encryption part of the HTTPS to provide simply HTTP to the victim. Let's go through the steps that a website visitor might use to log into a web server. But let's include the attacker in the middle to do the SSL stripping.
The first step is for the website visitor to send a GET request to the web server. This GET request would include HTTP instead of HTTPS, which starts the process of the HTTP stripping. This request is made over HTTP and not HTTPS. Although normally this is changed by the web server, the initial request being made with HTTP means that the attacker can now take advantage of SSL stripping.
The attacker will act as a proxy in the middle of this conversation. And for the very first request, which is that HTTP request, it will simply pass through that request to the web server. The web server recognizes that this request is being made in the clear. There's no encryption because the user is using HTTP.
So it will send a message back to the user saying, let's use a different web page that includes HTTPS. Obviously, the attacker doesn't want that information to get to the user, so it simply doesn't send that answer back to the user's workstation. Instead, the attacker sends a second request to the web server, but this one has the HTTPS included as part of the conversation.
Since the attacker is the one initiating this conversation, it has complete access to all of the data that would go back and forth over that encrypted connection. So the web server sees the proper request is being made to the HTTPS page, and it sends back an OK message saying, everything looks great. The attacker then sends the OK message to the website visitor.
That's the response to the initial get that occurred. And notice that is sent over HTTP and not the original HTTPS. The second step would be for the user to log in.
So they'll send their username and password information. But again, they're sending it over HTTP because they were never redirected to HTTPS. The attacker sees this information being sent, can read everything in the request, including the username and password, and simply uses that username and password to log into the web server with the appropriate HTTPS. The web server responds with an acknowledgment that the login was successful, and the attacker simply passes that acknowledgment back down to the visitor.
From this point on, all subsequent communication will take place. in the clear between the website visitor and the attacker. And of course, all of the information between the attacker and the web server will continue to run over the encrypted HTTPS. This means that anything that is sent back and forth between the visitor and the web server will always be captured, viewed, and in some cases modified by the attacker sitting in the middle of this SSL stripping.