Transcript for:
Cracking Caesar Cipher with Frequency Analysis

in this lecture we are going to talk about how to crack Caesar cipher with the help of frequency analyzes so basically we are going to use the same source code we have seen in the previous lecture so we are going to iterate through the cipher text and we are going to count the frequency so the occurrence of every single letter in the plaintext and as you can see we keep incrementing the value so we are going to use a dictionary with the letter itself the letter is the given character I p.w that and something like this and the frequency is the number of occurrence in the given text okay and then we are going to plot this distribution with the help of more closely pylab okay and we use this message so this is the ciphertext we would like to decrypt so what do we have to do this is what we have been talking about in a theoretical section that first we have to calculate the relative frequency distribution of the ciphertext letters then we have to get the second most frequent letter because the first one is going to be white spaces so we have to find the second most frequent letter and we can get the key based on a very very simple formula the key is going to be the value of ciphertext most frequent letter or sorry for that second most frequent letter minus the value of e this is what we have been talking about that Caesar cipher shifts all letters with the same key which means that this operation doesn't alter the distribution of the letters so this distribution is going to be approximately the same but it is going to be shifted with a given number of letters okay so we are going to analyze the frequency distribution in order to find the key for the Caesar cipher okay so this is the ciphertext and we just have to calculate the relative frequency of letters within the ciphertext so if we run the algorithm sees crack frequency we are going to get the frequency distribution of the letters and this is what we have been talking about that the most frequent letter is going to be the whitespace character so we are after the second most frequent letter and the second most frequent letter is letter I so what is it mean that in the original English text the most frequent letter is e and in the ciphertext the most frequent letter is I so we can come to the conclusion that E is transformed into I F is transformed into J and so on so we shift every single letter one two three four steps to the right what does it mean that in the Caesar cipher the value of the key is equals to four so if we use this ciphertext with key 4 then we should get the original plaintext let's test whether it's working fine or not as you can see it is working fine so it is the original tax that my name is marshal sir I'm from Beauty past Hungary ok you may post the question that why are there Z values and basically it is because of the special characters such as the exclamation mark such as the comma as you can see so these special characters are going to be transformed into a letter India for that but as you can see the text itself is going to be the same so my name is Bo lash whole surround from Budapest Hungary I'm qualified as a physicist and so on so this is the plain text and with the help of frequency analyzes we can come to the conclusion that was the most frequent letter in the ciphertext and if we know was the most frequent letter e in the ciphertext then we can have a good guess that how many characters to shift in order to get the most frequent letter from e in this case the distance between letter I and letter e is 4 so we can come to the conclusion that the key in Caesar cipher is equals to 4 and this is exactly what's happening so we are able to crack Caesar cipher in a brute-force manner and we are able to crack Caesar cipher with the half of frequency analyzes and this is exactly what we have been talking about that we are able to crack Caesar cipher because of information leaking what is the information the information is the relative frequency and distribution of the letters in the cipher text so this distribution is the information leaking itself and what's extremely important that Caesar cipher will not be more secure if we repeat the operation so for example if we use Caesar encryption with key to and then Caesar encryption with key 3 then the crypto system is not going to be more secure because it is the same as using just a single Caesar encryption with key value 5 so if we use Caesar encryption several times it is not going to make the crypto system more secure there are other approaches such as data encryption standard or advanced encryption standard for those algorithms if we use encryption several times then the crypto system will be more secure as far as Caesar cipher is concerned Caesar cipher will not be more secure if we repeat the operation several times so that's all about Caesar cipher and cracking the Caesar cipher thanks for watching