Transcript for:
Understanding HTTP and HTTPS Fundamentals

Hey everyone, welcome back to another video here on TryHackMe. I'm John and today we're going to be taking a look at the room HTTP in detail. Learn about how you request content or how you request content from a web server using the HTTP protocol. And if you watched the DNS in detail room, I spoke a little bit to this previously. However, we don't have to worry about that too much quite yet. Let's go ahead and move into task one. What is HTTP or HTTPS? What is HTTP? Hypertext Transfer Protocol. HTTP is what's used whenever you view a website. Developed by Tim Berners-Lee and his team between 1989 and 1981, HTTP is a set of rules used for communicating with web servers for the transmitting of web page data, whether it is HTML, images, videos, and so on and so forth. How about the secure version? HTTPS is the secure version of HTTP. HTTPS data is encrypted so it is not only stops people from seeing the data you are receiving and sending, but it also gives some insurances that you're talking to the correct web server and not something impersonating it because there's a signature field there. Don't worry too much about that, just know that there are systems built in there just to make sure that this is as safe as it can be. Answer the questions below. What does HTTP stand for? That stands for Hyper Text Transfer Protocol. And there we go. What does the S in HTTPS stand for? That'll be secure. On the mock webpage on the right, there's an issue. Once you found it, click on it. What is the challenge flag? So let's see. The biggest thing that's jumping out here is this lock. That's indicating that it's not secure because of DHTTP. That is a security issue. So we can go ahead and highlight this and don't worry if this is a little bit small. We'll paste that in there and we can see that we have the flag THM invalid HTTP cert. And there we go. Let's move into task two, requests and responses. When we access a website, your browser will need to make requests to a web server for the assets such as HTML, images, and to download the responses. Before that, you need to tell the browser specifically how and where to access these resources. This is where URLs will help. What is a URL? Short for Uniform Resource Locator. If you've used the internet, you've used a URL before. A URL is predominantly an instruction on how to access a resource on the internet. The below image shows what a URL looks like with all of its features. It does not use all features in every single request though. First we have the scheme here at the start. This instructs on what protocol to use for accessing the resource, such as HTTP, HTTPS, FTP, and so on and so forth. We have the user, which is our user, and then we have our password here separated by a colon. Some services require authentication login, and you can put a username and password into the URL to log in. Then we have the host, which is the domain name or IP address of the server you wish to access. That's going to be right here. We have the actual port that we're communicating on, which is what we're going to connect to. Usually, this is going to be 80 for HTTP and 443 for the secure version. But this can be hosted on any port between 1 and 65,535. But traditionally, this is going to be what they live on. Either that or 8080 is another common port to see. But we won't go too far into that. That's something that's a little bit more in the realm of CTFs. Then we have the path. which comes after this forward slash. And that's the file name or location of the resource you're trying to access. We have the actual query string, which is the extra bits of information that can be sent to the requested path. For example, the blog, and then we have question mark, ID equals one, would tell the blog path that you wish to receive the blog article with the ID of one. And then we have the fragment. This is a reference to the location, and we can see it up here, on the actual page requested. This is commonly used for pages with long content and can have certain parts of the page directly linked to it, so it is viewable to the user as soon as they access the page. Don't worry too much about this, just know that we need to know this part primarily, as well as this part. Let's go into making the request. It is possible to make a request to a web server with just one line, git forward slash http forward slash 1.1. So we have the request method right here with git. the page being requested so the actual route and then the protocol and the actual version there but for a much richer web experience you'll need to send other data as well This other data is sent in what's called the headers, where headers contain extra information to give the web server you're communicating with. But we'll go into this in a little bit more detail in the actual header task. And here we have an example request below. And here's a couple of those headers. To break down each line of this request, on line one, this request is sending the get method, more on this in the HTTP methods task, request the homepage with the forward slash, and telling the web server we are using the... HTTP protocol version 1.1. Pretty common web request, pretty straightforward. On line two, we tell the web server we want the website tryhackme.com. We can see that right there with the host. And then on line three, we tell the web server we are using Firefox version 87 for the web browser. This is our user agent. On line five, we are telling the web server that the web page that referred us to this one is HTTPS forward slash forward slash tryhackme.com. And then line five, it looks like it might not be there. Well, rather it is there. HTTP requests always end with a blank line to inform the web server that the request is finished. So it doesn't display right here, but there's an implied blank line. And here we have an example response. This is what's going to come back after we make that request. To break down each line of the response. Line one, we have the version and then we have our status code. In this case, it's 200 OK. which tells us the request is completed successfully. 200 is pretty common. 200s and 403s are going to be probably some of the most codes, most common codes that you see, along with 302. Don't worry too much about the other types of codes. We'll go into those a little bit later, I believe. Line two, this tells us the web server software and version number. This won't always come back. Line three has the current date, time, and time zone of the web server. On line four, the content type header tells... the client what sort of information is going to be sent such as html images videos pdfs xml and so on and so forth this just says hey i'm sending this this is what you should expect and then we have the actual link the content that's enclosed which is going to be on line five line six the html response can plan or contains a blank line to confirm the end of the http response and then line 7 through 14 has the actual information that we requested so there's the website pretty uh straightforward here what http protocol is being used in the example above that is going to be http and then forward slash 1.1 and there we go what response header tells the browser how much data to expect that is going to be what we saw on line 5 here the content dash length and if you're interested in learning a little bit more about this There is a web hacking technique called request smuggling. James Kettle, I believe, I might have just messed up his last name. He works with Portswigger, the people that make Burp Suite. He has a very, very well done talk on content length and how you can mess with this and how you can do a lot of really cool things. Again, that is request smuggling. I definitely recommend looking into it. If it goes a little bit over your head, don't worry about it. Just knowing about it is a good thing in the first place. That being said, let's move into task three, HTTP methods. HTTP methods are a way for the client to show their intended action when making an HTTP request. There are a lot of HTTP methods, but we'll cover the most common ones, although mostly you'll deal with the get and post methods. First, we have the get request, which is used for getting information for the web server. It says, hey, I want this, and the web server will typically respond with, Are you authorized for it? And it'll send it back. Then we have the post request. This is used for submitting data to the web server and potentially creating new records. So we're posting, we're sending data to it. We have the put request, which is used for submitting data to a web server to update information. This is more situational. Don't worry too much about this one. And then we have the delete request, which is used for deleting information and records from a web server. Again, more situational. The main ones you need to know are these two. What method will be used to create a new user account? That will be a post method. What method will be used to update your email address? For updates it's going to be that put method. What method would be used to remove a picture you've uploaded to your account? That would be delete. What method would be used to view a news article? And that will be get. And there we go. Let's move into task 4, HTTP status codes. Let's go ahead and click on this so we can view the site. In the previous task, you've learned when an HTTP server responds. The first line always contains a status code informing the client of the outcome of their request and potentially how to handle it. These status codes can be broken down into five different ranges, and I recommend adding this to your notes if you're taking notes alongside of this. For now, I'm just going to go over an overview because this is really the important information. So the 100s are going to be an informational response. It'll tell you something about what comes back. Don't worry too much about this. These are not very common. This is the main one that you're going to see. The 200s are success. This can be success, but it'll have a little bit more information with it. So it might have a caveat of success, but maybe an update or something like that. 300s are going to be redirection, which means that the web server or the requested resource has been moved or to a different webpage altogether. 400 is going to be, hey, you did something wrong on your end. There's a client error and then 500s, which means that something broke on the server's end. If this ever happens, it's probably worth digging into because these are interesting errors. Again, the main ones you're going to see are these. So client errors can range from things like something is actually malformed in the request to you don't have permission to access that. Again, very common. Common HTTP status codes. There are a lot of different HTTP status codes. and that's not including the fact that applications can even define their own. We'll go over the most common HTTP responses you are likely to come across. So first we have 200, which is okay, just means the request was completed successfully. Then we have 201, which is created, a resource has been created, for example, a new user or a new blog post. 301 is going to be a permanent redirect. This redirects the client's browser to a new web page or tells search engines the page has been moved and or somewhere else and to look there instead. Then we have 302 which is a temporary redirect similar to the name the above permanent redirect but as the name suggests this is only a temporary change and may change again in the near future. Then we have 400s which is an actual bad request so that's where we get into the actual malformed bit. This tells the browser that something was either wrong or missing in the request that was received. This could sometimes be used at the web server resource that is being requested. expected a certain parameter and the client didn't send it. So for example, with the blog, we requested the ID equals one. What if we didn't include that ID? It might send back a, hey, what are you asking for with a 400? Next we have 401, which is the not authorized. You are not currently allowed to view this resource until you have authorized the web application, most commonly with a username and a password. Then we have 403, very, very similar to 401 because it's in that 400 category. You do not have permission to view this resource whether you are logged in or not. 405, which is the method not allowed, this resource does not allow this method request. So for example, you send a get request to a resource for creating an account. When it was expecting a post request instead, it's going to say, hey, what are you doing? And it'll send back a 405 error. Next, we have the 404 page request. The page slash resource you requested does not exist. Then we have 500, the internal service error. This is a very interesting one, keep an eye out for this one. The server has encountered some kind of error with your request and it doesn't know how to handle it properly. This can usually be indicative of a deeper problem or something has gone very wrong. And then we have 503, service unavailable. The server cannot handle your request as it's either overloaded or down for maintenance. I recommend taking this block along with the block above it and put those in your notes. just really good to start memorizing especially if you're going into web hacking click the view site button on the top right which we've already done to see some of these http status messages in the browser and we can see that we have a couple example ones here so first we have 403 which is forbidden 404 it's not there and then 503 it's temporarily unavailable what response code might you receive if you've created a new user or blog post article that'll be 201 created What response code might you receive if you've tried to access a page that doesn't exist? That will be the good old 404 error. What response code might you receive if the web server cannot access its database and the application crashes? Not great. That should be, I believe, a 500 error. Maybe not. Let's see. Let's try 503 for that. And there we go. And what response code might you receive if you try to edit your profile without logging in first? That should be, I believe, a 401. Let's double check. Not authorized, so we need to log in. We'll go ahead and submit that, and there we go. Let's move into task five, headers. Headers are additional bits of data you can send to the web server when making requests. Although no headers are strictly required when making an HTTP request, you'll find it difficult to view the website properly otherwise. Common request headers. These headers are sent from the client, usually your browser, to the server. First we have the host header. Some web servers host multiple websites, so providing the host headers, by providing it you can tell it which one you require. Otherwise you'll just receive the default website for that server. Next we have the user agent. This is your browser software and the version number. Telling the web server your browser software helps it format the website properly for your browser. And some, also some elements of HTML, JavaScript, and CSS are only available in certain browsers. This mostly comes up when Internet Explorer, as much as it shouldn't exist still, gets in the mix. Internet Explorer doesn't support a lot of things and isn't very well capped up. If you look up user Internet Explorer compatibility issues, you should find plenty of them. Just look through that. It'll give you a good idea of why this is really important. Next, we have content length. I talked about this quite a bit. It's going to be the length of the actual data that is in there. And this is done as a way to ensure that it's not missing any data so that the entire request was actually received or response in that case. And then accepting coding. This tells the web server what types of compression methods the browser supports. So the data can be made smaller for transmitting over the internet. And then last but not least, we have cookies, which is the data sent to the server to help you remember your information. See the cookies task for more information on this, and we'll talk about this a little bit more later. Common response headers. These are headers that are returned to the client from the server after requests. First, we have the set cookie header, information to store, which gets sent back to the web server on each request. Again, we're going to talk about this a little bit later on. Think of this as sort of your hall pass or your authentication card. You present it every single time you go through the door, and it says, hey, this is valid. It gets checked by the server, hopefully, and it's a way to authenticate. Cache control, how long to store the content of the response in the browser's cache before it requests it again. This saves on bandwidth and was much more important in the early days of the internet. Not a big deal at this point, but it still saves a bunch of data. Content type. This tells the client what type of data has been returned. We talked about this already in a previous task. This can be a source of error. Don't worry about this too much right now, but this is something that as you get into web hacking, you will become very much more used to because you can do a lot of bad things with that. And then content encoding. What method has been used to compress the data to make it smaller when sending it over the internet? Answer the questions below. What header tells the web server what browser is being used? That is going to be our client or user agent rather. So user dash agent. And what header tells the browser what type of data is being returned? That is going to be the content type. And what header tells the web server which website is being requested? That is going to be the host header. And there we go. Let's move into task six, cookies. Let's go ahead and click on the view site button. You've probably heard of cookies before. They're just a small piece of data that is stored on your computer, ironically in a spot that's named the cookie jar. That is a real thing and is something that your browser uses. Cookies are saved when you receive a set cookie header from the web server. Then every further request you make, you'll send the cookie back to the website server. This is a little bit finer grain detail and you'll get into this when you start learning about the finer ins and outs of cookies. but generally speaking it'll be sent back to every uh in every request to that same website as long as it matches because http is stateless and it doesn't keep track of your previous requests cookies can be used to remind the web server who you are some personal settings for the website or whether you've been to the website before let's take a look at this as an example http request so here we can see that we have a get request and we have just our basic headers here and we can see that we have our response and a little bit of data. We are going to post a little bit more data, which is our actual login here. So we're logging into the website and we have our cookie being returned. It says, hey, you've logged in. Here's a cookie. Put it in your cookie jar. Send it back whenever you're making another request because it says, hey, you are still logged in. And here we can see that we're doing exactly that, where we're making a request to the website and we send our cookie along. And the website responds saying, hey, welcome back. You logged in. You clearly have authenticated because you sent the cookie. Cookies can be used for many purposes, but most commonly are used for website authentication. The cookie value won't usually be a clear text string where you can see the password, but a token, but rather this will be a token, a unique secret code that isn't easily human guessable. And if you ever want an example of this, take a look at your web browser's cookies. You can download an extension that's like cookie editor or things like that for Chrome, and you can play around with it a little bit there. And even then, we're going to talk about how to view your cookies here. Now we have viewing your cookies. Just to digress on the topic I just touched on. You can easily view what cookies your browser is sending to a website by using the developer tools in your browser. If you're not sure how to get the developer tools in your browser, click on the View Site button on the top of this task for a guide. So here we can see that we have a guide here. And this will show how to actually get it with Firefox. And it looks like just Firefox, but pretty straightforward. Once you have developer tools open, click on the Network tab. This tab will show you a list of all the resources your browser has requested. You can click on each one to receive a detailed breakdown of the request and response. If your browser sent a cookie, you will see these on the cookies tab of the request. Let's take a look at the question here. Which header is used to save cookies to your computer? That is going to be the set cookie header, as long as I typed it correctly. And there we go. Let's go ahead and move into task 7, making requests. We'll go ahead and click on the view site button there. This is an emulator for making demo HTTP requests. Using what you've learned from the tasks above, you can use it to complete the questions below. Make a get request to forward slash room. And here we can see that we can change our HTTP method up here, update your parameters, and it looks like there's something else covered. And then we can click here to actually send the request. All right, so we want to make a get request to forward slash room. And we'll send that now. And we can see that we have our response with this flag. So we'll copy that, paste it there, and move on to the next question. Make a git request to forward slash blog, and using the gear icon, set the id parameter to 1 in the url field. So this will be blog, and then we want to use the gear here, where we have the id equals a value of 1. We can save that, close out of it, and those parameters will be sent. And there we go. We can see that we've sent it with the parameter id equals 1. and we have our flag back. We'll copy that, put it in here, make a delete request to user forward slash one. So we'll do delete and then make sure that we aren't sending any of these. And then we need user forward slash one. And there we go. We have our flag, pretty straightforward. Paste that in there, make a put request to user slash two. With the username parameter set to admin, we'll do put and then user2 and then let's see username equals admin. We'll have to save that, send that along, and we can see the username has been changed to admin. We have our flag if I copy the entire thing. Paste that in there and we're on to our final question. Post the username of THM and a password of let me in to forward slash login. So we want to do a post and we'll change this to login and we'll change our data. Let's see. We want to get rid of that and we'll do a username of THM and then we want a password of let me in. And there we go. Let's go ahead and send that request. And we can see that we've been logged in, and we can see the request rendered down here below. Copy that, and there we go. That's going to do it for this room. As always, the TryHackMe Discord and subreddit will be linked in the video description below. If you have any questions, feel free to join those and ask in the appropriate help chat. Otherwise, until next time, happy hacking.