Transcript for:
Week 1: Introduction to C Programming

[INTRIGUING MUSIC] DAVID MALAN: All right, so this is CS50. And this is week 1, zero index, so to speak. And it's not every day that you can say that you've learned a new language, but today is that day. Today, we explore a more traditional and older language called C. And rest assured that even if what you're about to see-- no pun intended-- looks very cryptic, very unusual, particularly if you're among those less comfortable, cling to the ideas from last week, week zero, wherein we talked about some of those fundamentals of functions and loops and conditionals, all of which are coming back today. Indeed, whereas last week, and with problem set 0, we focused on learning how to program with Scratch, which, again, you might have played with as a younger student days back. Today, we focus on C instead. But along the way, we're going to focus, as always, frankly, on learning how to solve problems. But among the goals for today and really on an entire class like this is just to give you week after week all the more tools for your toolkit, so to speak, via which to do exactly that. So for instance today, we'll learn how to solve problems all the more so with functions, as per last week. We'll do the same with variables. We'll do the same with conditionals, with loops, and with more. But we'll also learn at the end of today's class really how not to solve problems. It turns out as powerful as Macs, PCs, cell phones are nowadays, there's actually certain things that they can't do very well and information they can't represent very well. And that actually leads to a lot of real-world problems, both past and surely future. So more on what we're not going to be able to do with programming before long. But beyond that, let's come back to this picture here. So this was the very first program that I wrote, that you wrote presumably in some form. And all it does is say "Hello, world." But as promised, today, this puzzle piece, or these puzzle pieces together, are going to very quickly start to look more like this. And I've deliberately color coded it in a way so that the text on the screen now kind of resembles the puzzle piece. So if I go back, notice that we had this, when green flag clicked puzzle piece, mostly in yellow with the green flag, that sort of kicks off the whole process once you actually click the button at top right of Scratch's user interface. And then there's the purple block which actually is the verb, the action, the function that does something. So if I bring us back over to what we're about to see today, there's going to be some boilerplate, so to speak, some orange text here on the screen that for now you just type and take for granted, like you need to write your code like that. But more interesting is going to be the purple. And we're going to see today that the function previously called "say" in Scratch is now called "printf" in this language called C. But in white here, you'll see similar text to our white oval last week, whereby that's where user input, like your input as the programmer, can actually go. So there's a lot of distraction. And honestly, it's these kinds of things that tend to distract and get frustrating early on when learning to code for the first time. But the ideas, most importantly, are going to be the same. So how are we going to go about using this. Well, it turns out, like last week, you're going to start writing something called source code. So code as we know it, quote, unquote, is more technically called "source code." That's what you and I as humans actually write. And indeed it might look a little something like we just saw. But unfortunately, computers only speak this, binary-- zeros and ones-- more properly known as machine code, in other words, those same patterns of zeros and ones last week, someone guessed, print out "hello, world" on the screen because one of those patterns is an H. Another pattern is an E, an L, and L, and an O, and so forth. And then other patterns of those zeros and ones are commands or instructions to the computer that literally say, show H-E-L-L-O comma "world" on the screen. But machine code would not be nearly as much fun to write if it were indeed in zeros and ones. Entirely for us, ideally, you and I are going to write source code, which conceptually is sort of up here, high level. But we're going to need a program to convert it to the lower-level machine code so that we don't spend our lives actually having to read and write zeros and ones, which back in the day, kind of in yesteryear, you kind of did with things called punch cards and holes on physical sheets of paper. We're beyond that because after years and years of innovation, folks have given us higher-level languages instead. So here's what we're going to need to do today. If at the end of the day you and I are writing source code but we want machine code as output, we need something in the middle that's going to convert that source code to machine code. You and I are not going to have to learn or talk about really any more zeros and ones. And the type of program we're going to start using today and introduce you to is called a compiler. So a compiler is a program that translates one language to another. And it can be any two languages. But today, and often, we'll talk about it in the context of source code to machine code. So this is Apple or Google or Microsoft or folks from other companies or even volunteers who have written software that do this conversion. You and I are essentially going to download a free compiler and use it to actually get our computer to understand the source code that you and I write in these higher-level languages. So where are we going to do that? Well, we could actually give you instructions and you could download the appropriate free open-source software onto your own Mac or PC. The reality is that creates so many technical support headaches because we all have slightly different computers. We all have slightly different versions of Windows or macOS or Linux or other operating systems. And that, too, tends to be a distraction at the beginning of any course like this or learning programming. So we're going to use the cloud instead. We're going to use a URL of the form https://cs50.dev. And what this will do for you is put inside of your browser window absolutely everything you need for the course, but it's going to use software, software called Visual Studio code, otherwise known as VS Code, that's actually free itself. It's very popular in industry. It's what "real" programmers use every day. But it's a cloud-based version thereof. And so everything will just work for you out of the box. But toward the end of CS50, the goal is going to be to get you off of CS50's infrastructure, to get you to download this freely available software onto your own Mac or PC if you so choose so that those training wheels, so to speak, can come off. And then even if you never take another class again, you don't need any class's infrastructure moving forward. You'll have everything you want and need on your own Mac or PC. But for now, it'll save us a bit of time. So in just a bit, I'm going to go to that URL myself on my computer. And I and you will see a user interface that looks a little something like this. The colors might be different based on your settings. Fonts might be different, and so forth. But in general, it consists of a few different regions. So over here at the top is where we are going to start writing code today. So it's a tabbed interface like any number of programs nowadays. And this is that same C code we saw a moment ago. So this is where, in a moment, I'm going to start to type it. Over here at the bottom is what we're going to call a terminal window, or a console. And the terminal window is where we're going to type commands for compiling our code, for running our code. And we'll see today a contrast between a graphical-user interface, or GUI, which has menus and icons and things you click and are very familiar with, versus a command-line interface, or CLI. And so we're using both of these together. And command-line interface just means, down here, you only use your keyboard. You can click, click, click if you want with your mouse. It's not going to generally do much because a command-line interface takes commands at the keyboard. So in a weird sense, it's going to feel like taking a step backwards from the Macs, the PCs, the iPhones, and Android phones we all have, which are very graphical. But it turns out, once you become a "computer" person or a programmer, you can be a lot more productive, a lot more efficient, I dare say, by learning to harness the command-line interface and using both types of interfaces for what each is good at. So more on that in just a bit. Over here at left, you're going to see soon a folder interface like Mac OS or Windows where any of the files or folders we create in CS50 are going to end up, as well. So it gives you the best of both worlds. You can point and click on the left, or you can type commands at the bottom, as we'll soon see. And then along here is the so-called activity bar, where there's just VS Code-specific features but also CS50-specific features. And if you're in your own version of CS50.dev, you click through in the dot dot dot menu or zoom out so you can see everything. You'll see CS50's own rubber duck, virtually speaking, that will be there throughout the course to answer any and all of your questions, as well. So more on that soon, too. So here's the code that I propose that we write first, just like we wrote our very first Scratch program to say "hello, world." So let's go ahead and do exactly this. I'm going to switch over to this screen here, where I've already logged into CS50.dev on my computer. And just to keep the focus on the code, I've hidden the activity bar. I've hidden the File Explorer, so to speak. So you're seeing here the area where all of my tabs are about to go and the terminal window, where all of my commands are going to go. But I've just simplified the UI to keep our focus on the interesting parts for now. So how do I go about actually writing and compiling and running some code? Well, the teaser is going to be these three steps. One of these is a command called, aptly, Code. And Code is just going to let me to open or create a new file, like a file called "hello.c." Make is going to be, for now, my compiler that allows me to make the program, that is convert source code into machine code, so from C to zeros and ones. And then weirdly, but we'll soon see why, ./hello is going to be the command to run my actual code, so the textual equivalent of like double-clicking on a Mac or a PC icon or tapping an icon on your phone. So that's it. These three commands are going to allow me to write, to compile, and to run code ultimately. So let's go ahead and do that. I'm back in my VS Code interface. I'm going to go ahead and run "code hello.c." And notice a couple of details here. So one, there's this weird dollar sign, which has nothing to do with currency, but it's just a common convention in the programming world to represent your prompt. So if a TF, if I ever say, go to your prompt, we really mean, go to your terminal window. Go to the dollar sign. And the dollar sign is where you type the command. Sometimes it's a different symbol, but a dollar sign is conventional. Now that I've typed "code" space "hello.c," I'm going to go ahead and hit Enter. And maybe not surprisingly, this gives me a brand new tab, a new file if you will, called "hello.c." And just like Word documents have their own file extension, like DOC, DOCX, and Excel files have .XLSX and PDFs have .PDF and GIFs have .GIF and so forth, so do C files have a file extension by convention that is .C. Now, a couple of minor points. Notice that, by convention, I'm almost always going to name my files in lowercase. By convention, I'm never going to use spaces in my file names. And my file extension, too, is going to be lowercase. Long story short, accidentally hitting the spacebar or using file names with spaces just tends to make life harder when you're in a command-line environment. So just beware silly, stupid things like that. So all lowercase, no spaces for now. So my cursor is literally blinking because the program wants me to write some code. I'm going to do this from memory. It'll take you presumably some time to acquire the same instincts. But I'm going to go ahead and type this first line here, pronounced "include standard io.h"-- more on that soon-- int main(void), with some parentheses thrown in. Notice what's about to happen here is a little interesting. In the code I want to type, I want what we'll call curly braces, the sort of squiggles that you don't use often in English, at least, but are there on your keyboard somewhere. But notice what VS Code does, and a lot of programming environments, is it finishes part of my thought. So I'm only going to type a left curly brace, but notice I actually get two of them. And if I hit Enter, notice that not only does it scooch one down a bit, it also indents my cursor because, just like with pseudocode last week, whenever you're doing something logically that should only happen if the thing above it happens, similarly is indentation going to be a thing when we actually write code. So VS Code and programs like it just try to save us keystrokes so I don't have to waste time hitting the spacebar or hitting Tab or wasting my human time like that. All right, so with that said, I'm going to go ahead and type the last of these lines, "printf," where the F is going to mean "formatted," and then a parentheses. And notice it gave me two. It gave me the second one for free. Sometimes it will get confused. And you can certainly override this, delete it, and start over. And now, unlike Scratch, in C, It turns out I'm going to need to use double quotes anytime I'm using an English word or phrase or any human language for that matter. "Hello" comma "world." And then at the very end of my line, much like English uses periods, I'm going to use a semicolon in C. So that's a lot of talking, but it's not much coding. It's technically six lines of code. But honestly, the only interesting one intellectually, as we'll soon see, is really line 5. Like, that is the equivalent of that, say, block. Now here's where I'll cross my fingers, hoping that I didn't make any typographical errors. It's going to automatically save for me. And I'm going to go back to my terminal window where now I'm going to do that second command, "make" space "hello." Common mistake-- you do not say "make hello.c," because you already made that file. You say "make hello," which is the name of the program that in this case I do want to create. And Make is smart. It's going to look in my folder. And if it sees a file called "hello.c," it's going to convert that source code to machine code and save the results in a simpler shorter-named file just called "hello," like an icon on your desktop. Now, hopefully nothing will happen. And that is a good thing, quite paradoxically. If you do anything wrong when programming, odds are you're going to see one or many more lines of error sort of yelling at you that you made a mistake. Seeing nothing happen is actually a good sign. So the last command, to run my code, recall our three steps here. We've written code to create the file, Make to compile the file from source code to machine code. So lastly is "./hello." So this now is the equivalent of my double-clicking on a Mac or PC or single tapping on a phone. Enter. [SIGHS] So close! All right, it's pretty good. I got the H-E-L-L-O comma space "world." But there's something a little stupid about my output. What might rub some of you aesthetically the wrong way? Yeah? STUDENT: The dollar sign. DAVID MALAN: Yeah, so the dollar sign looks like I was like, "hello, world" dollar sign in my output. But no, that's just kind of a remnant of my prompt starting with a dollar sign. And this is a little nitpicky, but this just doesn't feel right, doesn't look right. It's not quite correct. So how can I go about fixing this? Well, here's where, at least initially, it's going to take some introduction to just new syntax in C to fix this. The simplest instinct might be to do this. Well, let me just hit Enter like that. But this should soon, if not already, rub you the wrong way because in general, we're going to see that programming in C and in Python and other languages tends to be line-based. Like, you should really start and finish your thought on one line. So if you're in the habit of hitting Enter like this and finishing your thought on the next line, generally programming languages don't like that. So this is, in fact, not going to do what we expect. And just to show you as much, I'm going to do this. Let me go back to my terminal window here. I'm going to rerun "make hello" after making that change. Enter. And there we have it, like the first of our erroneous outputs. And it's yelling at me. It's missing a terminating character. And there's some red in here, some green, drawing my attention to it. Sometimes these error messages will be straightforward. Sometimes you're going to rack your brain a bit to figure them out. But for now I've kind of spoiled it. Obviously Enter is not the right solution. So let me clear my terminal window just to hide that error. Let me delete this. And let me propose now that I add this incantation here. So backslash n, it turns out, is going to be the sort of magical way of ensuring that you actually get a new line at the end of your output. So let me go ahead now and rerun "make hello," because I've changed my code. I need to now reconvert, recompile the source code to new machine. "./hello." And now, there is the canonical "hello, world" program that I hoped to write in the first place. So for now, don't worry about the include. Don't worry about the standard io. Don't worry about int or main or void or the curly braces. Focus primarily on line 5 here. And over the course of today and next week, we'll start to tease apart the other characters that, for now, you should take at face value. Questions, though, on any of the steps we've just done? Yeah? STUDENT: Why is the backslash n inside the apostrophes? DAVID MALAN: Sure, why is the backslash n inside of the quotation marks, if you will? So short answer is that's just where it needs to be because inside of the quotes is the input that you want printf to output to the screen. So if you want printf, this function, to output a new line, it must be included in the quoted text that you give it. STUDENT: So the backslash n [INAUDIBLE]. DAVID MALAN: Exactly. Backslash n is a special pattern that "printf no" means, OK, I should move the cursor to the next line. Good question. Other questions on any of these steps? Yeah? STUDENT: [INAUDIBLE] DAVID MALAN: A good question. So what if you actually want to print backslash n? Things get a little tricky there. Let me go ahead and propose that we do this. So it turns out-- and this is often the case in programming-- when you want a literal character to appear, you actually put another backslash in front of it. But this is not going to be something we do often. But there is in fact a solution to that. But let me propose that beyond that now we compare it against what we've actually done. So here is the first Scratch program we wrote with the green flag there. Here, recall, is the mental model that I proposed we have for almost everything we do whereby functions are just an implementation, say, in code of algorithms, step-by-step instructions for solving problems. The inputs to functions, recall from last week, are called arguments, or in some contexts parameters. And sometimes functions can have side effects. Like last time with Scratch, there was the speech bubble that magically appeared next to the cat's mouth as a sort of side effect of using the Say block. So just like this then, we had the white oval as input. The Say block was the function last week. And then we had this here, side effect. Well, how do we compare these things left to right? Well, here's the Say block at left. Let's compare now to the C code at right. Notice a couple of things to adapt from Scratch to C. Print is almost the name of the function. It is technically "printf," for reasons we'll eventually see. Notice the parentheses in C are kind of evocative of the oval in Scratch. And that's probably why MIT chose an oval, because a lot of languages use parentheses in this way. You still write "hello, world" just as we did last week in Scratch. But per our demo thus far, you do need the double quotes-- and double quotes, not single quotes-- double quotes on the left and right. And in order to get that new line, you need the backslash n. And one more thing is missing. Yeah? STUDENT: Semicolon. DAVID MALAN: The semicolon to finish your thought. So all of these sort of stupid things now that honestly you will forget initially if you've never programmed before, but you'll soon-- within days, within weeks-- develop the muscle memory where all of that stuff just jumps off, jumps off the page right at you. All right, so this backslash n is generally known, just so you know, as an escape sequence. And so backslash n allows us to specify a character that might otherwise be hard to type. But let's tease apart some of the other things atop that function already. So include stdio.h. It turns out that in C, a lot of the functionality that comes with the language is tucked away in separate files. So if you want to use certain functions, you have to tell the compiler, hey, I want to do some standard input and output. Like, I want to print some things on the screen. And that's because, for now, you can think of printf as living in this file, stdio.h. That's a bit of a white lie for now. But in stdio.h is essentially a declaration for printf that will teach the compiler how to print things to the screen. So "hash include" here simply tells the compiler before it does anything else essentially go ahead and find on the local hard drive a file called stdio.h and copy/paste it there so I know now about printf. So this thing, this .h file, is what we'll technically call a header file. And if you've ever heard this word, especially if you have programmed before, it represents essentially what we'll start calling a library. So a library in the world of programming is just code that someone else wrote that you can use. It's usually free and open source, which means you can literally see the code that someone else wrote, or sometimes you pay for it. Sometimes it's closed source, which maybe Microsoft wrote it. They won't show you the code, but they will let you use the zeros and ones. So libraries are super useful because honestly even I don't really know how printf works. I've taken for granted for 25 years that if I use printf, stuff prints on the screen. But someone smarter than me had to actually write the code in C that figures out how to get the H, the E, the L-L-O, and so forth onto the Mac screen, the PC screen, the phone screen, or somewhere else. So libraries allow us to stand on each other's shoulders and so that someone else can do the hard work, and we can now solve problems that are more interesting to us, not the basic commodity stuff that everyone might want in their code. So again, library is code that someone else wrote. A header file in C is just a file ending in ".h" that gives you access to the same. And so for instance, if you to learn more about these, there are, what are called in the world of programming, manual pages. And these are textual files, like a documentation of sorts, via which you can just learn how a function works or how you can use its inputs or arguments. The reality is they're written for folks who aren't in CS50. They're written for folks who aren't just learning how to program. They're written for and by folks who have been programming for years. And so frankly, they're a little hard to understand. And so CS50 has its own version thereof at this URL, manual.cs50.io, where you'll see not only the official documentation for C, the language, but also staff-written simplifications in layperson's terms what all of the various popular functions are, what their inputs, and what their outputs are. So for instance, under stdio.h, you can actually go to that website. You can go to a URL like this, where stdio.h is in there. And you can actually see the documentation therefore. So let me go ahead and do this. I'm going to go ahead in my browser here, I'm going to go to manual.cs50.io. And let me go ahead here and select those functions that are frequently used in CS50. And under stdio.h, you'll see a bunch of functions, only one of which we've even discussed called printf. I'm going to click on printf there. And you'll see an interface that at first glance might be a little overwhelming, but it's going to start to look more and more familiar. So first of all, you'll see that if you want to use printf under Synopsis, you need to include this header file. Like, you literally copy and paste that line into your own code. You'll also see this, which for now is a bit arcane, but this is kind of a hint as to what the function is going to look like. But more on that soon. But more importantly, you can read a description about it. And because these descriptions, when you're in less comfortable mode, are written by me and the course's teaching fellows, teaching assistants, and course assistants, you'll find them to be much more in layperson's terms. And so long story short, rely on this site once you want to learn how to use some function and also what other functions exist. In fact, if I go back to the main page here, you'll see that there are all of these functions like are frequently used in CS50. And there's hundreds more that come with C. But learning a programming language is not about learning all of those but rather just getting a sense of where you find answers to questions when you do want to try something new. But what is important to know for CS50 today is that we have our own header file called cs50.h which has functions that we have written just to make life easier in the first few weeks of the class. These are training wheels that we'll eventually take off. But it turns out in C, especially if you've programmed before, it's actually really hard and annoying just to get input from users, to get them to type a word or a number or something else. Like, C does not make this easy, in part because it's one of the earliest languages that wasn't zeros and ones. So you have to do a lot of the heavy lifting yourself. But we'll put on these training wheels today and for a few weeks so that we can focus really on the intellectually interesting ideas of C and programming without getting bogged down in certain weeds that we will come back to before long. So for instance, CS50's own documentation is there at that URL. But within the library are these functions, a function called get_string to get a string of text. "String" is a synonym for just text in a programming language. So get_string will prompt the human for a string of text. Get_int is shorthand for "get integer," if you want to get a number from the user. Get_float is a little more arcane-- get a floating point number, like a real number with a decimal point in it. And dot, dot, dot, there are others, as well. So this is to say within CS50, we've got some user-friendly functions via which we can actually get some input. And let's go ahead and use one of these, for instance get_string because recall that last week our second program in Scratch was this one here, where we didn't just say "hello, world." We said "hello, David," or "hello, Carter," "hello, Julia," whoever it was typing their name in. But to do that, we needed this Ask block in Scratch. And then we used the Say block. And then we used the Join block to make all of this work. So let's translate this program now into C because it's a little more interesting and representative of the kind of code we'll start to write. But we need a slightly different mental model. Still have a function here, which is the implementation in code of an algorithm. We still have some inputs called arguments. But previously, I said that the Say block and, in turn, printf have side effects, which is just something visually, typically, that happens on the screen. Other functions actually have, what we called last week, return values. And this is kind of analogous to a function maybe doing something for you, writing down the answer on a slip of paper, and then handing you, the programmer, the slip of paper to do whatever you want with it without just broadcasting it to the world with, like, a speech bubble on the screen. So a return value is germane for a program like this because recall when we used the Ask block and I typed in my name, where did my name end up initially? It didn't go on the screen yet. Where did it end up? STUDENT: In an answer. DAVID MALAN: In an "answer" puzzle piece. And that special oval puzzle piece I claimed at the time represents a return value, so the metaphorical piece of paper that the answer is written down on so that I can then use it later. So that's what we want to get to now in C, a return value that I can then do anything I want, whether it's print it to the screen, change it in some way, save it in a database, or anything else. So here, for instance, is what we did with Scratch, the input to the Say block-- or the Ask block was "what's your name," quote, unquote. The function, of course, is the Ask function. And the return value was "answer." If we now consider how we might translate this to C, it's going to look a little weird at first. But it's going to follow a pattern today, next week, the week after any time we do code like this. So get_string, I claim, is going to be the most analogous function in C to the Ask block. And to be clear, this is a CS50-specific thing, training wheels of sorts. But we'll show you in a few weeks what this function is doing and how you cannot use it moving forward once you're comfortable with the language itself. Notice I've put parentheses, left and right, as sort of a placeholder for user input. And that user input is going to be "what's your name?" But I can't just put "what's your name" in parentheses. What do I minimally need to add in there, too? STUDENT: Quotes. DAVID MALAN: Yeah, so the double quotes, left and right. So let me go ahead and add those in. I left a space here, not for a new line. I could move the cursor to the next line. But I minimally at least want to move the cursor at least one space over just so it looks pretty, so that when I'm prompted for my name, there's a space between the question and my answer. But it could also be backslash n. That's just an aesthetic choice on my part. But what do I do with the answer that comes back from get_string? This is where the text is going to look different today. In C, you start to use an equals sign from left to right respectively. And on the left, you put the name of the variable in which you want to store that return value. So a return value is kind of a conceptual thing. You can do with it what you want. And if I want to store it longer term in a variable, like x or y or z in math class, I can just give it a name here-- x or y or z or, more reasonably, "answer," or any other English word. No spaces, generally lowercase, same heuristics as before, but this means now, ask the user, what's their name? Whatever they type in, go ahead and store it from right to left in this variable called "answer." But C's not done with us yet. If you've learned Python or certain other languages, you'd kind of be done writing code at this point. In C, though, you additionally have to tell the compiler what type of variable you want to use. So if it's a string of text, you say "string." If it's an integer, a number, you say "int," as we might have seen before. So it's a little more pedantic. It's more annoying, frankly, the more onus on you and me, the programmers. But this just helps the compiler know how to store it in the computer's memory. And I'm so close to being done with this line of code, but what's missing? STUDENT: Semicolon. DAVID MALAN: So semicolon. And mark my words, if you've never programmed before, sometime this week, this semester, you will forget a semicolon. You will raise your hand. You'll get frustrated because you can't understand why your code's not working. You will run into stupid issues like that. But do take faith that they are stupid issues. It doesn't mean it's not clicking for you or you're not a programmer. It just takes time to see these things if it's a new language to you. So there now is my semicolon. All right, let's go ahead then and do something with that return value using the second of the big puzzle pieces in Scratch. So when I wanted to say, "hello, David," or whatever the human's name is, I kind of stacked my puzzle pieces like this. This is actually similar to Python and maybe some other languages some of you have learned. But C is a little bit different. And the closest analog to this Scratch solution is going to look like this. I still use printf because printf is the equivalent of Say. Inside of my parentheses, I'm going to go ahead and say, weirdly, "hello, %s." So there's no real analog in C of Join. Instead, there's a way to specially format text using printf, hence the F in "printf." And what you do in printf is you type whatever English word or human words that you want. You then use %s a placeholder. If you want a string of text to be added to your own text, you literally write "%s." And let me anticipate a question from the crowd-- how do you print out %s? There's a solution to that, too, if you literally ever want to print out %s. But it's deliberately a weird choice of characters so that the probability that we ever need to type this ourselves is just low that no one really worries too much about it. All right, but that's not quite enough. In addition to saying "hello", comma, space, placeholder %s-- and just for vocabulary sake, that's a format code. Again, "format" being the F in printf. I still need my double quotes around the whole thing. In this case, to match my previous program, I am going to go ahead and add the backslash n to move the cursor to the next line. And now I've left a crazy amount of room here, but that's deliberate. Does anyone have an instinct for what I'm probably going to want to add after the quotes but still inside of the parentheses? STUDENT: The answer. DAVID MALAN: So answer itself. I need to somehow tell printf with a second input, otherwise known as an argument, what I want to substitute for that %s. And so I put a comma and then the name of the variable that I want printf to figure out how to plug in here. So honestly it's a little annoying, and this is kind of a dated approach. Newer, more modern languages, like we'll see later in the course, Python and JavaScript, actually have much more user-friendly ways of doing it. But once you wrap your mind around the heuristics, the rules here, it's just formatting a string by plugging in whatever you want into this format string, so to speak. And again, the comma here is important. This signifies that it takes one input at left and a second input at right. But notice this comma. There's technically two commas. But I'm not claiming that this function takes three inputs. Why? This comma I'm pointing out doesn't mean the same. STUDENT: It's because that comma's part of the quotation marks and it's been part of the string. DAVID MALAN: Exactly. This comma that I'm pointing to is part of the quotation marks and therefore part of my string of English text. So this is just English grammar. This is sort of C syntax. And again, these are the sort of annoying little details that we're using the same symbol for different things, but context matters. So just stare at your code, look carefully left to right, and generally the answer will pop out, no pun intended. OK, questions now on this syntax before we actually write it and run it? Yeah? STUDENT: Why is the backslash n not after the answer? DAVID MALAN: Why is the backslash n not after the answer? So the way functions work, including printf, is that you pass to them one argument inside of the parentheses. And then if you have a second argument, you put it after this comma here. But the way printf works is that its first argument is always a string that you want to be formatted for you. So anything you want printed on the screen has to go in those quotes. And you can perhaps extrapolate from this. If I actually wanted to say multiple things in this sentence, so "hello," maybe first name, last name, I could actually do "hello" comma, %s, space, %s, if I had two variables, one called First Name, one called Last Name. But then I would need another comma for a third input to the function. And so it's very general purpose in that sense. Questions? Yeah? STUDENT: Can you abstract this further by [INAUDIBLE] the "hello" [INAUDIBLE]?? DAVID MALAN: OK, so can you abstract away the format string itself, "hello," comma answer? Short answer, yes, but not nearly as easily in C as you can in other languages. So that's why we're keeping it simple for now. But you're going to love something like Python or JavaScript, where a lot of this complexity goes away. But you'll see also in Python and JavaScript and other languages, they still are inspired by syntax like this. So just understanding it now will be useful for multiple languages down the line. All right, so let's actually do something with this code rather than just talk about what it might be doing for us. Let me go over to, for instance, VS Code again. And I'm going to go ahead now and remove this middle line of printf. I'm still in my same file called "hello.c." I'm going to clear my terminal window just to eliminate distraction. And to do that, I can literally type "clear." But this is just for aesthetic's sake. That's not functionally that useful. Or you can hit Control L to achieve the same on your keyboard. But I'm going to go back to line 5 here, where I previously just said "hello, world." And I'm going to do this instead. I'm going to give myself a variable called string. Sorry, I'm going to give myself a variable called answer, the type of which is string. I'm going to set it equal to whatever the return value is of get_string, asking an English question, "what's your name?" with just a single space just to move the cursor over, followed by a semicolon. Then I'm going to go ahead and say printf, quote, unquote, "hello, placeholder, backslash n," comma, and then what goes here again? STUDENT: Answer. DAVID MALAN: This is where answer goes. And then I just need a semicolon on the right of that. But I think now that I'm done. But let me point out a couple of details. This got very colorful, very pretty quickly. And it's not like the black and white code I had on the screen a moment ago. This is because what programs like VS Code do for us is it "pretty" prints, or rather it syntax highlights our code for us. So syntax highlighting means just add some colors to the code so that different ideas pop out. So you'll notice, for instance, that printf here, get_string here are in purple because they represent functions, just like the Say block. Here, "what's your name?", quote, unquote, in VS Code is a light blue instead of white. But it's still going to be consistent if I use strings of text elsewhere, as well. So I didn't type anything special. This isn't like Microsoft Word or Google Docs, where I'm highlighting and changing colors of things. This is all happening automatically. But it's just unicode text. It's just being interpreted automatically and having these colors applied so that things pop out more usefully visually. Now, I've unfortunately made a mistake. But I'm going to deliberately induce this one because you, too, will probably make this mistake. I'm going to go ahead and run "make hello" again, because I've changed my code. So I have to regenerate the machine code from the new source code. But unfortunately, when I hit Enter now, my God, the errors don't even fit on the screen. So let me make this bigger. I'm going to click the little caret symbol here just to make my terminal bigger for just a moment. And you'll see that there's more lines of errors than there are of code that I actually wrote, often which is written pretty arcanely, again, for programmers who've been writing code for 10, 20 years. But there are some details that pop out. So notice the problem is definitely with hello.c. So great, it is my fault. This syntax here means that line 5 is the problem. And this next 5 means character 5. So you can literally triangulate your bug, your mistake by going to line 5 and then over five, and it's somewhere in that area. Specifically, "the area is use of undeclared identifier string. Did you mean standard in?" I don't think I did. Like, I do want string, and then there's some other complexity here. But what's important here is not the specifics of this error but really the implication that it doesn't recognize the word "string" or "get_string." Now, why might this be? Yeah? STUDENT: You said that in order to [INAUDIBLE] DAVID MALAN: Exactly. Because we are using get_string, which I claimed is a CS50 thing that we'll use for a few weeks, C does not know about it out of the box, so to speak. I have to teach the compiler that get_string exists, just like I taught the compiler that printf exists by including the appropriate header file. And in this case, quite simply, it's called includeCS50.h. That now teaches the compiler, oh, someone else wrote this function already, get_string, and with it this type of variable called "string." So now if I go back to my terminal window and rerun the exact same command, "make hello"-- maybe crossing my fingers-- now nothing in fact goes wrong because the compiler has been brought up to speed with all of the functionality it needs. And now if I do ./hello, Enter, there it is, what's my name? And notice the cursor is one space over just because I thought that looked prettier than having the cursor right next to the question mark. D-A-V-I-D as my input, and Enter. And "hello, David." Questions on any of this code thus far? Questions? Any of the code. No? All right, so let's introduce some other functionality into the mix. It turns out that there are other types of data, other types of variables in the world, not just strings but indeed, per before, we have things called integers, "int" for short, floating point values, "float" for short, and a few others as well. So rather than only focus on string, let's get a little more interesting with numbers here and see what we can do with something like integers, again "int" for short, by taking a look at not get_string, as before, but now how about get_int. And for this, I'm going to give us a few other tools in our toolkit, those format codes to which I alluded earlier, like %s, fortunately are pretty straightforward. And here is a list of most of the popular format codes that you might ever care about with printf. In particular, we saw %s for string. And you can perhaps guess which one we're going to use for integers. STUDENT: %i. DAVID MALAN: Yeah, so %i is what we're going to use for integers. And this is the kind of thing that you can consult in the manual pages or a slide like this. There's only a few of them that you might frequently use. But let's go ahead and use integers in a more interesting context, not just using functions. But let's revisit this idea of conditionals. And conditionals in Scratch were like these proverbial forks in the road. Like, do you want to do this thing or this thing or this other thing? It's a way of making decisions in a program, which is going to be super useful and pretty much omnipresent in any problems that we try to solve. So let me give you a few more building blocks in C by doing the side-by-side comparison again. So here in Scratch is how we might say if two variables, x and y, one is less than the other, then go ahead and say, quote, unquote, "x is less than y." So kind of a stupid program. But just to show you the basic syntax for Scratch, this is how you would ask the question, if x is less than y, then say this. So Say is the function. "If" is the conditional. And the green thing here we called, what? What did we call it? Yeah? STUDENT: A Boolean. DAVID MALAN: A Boolean or a Boolean expression, which is just a fancy way of saying a question whose answer is true or false, yes or no, 1 or 0, however you want to think about it. In C, the code is going to look like this. So it'll take a little bit of habit, a little bit of muscle memory to develop. But you're going to say "if," then in parentheses, you're going to say "x less than y," assuming x and y are variables. You're then going to use these curly braces. And then if you want to say, quote, unquote, "x is less than y" in C, what function should we use here presumably? So printf. So printf, quote, unquote, "x is less than y." So it's a bit of a mouthful, but again notice the pattern. Name of the function is printf. In the parentheses, left and right, is the argument to printf, which is, quote, unquote, "x is less than y." And again, just for aesthetics, to move the cursor to the next line, which you don't have to worry about in Scratch because everything's in speech bubbles, we're adding a backslash n, as well. So notice that these curly braces, as they're called, much like the orange puzzle piece here, are kind of hugging the code like this. And I'll note that technically speaking in C, If you only have one line of code inside of your conditional, you can actually omit the curly braces altogether. And the code will still work if you have one single line of code. Why? Just saves people some keystrokes. If you have two lines, three lines, or more in there, you need the curly braces. But I'll always draw it with the curly braces in class so it resembles Scratch as closely as possible. As an aside to some of you who have programmed before, you might be cringing now because like you really like your curly brace to be over here instead of here, that, too, is a stylistic choice. And we'll talk, too, about this in the class. Aesthetically, stylistically there are certain decisions we can make. But generally in a class, in a company, you as a student or an employee would simply standardize on one set of rules, so to speak. So we'll use these rules for formatting our code in class consistently. All right, any questions on this snippet of C code? All right, a couple of others then. So here is how, in Scratch, we might have a two-way fork in the road. If x is less than y, say x is less than y, else say x is not less than y. In C, It's going to look pretty much the same. But notice I'm adding an "else" keyword here with another set of curly braces. I'm going to have a couple of more printf's. But in C, even though it's clearly keyboard based, it's just text, no more puzzle pieces, it's kind of the same shape, so to speak, and it's definitely the same idea. So it's following a pattern. What about a three-way fork in the road, if x is less than y, then say x is less than y, else if x is greater than y, say x is greater than y, else if x equals y, then say x is equal to y. Well, you can probably see where this is going. On the right-hand side, it looks almost the same. In fact, if I add in the printf's, it's really almost the same, at least logically. But there is at least one curiosity, seemingly a typo but it's not this time. Yeah? STUDENT: The double equals. DAVID MALAN: Yeah, the double equal signs does not match Scratch, but it's not in fact a bug or a mistake in C. Anyone have an intuition for why I did use two equal signs instead of one here? Yeah? STUDENT: Because otherwise it could be mistaken for a variable. DAVID MALAN: Exactly. Well, otherwise it would be mistaken for a variable, specifically assignment of a variable. So recall that in previous code, when we used the get_string function, we used an equals sign to assign, from right to left, the value of a variable. And that's a reasonable decision. "Equal" kind of means that the two should ultimately be equal even though you think about it from going right to left. Unfortunately, the authors of C kind of [? painted ?] themselves into a corner. And presumably, decades ago when they realized, oh, shoot, we've already used a single equal sign, how do we represent equality of two values, the answer they came up with was, all right, we'll just use two instead. And thus was born this decision. Is it the best one? Who knows? Crazy enough, in other languages, like JavaScript, you have not just one, but two, but also three equal signs in a row to solve yet another problem. So reasonable people will disagree as to how good or bad these decisions are. But in C, this is what you must do. But there's a bad design decision here, too. It's still correct, the code, left and right. But I bet I could critique the quality of the design of both the Scratch code and the C code for reasons, what? STUDENT: Do we have to do else if x equals [INAUDIBLE]?? DAVID MALAN: OK, no, really good intuition. Do we have to ask this third question, "else if x equals y?" So short answer, no, logically, right? Just based on arithmetic, either x is less than y or x is greater than y or, what's the only other possible answer? They must be equal, logically. So technically, you're just kind of wasting the computer's time by asking this question because it already knows, at that point, the answer. And you're wasting your time as the programmer bothering to type out more code or more puzzle pieces than you need because logically one stems from the other. So I can tighten this up, get rid of the "else, if," just use an "else." And I can do the same thing over here in C, thereby avoiding the double equal sign altogether, but not because it's wrong but because you're wasting time, because now you're potentially asking only two questions, two Boolean expressions, instead of 50% more by asking a total of three questions at most. Other questions then on this kind of code, logically or otherwise? No? All right, so if we have these puzzle pieces, so to speak, at our disposal, how can we go about actually using these? Well, suppose that we actually want to do something with values. Let's introduce variables in C, as well. We saw an example using a string a moment ago. But what about with something like integers? Well, you might not have used this in Scratch. But here's the orange puzzle piece in Scratch via which you can create a variable called counter to count things. And you can set it equal to some value like 0. Now, you can perhaps guess where we're going with this. If I want in C a variable called counter and I want to set it equal to 0, I use a single equal sign because logically you read it from right to left, or technically it's executed from right to left. But that's not enough in C. What's missing from the screen? STUDENT: Data type. DAVID MALAN: I need a what? STUDENT: You need a data type. DAVID MALAN: So we need a data type. And if it's going to be an integer, indeed I'm going to use int. And now the other mistake I keep making is-- STUDENT: Semicolon. DAVID MALAN: So a semicolon at the end of the line. So it's a little more verbose than some languages. But if you read it left to right, this is how you tell C to give you a variable called counter of type int and initialize it to a value of 0. That's all. All right, how about, in Scratch, if you want to change that variable by 1, by adding 1 to it? In Scratch, it's super simple. You just change it by 1 or even negative 1 if you want to go up or down respectively. In C, it turns out you have a few different ways to do this. And this looks like it's not mathematically possible, but that's because equals is assignment, recall. So this line of code is not saying that counter equals counter plus 1, because that's just not possible using typical numbers. But this means take counter's value, add 1 to it, and assign it back to the counter variable. So it's like incrementing counter in this way. But this is such a common thing in C and in programming to increase or decrease the values of variables, there's a more succinct syntax. This is identical. And it might take you a little practice to get used to it, but it just saves you some keystrokes. But it similarly adds 1, or whatever number you use there. And this is such a common operation in C especially that there's an even tighter way of executing the same idea. And you can literally just say counter++ and then semicolon in this case. All three are exactly the same. All three are perfectly correct. But you'll learn over time that typing less on the screen is probably going to save you some time. Meanwhile, if we wanted to do the opposite and do something like minus 1 in Scratch, we could similarly do minus minus in C. Or we could do-- yeah, we could do minus minus in C here at right. All right, so just some additional building blocks, translating from scratch to C. Why don't we go ahead and try using this perhaps in the following way? Let me go ahead and go back to VS Code. And let me propose that we do something like this. In VS Code, I'm going to go ahead and clear my terminal window. I'm going to close "hello.c" by just clicking the X. I'm going to go ahead and create a new file called "compare.c" because the purpose in life of this program is going to be to compare integers on the screen. This time I'm not going to mess up. I'm going to preemptively include CS50.h. I'm going to preemptively include stdio.h. And here, too, is a very common mistake in learning C. It is not "studio.h." So when you email us asking why "studio.h" is not working, that's because that is not the word. It is "standard io.h," meaning standard input and output, stuff involving the screen and the keyboard. Then I'm going to go ahead and, just as before, int main(void), but we'll come back to that eventually as to what it means. And now inside of "main," which is just where the main part of my program goes, again you can think of this as being analogous to "when green flag clicked." This just kicks everything off. I'm going to go ahead and do two things. I'm going to go ahead and get an integer called x, and I'm going to prompt the user for that int and just say something like, "what's x?", space. Then I'm going to do int y equals get_int, quote, unquote, "what's y?", space. And then let's just do something simple like, if x is less than y, then go ahead and print out, quote, unquote, "x is less than y backslash n," semicolon. So it's not a very deep program. It's just going to do what most any human brain could do pretty quickly. But it's at least demonstrating how we might use now something like a conditional in code. So let me go ahead and re-- let me compile this code for the first time, make compare, enter. Nothing bad happens, which is good. "./compare" is how I run the program. And just to tease this apart, dot, as we'll soon see, essentially means that the file is in your current folder. So dot means in your current folder. And we'll eventually see that dot dot means your parent folder, like the one that contains wherever I am on my computer's hard drive. All right. ./compare. What's x? 1. 2 for y. And hopefully it should say that x is less than y. So pretty straightforward. Proof by example. And hopefully this would work in other cases, too. But if I flip that around and I rerun it, ./compare, and I do 2 and 1, nothing's going to happen. But you would expect that because there's only one Boolean expression deciding whether or not I should actually type this out. So what's going on? Well, if this helps you, you might find it useful to think about the logic of any program, be it in Scratch or C, as kind of a flowchart of sorts. And we'll put up a few of these over time just in case you're a particularly visual thinker. And this represents what it is I just did. So here in this picture is where the program starts conceptually. And any time you see a diamond, think of that as a Boolean expression, a question that's being asked. And the question being asked is, is x less than y? That has two possible answers, true or false, yes or no respectively. So let me propose, per the arrow, that if the answer is true, then print out, per this rectangle, "x is less than y," just quote, unquote, and then stop. That's it for the program. But logically, if x is not less than y, that is that question's answer is false, we'll just skip right to the end and stop. So this is a control-flow diagram. It's just a pictorial way that you could write on a piece of paper that just represents what it is the program is doing. And this gets a little more interesting if now we do something else with the code. For instance, instead of just concluding that it's less than 1 or the other, let's go back to the code here. Let me clear my terminal window. And let me add an "else." So else, go ahead and print out "x"-- I don't think I want to say this-- "greater than y." It's not quite right. What would be reasonable to say here? STUDENT: "x is not less than y. DAVID MALAN: Yeah, subtle, but "x is not less than y" because it could be equal. We don't know if we're only checking two scenarios here. So if I recompile this, make compare, ./compare. Now if I do 1, comma 2, I still get the same answer. If I rerun ./compare 2, comma 1, I now get the opposite answer. It's not as good as might be ideal. It'd be nice to know if it's equal to or greater than. But at least that's all of the code that we have here. And just to now paint a picture, if I go back to my control-flow diagram, my flow chart here, this is what it looked like before logically. Now that I've added in a second branch, so to speak, now, if the answer is false, I first print out x is not less than y, and then I stop the program. So same idea, but the decision tree, if you will, if you've taken a 10 or the like, is getting a little bit bigger now conceptually. All right, what if we do something more than this? Let's actually have that third condition. Let me go back into my code here. I'm going to hide the terminal window just to make room for more code. And I'm going to say, "else if x is greater than y," then go ahead and say not "x is not less than y," but rather "x is greater than y." And then down here, I'll do an "else if x equals equals y," then I can go ahead and say printf, "x is equal to y backslash n," close quote. All right, so now if I run it-- let me open my terminal window again. Let me rerun make compare. Let me rerun ./compare. 1 and 2 are the same. Let me rerun it. 2 and 1 are the same. Let me rerun it a third time. 1 and 1 are now in fact equal. So this works correctly. But why did I make a point of using these "else if"s? Put another way, couldn't I just make my life a little simpler and just say, if this, then that? If this, then that. If this, then that. Just ask all three questions. Keep the code simple. Don't bother with these else's. Would this work for me? Yeah? STUDENT: It just seems like [INAUDIBLE] the program doesn't have to run the rest. DAVID MALAN: Yeah, so it saves a little bit of time because in this case, just like in English, this is like asking three separate questions. And it's not harnessing any information from previous questions in order to decide whether you should bother asking that other question. In other words, if x is less than y-- and you already figured that out because it's 1 and 2 respectively-- you're going to print this. Why would you waste time asking this question when it's not going to be true? Why would you waste time asking this question when it's not going to be true? And so the point I wanted to make here, which is that if we visualize that particular design, what the flow chart looks like is actually this. And let me zoom in at the top. If you ask the question "is x less than y," well, you're going to go ahead and say, x less than y. Then if you go down to the next question, you're still going to ask is x greater than y. And then below that, you're still going to ask is x equal equal to y? So no matter what x and y are, you're asking one, two, three questions all of the time. But if we actually go in and do what we did the first time, where if I go back to my code and I undo this edit and add back the "else if"s-- and now let me go back to the flow chart, which I claim is bad because it's one, two, three questions, one or two of which might not be necessary-- now if I visualize what I just did, the flow chart gets a little more complicated looking, but it's going to be better designed, more efficient. Why? Well, because if I start at the top here, I ask one question, is x less than y. If the answer is true, OK. I say x less than y. And then, boom, I sort of cheat and go all the way to the end of the program and stop, having asked only one question. If, though, x is not less than y, OK, fine, I'll ask you a second question. But if the answer is true, boom, I print out x is greater than y, and then I stop. And only in a perverse case where x actually equals y, which I'm going to claim is very unlikely or infrequent, only then am I going to ask one, two, three questions to figure out whether or not to print something at all. So this is what we mean by distinguishing between correctness of code-- because it's still correct-- but this version is better designed because hopefully you're going to go down this branch or this branch rather than the longest one frequently. Any questions now about this code or this visualization thereof? Yeah? STUDENT: I don't know if [INAUDIBLE] DAVID MALAN: A perfect segue. Why did I bother, though, even asking this question? Don't need to because when I hit this button-- hopefully I have the right slide in place-- this would be even better than that design. So thank you for teeing that up. This is the same picture. It sort of got bigger because there's fewer nodes, fewer shapes in the picture. Notice that if x less than y, boom, we say as much, and we stop. If x is not less than y but it's greater than y, boom, we stop. Or if it's not greater than, we immediately conclude x indeed is equal to y. And again, we stop. So this picture is about as efficient and as well designed as we can make our logic. That's about as good as we can solve this problem. So if I go back to my code now to make my C code match that, the only thing I need to do is stop wasting the computer's time. Don't ask that third question. Just logically, mathematically conclude that of course it's going to be equal at that point in the story. All right, any other questions on this? STUDENT: [INAUDIBLE] DAVID MALAN: Sorry, a little louder? STUDENT: [INAUDIBLE] DAVID MALAN: Really good question. What if I put in something that's not a number? So here, too, is where the CS50 library and the implementation of get_int will be your friend. So for instance, if I run ./compare and I want to compare cats and dogs, I could type in "cats," Enter. It's just going to prompt me again and again. It's not going to let me type in "dogs" either. It's going to force me to give it an integer. C does not do that by default. And in fact, as we'll soon see over the course of CS50, C is actually a very dangerous language at the end of the day because it just trusts that the human is doing what it wants. And as such, a lot of today's software that is hacked in some ways, if it's using C or another language called C++, are actually very vulnerable to certain types of hacking, whereas other languages that we'll get to in the class are less so for reasons like this. All right, so besides this, let's consider just one other data type. how about. So besides strings, besides chars, there's some others on this list here. Sorry, besides strings, besides integers, there's this other data type here in C known as a char for a single character. So here, let me just tease this apart. A string is indeed a string of text. It is zero or more characters together. A char is always precisely one character. Not all languages bother distinguishing between a single character and a string of characters. But in C, a string is typically multiple characters, but technically can be zero. Coincidentally, it could be one. But it's capable of being more. But a char is literally, as the word implies, a single character. All right, given that, notice that in the CS50 library, besides get_string, besides get_int, we also have get_char, so another handy function for just getting a single character from the user. Now, why would it be useful to get a single character from the user? Well, what if you're just doing something that you and I do pretty frequently when you install new software or fill out some form? You agree to some form of terms and conditions. So in fact, let me go back over to VS Code here. And let me propose that I create a new program called agree.c, so something akin to asking for the user's agreement. So in VS Code, I'm going to type "code agree.c." And I'm going to do some quick boilerplate. So include CS50.h, include stdio.h, int main(void). And then inside of main, which is like the "green flag clicked," I'm going to do this. Go ahead and get a character from the user, and ask them something simple like, "Do you agree?," expecting a yes/no response. But at the beginning of this line, I need to put the return value somewhere. So I'm going to put it in a variable called C. And in programming, if you're just getting a single value, it's OK sometimes to use X and Y or C when you're using-- in larger programs, you'll benefit from using actual words like "answer," like we did from the get go. But C has to be a specific type. So I'm going to literally say "char," and then I'm going to finish my thought with a semicolon. And here's now how I could check if the user agrees or not. I could do something like this. If the value of C equals equals, quote, unquote, lowercase 'y,' then go ahead and print out "Agreed backslash n." Else if the variable C has a value equal to lowercase 'n', let's go ahead and print out, say, "Not agreed," as though I'm agreeing or not to some terms and conditions. But notice these are not typos. What did I do ever so subtly different from last time I used text? Yeah? STUDENT: Single quotes instead of double quotes. DAVID MALAN: Single quotes instead of double quotes. So here's the heuristic. When using strings, which are generally multiple characters, have to use double quotes. When using a single character, you should use single quotes around the single character. So let me go ahead now and, make agree. Nothing went wrong, which is good. ./agree, Enter, and let me go ahead and type in y for yes. It seems to work. Let me run it again. ./agree. n for no, and it seems to work. And just if I type in something random like question mark, I don't know, it doesn't crash. It just ignores me because I only had two Boolean expressions there. But notice that it's actually a little buggy arguably. Let me run it again. ./agree, Enter. How about capital Y because, like, my Caps Lock is down. OK, it just ignores me. Let me do it again. ./agree, capital N because my-- oops-- because my Caps Lock is down. OK, it just ignores me. But this should make sense because I'm literally checking for lowercase. So how could I fix this? How could I fix this without just changing lowercase to uppercase, because that would then break it in the other direction? Yeah? STUDENT: [INAUDIBLE] DAVID MALAN: Yeah, let's just add another branch here, so to speak. So if variable C equals equals capital Y, then I can go ahead here and say printf agreed. And then let me close my terminal to make more room. Otherwise, down here, else if C equals equals capital N, let's go ahead and again say printf not agreed. And I claim that this would actually now work. It's a four-way fork in the road, but I'm at least checking for lowercase, uppercase, lowercase, uppercase for y and n respectively. I claim that this is correct, but this too, even if you've never programmed before, should start today to rub you the wrong way. Like, we can do better. This isn't the best design. Why might that be? Yeah? STUDENT: Could you change the character c to be uppercase, like before you even [INAUDIBLE]? DAVID MALAN: Ah, clever. So could we change the variable c to just be forced to uppercase or maybe forced to lowercase? No matter what the human types, we just do that ourselves so that way we can just simplify this again to two possible scenarios. I love that, but we haven't seen any functions yet in C that would let me change things to uppercase or lowercase. So we'll get there, but a good instinct and correct. Other thoughts? STUDENT: Use "or." DAVID MALAN: So we could use "or" in some sense, like a logical "or." What I don't like about this, to be clear, is that it's repeating itself. And there's this principle in programming, and in life in general, like, don't repeat yourself unnecessarily. And by that I mean I literally have the same line 10 as 14. I have the same line 18 as 22. And if anything, one, I literally wasted twice as much time as I needed to. Put another way, per our discussion of Scratch, what if I go in and just change something like, I want to be more excited, like "Agreed!"? Well, I might forget to change it in the other place. And let's just claim for today's purposes that that looks stupid, it's a bug, because I want them to be consistent. So don't invite situations where you might change something in one place but not another. Just only write it in one place total. So I like this idea of "or"-ing things together. So let me go ahead and delete what I just did. And just to be clear, too, while this is on the screen, when you highlight code in VS Code based on how we've configured it, these dots just show you how much I've indented because in C, stylistically, the convention is generally to indent four spaces and maybe four more spaces. So those dots just help you count without having to manually eyeball things yourself. But let me delete those lines. Let me delete these lines. And this is going to look a little weird, but the way you can "or" two thoughts together, so to speak, like "or" them together, is you don't say "or," but you use two vertical bars, which syntactically means the English word "or." And you can just ask the other question, if C equals, quote, unquote, capital 'Y.' And then down here, I can say or C equals equals capital 'N.' So it adds a little more code to each of those lines, but it doesn't add redundancy, because I've not duplicated my printf. I've not added more curly braces unnecessarily. Now as an aside, there's the opposite of "or", logically is the word "and." Just so you've seen it, I could do this. "&&" in C is how you express that the thing on the left must be true and the thing on the right must be true. But why would this make no sense in this context of line 8? STUDENT: It can't be uppercase and lowercase. DAVID MALAN: Yeah, at least to my knowledge, a character can't be both lowercase and uppercase. That just makes no logical sense. So indeed "or" is what we want in this case. Other questions? STUDENT: In CS50.h, is there a way to directly compare strings [INAUDIBLE]?? DAVID MALAN: Good question. Via CS50.h, is there a way to compare strings. Short answer, no. But C is going to give us that capability. And in fact, next week, among the things we'll do is actually compare strings. And if you've programmed before, you'll see in C that it actually doesn't work the way that you might expect. But that's a problem, too, that we will solve. But that transcends CS50. That's a question for C. Other questions on this kind of logic? Just to make this real then, anytime you click one of those EULAs or terms and conditions on a form in a piece of software, odds are there is code as simple as this underneath the hood. Maybe it's graphical. Maybe it's checking for you clicking this button or maybe hitting the Enter key. But underneath the hood is presumably some kind of conditional checking for those kinds of outputs. All right, how about another building block from last time, which we'll now translate to C, namely loops, things that happen again and again? And these, too, are everywhere in code. So in Scratch, here's how we might meow three times, super simple. In C, it's going to look a little weird. But you will get used to this over time if you've never programmed before. It looks like a mouthful, OK. But let's tease it apart line by line. And you'll see that you won't have that reaction frequently because it's all going to start to look very similar to itself. But what are we doing here? In C, you don't have the luxury of these cute and fun puzzle pieces that just do the work for you, repeat three times. In fact, in C and programming in general, sometimes the work is on us to actually figure out, OK, how can I use functions, variables, conditionals, and loops and implement some idea like repetition, like looping? And in C, here's how this might work. How can I go about doing something like printing "meow" three times? Well, I know about variables now. We're about to see loops. And I've seen how I can update variables by plussing or minusing some value to them. Let's combine those ideas. So first, I'm doing what with this highlighted line in English? If a friend cared to ask you, like, 'what is this line of code doing' later today, what would you say? STUDENT: It's creating a variable called "counter" and setting it equal to 3. DAVID MALAN: Good, it's creating a variable called "counter" and setting it equal to 3. I'll use slightly new jargon. I'm defining a variable, would be the term of our "called counter" and setting it equal to 3. So I'll use my hand to represent the counter. And that's all a variable is. It's like storage in some case that I'm representing information, using my hand in this case or the computer's memory here. Now what happens when using a loop in C? There's different types of loops, one of which is called a for loop-- oop-- one of which is called a while loop-- spoiler. A while loop works like this. Inside of parentheses is a Boolean expression just like inside of a conditional that asks a question. But this time the question is going to determine, do you keep going through the loop again and again and again? So it's not a one-time thing potentially. It is checked again and again and again to decide when it is time to stop looping, to stop cycling. All right, so it's asking this question first. Is counter greater than 0? OK, obviously the answer is true because I'm still holding up three fingers. So what happens? C goes inside of the curly braces per the indentation and executes printf of "meow," which prints out a "meow" on the screen. The next line of code executes, which, recall, is the same as just subtracting 1 from counter. So I think I take down one finger, so I'm left with two. And what happens next? Well, this you just kind of have to memorize. Once you get to the end of the inside of a loop, you go back to the beginning of a loop here and ask the same question, the same Boolean expression. So is 2 greater than 0? OK, obviously so. So you go into it, you print a "meow." You go into it and decrement counter further by one. So now my hand is holding up one. Now we wrap back around to the Boolean expression. Is 1 greater than 0? Obviously. We print out a third "meow." We then decrement counter again, and my hand goes to zero. We go back around once more. Is 0 greater than 0? No. And now the program just terminates. Or if there were more code here, it would just jump outside of the curly braces and keep going lower on the screen. So that's all that's happening. And so this is what MIT has the luxury of doing with pictures. But at MIT, someone probably essentially wrote code that looks like this to give us the illusion, the abstraction of this. So what we're learning today is how they invented these puzzle pieces by just using lower-level plumbing, if you will, like this here. Yeah? STUDENT: What would happen if you created the variable "counter" inside of the curly braces? DAVID MALAN: A really good question. What would happen if you created the variable inside of the curly braces? Short answer, it just wouldn't work in C, because if I were to try with my slide here, for instance, to move this line of code here down inside of this, for instance, now the very top line is trying to use counter before it even exists. So C is very literal it. Reads top to bottom, left to right. And if it hasn't seen you define or create a variable yet, you're going to get some scary error message on the screen instead. All right, other questions on this here code? No? All right, so if we want to then maybe tighten this up a bit, let me propose that we could do this instead. So besides this version of the code, let me just do something more canonical, more conventional. So you're totally fine with using a variable like counter . It's what Scratch uses by default. It's very verbose. It does what it says. Frankly, once you get comfy with programming, like most typical programmers, whenever they have a single integer in a program whose sole purpose in life is to count, they'll just use "i" for integer just like I used "c" for character. When you have larger programs, you don't want to start using A and B and C and D and E and F and so forth for your variables because nothing's going to make any sense. But when you're doing something super simple like counting with an integer, using a short-named variable is totally stylistically reasonable. But I can tighten this up further, not just renaming counter to i. What else can I do, if you recall? Over here? STUDENT: [INAUDIBLE] DAVID MALAN: Sure. STUDENT: A for loop? DAVID MALAN: Oh, OK, for loop. Yes, that was my spoiler. But while in a while loop, I can tighten this up slightly more. Over here? STUDENT: Instead of i equals i minus 1. DAVID MALAN: Yeah, instead of i equals i minus 1, I can actually tighten this up this way. And we didn't see the minus before, but it's the same idea-- i minus equals 1, or even more succinctly, i minus minus. So when you get comfortable with programming, any of these approaches are correct. This would be more conventional at this point. So if you want to write code like most other people write code, adopt ultimately these kinds of conventions. All right, so that just does the exact same thing, though. But let's now put this into practice. Let me go back to VS Code here. Let me go ahead and clear my terminal and close agree.c from before. And let me go ahead and create a file called "meow." So code meow.c. And let me do this the sort of wrong way. Let me include stdio.h. at the top, int main(void) thereafter. Inside of there, let me do printf "meow." And then you know what? I don't want to keep typing that. Let me just go ahead and copy/paste two more times. So I claim this is correct, make meow. ./meow, done. I've got code that prints "meow" three times. But this, again, should already rub you the wrong way. Why? Yeah? STUDENT: There's duplication. DAVID MALAN: Because what? STUDENT: There's duplication. DAVID MALAN: Because I have duplication. I mean, I literally copied and pasted it. And that's kind of a good rule of thumb. If you, in the future, start finding yourself copying and pasting code within the same program, you're probably doing something wrong. There's a better way to design it even if it's correct. So this is clearly a candidate for a loop. So let me go ahead and actually do that. Let me just go ahead and remove all of this duplication. Let me give myself a variable called i, set it equal to 3. Let me go ahead and give myself a while loop and check that i is greater than 0. Inside of this loop, let me print out just "meow" once. But I'll reuse that code again and again because here I'm going to do i minus minus. So that's the exact same code, the tight version of it that we saw a moment ago. Let me go ahead and "make meow" again, ./meow, and it still works. Why is this version better? Because if you want the cat to meow five times, you change it in one place. If you want to make the cat a dog, you change the meow to a woof in one place, albeit changing the file name eventually, but changing it in one place, not worrying about changing it again and again and again. But there are other ways to do this. For instance, let me propose that. And actually, let's see, let me propose that instead of just doing it this way, just to be clear-- yeah, let's go ahead and propose that instead of doing this, we can actually count in different directions. I'm kind of forcing this idea of starting at 3, going down to 0. But when normal humans in this room, if you ever count something, you probably do 1, 2, 3, and done. Like, that's how we would count in the real world. Well, we can do that, too, here code-wise. We could initialize i to 1. We could check that i is less than or equal to 3. And we've not seen this syntax before, but there's no easy way on a typical keyboard to type a less than or equal sign like in a math book. So we use two characters, a less-than sign and then an equal sign back to back. And that means less than or equal to. And this is the same idea so long as I plus plus i inside of it because that'll start at 1, then 2, but it won't stop then. It will go up to until i is equal to 3. Once i becomes 4, then that Boolean expression isn't going to be true. So it stops after three "meow"s total. But there's another way, too, and this is probably the most conventional and the way you should do it even though it's just as correct. In CS, if you've seen already last week, we almost always start counting from 0. Why? Just because, so we're not wasting a pattern of bits. So generally when you start writing code that counts, you should, quote, unquote, "almost always" start at 0, count up to but not through the total you care about so you don't get one extra by accident. And so this would be the most conventional way of doing what we just described. But they're all correct. You can make an argument that all of them are equally good. This is what most people, quote, unquote, "would do." OK, other questions on this here syntax or logic? No? All right, how about-- we got some cookies on the horizon. But before we get there, let's meow a few more times, if we may. So how about doing a little bit differently versus the while loop. And I think we heard it over here. Turns out there's another type of loop altogether. So this one here. And this one, if you can believe it, is probably even more conventional than the other way. And this is going to be thematic in programming. There's rarely one way, one right way to do things. You're going to have bunches of different tools in your toolkit. And your code might look different from someone else's because each of you tends to reach for a different tool in that toolkit. And here's another tool-- and as you proposed earlier-- a for loop. A for loop is just another way of achieving the exact same idea using slightly different syntax. And it's appealing, frankly in general, because it's a little more succinct. It just saves some keystrokes even though you have to memorize the order in which it works. This code is identical to this code here functionally. But aesthetically, of course, it looks different. How does it work? In a for loop, notice that in the parentheses is not a single simple Boolean expression. There are three things. One, before a semicolon, is a place to initialize a variable to do your counting typically. Second is the Boolean expression. So it's still there. It's just surrounded on the left and the right by two other things. Lastly is the update. What do you want to do at the end of every loop through this block of code? So you can probably imagine where we're going with this. How does this work? The first thing that happens is that a variable called i is defined and initialized to the value of 0. That happens once and only once. Then we check the condition. Is 0 less than 3? Obviously yes. So now we don't do the plus plus yet. We go into the loop. And this is where the for loop's a little confusing at first. We print out "meow." Then what happens? There's no more lines. So we go back to the for loop, and we increment i at that point. So now i is 1. Then we check the condition. i is less than 3? Yes, because 1 is less than 3. We go back into the loop and print "meow." Now we go back to the plus plus, so i is now 2. We check the condition. 2 is less than 3 obviously. So we go back into the loop and print "meow." Then we do the increment. i is now 3. Is 3 less than 3? No, so we exit the loop, and we're done, or we keep going down here if there's more code. But how many times did I say "meow?" 1, 2, 3 total, when my hand was 0, 1, and 2. Questions on this alternative syntax? It takes some getting used to, but most people would write loops using a for loop, I would say. STUDENT: Could you now in the curly braces, use just one line of code? DAVID MALAN: Yes. If you really want to be cool and save syntax, yes, it is correct and common to eliminate the curly braces if you only have one line of code therein. We in class will always put the curly braces there because this is the kind of thing where, if you get forgetful, you go in later and add a second line. Like, darn it, like you forgot the curly braces, things will not work as expected. So in general, use the curly braces, but you do not have to strictly. Other questions on 6? Yes? STUDENT: [INAUDIBLE] DAVID MALAN: Can be used without, what? STUDENT: [INAUDIBLE] DAVID MALAN: Oh, could you do it without the condition? Yes, there are very fancy things you can do that we won't focus on today. But yes, if you want to get rid of the condition, you could get rid of this here. And that would actually make the loop go forever, which may be a good thing if it's like a clock that you want to tick forever, but often not a good thing in code. Good question, though. All right, so beyond that, let's just go ahead and put this into context. Just in case it helps you to think about this, this is just another flow chart, if you're more of a visual thinker, that represents what it is this loop is now doing. Previously, all of our arrows went from top to bottom and stopped. But now there's an arrow going back, up, and around because of this loop, this cycle. So when we start this program, we set i equal to 0. We then check, is i less than 3? Obviously it is, so we print "meow." We increment i, and then we go back to that same condition. We check the condition. We print "meow." i plus plus, go back, go back. Now, if i equals 3, 3 is not less than 3, so the answer is false. And we stop. So again, it's just another way of thinking about how the code in Scratch, how the code in C might alternatively work in each of these contexts. But there's this one other puzzle piece in Scratch, recall, that's not the repeat block, which is for finite numbers of repetitions, but forever. And in C, there is a way to do this, but it's a little weird looking. There's no forever keyword. But you can use the while loop or, as you inferred, you can actually use the for loop without a condition in the middle. So here, I can actually say this. If I want to do something forever, I want to make sure that the answer to my question, the Boolean expression, is always true, always true, always true, the easiest way to achieve that goal is just literally write "true" there because true is true no matter what. And it's a trick for making the loop forever go around and around, as you might if you want the cat to live forever and meow incessantly or if it is a clock that you want to tick forever or the like. So here, for instance, is how we might have a cat meow endlessly, using this so-called for loop instead. But recall that in Scratch, we also had this ability to create some of our own puzzle pieces. And this, too, is something that we're going to be able to do here in C. And let me propose that we do exactly that by introducing the C analog of this. So here, for instance, is, in Scratch, our definition of a function called meow whose sole purpose in life was to just play the sound "meow" until it's done. This is going to look a little weird at first. But you'll notice some similarities with main. So recall this thing I keep typing with main, int main(void), int main(void). That's just the "when green flag clicked" equivalent for today. But if you want to create your own puzzle piece or your own function in C, you, for now, literally do this. You say, void, the name of the function you want to create, and then void in parentheses. And technically what this means is that this function has no return value. It doesn't hand you anything back like get_string or get_int. And the "void" in parentheses means it takes no inputs. It only meows. You don't have to tell it how to meow. It's just going to meow. So no arguments, so to speak. This literally just prints out "meow." But what this does for me is it abstracts away the idea of meowing. I don't need to know how to use printf or that you're using printf to make the cat meow. I now have a function in life called meow because in Scratch, recall, I used it like this. When the green flag is clicked, I could repeat three times this new custom puzzle piece. But in C, I could now do this. In my main program, I can use a for loop just like we saw a moment ago, copy/pasted from earlier. But now I can call my own C function called meow. And let me go ahead now and do this. If I go over to my C code here, back in VS Code, let me go ahead and delete everything inside main. Let me go ahead and do for int i equals 0, i is less than 3, i++. Inside of my curly braces, let me go ahead and say "meow." But I now need this meow function to exist because if I do "make meow" again, notice error. "Implicit declaration of function meow is invalid in C99"-- the 1999 version of C. What does that mean? Well, it doesn't know what the meow function is. And the meow function is not in CS50.h. It's not in stdio.h. I have to create it. So let me type out or really copy/paste what I had on the screen a moment ago-- "void meow meow"-- [CHUCKLES] "void meow(void)" printf, quote, unquote, "meow," close quote, semicolon. But here, too, let me scooch this down a bit so you can see all the code at once. Let me now do make meow. And unfortunately, I still have an error. If I scroll up, still on line 7 of meow.c, my compiler thinks that meow is invalid, that it does not exist. This too is a common mistake. And as simple as this code might be in spirit, where did I screw up? Yeah, in the middle. STUDENT: You need to define the function like above where you use it. DAVID MALAN: Yeah, I need to define the function before I use it. So again, C is going to take you literally. If you try to call meow on line 7, you better not define it on line 11. You better define it higher up. So the simplest fix is going to be just to do this. Let me clear my terminal. Let me highlight and just delete the meow function. And let me just paste it up here. And this will actually solve the problem. Make meow now works. And if I do ./meow, that, too, works. But this isn't really the best solution because if your solution is constantly, oh, well, just put it up there, put it up there, put it up there, I bet we could contrive a situation where one function needs to be above the other, but it needs to be above the other. And that's just not going to work in general. And more importantly, it just pushes main lower and lower and lower in your file. But the whole point of your main function is like, that's the entry point. That is what happens when the green flag is clicked. And so just in terms of user conventions, it's just useful for main to always be at the top of a file because then you can find it fast. Your friends can, your TFs can find it quickly if it's at the top. So the other solution here would be to leave meow at the bottom and leave main at the top. But this is the only time, if I may, that copy/paste is OK. What I've highlighted here in line 11 is what's called the function's prototype. It is enough information to give you the return type, the name of the function, and the return value-- and any arguments. And so if you just copy/paste that one line and end it with a semicolon up there, that's enough of a hint to the compiler that, OK, it doesn't exist yet, but it will. And it will look like that. That's the only time it's OK to copy/paste the very first line of a function you've written to the top of the file with a semicolon so that you can make the compiler happy. So if I do make meow now, still no errors. ./meow, and it now works. But let me add one final feature, coming back to Scratch here. And then it's time for a snack. So here, recall, was sort of the last fancy thing we did in Scratch, where we created not only our own custom puzzle piece, but it took an input so that we didn't need to keep using the loop ourself. We could just let the meow function be told how many times do you want the cat to meow. So in C, we don't have to make that many changes except this. We change the prototype to take an argument inside of parentheses. And this is the syntax for that. If you want your own function in C to take one or more arguments, you give the arguments a name, n, or whatever you want to call it. But you have to tell C what the type of that input is. So it's an int n. So it knows it's a number. And then you can just use n in your program. So instead of hard coding, typing manually the number 3, I'm just using n here. So this is equivalent to what I did with Scratch, by just dragging and dropping the n variable there. And then "meow" will get printed that many times. If I want to then use this-- notice, this is the last version of the cat that we did last week-- you just say "meow" this many times. So in C, this is where now the code gets very succinct because all the main part of the program does is meow three times. So this, again, is an abstraction. I don't need to know, care, or remember how meow is implemented. I just need to know what its return value, its name, and any arguments thereto are. So if I make this change, I think we can get the cat to meow any number of times. Let me go back over to my C code here. Let me go back into the file and change "void" here to be int n, where n just means number. I could use i, but n tends to be a quantity instead of a counter. I then, inside of this function, am going to do a for loop-- for int i get 0; i less than n-- instead of 3-- i++. And then inside of here, I'll paste that "meow" again. I need to change my prototype to be identical, so another copy/paste, or just manually edit it. But now notice what's cool about main, is that now I can meow maybe three times. Make meow, Enter, ./meow. OK, or if I really want to be cool, I can change this to 30,000 times. Go back here, make meow. Increase the size of my terminal window for a dramatic pre-break flourish. And there are 30-- that was a fast cat. There are 30,000 meows. I think now let's go ahead and take-- that's a lot-- a 10-minute break. We'll see you in 10. Cookies are now served outside. All right, so we are back. And I realize this has been a lot so far, right? So there's a lot of new syntax. There's a lot of translation of Scratch over to C. But among the goals of having spent last week in Scratch and having spent problems at 0 in Scratch is that none of today's ideas are really all that new. It's just a lot of syntax that will get more comfortable and more in your muscle memory as time passes. Up until now, though, we've focused largely on these side effects, like things happening on the screen. And that was akin to the speech bubble appearing in the world of Scratch. But let's focus for just a bit-- before we then explore things we can't do very well in code-- on return values instead in C. We've seen them already. Like, get_string returns a value. Get_int returns a value, a string and an int respectively. But what if we want to make our own functions that don't just meow and visually have this side effect of meowing on the screen but actually hand us back some value? Well, I bet we can do this in C, as well. Well, let me propose that to go that route-- let me go back to VS Code here. And let's make our very simple calculator that just adds some numbers together. But the same calculator, we'll soon see, is going to get us into trouble if you don't understand what the computer is doing underneath the hood. Let me go ahead and run code of, say, calculator.c. And in here, let me go ahead and give myself access to the CS50 library with CS50.h, the stdio.h library with stdio.h, int main(void), which, again, we'll just take for granted today that we have to include atop any of these programs. And let's just add two numbers together-- super simple calculator. So it gives me a variable called x. Assign it the return value of get_int. And I'll ask the user to give us x. Give me another variable called y. Assign it the return value of get_int again. But this time, ask the user for y. And then, lastly, let's just go ahead and print out the value of x plus y. But I don't think I can get away with something like this, x plus y semicolon, because if I do this, based on what we've seen before, what's actually going to get printed out? STUDENT: x plus y. DAVID MALAN: Right, literally like x plus y. So I think this is where I need the F in "printf" for formatting. What I think I really want to do is print out the value of some placeholder because, what do I want to substitute for percent i maybe as a second argument to printf intuitively? Maybe just x plus y. So indeed, I can get away with this because it turns out in C, there's a bunch of arithmetic operators, all of the ones that you might expect, including addition, subtraction, multiplication, division, and even this one, the so-called modulo operator, which generally gives us the ability to calculate a remainder when you divide one number by another. But I'll keep it simple with addition. And indeed, with printf, if I want to print out the value of x plus y, I can do that. But I have to tell printf what kind of value to expect, an integer, thus the percent i instead of %s for string. And I think this should do the job. So let me go back to my terminal. Make calculator, Enter. All is well so far. ./calculator, and let's keep it simple-- 1 for x, 2 for y. And indeed, I get 3 as the output. It's not very dynamic. It can't do a subtraction or multiplication or much more. But it does at least do those kinds of calculations. But let me propose now that we maybe make a reusable addition function, right, because addition is something I'm going to do a lot. And maybe it should be abstracted away with a function just like meowing was abstracted away a moment ago. So let me go ahead and instead of doing this, let me go ahead and give myself a function called add, but instead of last time where I had a meow function, I'm obviously going to call this "add" instead. And instead of last time, taking in no arguments, I think I want add to work a little differently. I don't want add necessarily to take an argument yet, but I do want add to return some type of value. And just intuitively, what type of value should an addition function return? STUDENT: An integer. DAVID MALAN: An integer, so an int. So I'm going to change void, which means the absence of a return value-- nothing's coming back-- to literally "int." But I'm not going to change the thing inside parentheses yet. I'm going to go ahead and copy my prototype up here. And I'm going to make this change, return x plus y. And then here, instead of printing out x plus y, let's go ahead and do this. Let me give myself a third variable just for now. z equals the return value of this brand-new add function that's going to add x plus y for me. And then let me print out the value of z. Instead of x plus y, I'm outsourcing now to this add function so it will do the addition of x plus y. So similar in spirit to meowing, but the return values, I claim, are about to create an issue. So let me make calculator again. And there's definitely some errors. So here we have, "use of undeclared identifier x." And that's on line 17. So that's pretty far down in the file. So specifically, my compiler does not like my use of x on line 17. But wait a minute, x is clearly defined on line 8. What intuitively might explain this issue even if you've never programmed before? Yeah? STUDENT: Well, because x and y are defined in the main function, not the add function. DAVID MALAN: Yeah, because x and y are defined in the main function, not in the add function. So the term of art here that we're about to introduce is something called "scope." So "scope" just refers to the context in which variables exist-- the context in which variables exist. So by that, I mean this. On line 8, I've declared x. On line y, I've declared-- [CHUCKLES] on line 9, I've declared y. But the catch is-- and here's where the curly braces are helpful-- those variables only exist in the context of the outer curly braces that are nearest to them, like this. So I can use x and y on lines 10, 11, 12, and even up to 13, but not thereafter. So I certainly can't use x down here on line 7. But this is a problem, because if add's purpose in life is to add x and y but add can't access x plus y, well, we have an issue of scope. Like, x and y are not in scope for this add function. But that's OK because remember that every function we've seen thus far can have maybe a return value or a side effect, but it can also take 0 or one or two or more inputs, known as arguments. So what if I instead do this? Let me clear my terminal window. And let me update add to not take nothing as input but maybe two integers. And I'll call them arbitrarily a and b. But I have to tell the compiler what type of arguments they are-- two integers, one after the other. And now what I can do is this. Let me change this up here, too-- int a, int b-- just so that the prototype is exactly the same. And the only purpose of this prototype is just to avoid the previous error, where the compiler didn't realize add was going to exist because it came later in the file. So here on line 11 now, if I want to add two values, x and y, this is now the syntax. We saw syntax in Scratch for passing in inputs to tell it how many times to meow. So this is just telling add what two numbers to add together. So now I have to change this to a plus b, for reasons we'll soon see. And let me see if this is right. Make calculator. So far so good. ./calculator. Let's do 1 and 2 for x and y respectively. And hopefully we should, again, see 3. Now, what's going on? So here, again, if I zoom in on my add function, this "int" here on the left, on line 15, means what about add? STUDENT: [INAUDIBLE] DAVID MALAN: This means that it has a return value, that it's an int. So it's going to hand me back, metaphorically, a slip of paper with an answer on it that is of type integer. It's not a word, like my name. It's a number instead. These mentions of int here and here are inside the parentheses, which means this function, add, takes two inputs. The first is an int. The second is an int. And just so we have something to call them, I call them a and b respectively. So what happens essentially when I call the add function now on line 11, I'm kind of passing in x. I'm passing in y. But the add function is going to think of them as a and b respectively. It could call them anything I want. I could change this to the word "first" and "second." And then I could literally change this to "first + second." Those are perfectly acceptable as argument or variable names. But who really cares? Like, a and b for such a simple function is perfectly reasonable, too. Technically, if your mind is going there, I could even call them the exact same thing. But let me propose for today, certainly don't do that because it just confuses things if you've got x's and y's here, x's and y's here, but they're clearly different. Just don't do that. Try to come up with different variables just to keep yourself sane. But here, I have a function that takes now two integers, a and b respectively. It just returns the sum of them so that I can now store the return value of add in a variable called z. And then, quite simply, print it out. But there's one other thing I can do here. Now, if we think about design, even if you've never programmed before, do I really need the variable z? Because I'm defining it on line 11, and then I'm quickly using it on line 12, and that's it? Like, sometimes you don't need variables. And they might make your code more readable. But strictly speaking-- and this is just kind of like substitution in math-- if z is the same thing as "add (x, y)," well, let me go ahead and just delete line 11 altogether. Let me get rid of mention of z. You can actually get away with doing this. And much like the Join block in Scratch, where I kind of overlaid it on the Say block, kind of stacking them, you can stack functions in C, or nest them really, kind of mathematically. Honestly, it makes it a little harder to read because your mind has to dive in conceptually deeper and deeper into this second argument. But it's perfectly acceptable, too. And just to connect the dots to maybe something from high school, this is kind of analogous to a function in math class being like f of x, where f is some function name, x is some arbitrary input to that function. And when you start to put functions inside of functions so that the output of one becomes the input to the next, it's like using this syntax, f of g of x and so forth. If you've never seen that before, don't worry. But if you have, it's a way to connect some of these dots. Any questions, though, on just this idea of now having a function that doesn't just have a side effect but instead has a return value? Yeah, in back? STUDENT: In our declaration of main, why did we show it as returning an integer instead of void? DAVID MALAN: In our definition of main, why did I do, what? STUDENT: Why do we show it as returning an integer instead of returning as void? DAVID MALAN: Oh, a really good question that I was trying to sweep under the rug for today. But in every one of our programs thus far, I have indeed said "int main(void)." Technically speaking, whenever you write a program and it finishes running, it actually returns a value somewhat secretly. It returns the number 0 by convention, which means all is well. And it can return any other integer if something goes wrong. In fact, on your Mac, PC, or even phone, if you've ever gotten like a weird message on the screen, like something went wrong and it's like a weird numeric code, like error negative 129, or something arbitrary like that, that tends to mean that some program running on your Mac, PC, or phone had something go wrong with the main function. And that is the number that was returned. But that's more than we want to talk about today. But we'll come back to this. But main always returns a number. By default, it is 0. More on that soon. All right, so with that said, let's actually tease apart what it is we've been using underneath the hood here a little bit by returning to VS Code's interface itself. It turns out that all this time, even though I keep alluding to macOS and Windows, which like 99% of us are probably running on our laptops or desktops, there's actually other very popular operating systems in the world, among which is Linux. So Linux is a very popular operating system, the thing that turns on-- the thing that boots up when you first turn on a computer. And it's very commonly used for servers nowadays. All of CS50's own servers run some version of Linux. Those students more comfortable sometimes run Linux on their own Macs or PCs even. So Linux is a very popular operating system. And it's particularly characterized by its textual interface, its command-line interface, even though it also comes with graphical ones, as well. So again, this term we started today with, a graphical user interface is a thing with menus and buttons. It's literally what you and I use every day on our devices, otherwise known as a GUI. But today onward, you'll get more comfortable with and more practice with this terminal window down here, which represents a command-line interface, or CLI. And just so you have a mental model of what's going on in the cloud here, when you access cs50.dev, you are accessing this version of VS Code in the cloud, a piece of software just running in a browser. But that piece of software is automatically connected to your very own personal server in the cloud, so to speak. Technically speaking, it's a "docker container." But it means that you have essentially your own mini server in the cloud that only you have access to. And that server or container is running an operating system called Linux. And in fact, every time I've been running a command down here, whether it's code or make or ./hello or anything else, I've been running commands from here in Sanders Theatre on a server somewhere in the cloud, my own Linux container or server. And you'll have the same yourself. This is the thing that we have pre-configured for you by installing the compiler in so many other pieces of software you'll soon see in the class. But underneath the hood, then, of Linux is a soon-to-be familiar environment that allows you to run different types of commands. And those commands include things like this. And this is something you'll develop muscle memory for over time. But I wanted to give you a sense of some of the most popular textual commands because we're essentially about to take away muscle memory you have from a GUI world and have you type out words that represent double-clicking on things, dragging on other things, and other such commands that you and I take for granted. It'll be a little painful at first in the first days or weeks, but it will make you far more productive long term so that even after CS50, if you start using your programming skills in some other domain, class, or real-world job, you'll just be a lot faster at the keyboard and able to do more work more quickly. So with that said, let me go back to VS Code over here. I'm going to go ahead and open up my File Explorer over here. And you'll see at left all of the files that I've created thus far in class and all of the programs that I've compiled thus far in class. I also have this source 1 directory, which you can download from the course's website, which has all of today's code pre-written in advance, so you don't have to type everything that I literally type. But all of these files are things that I've created. And you'll see that in white are the C files. And grayed out are actually the binary files, the machine code that I created that I was running. So you can click on any of these files in VS Code to open them. For instance, here is hello.c. And voila, it opens in the text editor. But if I try to open "hello," that's not going to work, because that's zeros and ones. And frankly, the computer could show me all those zeros and ones, but it's just not going to be useful. And honestly, it's too easy to make one mistake and break the whole thing. So instead, VS Code says that it can't display the text, because it's binary or maybe unsupported more generally. So know that you want to only click on the .c files when writing C code. But let me go ahead and do something else. Suppose that I decide that, wait a minute, we're nearing the end of class. And we're not done yet, but what if I want to change hello.c to goodbye.c or if I want to change meow.c to woof.c and turn it into a dog? Well, let's actually do that. I could go over here and right click or Control click on the file, just like on a Mac or PC. I can find the Rename option. And I can do it all via the GUI. But you should get more comfortable using commands like these here. And among the commands on this list are "mv" for move, a.k.a. rename. So for instance, if I want to change meow.c to be woof.c instead, I literally type "mv" space, the original file name, space, and the new file name. So this is very similar to what I've already been doing with the code program or the make program. I not only type the name of the command but also the thing that I want to code or the thing that I want to make. In this case, I type the thing that I want to move from old to new. Now, if I hit Enter in a moment, watch on the left-hand side, meow.c in the GUI should automatically change even though I'm doing this all via the command-line keyboard interface. And now it becomes woof.c. I mean, it's not all that exciting. But this is just to say that they are one and the same. One is a GUI, one is a CLI, but it's the same exact thing. Moreover, let me go ahead and close now the GUI at left, the so-called explorer. And in my terminal window alone, now I'm kind of out of my element, like wait a minute, what was the file I created earlier? Well, there's other commands as well. On this list is, coincidentally, "ls," which lists the file in your current folder. And as you might have gleaned here, "mv" for move, "ls" for list, CS people like to be succinct, terse, and type the minimal number of keystrokes. That's why these are all abbreviated commands instead of full words. But if I go back to my terminal window and type "ls," voila, there is exactly the same contents of my server but displayed textually. And there's some heuristics here. In green with an asterisk is all of the programs that I made with make that are executable. So the asterisks just means this is executable with dot-slash. Meanwhile, the source 1 directory, which only I have because I downloaded it in advance, has a slash to indicate that it's a folder instead of a file. But all of these white files ending in .c we created together here today. Now, what if I really am embarrassed by my very first program, hello.c? Well, I can very destructively go and use the rm command for remove. And rm hello.c is going to prompt me, a little cryptically, "remove regular file 'hello.c'?" And amazingly, this rm program has code just like we wrote earlier for agree.c, where I can type 'y' to delete it. I can type 'n' not to delete it. But let's delete it. Let's go ahead and hit y, enter. Nothing seems to happen. But in general, that's a good thing. But if I type "ls" again, notice what is now missing? And in fact, the list is a little shorter. So it's one line instead of two. "hello.c" is now gone. Now, if you do that, there are not easy ways to get the file back. So don't do that unless you really want to. But there are backups maintained of these files, as well. Well, what else is there, too? Well, there's all of these other commands. And you'll experience them over time. Like, "cp" is copy. "mkdir" is make directory. "rmdir" is remove directory. And for instance, let me just show you one folder. If I type "ls," there's that source 1 folder that I claimed I downloaded in advance. If you want to see what's there, you can type "cd" for change directory, source 1, enter. And voila, notice that your prompt has now changed. And let me clear the screen. Just as a visual reminder of where you are, you can see before the dollar sign now the name of the folder that you're inside of. So in Mac or Windows, you'd see obviously a graphical folder. Here, you just see a little textual reminder of where you now are. And if I type "ls," you'll see that I wrote a crazy number of files before class. And each of these represents different versions of the files that we've been coding here in real time that I usually have printouts of just to go through things in series so you have copies online, as well. So in short, all of these commands, if you've never used them before, they will soon become like muscle memory. And they do the most basic of operations. But there will be other commands that we'll see over time that do even much more than that. But let's go ahead now and solve some actual problems. And it's no coincidence that we keep showing or alluding to Super Mario Brothers in some form, an older game from the Nintendo Entertainment System, that allows you ultimately to have this two-dimensional world, where Mario moves up and down and side scrolls from left to right. But you'll see we can distill even some aspects of "Mario" into some fairly representative programming problems. And in fact, let me propose that we consider this screen from the original Super Mario Brothers. So there's these four blocks in the sky, each with a question mark. And if you click on one of these-- or if Mario jumps up underneath each of these question marks, he gets like a coin or something else that pops out. Let's distill this, though, into its essence and consider in C, how can we make, not a blue sky yet, not a green grassy hill, and so forth, but how can we just make four question marks in a row, because I dare say that we do have the building blocks via which to do this. Well, the simplest way might be to go over here and run code of mario.c. And then in mario.c, let's include some stdio.h so we have printf. Let's do int main(void), as we keep doing. And inside of main, let's keep it super simple-- 1, 2, 3, 4, backslash n. Doesn't get much simpler than that. This is not going to be the prettiest of games. But if I make Mario now, ./mario, I get my four question marks in the sky. All right, so it's not all that interesting. But this is clearly a candidate for what type of programming feature. STUDENT: Scratch. DAVID MALAN: Not to Scratch, though Scratch would make it more interesting. yeah? STUDENT: A loop. DAVID MALAN: So some kind of loop, right? So print the thing out iteratively instead. So let me do that. Instead of just printing this out all at once, let me go ahead and remove this and do for int i gets zero; i less than 4; i++. And then in here, let me go ahead and print out just one question mark instead. And now let's run this. So "make mario" to recompile it, ./mario. And does anyone not want me to hit Enter yet? Why? STUDENT: Because it's gonna print a new line. DAVID MALAN: Yeah, it's going to print out a new line every time. So notice it's four question marks, but there each on its own line. All right, well, let me fix this. It's obviously because of the backslash n. So let me remove that. Let me rerun make mario, ./mario. And it's better in one way but worse in another. So wait, but now the dollar sign is doing that thing where it's on the same line, which just looks stupid if nothing else. So how can I fix that? Yeah? STUDENT: [INAUDIBLE] DAVID MALAN: Yeah, so logically, we don't have that many building blocks today. It's a lot of new syntax, but it's not that many new ideas. Let's just use printf to print out literally one and only one of these backslash n's, but outside of the loop so it happens after all four of those have been printed. All right, let me do make mario again, ./mario. And OK, now we're back in business. So sort of silly syntactical details, but if you reduce the problem to its essence, it should hopefully, logically, become clear over time. All right, well, how about not just something like that but vertical? Well, we've done something vertical already. And so I can imagine we could change the program to very simply print out three bricks instead of four question marks. But what if we consider a two-dimensional world? And later on in this game if you go underground, everything looks like this with lots of bricks. And let me propose, for the sake of discussion, that this big wall here is like a 3-by-3 grid of bricks. So it's not just a single brick. It's like three by three, or nine total. Now things get interesting. And let me go back to mario.c. I could take the easy road out and just say, all right, well, let's printf, how about 1, 2, 3, backslash n, close quote. And then, OK, let me just copy/paste. And I'm using hashes instead of the actual bricks. But aesthetically, it's pretty close. Let me now go ahead and "make mario" again, ./mario. And it doesn't quite look like a square. But that's just because the hashes are a little taller than they are wide. But it is correct, but not well designed. So here, too, what would be better designed than just hardcoding, typing literally all of these hashes? Yeah? STUDENT: We could use maybe two loops. DAVID MALAN: Interesting, two loops. And why two loops instead of one? STUDENT: Oh, wait, nevermind. Well, I was going to say you could do it one for vertical and one for horizontal. DAVID MALAN: OK, it's the right instinct. So one for vertical, one for horizontal. And even though these predate most of us, old-school typewriters you might know or might recall that if you feed a piece of paper into it, you can print like line, then it scrolls, line, then it scrolls, line, then it scrolls. This is kind of how the terminal window works, too. You can print rows and columns, but you have to print one row at a time, one row at a time, one row at a time. It's not easy, but it is possible to go backwards and go up and down. But just going row by row by row is more typical. So how can I do this? Well, I could use at least one and maybe even indeed two loops. And this is where we're just now composing different ideas from today and even last week. So let me go ahead and say, for int i gets 0; i less than 3-- for a 3-by-3 grid-- i++. And now let me cheat slightly. Let me print out just three of these here, and that's it. So I'm kind of cheating. I'm printing out rows dynamically, but I'm still printing three columns all in one breath. But let's see what happens. Make mario, ./mario, and it does work. But what if you said, no, I want 4 by 4 or 5 by 5 or 6 by 6? Now I have to change the 3 to a 6, and I have to add another three hashes here. Things get messy if we don't do this mathematically. So let me now do this instead. Why don't I go ahead and print out every row at a time. But for each row, let me use another loop to decide, like, rat-a-tat-tat, from left to right, how many do I want to print. So to do this, I could do another for loop. I could call this variable something different. j is pretty common. We start at i, we go to j. If you go past k, maybe l, you're probably doing something wrong. You don't want nested, nested, nested loops, but two is OK. j equals 0; j is less than 3; j++. And then here, I can print out a single one of these and no new line. I don't want to screw up like I did before. So I'll just do one. Let me go ahead and do make mario now, ./mario. But when I hit Enter, this is not correct yet. What's it going to look like? STUDENT: A single line? DAVID MALAN: A single line of nine hashes, I think, because I never used a single backslash n. So that looks wrong. So between what line number should I insert a printf of backslash n? Let me look a little farther back if I can. How about over here? Yeah? STUDENT: 10 and 11. DAVID MALAN: Between 10 and 11. So I'm going to go in here. I'm going to add printf, quote, unquote, "backslash n" semicolon. Let me go back and recompile mario-- ./mario. And crossing fingers-- voila, perfect. I printed out now a 3-by-3. Now, it's correct. It's not, if we want to be really nitpicky, maybe still not the best design. Where am I perhaps repeating myself? Yeah? STUDENT: [INAUDIBLE] DAVID MALAN: Yeah, I mean, it's not a huge deal. But now I have two, people would call these magic numbers. "Magic" in the sense of, where did that come from? You just randomly put it in the middle of your code. And you also put the same thing here. Now I have to make sure I don't screw up and make one change but not the other. So it turns out we can factor these out. I can actually do something like this, int n equals 3. And then I can just change this to n and this to n, which is marginally better because now I only have to change n in one place if I want to make this thing bigger or smaller. It's still going to work the same. So make mario, ./mario. There's our 3-by-3. But if I want to make a 5-by-5, let me change the n to 5, rerun make mario, ./mario. And now it's a bigger grid, 5-by-5. But this is a little fragile. And it turns out there's another trick we should introduce. It turns out that C supports what are called constants, whereby if you have a variable that you want to exist because it's useful but you don't want to accidentally change it, or if you're working with a partner in class or a colleague at work, you don't want your partner or colleague to accidentally change that value with their own code, you can go into your code and tell C, this is actually a constant integer, a const, so to speak. And this will just prevent you or someone else from doing something stupid by accidentally changing it elsewhere. The code is still going to work the same, ./mario, but you won't be accidentally able to change it very easily to something else. And honestly, what we've now done, too, is set ourselves up to make this more dynamic. Let me go up here, and let me add the CS50 library so that we have access to get_int because now we could do something fancy like ask the get_int function for the size of this brick wall. And then we can use n dynamically. So for instance, let me increase the size of my terminal, make mario, ./mario, size 3. Gives me a 3-by-3. ./mario size 5 gives me a 5-by-5. ./mario, how about 50, gives me a crazy big one, but it's all dynamic. And now I don't have to even change the code. It just now works. As an aside, if you're wondering how I type so darn fast, sometimes it's just because I'm hitting the up arrow. It turns out that Linux will remember, if you configure it this way, all of your previous commands. So if you hit up, up, up, I can go through the past couple of hours of commands that I've typed, which is useful sometimes-- not for hours of commands but the past few-- just to save yourself some keystrokes. And another trick in a terminal window is to do this. If I do ./ma and I get bored and I don't want to type out "rio," I can also just hit Tab, and it will autocomplete based on the characters that do match. So those kinds of tricks, too, will save you time over time. But let's do this. It's kind of broken, arguably, if I do this. How about "cat?" All right, well, that works. That prevents me from doing something stupid because get_int only accepts integers. But it will accept 0, which does nothing. It will accept negative 1, which does nothing. And that's not bad. It's not doing something weird. But it would be nice to catch that and force the user to give us a positive integer instead so we at least see something on the screen. So let me go back into my code, and let me propose that now that we have the CS50 library, why don't we do something like this? I'm going to change this. I'm going to get rid of the constant just in case the user needs to type it again. And what if I do this? While n is less than 1-- so if it's 0, negative 1, negative 2, or whatever, let's go ahead and again ask the user for an int, and ask them for the size again. And therefore, only once n is not less than 1 will this loop break out and will proceed with the rest of the code. So now let me try this. Make mario, ./mario 0-- didn't like that. Negative 1-- didn't like that. Negative 2-- didn't like that. 3-- it did like that. So using a loop now, I can ensure that the human is providing me with input that I actually want. So this is correct. But I dare say 6 through 10 could be done better. Why is this poorly designed instinctively? Yeah? STUDENT: There's repetition. DAVID MALAN: What's the repetition, to be clear, what lines? STUDENT: Lines 6 and 9. DAVID MALAN: 6 and 9. OK, so they're literally the same, and that's generally not a good thing. And maybe I could change this one to remind the user like, hey, that's not a positive number. So you might want to customize the message. But just having copy/paste here for the most part is not a good thing. So it turns out-- and there's just one feature of C we wanted to introduce you to today-- it turns out there's one other way that would actually help us eliminate this redundancy of using get_int twice and particularly asking literally the same question-- size-- twice in duplicate. So I'm actually going to go into my code here, and I'm going to delete the loop as we've written it thus far. And instead of using a while loop, I'm going to introduce instead something that we typically call a do while loop, which is a little bit different. Indeed, we begin with the keyword "do," and then inside of the curly braces, what I'm going to do here is that thing I might want to do once and more times thereafter. So for instance, I'm going to say n equals get_int quote, unquote, "size." And then at the bottom of this block of code, then I'm going to use the keyword "while," as well as parentheses as always for a Boolean expression. And here. I'm going to ask the question, do this while n is less than 1. But there's one fix I still need to do here because notice on the current line 8, I actually haven't given n a type. I haven't declared n yet. But it would not be correct to declare n here, inside of that do block. But why might that be? Why would it not be a good thing to declare n inside of these curly braces? Yeah, so recall that this is an issue of scope. Recall that the scope of a variable is generally confined to the most recently opened curly braces in which that variable is declared. And so if I declare this variable on line 8, I'm not going to be able to use it on line 10. But there is a fix, even though it might look a little strange. I'm going to go above my do block here. And before I go into this loop, I'm actually going to declare n to be an integer, but semicolon, end of thought. I'm not going to bother giving it a value, because I know logically I'm going to end up giving it a value anyway now on line 9. And so what's different about this version of the code is that the do while loop ensures that we prompt the user for input at least once. And then while that input is not what we expect, for instance less than 1, then it's going to execute again, again, again. And indeed, the semantics are just that. Do the following while this Boolean expression is true. So if I go ahead now and rerun make mario, compiles OK-- ./mario. And now I'll go ahead and input something that's not correct, like 0. But I'm prompted again. I'll input something like negative 1, and I'm prompted again. But if I go ahead and input, for instance, 10, now, because that's a positive integer, I indeed get a 10-by-10 grid of bricks. And there's one other thing we should introduce here, too, in C, too. C supports comments. And a couple of you have asked about this if you come from other programming languages. Suppose I want to remember what it is I just did with this program. Let me go in between lines 5 and 6 here and do "// prompt user for positive integer." This is what's known as a comment. And it's grayed out only in the sense that the compiler is not going to care about this. The computer is not going to care about this. This is a note to self, like a sticky note in the context of Scratch. And it starts with "//," which essentially tells the compiler ignore this, this is for the human, not for the computer. But this comment, so to speak, is a way of just reminding yourself, reminding your colleague, reminding your TF what it is a few lines of code are meant to do. And now this comment might be print, and how about n-by-n grid of bricks? And what's nice about comments is that theoretically you can get away with, or someone else can get away with, just reading this comment and then not even have to look at the rest of the code. They can look at this comment and not have to look at the rest of the code because you've described for them what it's meant to do. Yeah? STUDENT: I just had a question about the hashtag [INAUDIBLE] DAVID MALAN: Sure. STUDENT: That's for [INAUDIBLE] DAVID MALAN: Correct, the hash sign in Python is a comment, is not the same thing in C. In C, hash include means to include the library's header files in that way. Other questions on these here tricks? No? All right, so as promised, what is maybe C not actually good at? Well, let me propose that we consider what's actually inside of your computer. At the end of the day, whether it's a Mac, PC, iPhone, Android, phone, or some other computer device, there's something that looks like this. And this is memory, otherwise known as RAM, or random access memory, for reasons we'll get to in a few weeks. But this is where data is stored. This is where "hello, world" is stored. This is where 1 and 2 and all of those numbers are stored. Any data in your program is stored ultimately in the computer's memory. And the most important takeaway for today is that all of us only have a finite amount of memory in our devices. You might have a high-end device which has a lot of memory, but it's still finite, which means you can only count so high with that device. You can only store so many files with that device. There are fundamental physical limitations even though mathematically, theoretically, we should be able to count toward infinity. So what are the implications for this? Well, consider this. In the world of numbers, as per week 0, if you're only using three digits-- and I've grayed out the fourth one just to make the point-- if you're only using three digits, we can count from 0 to 1 in decimal, to 2, to3, to 4, to 5, to 6, to 7. And as soon as you count to 8, you technically, per last week, need a fourth bit. But if you don't have it, the number 7 might seem to be followed by what number instead? STUDENT: 0. DAVID MALAN: 0. The number overflows, so to speak, right? You carry the 1. But if there's no place to put the 1, because there's no fourth light bulb, if there's no fourth transistor, if there's no fourth bit, the lower bits, the zeros, are going to be mistaken for the number you and I know is 0. So integer overflow is a thing in computers whereby if you don't have enough memory, if you count high enough, the number will wrap around back to 0. Or sometimes it will wrap around to a negative number, depending on whether the code supports negative and positive numbers and 0 alike. So that has some very real world implications in integer overflow that's sort of a fundamental limitation of how numbers are typically stored. Now, thankfully, we typically don't store things based on number of digits but number of bits. And a bit is just a 0 or 1. And recall from last week that a common unit of measure is minimally eight bits, or a byte, but even more commonly is 32. So for instance, here are 32 bits, all zeros. And if you do out the math, this is the number you and I know in decimal is of course 0. But if I change all 32 zeros to ones, this is a really big number now. If we're only using positive numbers, not negatives, what number roughly is this, 32 ones? It's roughly 4 billion in total-- roughly 4 billion in total. Why? Well, if you've got 32 bits, each can be two possible values, 0 or 1, that's 2 to the 32nd power, which is roughly-- I'll stipulate-- roughly 4 billion total. The problem is, what if you want to count to 4,000,000,001? That's a bit of a white lie. It's not precisely that. But what if you want to count just higher than that? You'd need a 33rd bit because all of the others are going to go to 0 at that point, and you might count from 1 to 2 to 3 to 4 billion back to 0, or worse if you're dealing with negative numbers, too. So the fact that there are finitely many bits used in computers is a problem. And negative numbers do add a complexity because this is specifically the 4 billion in question-- 4,294,968,295. That is as high as you can count with 32 bits if you don't bother with negatives. But if you want negative numbers, you've got to half that because you've got to save half of them for negative, half of them for positive, give or take. And so if you're supporting negative numbers, as you probably should for a calculator, for Microsoft Excel, Google Spreadsheets, you can only count as high up as 2 billion roughly, or negative 2 billion roughly instead. So it turns out that when you are using data types in C, you have some control over how many bits are actually used. And this list is longer than we've covered today, but we did talk about integers for a while. Those are, by convention nowadays, 32 bits. If that's not enough, you can upgrade your variables to longs, which tend to be 64 bits instead, which isn't just twice as big as an integer, it's actually 64 bits, which is exponentially more. It's an unpronounceable number, at least for me. That's a crazy big number, but it's available to you. Moreover, we can see this if we actually are a little reckless with how we're using code. And just so you know too, though, there are functions even in CS50's library that let you use these larger values, get long of course, will get you a long. And this one's a little non-obvious, but "%li" is the format code for printf, just so you know, for printing a long integer and not just an integer. So it's two characters instead of one. Suppose, though, we actually want to use code involving some large numbers. It turns out that certain bad things can happen. So let me go ahead and do this. I'm going to go back over to VS Code here, and I'm going to modify my calculator to do something that, at glance, should be perfectly reasonable. Let me go ahead and open up calculator.c, as before. And where we left off, we had this add function. And you know what? I'm going to simplify it back to its very original version. I'm going to go ahead and get rid of the add function and just distill it to its essence, which is not to add any more. But let's just do division. I want to print out this time maybe x divided by y. So here we go. x divided by y is a nice simple program in my calculator. Let me do make calculator again, ./calculator. And let's divide something like 1 divided by 3, which should-- hm, OK, weird. It gave me 0 instead of probably 0.3333333, as you might have expected for 1/3. So what might the takeaway there be? Why am I seeing zero perhaps? Yeah? STUDENT: If 0's the integer [INAUDIBLE],, then if you want decimals [INAUDIBLE].. DAVID MALAN: Yeah, so 0 is an integer. And indeed, that's what I'm telling the thing to print. And in fact, if we go over to my little cheat sheet here of format codes, I'm currently using %i. I should actually, when I do division of numbers that might have floating point values, a decimal point that floats left to right, otherwise known as a real number, I want to use %f for float instead. So I'm actually going to go back to my code here. And let's try this-- %f instead of %i. And let me go ahead and "make calculator." Huh, all right. Well, this didn't work then. "Format specifies type 'double,' but the argument has type 'int.' All right, so that, too, is not quite working. So I think I actually need to make a change here further. Let me actually go ahead and do this. It turns out that besides integers, there are these things called floats, and also doubles. A float uses 32 bits, and a double uses 64 bits. And that doesn't necessarily mean you can count higher as much as it means you can have more numbers after the decimal point. So if you want a more precise value, you throw memory at it by using a double and 64 bits instead of a float. But we'll keep it simple, and let me go ahead and do this. Let me just do the math using the type of variable that I should be here. Let me do not int, but float z equals x divided by y. And now let me go ahead and print out the value of z. Strictly speaking, I don't need the variable. But I'm trying to be pedantic and actually use a float explicitly this time so we see a real number. All right, let me go ahead and do make calculator, ./calculator. 1 divided by 3 equals-- damn, now it's just showing me more zeros, which clearly isn't the case. Well, this is because of an issue that we'll generally call truncation. So truncation is just a term of art that means if you take an integer and you divide it by an integer, even if you get a fractional value, the fraction just gets thrown away because you're only doing integer-based math. So if there's anything after the decimal point, it just gets truncated, literally discarded. So what should 1 divided by 3 be? Obviously, 0.33333333-- ad nauseum. Fortunately, you throw away everything after the decimal point, which leaves you still with just 0. And even though I'm seeing more zeros, that's because we threw away all of the 3's. That is just what happens when you use integers and do any kind of division like that. But there is a solution. We can actually convert, or cast, integers to floating point values. So we can tell C, I know this is an integer now. But go ahead and treat it as though it has a decimal point, even if it's .0 At the end of the number. So I can go into this, and I can use parentheses and literally write "float" in parentheses. And over here for y, I can literally use parentheses and convert y to a float. The term of art here is type casting. You're converting one type to another effectively, or technically treating one type as though it's another even if it doesn't necessarily have a mathematical impact. But what it means now is that z will be defined by dividing one float by another. So truncation will not now happen. So let me do make calculator, ./calculator. And now 1 divided by 3, there it is. Now the math is actually correct. But my God, we had to jump through hoops just to get this to work. So to be clear, the two issues we-- well, the issue we encountered was truncation. If you divide an int by an int, you will get an int no matter what it should be mathematically. But if you instead type cast the values, the variables to a floating point value, or a double for that matter, then a float divided by a float will give you a float and preserve all of those 3's. But here's another catch, or at least a limitation potentially with computers. What if I go ahead here and do-- let me do this. Let me show you one trick here, even though the syntax is a bit weird. Instead of printing out %f alone, let me print out % dot, maybe 5f So this is weird syntax. And it's only specific to printf. "%.5f" means 'show me five decimal places specifically.' So if I do make calculator, ./calculator, 1, 3, voila, I get five decimal places. If I want six, let's do this. I'll change the code to 6. Make calculator, ./calculator, 1, 3, and now I get six 3's instead. All right, well, wouldn't it be nice to be even more precise? Let's give me 20 significant digits after the decimal point. So make calculator, ./calculator, 1 divided by 3, and-- woo. So your middle school teacher seems to have lied to you at this point. 1 divided by 3 is apparently not 0.33333 with a line over it, or just infinite number of 3's. OK, that's not quite the right conclusion, though. Oops. Why might I be seeing these weird numbers instead of just lots of 3's, intuitively? Why this rounding error? Yeah? STUDENT: The computer just has a [? limited ?] [? memory. ?] So there's [INAUDIBLE] DAVID MALAN: Exactly. The computer only has limited memory, finite memory. So it just can't represent every possible number in the universe because we know from grade school there are infinitely many of those numbers. So what you're essentially seeing is the closest it can actually get. It's rounding to the nearest floating point value, if you will. And it also relates to how the numbers themselves are represented in memory underneath the hood. I can do a little better, though. Let me zoom out. And let me upgrade, so to speak, from 32 bits to 64 bits and use doubles instead. I can still use %f. You don't use %d for double. Let me do make calculator again, ./calculator, 1, 3. I get more 3's but still some rounding. It's more precise, but it's not 100% accurate, because that's just not going to be possible in terms of the computer's memory. So this is a whole other issue known as floating point imprecision, which is another type of limitation. We saw integer overflow, if integers can only count so high before you run out of bits and things wrap around. Floating point imprecision means that you can't possibly represent the infinite number of real numbers that exist in the universe if you only have a finite amount of memory. You would need an infinite number of bits, it would seem. So these are two issues that actually fundamentally can influence the correctness not only of your code but code in the real world. And case in point, back in my day-- I graduated in 1999-- and a lot of the world thought the world was going to end around then because around the time the years rolled over from 1999 to 2000, there was a lot of old software still running in the world. And in fact, that old software, reasonably so, only used two digits to represent years. Why? Memory was very expensive early on. And if you could use half as much memory to store a year, that was a win. That saved you money. That saved you memory. The problem though, of course, is that a lot of old software from the '70s and prior was still running in 1999. And unless companies or individuals updated that software, 1999 might be mistaken for the year 1900 instead of 2000, because all of the code just assumed that, of course, we're talking about the 1900s. This code is not going to be running 50 years later, but it was still in that case. So people had to scramble, and they essentially had to solve this by using more digits, so upgrading from two to four. Nowadays, and really since the '70s too, we've used 32-bit integers to keep track of time, specifically keeping track of the number of seconds using an integer from January 1, 1970, the so-called epoch whereby that's just an arbitrary date early on where we just started counting time. So all of the clocks in your Macs, PCs, and phones pretty much just have a single integer that gets updated every second, but it's just keeping track not of absolute time per se, but how many seconds have passed since January 1, 1970, just because that's the date humans chose. The problem is you can only count as high as 4 billion, give or take, with 32 bits and actually 2 billion, give or take, if you support negative numbers, as well. And the problem with that is that we're about to trip over the same issue again in not too long from now. This is the 2038 problem because in the year 2038, on that date, mark my words, things could break again. Why? Because that 32-bit value is going to accidentally wrap around back to a 0 or a negative value. So we're going to go through the whole darn process again. Now, thankfully the solution, as you might expect, is kind of just to kick the can even further down the road and use 64 bits, which I think will get us another 290 million years of runway. It's more than twice. So it's not our problem anymore at that point. But that's fundamentally going to be the solution. But it will still be finite. So we're just deferring to our descendants to actually deal with the issue some millions of years from now if these things are still running. So if that does happen, here's the specific date that, in 2038, all of a sudden our clocks will still think because a negative number will get subtracted to the current epoch time. So it will think we're back in 1901. So this has had some fun and very real world implications. So for instance, this is the game Pac-Man, which you might have played. It kind of came out around my day, back in time. And if you get to the 256th level, this unfortunately is what happens because they didn't really expect that players would spend all this much time playing Pac-Man apparently. And they didn't really have a condition saying, you win if you get to the 255th or 266th level. And so what happens here essentially is that the whole screen gets very garbled because there's an integer in the original Pac-Man that counts to 256, but that's too big, so it wraps back around to 0. And it doesn't know when to stop printing fruits on the screen, as in this case, to collect. Another example of this is actually from the original Donkey Kong game, which looks something like this in my day, too, whereby in Donkey Kong, there was this mathematical formula, whereby the number of seconds you have to solve the game was a function of 10 times your current level number plus the number 4. That dictated how many seconds you get. So of course, the higher the level, you get more and more time as the level climbs. Unfortunately, once you hit level 22, the math ends up being 10 times 22 plus 4, which gives you the number 260. And they, too, were using 8-bit values, a single byte to represent numbers, which means 260 is bigger than 256. And the way that math worked out was, well, 260 minus 256, if it wraps back around, gave people four seconds to solve level 22, which is just impossible. Like, Mario can't even get up a couple of levels or so from where he actually was. So that, too, was sort of a well-known bug, as well, since that works out to be there. Lastly, and this one is all the more real, in 2015, Boeing 787 was documented as having not a hardware bug but a software bug in the following sense. "A model 787 airplane that has been powered continuously for 248 days can lose all of its power due to the control unit simultaneously going into failsafe mode. This condition was caused by a software counter that will overflow after 248 days of continuous power. Boeing at the time was in the process of developing a CPU software upgrade that will remedy the unsafe condition." And people did the math. It turns out that Boeing was probably using an integer that was 32 bits. And they were keeping track of time not in seconds but hundredths of seconds, because if you do out the math, after a 32-bit value has reached 4 billion-- or 2 billion one hundredths of a second, the number wraps around back to 0, or negative 2 billion. And the implications was literally the plane's power would stop. And if you can believe it, if you grew up with Windows, macOS, or whatnot, anyone want to conjecture what the solution was until Boeing updated their software? STUDENT: Turn it off, turn it back on. DAVID MALAN: Turn the plane off, and turn it back on because that has the effect of resetting its memory and therefore all of its variables back to 0. So this is ultimately to say as you dive into problem set 1, your first in C, you, too, will make quite a few mistakes when it comes to correctness. You, too, will encounter opportunities for better design and better style. In the real world, there are very much these issues. So even if you struggle, know that for better or for worse, you're in very good company. But some three months from now, you will be in much better shape because this was week 1, and this is CS50. [APPLAUSE] [INTRIGUING MUSIC]