Key Concepts in Computer Science

Computers make no sense. Throw some metal in a box and boom [monke]. What the heck is going on here? Inside your PC is a Central Processing Unit, or CPU. It’s basically just a piece of silicon with billions of microscopic switches called transistors. Depending on the flow of electricity, they can be on or off, kind of like a light bulb, which gives us two states: 1 and 0. The value at one of these switches is called a “bit”. One bit by itself doesn’t really do much. But put them together, and magic starts to happen. A group of 8 bits is called a “byte” and can have 256 different combinations of 0s and 1s. Congratulations! We can now store information by counting in a system called “binary”. Every bit represents a power of 2, 1 meaning the power is included and 0 meaning it’s not, so this number has 1 times 64, 1 times 4, and 1 times 1, which adds up to 69. This is nice, but for humans, hexadecimal is even better: It’s often denoted by this 0x and is simply a more readable format than binary: Four binary bits can take any value from 0 to 15. Hexadecimal uses 0-9 and a-f to represent those values, so a group of four bits can be replaced by one hexadecimal digit. Okay. Now that we can store numbers, we just need computers to actually, you know, do something with them. Using transistors, you can make logic gates, which are electronic circuits that encapsulate logical statements. You can think of it as a lightbulb with two switches, where the light only turns on under certain conditions. For example, only if A AND B are on. By combining logic gates in a clever way, you can build circuits that perform calculations according to Boolean algebra, which is a system formalizing mathematical operations in binary. But, even though computers understand 0s and 1s, for humans, it’s not really all that useful. So, using a character encoding like ASCII, we can assign a binary number to each character. When you type an A on your keyboard, it gets translated into this binary code, and as soon as the computer sees this, it says: “Ah yes, that is a capital A.”, and slaps it on the screen. How these devices fit together is handled by an operating system kernel, like Windows, Linux or Mac, which sits between computer hardware and applications and manages how they all work together, for example with device drivers. Input devices allow you to give the computer instructions with the press of a button, but at the lowest level, computers only understand instructions in machine code, which is binary code telling the CPU what to do, and which data to use. When it comes to following these instructions, the CPU is kind of like a genius, just with the memory of a demented goldfish. It can handle any instructions but it cannot store any data, so it’s only really useful with random access memory or RAM. You can imagine it like a grid, where every box can hold one byte of information, which can be data or instructions, and has an address, so the CPU can access it in four steps: Fetch from memory, decode instructions and data and finally, execute and store the result. This is one machine cycle. Since a program is basically just a list of instructions in memory, to run it, the CPU executes them one by one in machine cycles until it’s complete. Oh yeah did I mention that this happens, like, really fast? Modern CPUs can do billions of cycles every second, which are coordinated and synchronized by a clock generator. The speed of this clock is measured in GHz, and people often overclock their CPUs to improve performance, which is nice, but might just set your PC on fire. What’s ever crazier though, is that a CPU has multiple cores, which can all execute different instructions in parallel, so at the same time. Each core can be split into multiple threads, which also allows every single core to handle multiple instructions concurrently, so switch between them really quickly. Okay, that’s cool, but it doesn’t matter how powerful a computer is if you have no way to give it instructions in the first place. Typing machine code by hand would probably make you go insane, but luckily, you don’t have to: The kernel is wrapped in a shell, which is just a program that exposes the kernel to the user, allowing for simple instructions in a command line interface with text inputs. But the best way to make a computer do something useful is with a programming language, which uses abstraction, so that instead of this, you can write code that looks like this, which is then converted into machine code for you. Some languages like Python use an interpreter, which directly tries to execute the source code line by line. Other languages like C or GO use a compiler, which converts the entire program into machine code, before putting it in a file the CPU can execute. Now, every programming language has different syntax, but there’s some basic tools almost all of them have: The most basic way to use data is with variables, which assigns a value to a name, which can then be reused and modified. Depending on the value, variables can have different data types. For text, there’s single characters and strings of multiple characters. For numbers, there’s integers, which can also be signed if they’re negative, and floating point numbers for decimal values. They’re called floating point, because the decimal point can float around to trade off precision with range. This is possible because they use scientific notation. It’s some number times a power telling you where to put the decimal point, which is exactly what they look like under the hood. The actual number is stored with binary fractions. Some fractions like 1/3 can only be approximated in binary with an infinite sum, but, since memory is not infinite, you have to cut it off at some point, which leads to rounding errors, causing pretty weird calculations, sometimes. If these are not enough, long and double use twice the amount of memory to double the range of ints and floats. Some languages like Python automatically figure out which type a variable is, but in a language like C, you have to explicitly declare the type of a variable. The value of a variable is stored at some address in memory. Pointers are variables whose value is the memory address of another variable, which is denoted by this ampersand. So really, a pointer is just some chunk of memory, pointing to another chunk of memory. Since a memory address is just a number, you can add and subtract from it to navigate through individual bytes of memory. This is called “pointer arithmetic”. In some low-level languages like C, you have to manually allocate and free up memory once it’s no longer used. This all happens in the heap, which is a part of memory that can dynamically grow and shrink as the program demands, which allows for more control but makes in incredibly easy to completely break your code. You could touch memory you’re not supposed to, or that simply doesn’t exist, which is known as a “segmentation fault”. But also, if there’s some chunk of memory that’s no longer used, and you forget to free it or you have no way to access it anymore, that memory is no longer usable. This is called a “memory leak” and will make the program slow down and eventually crash. To avoid this mess, high level languages like Python have built in garbage collectors that manage memory for you. Different data types take up a different amount of memory. Integers are most often 4 bytes of memory. A single character is most often one byte of memory, and a string is just multiple character bytes, with a “NUL” character to signal the end of the string. Storing multiple items in a contiguous chunk of memory like this is the idea of an array: More generally, it’s a list of items with the same data type, with each item having a numerical index, most often starting at 0. Since the items are next to each other in memory, by knowing the address of the first item, you can quickly index to any item in the array by using pointer arithmetic. Arrays are what’s known as a data structure, which is just a way to organize data to make it easier to work with. Retrieving values from an array is blazingly fast, but the size of an array is often fixed when creating it, so when it’s full, you can’t add anything, and if you don’t use all of it, it’s just wasted memory, so a more flexible option is a linked list. It uses nodes containing a value and a pointer to the next node, which allows them to be spread apart in memory. Also, it can grow and shrink dynamically, as you can add and remove any node, and you can reorder it, by simply rearranging the pointers. This is great, but they can be impractical, as you have no way to access the last node except if you traverse every single one before it. But still, both arrays and liked lists are useful, as they allow you to create queues and stacks. A stack follows the last in, first out principle, just like taking a pancake from a stack. Practically, just imagine a pointer that always points to the item that was last added to the structure. Then you can pop, so remove the last item which increments the pointer back. A queue follows the first in first out principle and uses two pointers, one for the first item that was added, and one for the last. Any new item gets added after the last pointer, which is then updated, and dequeuing starts at the first pointer. Another useful data structure is a hash map, which is just a collection of key value pairs. It works kind of like an array but uses a hash function to take a key and assign it to an index, where its value is then stored. But sometimes, two different keys can map to the same index, which is called a “collision”. There’s different ways to deal with this, but one way is to create a linked list at that position in the array, which makes it a little slower to look up that value, but still, hash maps are incredibly useful, because you can define the keys that point to every value, and since they’re based on arrays, retrieving them is blazingly fast. For many problems, it can be very useful to represent the relationship between different datapoints as a data structure. If you take the nodes of a linked list, but allow any node to point to any other node, you get a graph, where the nodes are connected by edges that can be directed, undirected and can even carry a weight, which can stand for any metric, such as distance or cost. Graphs are useful for analyzing groups inside networks or finding the shortest path between two points, for example in Google Maps. There’s two main ways to search a graph: Breadth first search, starts at one node and moves out layer by layer until it finds the target node for the first time. Depth first search explores every single path fully until it reaches a dead end, then it backtracks to the last node with a different path and continues from there in the same way until it finds the target node. A graph where any two nodes are connected by exactly one path is called a tree, and represents a hierarchy, for example the file system on your computer. A tree starts at the root and branches out into subtrees, which end in leaf nodes. Parent nodes can have any number of child nodes, but oftentimes, binary trees are the most useful. For example, a binary tree where all the values left of any node are smaller, and all the values right of it are greater, is called a “binary search tree”, which makes finding specific values super fast. If you want to find a target value, just start at the root. If the target is smaller than that node , go left, if it’s greater, go right, and repeat this until you find the target. What we just used is a very simple algorithm, which is just a set of instructions that solves a problem step by step. The simplest way to write an algorithm is in the form of a function. It takes inputs, does something with them, and returns an output. Just like variables, you can then call a function by its name and pass in different arguments. When calling a function, the function call gets pushed onto the call stack, which is short-term memory used for executing code, and as the name implies, it’s based on the stack data structure, which means last in first out. To implement algorithms, you often have to compare two values, which you can do with operators like greater than or equality, and logical expressions such as AND, OR and NOT. Expressions like these are simple: they can be true or false, which are the possible values of the Boolean data type and allow you to write conditional statements: If some condition is true, do this, else do that. Booleans can also be used to loop over certain parts of code. One way is with a while loop: While this condition is true, this code will execute. Another way is a for loop, which can iterate over every element inside a data structure like an array, but can also loop for a specific number of iterations, by setting a starting value, incrementing it after each iteration, and setting an upper bound with a condition. Functions can also call themselves, which is known as “recursion”. This is useful when a problem can be broken down into smaller identical problems, such as calculating 5 factorial, which is just 5 times 4 factorial, which is just 4 times 3 factorial and so on. But, by default, a recursive function will just keep on calling itself forever. This means that it keeps pushing more function calls onto the call stack, until the stack memory is exceeded in a “stack overflow”. To stop this, you have to add a base condition to a recursive function, which defines when to stop. Only then will the function calls be executed without crashing your PC. Recursion is cool, but it can be pretty expensive time and spacewise. So, to minimize the amount of computations needed, past results can be saved in a cache, so if they come up again, the computer doesn’t have to recompute them from scratch. This is called “memoization”. Speaking of performance, to judge how good an algorithm is, you can look at time and space complexity, so how much time or space is required to run it. This is measured in Big O notation, which describes the relationship between growth of input size and number of operations needed to execute the algorithm. For example, adding 1 to every number inside an array is O(n), because the number of operations increases in a linear way as the array grows. What’s relevant is not the exact number of operations, but rather the trend as the the input size goes to infinity. You see, something like n! grows way faster than any linear function ever could, so as long as the time complexity is some kind of linear relation, it’s simplified down to O(n), and the same thing goes for any other group. When writing algorithms, there’s some general approaches: For example, when searching an item in a list, a brute force approach would be to simply check every item until we find the target, but a more sophisticated approach would be divide and conquer. For example, in binary search, you cut the problem in half each time by checking the middle element of a list and seeing on which side the target is until you land on the target. Now, when solving a problem with code, there’s always multiple ways to achieve the same result, which are called programming paradigms. Let’s try to find the sum of all elements in this list. “Declarative programming”, describes what the code does, but not how exactly the computer should do it, whereas “Imperative programming” explicitly describes how the computer should achieve a result with detailed instructions. An extension of imperative programming is object-oriented programming, where you can define classes as blueprints for objects, which are single units consisting of data in the form of properties and behaviours in the form of methods. To code a call, you begin by defining the properties as variables, and the methods as functions. After encapsulating properties and methods in a class, you can instantiate an object and use the dot notation to work with its properties and methods. Classes make it easy to organize and reuse code, because you can define subclasses that inherit properties and behaviours of a superclass, but can also extend and override them. For example, a RubberDuck subclass might implement quack() with a squeak(), instead. As a result, rubber rucks can be treated as objects of the Duck class, but behave differently when quack() is called, which is the concept of “polymorphism”. But for some problems, none of these traditional paradigms will work. Let’s say you want to make a computer recognize which of these images is a bee. The problem is, you can’t really describe what a bee looks like with code. This is where machine learning comes in, aka teaching a computer to do a task without explicitly programming it to do that task. First you need a lot of data which you split into training and test data. Next you choose an algorithm that can change its parameters over time, for example a neural network, where the weights can be updated to achieve a different result. By feeding lots and lots of training data into this algorithm you can build a model, whose accuracy you can then check with the test data. If it’s not quite right, the model can improve over time by comparing the output to what it should have been, capturing the difference in an error function, and tweaking its parameters to minimize the difference. But no matter how futuristic, bleeding-edge, blazingly fast and optimized it is, if you want people to actually use the application you wrote, you should probably know about the internet. It’s a network of computers from all around the globe connected by wires. Like, literally, the internet is a bunch of thicc cables that run at the bottom of the ocean along with facilities like Internet Service Providers that connect you to your destination. These computers communicate with the Internet Protocol Suite. Every computer on the network has a unique IP address. Two computers can then transfer data with the transmission control protocol. It breaks messages into a bunch of packets, sends them through a network of wires, before the receiving end puts the message back together. If you have a poor internet connection, you might have experienced “packet loss”, which is just if some these packets get lost along the way. If the internet is the hardware, then the web is the software, which you can use with a browser. Every page on the web has a URL. When you type it into your browser, it looks up the IP address of the server hosting this website with the domain name system, which is like a dictionary mapping domain names to IP addresses of actual servers. After connecting to it via TCP, your browser, called the client, uses the hypertext transfer protocol to send an http request to the server, which then gives a response, ideally containing the contents of the webpage. The actual website most often consists of three parts: An HTML file contains all the content of a website and is basically just a collection of elements, which can be text, links, buttons and so on. A CSS file controls the visuals and makes a website look nice. But, a website is useless if pressing nice looking buttons does nothing, so a language like JavaScript adds functionality. But sometimes things can go wrong. With every http response comes a response code, which carries information about the status of the response. For example, 200 means “OK”, and anything starting with 4 is an error, the most famous one being “404 – page not found”. HTTP requests can carry different methods, for example GET, POST, PUT and DELETE, so retrieve, add, update and delete information. These are often used by Application Programming Interfaces, which connect two applications and allow them to interact with each other, for example store and retrieve data from a database. The most common type of database is a relational database, which uses tables to store data. Columns of a table contain different attributes, and rows represent individual datapoints. Also, each table has one unique attribute called the primary key. A foreign key is the primary key of another table, establishing a relationship between the two. In this case, each book is connected to an author. With a language like SQL, you can write statements to work with data from these tables. You could look up the titles and authors of all books, whose title starts with “H”, but for that, we have to join the authors table with the books table on the matching key to combine two attributes from different tables into one, giving us this output. These statements are useful, but you’ve got to be careful, or you might just delete an entire database with one line of code. But, don’t worry. That’s never happened before. Behind every login page is a database with usernames and passwords. When a user tries to log in, an SQL query is often used to check if the user input matches with an entry in the database. That’s good, but, a devious actor could type something like this, which changes the query by terminating the string early and commenting out the rest, which means as long as this username exists in the database, access is granted. This is called an SQL injection attack and is one of the easiest ways hackers get places they’re not supposed to. Hearing about all these concepts is one thing, but to really learn them, you have to see them in action and use them yourself, which is exactly what you can do with Brilliant, which has thousands of interactive lessons for everything from math and data science to programming and AI. They make knowledge stick with visual lessons and interactive problems, which is not only fun and builds intuitive problem solving skills, but also proven to be 6x more effective than simply watching hours of lectures. I know that making time to learn new skills can be difficult, but Brilliant makes it so easy: You can build valuable knowledge from the ground up in just a couple of minutes a day with bite-sized lessons from any device, anywhere, anytime. You’ll go from the basics of data science to analysing real datasets from Spotify in absolutely no time, which you can supplement with math fundamentals and programming courses in Python to build one of the most in demand skillsets of our time. The best part? You can try everything Brilliant has to offer for free for a full 30 days, by visiting brilliant.org/WackyScience/. You’ll also get 20% off an annual premium subscription. Thanks to Brilliant for sponsoring this video.

Transcript for:Key Concepts in Computer Science

Transcript for:
Key Concepts in Computer Science