Transcript for:
Understanding Computers and Programming Basics

Computers make no sense. Throw some metal in a box  and boom [monke]. What the heck is going on here?  Inside your PC is a Central Processing Unit,  or CPU. It’s basically just a piece of silicon   with billions of microscopic switches called  transistors. Depending on the flow of electricity,   they can be on or off, kind of like a light  bulb, which gives us two states: 1 and 0.  The value at one of these switches is called a  “bit”. One bit by itself doesn’t really do much.   But put them together, and magic starts to happen. A group of 8 bits is called a “byte” and can   have 256 different combinations of 0s and 1s.  Congratulations! We can now store information   by counting in a system called “binary”. Every bit represents a power of 2,   1 meaning the power is included and 0 meaning  it’s not, so this number has 1 times 64,   1 times 4, and 1 times 1, which adds up to 69. This is nice, but for humans, hexadecimal is   even better: It’s often denoted by this 0x and  is simply a more readable format than binary:   Four binary bits can take any value from  0 to 15. Hexadecimal uses 0-9 and a-f   to represent those values, so a group of four  bits can be replaced by one hexadecimal digit.  Okay. Now that we can store numbers,  we just need computers to actually,   you know, do something with them. Using transistors, you can make logic gates,   which are electronic circuits that encapsulate  logical statements. You can think of it as a   lightbulb with two switches, where the light  only turns on under certain conditions.   For example, only if A AND B are on. By combining logic gates in a clever way,   you can build circuits that perform calculations  according to Boolean algebra, which is a system   formalizing mathematical operations in binary. But, even though computers understand 0s and 1s,   for humans, it’s not really all that useful.  So, using a character encoding like ASCII,   we can assign a binary number to each  character. When you type an A on your keyboard,   it gets translated into this binary code, and as  soon as the computer sees this, it says: “Ah yes,   that is a capital A.”, and slaps it on the screen. How these devices fit together is handled by an   operating system kernel, like Windows, Linux or  Mac, which sits between computer hardware and   applications and manages how they all work  together, for example with device drivers.  Input devices allow you to give the computer  instructions with the press of a button,   but at the lowest level, computers only  understand instructions in machine code,   which is binary code telling the CPU  what to do, and which data to use.  When it comes to following these instructions,  the CPU is kind of like a genius, just with the   memory of a demented goldfish. It can handle  any instructions but it cannot store any data,   so it’s only really useful with  random access memory or RAM.  You can imagine it like a grid, where  every box can hold one byte of information,   which can be data or instructions, and has an  address, so the CPU can access it in four steps:  Fetch from memory, decode instructions  and data and finally, execute and store   the result. This is one machine cycle. Since a program is basically just a list   of instructions in memory, to run it,  the CPU executes them one by one in   machine cycles until it’s complete. Oh yeah did  I mention that this happens, like, really fast?  Modern CPUs can do billions of cycles every  second, which are coordinated and synchronized   by a clock generator. The speed of this clock  is measured in GHz, and people often overclock   their CPUs to improve performance, which is  nice, but might just set your PC on fire.  What’s ever crazier though, is that a CPU has  multiple cores, which can all execute different   instructions in parallel, so at the same time.  Each core can be split into multiple threads,   which also allows every single core to  handle multiple instructions concurrently,   so switch between them really quickly. Okay, that’s cool, but it doesn’t matter   how powerful a computer is if you have no way  to give it instructions in the first place.   Typing machine code by hand would probably make  you go insane, but luckily, you don’t have to:  The kernel is wrapped in a shell, which is just  a program that exposes the kernel to the user,   allowing for simple instructions in a  command line interface with text inputs.  But the best way to make a computer do  something useful is with a programming language,   which uses abstraction, so that instead of  this, you can write code that looks like this,   which is then converted into machine code for you. Some languages like Python use an interpreter,   which directly tries to execute the source  code line by line. Other languages like C   or GO use a compiler, which converts  the entire program into machine code,   before putting it in a file the CPU can execute. Now, every programming language has different   syntax, but there’s some basic  tools almost all of them have:  The most basic way to use data is with  variables, which assigns a value to a name,   which can then be reused and modified. Depending on the value, variables can have   different data types. For text, there’s single  characters and strings of multiple characters.  For numbers, there’s integers, which can also  be signed if they’re negative, and floating   point numbers for decimal values. They’re called  floating point, because the decimal point can   float around to trade off precision with range.  This is possible because they use scientific   notation. It’s some number times a power telling  you where to put the decimal point, which is   exactly what they look like under the hood. The actual number is stored with binary   fractions. Some fractions like 1/3 can only be  approximated in binary with an infinite sum, but,   since memory is not infinite, you have to cut it  off at some point, which leads to rounding errors,   causing pretty weird calculations, sometimes. If these are not enough, long and double   use twice the amount of memory to  double the range of ints and floats.  Some languages like Python automatically  figure out which type a variable is,   but in a language like C, you have to  explicitly declare the type of a variable.  The value of a variable is stored at some  address in memory. Pointers are variables whose   value is the memory address of another variable,  which is denoted by this ampersand. So really,   a pointer is just some chunk of memory,  pointing to another chunk of memory.  Since a memory address is just a number,  you can add and subtract from it to navigate   through individual bytes of memory.  This is called “pointer arithmetic”.  In some low-level languages like C, you have to  manually allocate and free up memory once it’s   no longer used. This all happens in the heap,  which is a part of memory that can dynamically   grow and shrink as the program demands,  which allows for more control but makes in   incredibly easy to completely break your code. You could touch memory you’re not supposed to,   or that simply doesn’t exist, which is known as  a “segmentation fault”. But also, if there’s some   chunk of memory that’s no longer used, and you  forget to free it or you have no way to access   it anymore, that memory is no longer usable. This is called a “memory leak” and will make   the program slow down and eventually crash. To avoid this mess, high level languages   like Python have built in garbage  collectors that manage memory for you.  Different data types take up a different amount of  memory. Integers are most often 4 bytes of memory.  A single character is most often one  byte of memory, and a string is just   multiple character bytes, with a “NUL”  character to signal the end of the string.  Storing multiple items in a contiguous chunk  of memory like this is the idea of an array:  More generally, it’s a list of  items with the same data type,   with each item having a numerical index, most  often starting at 0. Since the items are next   to each other in memory, by knowing the address  of the first item, you can quickly index to any   item in the array by using pointer arithmetic. Arrays are what’s known as a data structure,   which is just a way to organize  data to make it easier to work with.  Retrieving values from an array is blazingly  fast, but the size of an array is often fixed   when creating it, so when it’s full, you can’t  add anything, and if you don’t use all of it,   it’s just wasted memory, so a more  flexible option is a linked list.  It uses nodes containing a value  and a pointer to the next node,   which allows them to be spread apart in memory.  Also, it can grow and shrink dynamically,   as you can add and remove any node, and you can  reorder it, by simply rearranging the pointers.  This is great, but they can be impractical, as  you have no way to access the last node except   if you traverse every single one before it. But  still, both arrays and liked lists are useful,   as they allow you to create queues and stacks. A stack follows the last in, first out principle,   just like taking a pancake from a stack.  Practically, just imagine a pointer that   always points to the item that was last added  to the structure. Then you can pop, so remove   the last item which increments the pointer back. A queue follows the first in first out principle   and uses two pointers, one for  the first item that was added,   and one for the last. Any new item gets added  after the last pointer, which is then updated,   and dequeuing starts at the first pointer. Another useful data structure is a hash map,   which is just a collection of key value pairs. It works kind of like an array but uses a hash   function to take a key and assign it to  an index, where its value is then stored.  But sometimes, two different keys can map to the  same index, which is called a “collision”. There’s   different ways to deal with this, but one way is  to create a linked list at that position in the   array, which makes it a little slower to look up  that value, but still, hash maps are incredibly   useful, because you can define the keys that point  to every value, and since they’re based on arrays,   retrieving them is blazingly fast. For many problems, it can be very   useful to represent the relationship between  different datapoints as a data structure.  If you take the nodes of a linked list, but  allow any node to point to any other node,   you get a graph, where the nodes are  connected by edges that can be directed,   undirected and can even carry a weight, which can  stand for any metric, such as distance or cost.  Graphs are useful for analyzing groups inside  networks or finding the shortest path between two   points, for example in Google Maps. There’s two  main ways to search a graph: Breadth first search,   starts at one node and moves out layer by  layer until it finds the target node for the   first time. Depth first search explores every  single path fully until it reaches a dead end,   then it backtracks to the last node with a  different path and continues from there in   the same way until it finds the target node. A graph where any two nodes are connected by   exactly one path is called a tree, and represents  a hierarchy, for example the file system on your   computer. A tree starts at the root and branches  out into subtrees, which end in leaf nodes.  Parent nodes can have any number of child  nodes, but oftentimes, binary trees are the   most useful. For example, a binary tree where  all the values left of any node are smaller,   and all the values right of it are greater,  is called a “binary search tree”, which makes   finding specific values super fast. If you want to find a target value, just   start at the root. If the target is smaller than  that node , go left, if it’s greater, go right,   and repeat this until you find the target. What we just used is a very simple algorithm,   which is just a set of instructions  that solves a problem step by step.  The simplest way to write an algorithm is  in the form of a function. It takes inputs,   does something with them, and returns an output. Just like variables, you can then call a function   by its name and pass in different arguments. When calling a function, the function call   gets pushed onto the call stack, which is  short-term memory used for executing code,   and as the name implies, it’s based on the stack  data structure, which means last in first out.  To implement algorithms, you often have to  compare two values, which you can do with   operators like greater than or equality, and  logical expressions such as AND, OR and NOT.  Expressions like these are simple:  they can be true or false, which are   the possible values of the Boolean data type  and allow you to write conditional statements:  If some condition is true, do this, else do that. Booleans can also be used to loop over certain   parts of code. One way is with a while loop: While  this condition is true, this code will execute.  Another way is a for loop, which can iterate  over every element inside a data structure   like an array, but can also loop for a specific  number of iterations, by setting a starting value,   incrementing it after each iteration, and  setting an upper bound with a condition.  Functions can also call themselves, which is known  as “recursion”. This is useful when a problem   can be broken down into smaller identical  problems, such as calculating 5 factorial,   which is just 5 times 4 factorial, which  is just 4 times 3 factorial and so on.  But, by default, a recursive function will  just keep on calling itself forever. This   means that it keeps pushing more function  calls onto the call stack, until the stack   memory is exceeded in a “stack overflow”. To stop this, you have to add a base condition   to a recursive function, which defines when  to stop. Only then will the function calls   be executed without crashing your PC. Recursion is cool, but it can be pretty   expensive time and spacewise. So, to minimize the  amount of computations needed, past results can be   saved in a cache, so if they come up again,  the computer doesn’t have to recompute them   from scratch. This is called “memoization”. Speaking of performance, to judge how good   an algorithm is, you can look at time and  space complexity, so how much time or space   is required to run it. This is measured in Big  O notation, which describes the relationship   between growth of input size and number of  operations needed to execute the algorithm.  For example, adding 1 to every number inside an  array is O(n), because the number of operations   increases in a linear way as the array grows. What’s relevant is not the exact number of   operations, but rather the trend as the the input  size goes to infinity. You see, something like n!   grows way faster than any linear function ever  could, so as long as the time complexity is some   kind of linear relation, it’s simplified down to  O(n), and the same thing goes for any other group.  When writing algorithms, there’s some general  approaches: For example, when searching an item   in a list, a brute force approach would be to  simply check every item until we find the target,   but a more sophisticated approach would be divide  and conquer. For example, in binary search,   you cut the problem in half each time by checking  the middle element of a list and seeing on which   side the target is until you land on the target. Now, when solving a problem with code, there’s   always multiple ways to achieve the same result,  which are called programming paradigms. Let’s try   to find the sum of all elements in this list. “Declarative programming”, describes what the   code does, but not how exactly the computer  should do it, whereas “Imperative programming”   explicitly describes how the computer should  achieve a result with detailed instructions.  An extension of imperative programming is  object-oriented programming, where you can   define classes as blueprints for objects, which  are single units consisting of data in the form of   properties and behaviours in the form of methods. To code a call, you begin by defining the   properties as variables, and  the methods as functions.  After encapsulating properties and methods  in a class, you can instantiate an object   and use the dot notation to work  with its properties and methods.  Classes make it easy to organize and reuse code,  because you can define subclasses that inherit   properties and behaviours of a superclass,  but can also extend and override them.  For example, a RubberDuck subclass might implement  quack() with a squeak(), instead. As a result,   rubber rucks can be treated as objects of the  Duck class, but behave differently when quack()   is called, which is the concept of “polymorphism”. But for some problems, none of these traditional   paradigms will work. Let’s say you want to make   a computer recognize which of these images  is a bee. The problem is, you can’t really   describe what a bee looks like with code. This is where machine learning comes in,   aka teaching a computer to do a task without  explicitly programming it to do that task.  First you need a lot of data which  you split into training and test data.  Next you choose an algorithm that  can change its parameters over time,   for example a neural network, where the weights  can be updated to achieve a different result. By   feeding lots and lots of training data into  this algorithm you can build a model, whose   accuracy you can then check with the test data. If it’s not quite right, the model can improve   over time by comparing the output to what it  should have been, capturing the difference   in an error function, and tweaking its  parameters to minimize the difference.  But no matter how futuristic, bleeding-edge,  blazingly fast and optimized it is, if you   want people to actually use the application  you wrote, you should probably know about   the internet. It’s a network of computers from  all around the globe connected by wires. Like,   literally, the internet is a bunch of thicc  cables that run at the bottom of the ocean   along with facilities like Internet Service  Providers that connect you to your destination.  These computers communicate with the Internet  Protocol Suite. Every computer on the network   has a unique IP address. Two computers can  then transfer data with the transmission   control protocol. It breaks messages into a bunch  of packets, sends them through a network of wires,   before the receiving end puts the message back  together. If you have a poor internet connection,   you might have experienced “packet loss”, which is  just if some these packets get lost along the way.  If the internet is the hardware, then the web is  the software, which you can use with a browser.  Every page on the web has a URL. When you type it  into your browser, it looks up the IP address of   the server hosting this website with the domain  name system, which is like a dictionary mapping   domain names to IP addresses of actual servers. After connecting to it via TCP, your browser,   called the client, uses the hypertext transfer  protocol to send an http request to the server,   which then gives a response, ideally  containing the contents of the webpage.  The actual website most often consists of three  parts: An HTML file contains all the content of   a website and is basically just a collection of  elements, which can be text, links, buttons and   so on. A CSS file controls the visuals and makes  a website look nice. But, a website is useless if   pressing nice looking buttons does nothing, so  a language like JavaScript adds functionality.  But sometimes things can go wrong. With  every http response comes a response code,   which carries information about the status  of the response. For example, 200 means “OK”,   and anything starting with 4 is an error, the  most famous one being “404 – page not found”.  HTTP requests can carry different methods, for  example GET, POST, PUT and DELETE, so retrieve,   add, update and delete information. These are  often used by Application Programming Interfaces,   which connect two applications and allow  them to interact with each other, for   example store and retrieve data from a database. The most common type of database is a relational   database, which uses tables to store data. Columns of a table contain different attributes,   and rows represent individual datapoints.  Also, each table has one unique attribute   called the primary key. A foreign key  is the primary key of another table,   establishing a relationship between the two. In  this case, each book is connected to an author.  With a language like SQL, you can write statements  to work with data from these tables. You could   look up the titles and authors of all books,  whose title starts with “H”, but for that,   we have to join the authors table with  the books table on the matching key to   combine two attributes from different  tables into one, giving us this output.  These statements are useful, but you’ve got  to be careful, or you might just delete an   entire database with one line of code. But,  don’t worry. That’s never happened before.  Behind every login page is a database with  usernames and passwords. When a user tries to   log in, an SQL query is often used to check if the  user input matches with an entry in the database.  That’s good, but, a devious actor could type  something like this, which changes the query by   terminating the string early and commenting out  the rest, which means as long as this username   exists in the database, access is granted. This is called an SQL injection attack and   is one of the easiest ways hackers  get places they’re not supposed to. Hearing about all these concepts is  one thing, but to really learn them,   you have to see them in action and use  them yourself, which is exactly what you   can do with Brilliant, which has thousands  of interactive lessons for everything from   math and data science to programming and AI. They make knowledge stick with visual lessons   and interactive problems, which is not only fun  and builds intuitive problem solving skills,   but also proven to be 6x more effective  than simply watching hours of lectures.  I know that making time to learn new skills can be  difficult, but Brilliant makes it so easy: You can   build valuable knowledge from the ground up in  just a couple of minutes a day with bite-sized   lessons from any device, anywhere, anytime. You’ll go from the basics of data science   to analysing real datasets from  Spotify in absolutely no time,   which you can supplement with math fundamentals  and programming courses in Python to build one   of the most in demand skillsets of our time. The best part? You can try everything Brilliant   has to offer for free for a full 30 days, by  visiting brilliant.org/WackyScience/. You’ll   also get 20% off an annual premium subscription.  Thanks to Brilliant for sponsoring this video.