Transcript for:
Neural Networks from Scratch Lecture Notes

What's going on everybody welcome to the Neural Networks from Scratch series where we will be creating creating Neural Networks end-to-end so from... the Neurons, the Layers, Activation functions, Optimization,... Back Propagation all this stuff were going to be coding that from scratch in python. Now... everything we do we are going to show first in truly raw python no libraries or no 3rd party libraries and then we are going to use NumPy for multiple reasons NumPy just makes a lot of sense here. It's a extremely useful library and it'll cut our lines of a full application down-a-ton it'll make it much faster and NumPy is a great thing to learn, so... we will show everything from scratch in python first and then we are going to use NumPy. Now... why would anybody wanna do this to themselves? Well, first of all, it's very interesting umm... the idea is not-even though we effectively we are going to be programming our own Neural Networks framework, it's not really the purpose here. The purpose is to actually learn how Neural Networks work at a very deep level so that when we go back to whatever Framework we actually use, be it PyTorch or Tensorflow's Keras or maybe some library that doesn't even exist yet. We actually understand what we are doing. So from myself when I at least when I learned Deep Learning, it was all- -yes it was hard but at the same time everything was kinda solved for me you know, how many layers, how many multiple layers, what Activation functions to use but all that stuff was just fed to me I just kind of-okay and then I just sort of memorized like here's the activation function used for this sort of task But I didn't understand why and this became a problem when I tried to solve problems that had not yet been solved for me. So classifying images of handwritten digits? Pretty darn simple. Taking that a step further and classifying images of cats and dogs? Pretty simple. But then taking that just a tiny step further and classifying images that were instead frames from a video game and trying to map that to actions that I want to take in the video game. Suddenly, I'm lost. and that's..... ...that's no good and there's no way to really know where to go next and you can see that there is a problem but actually trying to solve more custom problems is going to require to have a deeper understanding of how things actually work Unfortunately though deeper understanding Deep Learning can be very very complicated. So you can just-just a simple forward pass of a Neural Network Neural Network looks extremely daunting. So looking at the calculations for a Neural Network, it looks pretty confusing fairly quick You've got your input data and every unique input and every unique neuron. The information coming through has a unique weight associated- with it. Those get summed together per neuron neuron plus a bias. Run through an activation function and then we do that for every single layer, giving us the output information. From there we wanna calculate a loss which is a calculation of how wrong the neural Network is, so that hopefully we can fix it. And at the end of that, even though that was just a forward pass that looks already extremely-extremely daunting Now let's take a look at the exact same formula just in code version. You have all your inputs times your weights (Input x Weights) And you don't have to follow along perfectly here just look at each element and see if any of these elements look like they are over your head. They really shouldn't be. So inputs times weights, you can make this even simpler by doing just a dot product so 'np.dot' y1 is just the max of zero and the output We do this for all the layers. Then we have our softmax (activation function) at the very end. And then the loss turns out to be a negative log (logarithmic) loss due to the nature of neural networks. So this is the entire forward pass and calculation of loss formula I urge you to look at all of the functions and things that we are doing here to determine if any of these is really over your head because it shouldn't be right? We've got very simple functions going on here We are calculating a log, sum-if you don't know what a 'log' means btw we're going to explain it. 'Sum', you should know what that means. 'Exponential', again if you don't know what that means we are going to explain but it's very simple. 'Dot Product', again if you don't know what that means no worries we will explain it, very simple. 'Maximum', is just the max whatever two values you pass here. Again very simple. Some more dot products, 'Transpose', again very simple. If you don't know it we will explain it anyway That's it! None of this is over your head I promise you As far as prerequisites go the only expectation I have from the viewers is that you understand programming and Object-Oriented Programming (OOP) otherwise, you're going to feel kinda lost. If you're coming from a different programming language you will probably find python simple You can probably use the programming language that you are familiar with Everything is so low level here that you should be able to follow along in any other language that you want So feel free to do that, if you wanna do that otherwise, if you wanna follow along in python make sure you know the basics and OOP I'll put links in the description for both of those If you need to brush up Next, is the version of python If you follow along with python we are gonna do things like 'fstring' I'm gonna be on Python 3.7 The NumPy version, I can't think of any reason of why it would matter. I'll put that information in the description just in case any function does change. Like I said, everything's so low level this series should be good for like 10 years. Let's hope for that :) So those are the prerequisites it's really not much. I don't expect anybody to have any background knowledge of Deep Learning So if you do know things about deep learning, yes we are going to cover hopefully quickly the fundamentals, just so people understand like what exactly are we aiming for here? And then the bulk of your understanding of how Neural Networks work is gonna come from us just building these Neural Networks If things feel a little fuzzy to you it's probably normal, to be honest. I think once you build a Neural Network from scratch that is all the understanding you are gonna need So you're not expected to know math or anything like that. For me in college, the only math class I took was 'Math Fundamentals' And I don't think we did any math calculation at all in that class. It was definitely a joke. So if I can do this, I know you can do this too. So If you wanna brush up on Math There is Khan academy for Linear Algebra and the calculus stuff But I wouldn't even suggest you go through the full series on either of those topics You can use those to kinda like a spot check issues that you still find confusing With that in mind, this series is also provided in conjunction with the 'Neural Networks from Scratch' book We are going to be covering the same material for the most part. The book might be a little more verbose. The series is obviously free but the book ha various prices depending on what you want. The E-book, Softcover, Hardcover. We ship everywhere in the world Access to the book gives you access to the E-book. So whichever version you buy you always have access to the E-book. That gives you the opportunity to access the google docs draft, currently it's in a draft form at some point, it won't be a draft anymore but you can highlight, post comments, ask questions inline with the text Also if you're impatient you can access even though its a draft right now it's complete from basically end-to-end of training the Neural Network and we're doing testing right now. So, we are already obviously quite a bit ahead of where the videos are so if you're impatient you can also access information earlier with the book I would use the book as either reviews or the videos as reviews So I would maybe read the book before watching the video and then use the video as a review of what I learned in the book or vice versa This is a topic that you're not gonna blow through this in a weekend. It's gonna require multiple sittings multiple environments and ideally multiple mediums so, If you're interested in the book you can get that at 'nnfs.io'. So we call these Neural Networks because they look visually like a network. You've got your Neurons which in this case are the blue circles. They're connected via those orange lines and in this case, we have basically the input layer, two hidden layers of 4 neurons each and then your output layer. Now data is gonna get past forward through this, so it starts at your input layer, so in this case, we only have two pieces of data they are gonna come in. That gets past forward to that first hidden layer Then that hidden layer passes data to the second layer and then finally to the output layer where we hope that it will output something that we want, for example: based on some sensor data maybe we wanna predict Failure or not failure. You could either have one neuron or in this case we have two neurons. So the top neuron might be a failure neuron, the bottom neuron is a not failure neuron. And depending on which one has the higher value that's the supposed prediction. Now the end goal of neural networks like most machine learning is to take some input data and produce output data that is desired. In this case, we've got images of cats and dogs. We hope that we can pass it through in pixel form to our neural network and if its a dog then that final output neuron on top is going to be the strongest, if it's a cat then that final output neuron on the bottom will be the strongest. And we can do this by tuning the weights and biases So all those unique weights and biases, we do that by tuning those and that is the actual training process. It's tuning those in such a way so that hopefully, we can take data that this neural network has never seen, give it pictures of cats and dogs that it has never seen and have it accurately predict those So how and why do neural networks work? Well, if you just look at them and really consider what's going on every neuron is connected to the subsequent layer of neurons in folds. So each of those orange lines, that connection that's a unique weight and then every neuron is a unique bias. So what this ends up giving us is a huge number of uniquely tunable parameters that go into this GIGANTIC function. So for example, with 64 x 3 hidden layers here we have 9164 tunable parameters in this gigantic function. And each of those parameters impacts the output of the next neurons and so on. And so what we end up having are these complex relationships that can in theory be mapped. So to me the impressive thing of neural networks is not necessarily all of those connections its not really complex to understand. What the hard part of neural networks and deep learning is figuring out how to tune such a thing. Alright! since I think it would be lame to not post any code at all in this first video we are going to begin to code neuron. But first I wanna go over the version numbers real quick because quite frankly I am gonna forget to put it in the description So I'm using Python 3.7.7 NumPy 1.18.2 Matplotlib 3.2.1 Again all of this should work very far into the future But just in case not, there are the exact versions if anybody needs to have the exact version to follow along Now, let's go ahead and begin to code just to say I am using Sublime Text You can use whatever editor you want. Everyone's got a really strong opinion on editors. I'm gonna be using Sublime text. There might be a time where we use Jupyter Notebook. Who knows, I'm gonna use whatever makes the most sense to me at the time which may not make sense to you or me later Anyway, continuing on, so every neuron basically- Let's pretend we are coding a neuron that's somewhere in this- densely connected feed-forward multilayer perceptron model. Using all the big words. Don't worry about knowing what those mean, by the end of this you will know all of those words. But part the problem with learning deep learning is people use the same 3-4 words for the exact same thing and it can be very daunting. So for now, we are gonna called this a neuron, it's somewhere in our neural network. Now in this fully connected neural network every neuron has a unique connection to every single previous neuron. So, let's say there are 3 neurons that are feeding into this neuron that we are gonna build. So, we don't know much about those neurons but we know that they are outputting some value So first, their output becomes the neurons that we are coding's inputs We are gonna just make up some numbers, we're just gonna say 1.2, 5.1, 2.1 Those are the unique inputs So these are outputs from the three neurons in the previous layer Every unique input is also going to have a unique weight associated with it. So we are going to say weights and you should know how many weights we are going to have Well since we have three inputs we know we are going to have three weights (3.1, 2.1, 8.7) I'm just making up these numbers. It's just for, beginning to code how neurons are gonna work So you've got your inputs, your weights and then every unique neuron has a unique bias So bias equals to 3 So now, the first step to for a neuron is to add up all the inputs times weights plus the bias So this is relatively simple. In very raw python no loops required at this stage. We're just going to say basically output so far of this neuron is going to be look at the screen now You need to have some basic knowledge of 'list' here That is basically so far... There will be other things that will happen soon enough, but so far that is the output to our neural network So we'll just print the output, run it. 35.7 If you're new to Sublime Text for some reason 'ctrl + b' to run. But you'll have to set up the build system to run with python. But I would expect how that people know how to work with a programming language But if not feel free to either comment below, join us in 'discord.gg/sentdex And we'll be happy to help. Anyway that's it for now if you've got questions, comments, concerns, whatever moving forward feel free to ask them below otherwise, we're just gonna keep slowly shipping away at Neural Networks from Scratch As you can see pretty darn simple so far and for the most part, it's just adding a bunch of- as we've broken it down so far We are gonna break it down to the point where every little additional step is gonna look an awful lot like that There's a couple of points that may be a little more challenging but the goal is to break it down so much that it is painfully simple So I will see you guys in the next video :)