Understanding Tensors in Neural Networks

tensors do it for the data tensors do it for the speed tensors do it whenever you feel the need steadquest hello i'm josh starmer and welcome to statquest today we're going to talk about tensors for neural networks and they're going to be clearly explained this stack quest is sponsored by lightning and grid.ai with lightning you can design build and scale models with ease focus on the business and research problems that matter to you lightning takes care of everything else and with grid you can use the cloud to seamlessly train hundreds of models from your laptop with a single command no code change is necessary for more details follow the links in the pinned comment below note one thing that makes tensors a little confusing is that different people use the word tensor differently people in the math and physics community define tensor one way and people in the machine learning community define tensor a different way in this stat quest we're going to focus on the way tensor is used in the machine learning community within the machine learning community tensors are used in conjunction with neural networks so we need to talk about neural networks note if you're not already familiar with neural networks feel free to check out the quests the links are in the description below anyway neural networks can do a lot of things for example in the statquest neural networks part 1 inside the black box we had a simple neural network that had a single input drug dosage and used that single value to predict a single output the efficacy of the dosage then in the stat quests on back propagation we saw that even with this super simple neural network and this super simple training data with only three data points we still had to do a lot of math for the neural network to fit this squiggle to the data ugg math then in the stat quest neural network's part 4 multiple inputs and outputs we had a fancier neural network that had two inputs that corresponded to two different flower measurements and had three outputs that predicted which irish species these two measurements came from and then we saw how the neural network does a lot of math to make those predictions double ugh more math then in the statquest neural networks part 8 image classification with convolutional neural networks we had a super fancy neural network that had a 6 pixel by 6 pixel image so 36 pixels in all as the input and two outputs that predicted whether the image was of an x or an o and we walked through a whole lot of math that we needed to do in order to make those predictions triple ugg so much math note even though this neural network needs to do a whole lot of math it's still relatively simple compared to the types of neural networks that are used in practice for example the input to this convolutional neural network is a relatively small 6 pixels by 6 pixels black and white image however in practice usually the input image is much larger like 256 by 256 and that means the input has 65 536 pixels and usually the image is color instead of black and white and color images are usually split into three color channels red green and blue and since the neural network treats each channel separately that basically triples the number of pixels we have to do math on so now we're up to three times sixty five thousand five hundred thirty six which equals one hundred ninety six thousand six hundred eight pixels that we have to do a lot of math on and this is just one image and usually we need to do a ton of images to train the neural network so that means we have to do a ton of math on a ton of images and if we want to apply a neural network to video which is basically a series of images then we have even more math the good news is is that all this math is what tensors were designed for bam now let's talk about what tensors are from the perspective of someone who is creating a neural network tensors are ways to store the input data which in this example consists of three color channels for every single frame but as we saw earlier the input can also be super simple and consist of a single value and tensors also store the weights and biases that make up the neural network so from the perspective of someone creating a neural network tensors can seem really boring for example the input value for this neural network is just a single value which in most programming languages we'd call a scalar however to make things seem more exciting we can use fancy terminology and call the input which is just a single value a zero dimensional tensor when a neural network takes two input values like this one then in most programming languages we would say we store the inputs in an array however using tensor talk we will call this a one-dimensional tensor likewise when the input is a single image most programming languages would call it a matrix but we'll call it a two-dimensional tensor and when the input is video most programming languages would call this a multi-dimensional matrix or a multi-dimensional array or for you python people an nd array however using tensor talk we will call it an n-dimensional tensor so just like i said earlier from the perspective of someone creating a neural network tensors can seem really boring because all we have done is rename things that already exist so what's the big deal well unlike normal scalars arrays matrices and n-dimensional matrices tensors were designed to take advantage of hardware acceleration in other words tensors don't just hold data in various shapes like these but they also allow for all the math that we have to do with the data to be done relatively quickly usually tensors and the math they do are sped up with special chips called graphics processing units gpus but there are also tensor processing units tpus that are specifically designed to work with tensors and make neural networks run relatively quickly double bam note one thing i hinted at early on but didn't dive into the details about is that one of the things we do with neural networks is estimate the optimal weights and biases with back propagation and if you saw the stat quest on back propagation you'll know that we have to derive a bunch of derivatives and do a whole lot of the chain rule well one more cool thing about tensors is that they take care of back propagation for you with automatic differentiation this means you can pretty much create the fanciest neural network ever and the hard part figuring out the derivatives will be taken care of by the tensors triple bam in summary there are two types of tensors one type is used by mathematicians and physicists we did not talk about these today the other type of tensor is used in neural networks this is the type we talked about tensors for neural networks hold the data and the weights and biases and are designed for hardware acceleration so that neural networks can do all the math they need to do in a relatively short period of time and they take care of back propagation with automatic differentiation bam now it's time for some shameless self-promotion if you want to review statistics and machine learning offline check out the stack quest study guides at statquest.org there's something for everyone hooray we've made it to the end of another exciting stack quest if you like this stat quest and want to see more please subscribe and if you want to support statquest consider contributing to my patreon campaign becoming a channel member buying one or two of my original songs or a t-shirt or a hoodie or just donate the links are in the description below alright until next time quest on

Transcript for:Understanding Tensors in Neural Networks

Transcript for:
Understanding Tensors in Neural Networks