Transcript for:
Applying Generative Adversarial Networks (GANs) in Python using Keras

hey guys you're watching Python tutorial videos on my youtube channel Python for microscopy so in the last tutorial we talked about the basics of generative adversarial networks or Gantz and in this tutorial let's actually go ahead and implement it in python using Kerris okay and we are going to do both the training part and also the testing part now just to give a quick again summary of what we talked about in the last tutorial the term generative means we are generating something in this example generates data or generates images yeah image is a form of data adversarial means the generator and discriminator are each competing to win where the generator is trying to create fake data that looks realistic and the discriminator is trying to catch this fake data okay and as we trained the both get better at their respective jobs like generator gets better at faking the data and discriminator gets better at discriminating and there comes a point where the generator is so good at faking this realistic looking data we are going to take that as our model and then generate further images and those images can be used for other applications which I talked about in the last tutorial ok so again just a quick summary so what we have to implement is a way of creating random seeds right random noise that's easy defining a generator Network okay this is not easy but we are going to copy something that already exists from somewhere else which I already did read a whole bunch of papers look at a lot of code on github or wherever people published it and take something that actually works and most of the code that you see are probably some incarnation of the data of the architecture from the original paper or some other papers that got published in this field okay no one's no one like me at least is sitting and putting together these things because I'm not smart enough to do that okay I'm smart enough to take something and then customize it for my task but not smart enough to put together a network and not many people are good at that anyway okay continuing we also need to find a way of actually getting real images and then supplying them both to the discriminator network that we are going to fine and of course the discriminator will at least classify this as real and fake and then we have to get our discriminator last generator loss and retrain this network so that's the plan now before jumping in a couple of housekeeping things first of all I in my past tutorials in my past videos I should have talked about what version of Python and what version of something else I was using because that really that's very important for you to repeat you know this experiment so I just restarted the kernel so you can see I'm using Python three point five point five I know Python three point seven is out there but I have a GPU an older GPL called nvidia quadro k 5000 and it took me quite a few days to figure out how to get my GPU you know for tensorflow GPU ready for tensorflow so i looked at okay what version of cuda drivers do i have and all those drivers and all that stuff so i don't want to make this a video about my grievances you know on how I got this GPU ready but again there's a lot of material out there that you can follow to get that done so continuing python three point five point five what else that you need to know let's actually import care us so you can see the version of carrots that I'm using and also the tensorflow I totally don't remember what they are so let's go ahead and do that version so the version of Karass I'm using is 2.0 point eight two point zero point eight and let's also import tensorflow stf & t F version and I'm using tensorflow 1.4 I believe 2.0 is out I know they're 2.0 is out but I'm using the older version because again for me to use this GPU I have to have this version of tensorflow that version of Kara's and so on okay now that we talked about this a couple more things so we are so here you see I'm defining my image as 28 by 28 by 1 that's because I plan on working on em NIST dataset and you're probably aware of what image data set is if not well let's go ahead and cover that I thought of actually ctrl C let me just show you what I mean it is so first of all you can download this M&H dataset as part of Karis datasets okay if not image data sets you can actually download it and you can leave it on your hard drive locally and then use that either way it's okay so first let's go ahead and download that okay there you go I have it and now let's unpack it so there are two in there that we need to unpack and let's load and here let me open my to poll for X okay so there is X&Y okay so I opened X and you see my X is 60,000 which means we have 60,000 images of handwritten digits and each image is 28 by 28 pixels in fact if I open this you can see image number 0 right I mean down here my index is 0 image number 0 it kind of looks like five someone writing a very bad way of doing writing 5 ok of course these are handwriting so it is not pretty so if I actually go to index 1 which is the second digit it looks like it is 0 again this is a bits extended in this direction because you see the cell size is extended in this direction but anyway so this is 0 and if I keep going what is the next number it almost looks like for you see I mean it stretched in this direction because of the window but again this looks like it's 4 and the next one is it looks like I don't know maybe one but so let's just look at the first view okay so again 5 zero for okay and the reason why I'm stressing on that is if you look at the other units here you see five zero four these are all the labels of these images so these are X is all the images and the second variable here or the second parameter here is your ground truth which is is it number five zero or four so this is the data set we're going to work with okay now let's go ahead and continue first let me explain at a very high level we are building a generator and then we are building a discriminator again going back we are building a generator Network we are building a discriminator Network and then we are going to train these two networks first on real data you see we are loading em nest data right there and then on fake data we are going to generate a whole bunch of fake data by using our generator okay and then we are going to save the images after so many epochs so we can see how the model evolves as a function of time okay and then other parameters hyperparameters like defining your optimizers and and so on and the combined again we'll get to what the combined is combined is when you attach this discriminator and generator together okay so that's at a high level now let's go through this line by line I'll share this document as part of my PI github repo again the link will be in the description so you can go through this and I added as much text as possible so you can read this yourself and if you find me my video me talking boring you can always read this text okay so let's import again I don't want to explain what this is this part is basically importing or MNH data set importing the required layers like our input layer dense layer and reshaping and flattening and batch normalization all of these okay this is very standard as you probably know now here I'm defining my image rows as 28 columns as 28 and again these images are all grayscale but I would like to add a channel to my image shape so it becomes XY and number of channels so for example if you want work with color RGB images then you can change the channels to three okay now build the generator and again if you give what this does is the input is noise okay the input to this generator is nice I define the noise to be a vector of size 100 right 100 by 1 in this case it's one dimension 1 the air array of size 100 so this is the input that goes into this sequential model that we just defined and this is very again this should not be new to you if you watched my unit tutorials or one of the previous deep learning tutorials so we are just to this model we are actually adding a dense layer you see the dimensions 256 and then we are using leaky revenue again I talked about this and alpha is the hyper parameter y 0.2 don't ask me I'm just copying this from the published paper and again this is where the sensitivity of hyper parameter really is very important and momentum is not that big of a high prepare momentum is just how fast it actually trains so I'm going to use this leave this as 0.8 so it's basically three dense layers dense going from 256 to 512 to 1024 okay and batch normalization between each step now and the final layer is again my activation is art and H activation and finally I'm reshaping this and the noise and the image right so this is what the function here my generator function returns okay and the noise is again this you can call this a latent vector if you want some people call it latent vector but noise is basically the random input vector I should show you this vector okay that goes into the generator framework and this image I should have actually labeled this as generated image or fake image that image is this image the fake image that goes into the discriminator Network so this takes in a one-dimensional array as input yeah of 100 values right one-dimensional array and it generates a 2-dimensional image of 28 by 28 by 1 okay so that is what this is returning now if you look at the discriminator okay again the input is an image for the discriminator remember the discriminator is nothing but just like a binary classifier it just says either it is a real image or a fake image 0 or 1 okay so given an input image it outputs the likelihood of the image being real ok so again I have only three layers over there dense 3 dense layers and then the activation is sigmoid here and again the what it returns is the image that we supply yeah and the validity you can call it score or is it true or false whatever it is the validity is the value that it gives up saying that ok is this a real image or a fake image ok it's basically the probability so these are the two layers now moving down I am now defining the training part of it ok and the input parameters are number of epochs the batch size and save interval how often do you want to save let's actually change this to 500 if I want to save this but we can give this input later anyhow okay these are just the default values right now first we have to load the data real data ok so this part I'm defining this part real images and loading this real images here ok and I only care about X train these are the real images that that we are trying to import ok again X train something else wise train something else right again we looked at that data set here so all we're trying to load is again just these two parts this part ok all 60,000 of it just this part so that's what that is signifying there and now because these are all again you see these are all unsigned integers okay let's actually convert them into rescale them in the floating point and then rescale them to minus one to one you can do zero to one okay floating point numbers I'm doing minus one to one and the way we do that is again minus one twenty seven point five divided by one twenty seven point five again you can do go from zero to 255 if you want between zero to one so this step is very straightforward again if any of this is confusing to you just like I did here load the data set extract this run this single line and see how the output looks like so this is how you learn okay and then now we are expanding the dimensions all I'm doing is adding an extra access again this is otherwise the shape would be 28 by 28 right these are grayscale images now I am adding extra channel to make it 28 by 28 by 1 okay that's all this is and now I define half of this batch is batch size divided by 2 okay that's because I want half of it actually going into into the discriminator again let's go back half of these images going to discriminator and half of the real image is going to discriminator okay so I want to mix half and half that's why I just do batch is half of the batch batch size okay so meaning in this example if I have batch size of 128 64 would be the half batch okay now if we say okay 1,000 epochs now for each epoch okay train this discriminator first again we train the discriminator first and then the generator and then the discriminator and then the generator while we are training the discriminator or generator does nothing and while we are training the generator the discriminator does nothing ok it just sits there so let's start by training the discriminator here and how do we train so select a random half batch of real images so we are taking these real images right and then we are taking a random half batch of these images and so that's what I am calling IMG s okay and here I'm actually creating something called noise okay and this is a Gaussian noise normally distributed noise between Z you know and the half batch and 100 again any time you have any question what does it do all you got to do is just come here and then run line by line if you want let's say my half batch is 64 okay so what does that mean when you run that well you have to import numpy as NP okay well and now you can see my nice is it generated 64 of 100 dimension vectors that's all it is so I have 64 of these you see 0 to 63 and each of this is 100 so if I go in this direction we should have 100 of these ok so that's all this is doing so it's actually generating 100 different noise values because we would like to input 100 of these 2 into the generator ok and the generator is predicting on all of these 100 in other words 100 images are good or 100 random numbers of going in 100 of these random numbers are going in and then what's coming out is 100 of these images that's exactly what this line is doing okay generate images is generator dot predicting ok so we already have our generator up there ok so don't want to dwell on little trivial topics but I sometimes these make much such a difference right not understanding a little topic makes such a huge difference in us implementing this code ok so I don't mind me explaining these little things okay so now let's train the discriminator I'm going to train it ok on images remember these images are the real images these are not fake right so these images that we are getting out of out of our emanation data set directly so I'm going to train on these images and then I'm going to get this loss as real loss okay and then very similar I'm going to do that on generated images yeah generated images and then I'm going to take this as fake okay so half of this half of that we are going to take you know and train the discriminator and we'll get to loss values one for real one for fake again as you can see in this you get for this later you get I mean I showed this as single loss well how do you get a single loss you just average real and fake meaning I'm just adding those two and dividing by 1/2 right are dividing by 2 so my discriminator loss that I'm keeping track off is an average of real and fake images the last that we are coming between these two okay are discriminatory strained we are done after training the discriminator in this epoch right within this epoch go down and train the generator and don't do anything to the discriminator so to do that for generator I'm going to generate noise again okay very similar to the definition that we have done earlier and then I'm gonna generate corresponding truth values again the Y value the Y value here is is the image probability right typically now this area generates an array of all ones okay again if you wonder what this is I'm pretty sure most of you know and you're probably getting frustrated because I'm explaining little details but believe me there are people who need help and this is for them okay so when we do that the valid y it's actually generating a whole bunch of ones but in in the column format not in the row format meaning we have to reshape it but that's okay or does it not matter but let me go through the code it's so anyway so this is all once actually in this case we don't reach a Piett we will at some point so these are all ones now why why am i calling all my fake images to be true in a way this is this is basically this step is very important for this Gann okay so let me dwell a bit of time on this so I'm generating fake it I'm going to generate fake images here yeah train on pass here providing the noise and I'm calling them that this is real image so this is my way of fooling the discriminator that a fake image is real so I'm challenging the discriminator hey this is real take it and then the discriminative takes the and it's like Nana this is not real yeah and then gives it back and then this is how this entire training thing goes along I hope it's making sense to you so the reason we again created this this parameter with all ones is to say that okay now look at the generator loss and then combined the Train on batch what does combined mean if I go down again combined is the model that we are actually generating using an input input array like zes or input shape some array and then valid the valid is again I'm the probability sorry for the name it's actually probability over there okay again let's go back let me repeat this don't want to confuse you read the text if you think it's confusing okay now so where are we so right now so we are about to J train the generator again the way we train the generator is via combined okay in the combined method we are combining the arca network for the generator and the discriminator but then we are not training the discriminator part we are only going to generate train the generator part okay so again it probably makes sense once I go down but that's exactly what we are doing here okay train on that by fooling that a noise or a fake image is real image because we are supplying a y-value of 1 which is true okay now I'm going to print okay as the epoch goes along I'm going to print the last four discriminator the accuracy for discriminator I will track the accuracy there and the generator loss okay and then the epoch number and all that stuff okay this is just a print statement that actually is printing all the values that we are keeping track of and if I park you know after every so many epochs okay which is after so many epochs we are going to actually save these images but all this is saying okay just say these images and and it runs this this part of the code right so once a certain number of epochs are reached we are going to actually in this example what are we doing like printing five by five grid of images okay and we are saving it so basically we are generating 25 random images out of we are picking 25 images out of this predicted ones and we are actually printing them so we will see a grid of 25 by 25 after so many parks and this is up to us right how many after how many epochs we need to we need to print this okay so that's a that's a quite a bit I just want to scroll up here to show you the save interval so in this example after 500 every 500 epochs just go ahead and save this okay save these images now let's actually go down to the main part of the code okay up to this point these are all individual functions okay so now we are into the main part okay so let's use optimizer as Adam because we are using this optimizer multiple times I just defined it up here okay there is no other special reason and for a discriminator let's actually use binary cross-entropy because remember discriminator is a binary discriminator it's a binary classifier so binary cross-entropy works much better than mean squared error or any other type of loss function for this type of problem so that's why we are using binary cross-entropy loss okay and let's keep track of accuracy as the metric that we need to keep track of and that's it so all we're doing is training the discriminant filing the discriminator right here okay and the next part is generator okay build the generator and again we are compiling this generator using binary cross-entropy and optimizer is again we are using the same atom optimizer going down what else do we have okay so my Z is the input shape again this is the input parameter that actually goes into my model okay so Z is this vector that we are talking about of size 100 okay so that's my input of shape and then my image or the fake image is the one that got generator generated using this as my input okay that's what this is why am I doing this because I would like to go down to combined in a second but let me explain this discriminated are trainable equals to false this is very very important and counterintuitive also we said like okay discriminator what does that mean we are not training discriminate or no we're training discriminator up here anyway okay and then we are into the generator part and down here okay before actually start trainers we train or generator I want to keep my discriminator as non trainable while I'm doing my generator that's what this generator discriminated or try nama trainable equals to false means again while discriminator is being trained generator does nothing while generator is being trained discriminator should not be trained because we are using a combined model where we have combined a discriminator and generator we have to mention that discriminator is not trainable that's what this line is okay and then my probability of the valid equals to discriminator okay off this image so we are now running this discriminator editor so we built the discriminator here right build the discriminator here and then down here and I recommend you going through this line by line you see how we just build discriminator we're calling that function here and then we are compiling it down here I'm just basically running that by supplying my input as the image that actually is coming out of my generator here okay so this is this generator part and so now that I have my valid which is my probability here I'm going to go ahead and run this combined model with this as these as our inputs okay and again Z is our our input into I felt I passed there for a second because I felt the need to explain this one more time and I know this is going to be a long video but I hope I hope you pay attention here generator what goes in as the input a vector of size 100 which is what we are calling noise this is basically random noise that we are supplying what comes out of this is a fake image so generator noise input fake image output discriminator image input and the validity score as output or the probability or is this a fake or real image as output now when we combine them down here we are combining them yeah so the input is the vector because our generator input is the vector the output is is this valid or not valid the output of the discriminator okay so that's because we combine these two models down here this is how again please please go through this if this doesn't make sense watch another video I don't want you to be confused ok but this is this can be a confusing topic initially okay so now that you do that we just need to train it on how many ever epochs how much ever our batch size is and save interval how often do you want to save okay so let's actually do because I have a system with one GPU that's 4gb RAM it's not super duper GPU but still better than CPU let's actually train it for 108 parks and save it for every 20 or 10 actually ok and then so we can see some images I've already trained this on a hundred thousand epochs so I can show you the final result and eventually I would like to save the generator right model and here it is generator model h5 ok this is the file I'm saving it to ok let's go ahead and run this and on the right hand side you can see the progress ok so it says no such file images /m niched so what that basically means is when it is running oh sorry sorry about this let me actually import experimenting with this that's why I left the title to be experimenting or my misspelling of experimenting it's okay so let's run this because I was running a different file so everything same of course it's exactly the same file a box ten let's change the epochs to 100 and change this to every 10 and run this and now I hope it should run fine that's because previously didn't find the folder images that's all it was okay so there is our print statement here this print statement or discriminator loss accuracy and this accuracy of 100 I don't think we should even print this accuracy anyway but let me go into this images that it actually just output here and let's look at it so this is after 0 epochs this is after 10 after 20 and so on it's still after 90 or 100 random noise nothing there so let me just show you my 100k epochs and here is 0 and here is after 1 mm I mean yeah up there 2000 4000 6000 and so on and as you see again these images are completely fake no one wrote this the Gann here is writing these creating these fake images that's all ok so as you can see as the number of epochs improved so at about 50,000 epochs I actually got like something that looks very realistic there you go so as I keep going like all the way to 90 thousand 92 94 96 now it almost looks very realistic over there yeah some it can just definitely improve upon but that still looks like 8 so this is how you can actually create tray in your generator just to finish this off let me just show you how you can predict just by using the saved model okay here I have the same model after 100,000 epochs and this is something that people share when they publish papers and that you can use to actually create your own images so here let's just do a single image so I'm importing the required libraries again so all we need to do is load the model okay generator model right there and now we need to supply a random vector right remember this is the input and then our output would be an image so I supply a vector again I'm generating something of size 100 again that needs to be reshaped because if you just do this then the vector is in the wrong direction so anyway so I'm reaching it here and now I'm generating an image which is just model dot predict we know this right and the input value is vector so it's taking this model and taking this input vector and it actually predicts it and now I'm just doing a pie plot that M 0 to show this image so let's go ahead and run it there you go it shows some image let's run this one more time it should show something else right looks like number 3 these are all again the fake images that we are generating if you would right to do multiple images so instead of just 100 yeah instead of 1 vector we just define multiple vectors for multiple images that's pretty much it and I'll share this file also but anyway so the way I'm doing it is I called it again latent points these are vectors and dimensions and number of samples and dimensions again I think I got this code from some one of the blocks I was reading I should keep better track of these blocks so I can properly acknowledge them but anyway and then I'm just saving these plots okay and in this example looks like we have 16 images that or 16 of these samples again latent points that we are generating and so I'm actually plotting a 4 by 4 grid here if you obviously if you generate 2520 test 5 by 5 great okay so let's go ahead and run this to end this video there you go so our 4x4 grid of realistic-looking images so now I can use this model to rate amnesty numbers so I hope you learnt something from this video I hope I know it's a lot it's quite a bit it's a very long video but I tried breaking this in fact this is the fourth time I'm recording this video because I tried breaking this into chunks of ten minute videos but you will lose the continuity I thought this was the best way to communicate what I have in my mind so so hopefully it comes off as very effective anyway again thank you very much for watching this tutorial and please subscribe to this channel like I said it keeps me encouraged to create more such content thank you