Transcript for:
Deep Learning Application with Stable Diffusion

folks today we will make the most advanced deep learning application in the history of this channel we will use stable diffusion to generate images from a piece of text we will then enlarge them using super sampling techniques and we will wrap it in a beautiful goey application that runs from a Docker container and the best part is we will delegate all the generative tasks to our GPU making our app incomparably faster it's the most Prof professional and coolest workflow that I've ever simplified and the results will absolutely blow your mind so without any further Ado let's roll so let's begin by installing Docker desktop for this we will navigate to dock. and from the products menu we will select Docker desktop now if you are not familiar with Docker I have an amazing tutorial that covers it in detail but the general idea is we want to isolate our application from the rest of the computer we want to make sure it runs consistently in different conditions environments and even operating systems and that way users don't need to install a bunch of libraries trying to recreate the exact same environment that us developers have they can just run a Docker container with our software and all its dependencies already installed now once Docker desktop is on our system we will go ahead and make sure that the engine is running given this green bar at the bottom additionally if you are planning to use a WSL terminal like myself we will navigate to the settings and we will make sure we enable it by checking this lovely box right over here once we do so we can then safely navigate to our windows subsystem for Linux and we will use it to clone some starter files from GitHub where I created a very nice interface that doesn't have any functionality there are buttons but nothing happens when you click them so we are basically building this project almost from scratch for this we will copy the https address of my repository and we will paste it inside our terminal following a git clone command then we will navigate to the brand new directory we now have on our Linux drive with CD followed by stable diffusion gooey app in addition we will navigate to the starter files folder which from now on I'll refer to as the root directory of our project now to make it work we will need to set up a Docker container and the fastest way to do so is by using Docker in it now unlike previous tutorials where we manually typed a bunch of code into our Docker file we are now automating this process so Docker in it will ask some questions and based on our answer it will generate all the files that our container needs in our case it already detected that the platform we use is python so let's just press enter And while python 31.5 is the officially recommended version that you should specify on your end I will go for python 3.10 instead I do it on purpose you will see towards the end of the video why then the port of our choice would be 8,000 and the running command we are planning to use is Python 3 app.py and beautiful Docker successfully initialized all the important files there you go and we can now run our application by copying this lovely command of Docker compose up-- build once our container is ready we will open our browser and navigate to Local Host at the Port of 8,000 and boom here's our beautiful starter application that doesn't really do much so let's investigate it a bit and the first thing we'll do here is enter some kind of prompt let's say um Canadian bear eating fish in the river now once we hit the generate button our app should produce three different images based on our prompt but at this point the only thing it does is collect the user input and print it to the console you can of course view it inside the terminal or alternatively if we navigate to Docker desktop then we navigate to the containers bar choosing our starter files server we can find our user input in the logs as well so if the input is already being collected then we just need to convert it from a piece of text into an actual image for this we will need a generative AI model which in our case is stable diffusion in the version of 1-4 but how exactly can we use it well approach number one is what we've done in previous tutorials where we simply copied this type of code and we were pretty much good to go but the only problem with this approach is that it takes way too long to load an operate so approach number two is downloading this entire model into our local computer and it does require some dis space specifically I am talking about 40 gab of it which I know sounds like a lot but very often when you're dealing with these type of models you'll be looking at hundreds of gigabytes instead so as far as I'm concerned it's just a baby now let's go ahead and copy the URL of this model card and you can check out the description for all the links we will then navigate to our terminal we will shut down our container with contrl c and we will download stable diffusion with Git clone followed by the URL of our model card and as you may guess it will take quite some time to download and if your download only took a few seconds it obviously means that something is wrong I mean these numbers they're not really adding up to 40 GB so let's fix it we will need a package called git lfs which we will download with pseudo apt-get install git-lfs as in git large file storage we will hit enter then we will call git lfs install then we'll of course get rid of this incomplete directory we just download it let's delete it and now we can go ahead and clone our model once again this time properly so get clone followed by the hugging face repository great now once our model is done downloading and we can find it inside our root directory we can then go ahead and update our requirements file adding a library named diffusers which uses the Transformers architecture as well as the torch deep learning framework also known as pytorch lastly we will add the accelerate module which will help us with loading our model on CPU in a memory efficient way now let's save it and we can finally start coding so inside our app.py file from diffusers we will import the stable diffusion pipeline in Upper camel case then right below our secret key we will create a new variable named pipeline which we will assign to the stable diffusion pipeline class calling the from uncore pre-trained method on it and into this method we will pass the path to our model which in our case is dot slash which represents the root directory followed by the name of the model repository which is cloned so let's just copy it from our file system and there you go this is where our model now lives now this pipeline produces very basic images they are nice but they're not as realistic as we'd like them to be so let's fix it by typing Pipeline and calling the enable uncore freu method on it now this method allows us to customize four different quality control parameters and in my case I'll just use what's recommended for the stable diffusion 1.4 model I'll copy them directly from the freu repository and if you're curious to see some examples this is the type of quality improvement that you'll get awesome so let's go ahead and paste these parameters inside our enable freu method replacing every colon with an equals symbol now to produce our image we will find the app route where we collect our user input and print it to the page sorry to the console so right underneath our print statement we will type pipeline to which we will pass our user input so let's just copy it from the print statement right above and let's pass it right over here then we can assign this expression to image and at this stage let's just see how this data structure looks like we'll have a better idea of what we're dealing with here let's save our file let's go back to our terminal and let's call Docker compose up-- build once again we will then navigate to our browser then to our app and we will remove the prompt rout from the URL then we will enter the same input from before and after quite some time our image was generated so we can navigate to our terminal where we can find our out put now this output has two different arguments the first one is images and it stores a list with a single image item and that's exactly what we're looking for the other argument stores whether our image is safe for view or not and in our case it is perfectly safe because you want it to be false great so now we can just go back to our code we can focus on the images attribute of our Pipeline and since it's stores a list we will focus on the first and only item in this list then instead of printing this image we will go ahead and save it with image Dove and we will call it at least at first output.png now currently if we make changes to our python file our container doesn't automatically detect them we will need to shut it down and rebuild it time and time again which is probably not ideal so let's fix it the first thing we need to make sure is that our debug mode is set to True which is of course true in our case if so we will navigate to our Docker compose file where we will add a new attribute of volumes and we will use this attribute to mount the local root directory do slash to The Container directory of Slash apppp but wait how come it is sl/ app where exactly can we find this detail well let's save this file and let's navigate to our Docker file where we can see that the working directory of our container is indeed SLA and that's the only reason why we chose it if it says something else on your end that's what you should write instead now if we're already here let's have a look at another detail so by default Docker in it creates a special kind of user called app user and this user has very limited access they can use the app but they cannot make any changes such as saving images which is a problem in our case so let's go ahead and remove this unprivileged app user replacing it with the system administrator also known as root otherwise you will get a permissions error that looks like that and great now we can save our file we can navigate back to our terminal we can shat down our container for the very last time with contrl C we will then rebuild it with all the changes we just made we will enter our original prompt we will then navigate to our route directory where we will find our lovely Canadian bear yay now let's apply the same logic on three different images storing them in the appropriate location of static images and there you go we can find our logo and our placeholder question mark image here so back in our code inside app.py right above our image variable we will go ahead and create a for loop with for I in range three then we will indent our image pipeline as well as the image saving command and instead of output.png we will navigate to static SL images and we will give unique names to each of our images we will do so with demoore image to which we will concatenate the string instance of I which is our iteration variable from above it is either zero 1 or two which will be the unique portion of the image names then we will wrap it up with the file extension of PNG and if we're already here we may as well display those images on the page so let's just copy the names of these images and let's paste it right underneath inside the prompt images list comprehension that we are passing to our render template return statement so instead of displaying our placeholder images three times in a row we'll just go ahead and display the actual demo images and since this list comprehension also uses the iteration variable of I in the range of three everything should work like a charm so let's save it let's go back back to our browser and let's give our Canadian bear another try and wow look at these beautiful beautiful Bears it's pretty impressive right but the only problem is it took way too much time to produce these three images and our CPU sounded a bit like a jet fighter in the process so how exactly do we solve it well why don't we use our GPU instead now the next trick will only work if you have a Cuda compatible Nvidia based graphics card so if you can find it somewhere here on the list then great you can follow along and you can continue with the next steps otherwise please skip to the super sampling part and I will see you in about a minute now to access our GPU from Docker you will need an Nvidia driver which I assume you already have because it's very rare if you don't additionally we will need a container toolkit which we can install using the following guide and we will first copy the configuration command we will then paste it inside our terminal shutting down our container of course and once it is shut down we can now paste it then we will go ahead and update our package repository as in apt get and now we can finally install the container toolkit we will then go ahead and scroll down and we will configure the container runtime lastly it is recommended to restart Docker so let's just do it we will copy this pseudo system control restart command we will paste it back and if you cannot do it from your terminal just like in my case we can just do it from Docker desktop let's just hit the quit Docker desktop button we will confirm we would like to quit and we will open it once again yay so now we've set up our GPU but we also need to notify Docker to do so we will navigate to another installation guide this time the docker GPU support guide we will scroll down and we will copy the docker compost code right from deploy down to the very end then we will of course paste it inside our composed file right underneath volumes and please make sure that the deploy attribute is at the same indentation level as the volumes attribute otherwise you will get an error great so we're officially good to go how do we know well let's save this file let's go back to app.py and at the very very top we will import pytorch and we will print WEA torch. ca. isore available let's save this file let's navigate back to our terminal and boom Cuda is available indeed given this lovely true statement at the very top right above loading our pipeline components now if it is false on your end please leave me a detailed comment below so I can help you with troubleshooting and the emphasis is on the detailed now to delegate our tasks to the GPU we will first turn our stable diffusion pipeline into a multi-line command just so it is easier to look at we will then call the two method on it and we will send it to a device named Cuda and now if we save our file if we navigate back to our app our images are now ready within seconds and not minutes yay and perfect we are almost done we just need to find a way of enlarging these images because right now they're pretty small for this we will need another deep learning model called edsr as in enhanced deep super resolution neural network so from their GitHub repository we will navigate to models we will then choose how much we'd like to zoom in in my case I'll go for * 4 turning 512 pixels into 248 of them then with a right click on the RAW button we will copy the link address now since we don't need the entire repository we won't be using git clone this time but a command called W get followed by the URL which is copied now once the download is complete we will then take care of the requirements which in the case of our super sampling model would be opencv-python as well as open cv- contrib d python but that's not all if we try to install open CV with these requirements only then we will get the following error to fix it we will need to install some operating system requirements that cannot be just installed with Pip so so specifying them here is not going to work instead we will navigate to our Docker file we will copy the following command from the description and we will paste it right after we pip install our requirements file now if we save everything and we'll make sure I Sav the requirements as well because I can't remember we will then navigate to our terminal we will rebuild our container with Docker compose up again and now we can finally navigate to app.py and import CV2 also known as opencv now open CV is a computer vision library that we've used many times on the channel before but never in the context of loading neural networks so let's see how it works now first we will create a super sampling object with CV2 DNN super res as in deep neural network super resolution followed by Dot and an abbreviation of DNN super resolution implementation underscore create we will then assign this expression to a variable named superr next we will load our model into this object with super res. read model in camel case and we will pass the name of our model as it appears in our file file system o sorry into this lovely method of ours so let's just copy it and let's specify it in a string lastly we will need to set this model with super res. set model where the first argument is edsr and the second argument would be the scaling factor of four because we are using the times 4 model and great our model is ready to go and we can take care of the image that it's supposed to enlarge for this we will search for the super sampling route which is the third and last route of our application right above if name equals Main and in this route we will type CV2 do IM read as in image read to which we will pass one of our demo images but wait a second how do we know which of our demo images was selected I mean we have three of them right well judging by this print statement above we see that we need a bit of investigating so let's comment out this incomplete line of code let's save our file and let's navigate back to our app so what happens if we click on the leftmost save button well first of all our demo images disappear and they're being replaced by the placeholder which is something we'll need to fix but second of all if we navigate to our terminal we can see a message that save button zero was clicked if we click on the center button then we see that button number one was clicked instead and same goes for the rightmost button that returns button number two which is a perfect match to the way we named our images we have image zero image 1 and image two which allows us to load each of our images one at a time so back in our code we will go ahead and uncomment our hashtag from earlier and we will pass the following string into our image read command it will start with Slash followed by Static followed by images followed by demoore image to which we will concatenate the identification number of our images which we'll just copy from the print statement and just to be on the safe side we will convert it into a string then we will wrap it up with a PNG extension saving this very long line of code AS demoore image and now that we have a loaded image we can then pass it into our super sampling model we will do this with super res. upsample to which we will pass our demo image then we will assign this expression to exlore image as in extra large image now the only problem is the app sample method returns a NPI array which is a mathematical structure rather than than an actual image so let's quickly convert it into one by using the image class with a capital I from the pillow module and calling the from array method on it passing our NPI array of excel uncore image into it now we can then assign this expression back to excel image overwriting the original value with the new one and now that we're dealing with an actual image we can save it with XL image. saave and at first we will call it exlore output.png we just want to make sure it works so let's save our file and before we forget let's go ahead and take care of our placeholder images to do so we'll just copy our list comprehension from The Prompt route and we will paste it inside the super sample route easy peasy let's save everything let's go back to our browser and let's generate some bare images once again we will then pick our favorite image in my case this guy over here let's save it and back in our file system specifically our starter files folder we can find our extra large image and it has the right size I mean it's clearly bigger than 512 pixels but the only problem is our colors are way off now the thing to remember with open CV is that instead of using the common RGB form format it uses the much less common BGR format so our color channels are completely flipped to fix it we will close this image we will navigate back to our code and we will convert our image to the right format right before we pass it into our model to do so we will type CV2 do CVT color where the first argument is our demo image and the second argument is the type of conversion we'd like to perform in our case CV2 color in all caps uncore BGR to RGB now let's assign this expression back to demo image overriding our blue bear with the right colors so let's save it and if we go back to our app and we try to save this image once again we will go back to our file system where our bear image now has the right size and the right colors now the only problem is we need to save a collection of images rather than overwriting the same file time and time again so instead of extra large output we need to give our images unique names one way to do so is by using date and time so let's navigate to the top of our code and from date time we will import date time then right below we will assign date time. today to a variable named image ID and just to stay on the safe side let's also convert it into a string and at first let's go ahead and print it with image ID to which we will concatenate our variable image ID let's save it let's navigate to our terminal where we can find our unique combination of digits counting the seconds as well as the milliseconds so it's very unlikely that one image will override the other now the only thing I don't like here is all those punctuation symbols as well as spaces so let's get rid of them and one way to do so is by using a chain of replace methods where at first we'll replace a colon with an empty string and we will do the same to the dash symbol as well as a space symbol and lastly a DOT if we save our file once again navigate to the terminal our ID Only Stores a combination of digits still unique but no punctu equation detected so let's use it as part of our image name for this we will copy these two lines of code actually let's cut them let's paste them inside our super sampling route at the very top and we don't really need the print statement so let's just delete it now one more thing we can do is we can store our extra large images in a special directory so let's create it first let's navigate to static images and we will create a new folder and let's call it saved then instead of extra large output we will first navigate to our new saved directory and we will call our images image underscore to which we will concatenate our image ID and we will finish this string with a PNG extension now if we save our file we can then see our complete app in action so let's navigate there let's create some more Canadian Bears let's save one of our images in super resolution let's save another one and if we check our file system we can see that our images are saved in a special directory under unique names yay now to be fair we can make the super sampling process a bit faster by connecting open CV to Cuda but it's quite the process so if you'd like to see how please leave me a comment below now before we can share our app we need to take care of Licensing and the idea is because we are using models that somebody else made we also need to follow their rules and in the case of our app we have three different licenses where the first one is the edsr license which you can find on GitHub so let's click on the license and let's copy the raw link just like earlier we will then paste it in our terminal following a WG get command then we will go ahead o and Reen name our new license file and we will call it edsr license just for extra Clarity and we will do the same for our stable diffusion license clicking on this license tab on hugging face expending it by clicking on read more and then clicking on this files button and from this repository we can go ahead and expand the license.txt file copying this beautiful URL from up there and W getting it lastly we will navigate to the free you repository which also requires a license and let's W you get it as well and just like before we will rename it to free you license beautiful additionally we will get rid of both our output images that we were testing with and same goes for our three demo images because they'll be automatically generated by the app so let's delete them and lastly we will delete our saved image collection but make sure you keep the saved directory itself because our app will not run without it so it's very very important now once we made a few changes to our files let's go ahead and rebuild our container with Docker compose up-- build for the very last time now once our image was built and updated we can then publish our app on dockerhub for this we will go ahead and create a new repository which in my case I will call diffuse me we will also fill in some basic information I'll just do it off camera and once we have a remote repository name we will navigate to our terminal and find out the local name of our image we will do so with Docker images and in my case the name is starter files Das server now we need this name to be a perfect match to our remote repository name so let's go ahead and rename it using Docker image tag followed by the local repository name we'll just copy it and with a colon we will indicate the local repository tag which in my case is latest and then we will specify the remote name in my case Maria sha SL diffuse me lastly we will give it a tag of 1.0 because it is the first version of my image let's go ahead and hit enter let's type Docker images once again with no typos and now we see a new instance of our image that matches the name of our remote repository and in this case we can just type Docker push followed by the remote repository name and the tag of 1.0 boom now if your access to dockerhub is denied you'll just need to log in with Docker log in specifying your user username Maria sha as well as your password and once your login was successful you can then go ahead and call your push command once again and this might take you a bit of time now once our image was pushed we will navigate to dockerhub we will refresh the page and our app is officially public but what happens if somebody else wants to use it let's test it for this we will get rid of our local image instances by navigating to Docker desktop and then to this troubleshoot lovely section where we will Purge our wsl2 data and once our Purge was completed we'll then go ahead and get back to our terminal and let's test our Docker images once again where we get nothing in return which means that we get the same starting point as our users next we will go ahead and Docker pull our remote repository Maria sha diffuse me in the tag of 1.0 to make sure this image is safe to use and it doesn't have any vulnerabilities we will use a static analysis tool called Docker Scout we get it automatically through Docker desktop and we can access it with Docker Scout quick view followed by the name of the image and the tag of 1.0 and once our image was done scanning we can see the total number of high medium and low vulnerabilities where in my case the vast majority of them have to do with the version of python I chose now if you'd like to see some detailed information I recommend copying this Docker Scout CS command but on my end I only care about the recommendations so let's copy this one instead let's paste it in our terminal and let's hit enter and as you may guess my recommendation is to change the base image either to python 3.12 slim either to python 3.11 slim or to an Alpine version now on my end I'm not too worried about it but if you are more picky and if you'd like to fix it please navigate to your Docker file scroll up until you find your base image commands and change 3.10 to 12 as well as Slim to Alpine and boom now Docker Scout will be extra happy and your app will be more secure than before now once we've scanned our image and found that it's safe we can finally run our application to do so we will create a new directory that has nothing inside it with MEC deer followed by the name of test we will of course navigate there with CD test and we can finally run our app with Docker run setting the gpus flag to all then we will set the ports flag to expose the local Port of 8000 to the the containers Port of 8,000 as well we will also use the volume stag that sets the current local directory of do slash to receive information from the containers directory of slapp SL static SL images SL saved and that way we are only storing our High defition images on our local file system everything else will remain inside the container we don't need our Docker files our python files or our 40 gabt of stable diffusion tools no thank you we only care about the high defin images okay now lastly we'll specify the name of our image which in my case is Maria sha SL diffuse me in the tag of 1 point0 let's run it let's fix my typo because gpus requires two dashes and not one and this time let's go for something a bit less realistic let's go for a unicorn cat on the beach a high definition photo at dusk oh wow look at these beautiful kitties so let's go ahead and save one of them on our system let's navigate to our new test directory where we can find our lovely unicorn Kitty image in high definition and yay everything worked it looks amazing good job and thank you so much for watching if you found this video helpful please share it with the world and don't forget to leave it a huge thumbs up and all kinds of comments if you'd like to see more videos of this kind you can always subscribe to my channel and turn on the notification Bell I'll see you very soon in some more amazing videos and in the meanwhile bye-bye