hello and welcome to everybody on cloud fitness so in today's video we are going to continue on our databricks play series and i am going to show you how you can parameterize your jobs how you can productionalize your job in the databricks using the databricks utilities so now let me go go to the databricks portal and show you how you can actually do this now for example i have this notebook already i have run it just now just before making this particular video essentially if you see in this particular notebook what i'm trying to do is i'm trying to read uh the uh i'm trying to read a open data set a databricks open data set i'm trying to do some transformation on top of it and then i'm trying to write it back to a snowflake table to a snowflake data warehouse table i'm trying to write it back now while doing this process in case this is a job for you in that case you want to move it to production you want to productionalize it you want to parameterize it how can you do it this is what we are going to learn in this particular video so if you look at the command number one right here it is nothing but essentially i am trying to uh you know have some commands you know which are just you know importing some pi spark modules over here which i am going to use as a part of this notebook now the best way to productionalize anything is to use widgets right db utils dot widgets so db utils is nothing but it is the databricks utilities now using that databricks utilities you can use widgets widgets are nothing but a kind of you know similarly um whenever you go to a website you have drop down you have you can type anything in form of text so those are actually called your widgets now widgets databricks drop down now if now these widgets are of different types you can use a text widgets where you have to just type in the text you can use you know combo box you can use different types of widgets we are using drop down over here so that's why we have mentioned drop down now in this in this essentially what you're going to do is you are going to name your widget so the name of my widget is a database because this is the database where i want my table to be created right this is the schema where i want my you know table to be created this is the name of my table where i want to create it right so these are the value these are the name of my widgets which i will be using throughout and this is the value which is present for my widget right this is the test database this is the value which is present uh inside my widget and this is the value which is being used by my code so this is how it actually works so it is a drop down widgets if i you know click on the drop down i can select whichever widget i want to i will also show you how you can you know create a job out of it and use it so for example you can create the widgets like this and the moment you create the widgets here i am trying to you know connect to the snowflake connection to the snowflake and all these details we have seen it earlier as well in my previous video now in this particular video you can actually see that i have a url you know i have a username i have a password over here right similarly i have a database right now instead of typing the database name directly what i'm trying to do is i'm trying to call the widget over here right i'm trying to call the widget the very first thing that i have done is i have created a widget now to create a widget you just need to say dbutils.widgets this is the api that you have to use db dbutils.widgets.dropdown and then you have to give the widget name and then you have to give the values inside it right now there can be n number of values i can say okay there can be n number of values over here and the first value will be the default value so the test database is a default value and you can type in n number of values from here you can do that now let me just remove this part because we don't need it now these are just the connections now in this connections when i have to get the value of a particular widget how do i get it i get it using dbutils.widgets.getapi now the moment i use dot get and i mention the name of my widget right the name of my widget is database this i have defined here now the moment i do it i will get the value test database here i can also hard code this value i can also simply hard code this value as test database right but we do not want to do that right we are trying to productionalize our job now the moment let me just put it back so this is how you create a widget and then you use a widget now after doing that uh you know i'm just taking a data set i'm creating a data frame so this is how my data frame looks like right this is my data frame and then i'm trying to do some filter operation on my particular data frame after doing a filter operation i'm doing some group by aggregate operations we have already talked about all of this in detail in my previous videos now similarly here we are trying to do a windows function and we are trying to get you know some top five airlines which have been delayed the most right that is what we are trying to fetch and this is how my data frame looks like you know my end result after my etl operations this is how my output looks like so these are the airline names and this is the count you know how many times the departure was delayed and these are the top five airlines you know that had the most delayed departures now after this i am going to write this right i'm going to write it to the snowflake table now all these details related to snowflake also we have discussed so this video is in continuation to the previous videos right now i am going to write it to a particular table which is an airline 2 table right now i have just run this already so the moment i ran it what happened is you can see in this particular test database in this schema in this airline 2 table got created now if i do preview data you can see my data comes here this is how my data looks like after i run this now this is just the way in which we can parameterize our job this was the main goal of creating it but now if i want to let's say uh you know i want to create a job out of it how is this uh widgets very helpful you know it has to be helpful it's not just you know writing in the code now if i say create job right the moment i say create job and let's say i can say i you know in this job you just basically have to specify your job name i'm just giving random name and then you have to select the type of the notebook and after that essentially let's select the notebook itself right now let me select the notebook that i have okay uh let me go back and try to create a job again let me say job name and let me say task name itself over here and then let me just select it now now this is the notebook and let me just confirm it and if you see there's a cluster option over here you can select any type of cluster what you want so the ideal way is you should use a job cluster for it but for now since i already have a cluster i'm going to use that one now if you see this parameters option over here this is where your uh you know the utilities that you have created are going to work so if i go back to this particular notebook let it let me open it in another tab now the moment i do this you can actually see that these are the notebooks right the this is the notebook and these are the values now these are the parameters which i have to specify so for example let me copy the database and let me say let me say the parameter is database the second parameter is schema right and let me say the third parameter is a table and the fourth parameter is a warehouse right so this is how you add a parameter and then you have to actually define the value right so let me select the value test database let me just type in here test database and then the schema let me name it as schema and then table let me give a new name to a table here or let me just keep the same name airline 2 and then warehouse is compute underscore wh right now we have all of these details i will actually check the name of my table from here airline 2 and let me just put it over here so this is essentially how you can actually pass on the parameter and you can simply say create right now the moment you click on create you will actually see that your job gets created right there is no schedule for it you can create add a schedule for your job but let me go back to the job so this is how your job is created and it has all those parameters so now if i go here right now let me go back where i'm writing back to the snowflake right i'm writing back to the snowflake in this particular command right while writing while writing to the snowflake i'm mentioning here the name of the uh table directly as airlines 2 and the mode is overwrite right i am mentioning it directly you can still choose to directly use dbutils.widgets.get you know similar to this you can use this dbutls.widgets.get right and you can say dot get table right you can always do that let me in fact do it over here right now let me do it here it will be table so this is essentially how you can actually go and you know uh productionalize your job and now you have this job and whenever you want to you know run this job with a different table name if you want to load it uh you know in in a completely different uh you know database or a different schema what you can do is you can directly uh you know create everything over here you know and you can just change the values right if you want to change anything you can directly change it here itself rather than going and changing it in the code and if you want to run it you can directly run it from here so whatever values you have specified over here right so the same values will actually be used by your job these are the these will always take priority over whatever you have specified in your uh job always whatever you have specified in the parameter section those values are always going to take the priority so now this is the job which is going to run i have already started the job run right now if you see it is going and it is running so let me just check uh let me just open it and see how it runs so if you see it is still running and it has run the first commands first few commands it has already run and it is you know going and running these commands right so the what what happens essentially here is this i just wanted to show you how you can use the utilities how you can use the widgets you know for your purpose you don't need to go in the job and change the parameters again and again if you have any table name any database or any xyz name you can simply create a widget out of it add as a parameter to your job and you can directly change that particular parameter through your job itself rather than going to the code and doing it right and even the widgets also when you talk about these widgets these are of different types these can be your text type these can be your drop down or anything but usually you do not need to get that variety even if you use a text widget or you use our drop down widgets it's more than enough right how do you create a widget you just say db utils dot widgets right if you are creating a drop down widget say drop down if you are creating a text widget you just say dot text here in the database this is the name of your widget and this is the default value your widget will take now these are the n number of values that your widgets might have and when you are trying to get the value from your widget you are going to say dbutils.widgets you are going to use the get api to get the value of that particular widget right you are going to use the name right so this is how you can actually use this command will run and eventually what it will do is it is going to write to the table right it is going to write to the table that we have specified which is airline 2 and in the option we have set mode equal to overwrite so our mode is over right so what it will do is the table which is already present it will truncate and it will reload this particular table so i hope uh you know you understood how we can you know use widgets in our data set and uh you know how widgets can be you know very well utilized uh in your workloads so do let me know in the comment section if you have any doubts and do remember to like share and subscribe to my channel so thank you so much for being here