Transcript for:
Azure Databricks Notebooks Overview

hello and welcome to everybody on cloud fitness i hope you are liking this play series and do remember to subscribe to my channel so in today's video we are going to talk about notebooks in is your data break so i have already given you a portal walkthrough i've explained you what exactly as your databricks is now in that portal walkthrough you know i explained you each and every component but now we are going to discuss about notebooks in much detail right so for that let's move on to the azure portal now let me go back to this portal and if you remember this is the workspace that we this is the databricks uh workspace that we created as part of our demo in the previous video right so essentially i don't need to right so i'll just delete one of it now uh in fact why i did not delete it is just because to show you how to delete uh you know uh a workspace in case you have to now if you click on this workspace you will see that there is an option like it gives you an overview of your workspace it gives you details of your workspace pricing tier the url and everything and there is an option to delete as well so if you click on this delete it will ask you are you sure you want to delete now in that case we just click on ok and it will start deleting this workspace and do remember that whenever you have created any resources in azure you know if you're not using them just delete it because otherwise you'll get a lot of you know you'll get a long bill you know of all the resources because any resource that you're not using just delete it and do remember to check on the pricing always before moving on to any particular you know resource that you might be using so now i'll go back to my workspace so this is the workspace that i created the previous one is in progress the deletion is in progress it will get deleted if you see over here it will get deleted now if i click on the launch workspace over here uh in the uh portal if i click on the launch workspace you will see that i will be uh you know logged into the azure databricks portal which i showed you even the long even even some time back so if you see on the left hand side if you scroll you are in the data in science and engineering workspace i have discussed about each of this in my previous video but right now we will be focusing on data science and engineering workspace if you click on this workspace part so i explained you that this is the place where you actually have to code where you actually have to play with the data right so if you click on this workspace over here you will see that it says shared and users i explained about this in my previous video now if i click on the users my user is bhavnapedi15 at the gmail.com now if i click on this right so i if i click on this so if you see these are the notebooks that i have created uh you know quite some time back right so these are all the notebooks these are different pieces of my work right similarly now if i click here in this drop down you will see that we have different options we have option of create import export permissions copy link address so these are the things now what is each of them so click on create right now what does do create a notebook right so if i click on create a notebook over here i have to give a name so let me say test youtube notebook right and then i have to select the default language scala python sql or r so whatever language that you want to choose you can choose the language i will talk about more uh you know how you can switch between the languages and all so let's click on python right and if you have a cluster i'll talk about cluster also just click on create just select a cluster and click on create so the moment you click on create you will see automatically it will come like this this is called a notebook now this is where you are actually going to work now even uh before that i'll again go back to this drop down so here you have an option of creating a notebook similarly you have an option of import right so in this import you will see that you have an option to import the notebook so if you haven't already created notebook right if you have uh let me let in fact let me do one thing let me just uh so this is my already uh you know created notebook so let me just click on one and let me just click on export right if i export it and let me click on the source file then the moment i do it you will see that it gets downloaded right i have this as you know a downloaded version now if i go back again here click on import right and if i click on browse right now this is the place where i can go to my downloads over here and then i can click this is the uh if you see this is the notebook which i downloaded i can click on it i can open it right and i can just simply import it right so the moment i import it you will see it get got imported right now if you see over here if i again click back on the drop down here so there what i explained you just now i'll just do a recap create a notebook you can create a notebook from here you can import a notebook for import i just showed you how you can import our already downloaded notebook again right now if for example you are working on some project uh your friend gave you a downloaded version of the notebook and you want to import it you can import it from this right and then there is second option of and also remember that it is saying accepted formats so in all these formats you can just download it right so for us it was a py version it was a python version right if it is a scala notebook it will be scala node it will be a scala uh dot scala version if it is a sql notebook it will be dot sql and things like that for dot dbc basically if you uh go to this notebook and if you click on export there is a dbc archive right there is a ipython notebook and there is an html as well right source file basically means that if your notebook is in python version it will download it in python version if it is in scala it will download dot scala if it is in sql it will download dot sql dbc archive will give you extension of dot dbc html will give you dot html and then you have ipython notebook and similarly if you want to import right you will see here that it has dot dbc format scala p by sql dot r and then this is the the last version right so like this you can actually import now similarly you have an option of you know url as well right now for this url for example let me just google you know sample data breaks notebook let me just google it and if i just open it i'll just show you how to you know use that url option as well so if you're googling and you have some notebook quick start notebook for i'm just like this i'm just trying to you know open uh a notebook and just try to i'm just trying to tell you how you can use that so if you see i just this is a random notebook it doesn't have any meaning for me right now i'm just clicking on this import notebook option right so if you see the moment you click on this import notebook option you have this url right so if you copy this url and you go back to your workspace and uh again i'll show you if you go to this drop down and you click on import so first import we have already seen so for the url if you click on this url and you just paste this url over here you can click on import and you will see uh okay it has failed to input but i don't know what is the reason maybe it's not allowed or something like that but usually it gets imported let me again just try it if there's any problem with the url or something okay so usually if you do it like this you can actually import the notebook but i hope uh you know don't bother about this error usually you will not get this edit and in case you know you uh try you you have this option you just need to remember that you have this option and you can even do that i'm just checking why you know is there any problem with the copy or something i don't think there's a problem with the copy but anyways so maybe the import is not allowed or something like that so i i just don't want to debug into this error it's not that important right now but you should remember that there is an option to import it both via url as well as through the downloaded version in your system and then you have an export version as well right export means you already have a notebook right and you just want to download it you can download it at as dot dbz file dot in the source file whichever language it is in it will just download it and then you have an html now also uh you know similarly you have an option of permissions over here right and similarly even if you uh let me go inside so these are the notebooks right so if i click on here here also i have a you know option of permissions now if i click on this permission option right uh if you see this permission option basically allows you know any other person from the users you might in your project you might have around 30 members right so out of those 30 members apart from admins if you want someone else as well to view your file you need to give the permission to them right so you just need to select the name from here basically uh you know i don't have anybody added as of now so i don't see the name but otherwise if you have users you will see the user names or if you have group of user you will see the groups as well then you can give them the uh you know access to manage edit run edit right so if they want to read you can just click on read if you want to run then them to run your notebook just you have to give that if you want to give them all the permissions for your notebook you can click on manage as well so this is how you do it and permission you can set at two levels you can set it at the folder level as well so in this folder i might have hundred notebooks so if you want to give them access to all 100 notebooks you need to give them at the folder level otherwise to a single notebook if you want to give them error then in that case you need to go to that particular notebook and select this permission option here right and yes so at the folder level these are the uh pretty much options that you have and then you have something called as copy link address in this uh you know folder path and here also in fact if you click on this on the notebook you have an option of clone right clone basically means so let me just clone it and bucket by so let i can just give it any name clone just i'll just say delete it right if i want to clone it at any other path any other folder i can just clone it right if i want to clone it in the shared i can just clone it clone it means basically duplicating it right so if i go back to the shared folder now you will see it has this notebook and this notebook is the same notebook which i have copied right so in the similar way again i'll go back and show you what are the options over here now clone rename rename is like very simple you already know what a rename is move basically you already know what a move is you just want to move it from one location to another move to trash basically means you want to delete it so let me go back uh to the shared folder i just want to delete it right so i'll just say move to trash right and then you will get this option are you sure you want to delete it to trash items in trash are permanently deleted after 30 days so this kind of notification it will give it to you and you can just simply click on confirm and move to trash now you see it has deleted right now similarly if i click on this again so you have something called as export export i've already defined permissions also i have already you know explained you copy file path open a new tab is pretty uh you know common copy file path is also common is like let me just copy it click on this so the moment you click on this let me open notepad plus plus and let me just copy it so this is uh this is what the file path is right uh it just copies the relative path of your notebook so basically inside your databricks you might have an xyz folders right so let's say for example i have this right now if i let's create another folder now if i create a folder let's say youtube folder right the moment i click on create folder you will see that it will create a youtube folder over here right now let's say i have a notebook over here i click on it i just clone it now for cloning i i want to clone it to the new folder i just click over here this is the path right cloning to this folder and then i click on this clone option so the moment i do that and i open this folder you will see that it has this you know notebook now if i click here and i just say copy file path right the moment i copied it and i let me just paste it over here [Music] just increasing the font now you will see users this is my user right and then inside the youtube folder i have this windows function right this is the path basically so this is uh you know a very much uh overview of how this folder path and all the drop down works so let me click on this drop down and just let me move to trash because i don't need it right this was just for your demo now uh again let me go back here uh in fact let me move it to trash as well so this is how you know our overview of notebook is but now let me tell you how it works each and every notebook so we created a new notebook over here correct test youtube notebook right this is the notebook that i created in front of you now if i open it this is how my notebook looks like right now in this notebook what happens is on the the very first thing that you see over here is the name of the notebook whatever name you give will appear here then there is an option of python right so whichever language you selected while creating this notebook will come over here so if i click on this python option so i clicked on this python option you will see default languages python now if i click on this drop down i have scala sql nr so if i want to switch from python to any language i can do it after creating the notebook i can do it from here as well right now what happens is if you see it also mentions something right what it says is changing default language may render command with your percentage python invalid right okay let me in fact talk about this later now what happens is whatever command that i will write so let me say print let's say good right uh basic okay so this command whatever i have written this command is actually a python command why because you have to write it in python because you have defined your notebook to run only in python over here right now if i create now if i want to create a new cell right so this is this is called a cell so each command you can write it in a cell so if you have a piece of a code that you want to run together that piece of code you can write it in one cell so each command one is one cell command two is another cell so if you want to create cell just hover over it and you will see this plus symbol just click on that right or you know if you want to create a number of cells just click on this plus plus plus and the moment you click on it and you are writing something here you will see here also there is an option of python right now let me just start the cluster in the meantime so that because this takes some time now if you see each and everything that i will write here will actually run by default in python but now if i want to run it in any other language right now in that case let's say i want to run it in sql so i can simply click here i can just simply click here and i can say sql so now by default you will see that it writes percentage sql now what does it mean it means that all these commands by default will run in python and this command since i have specified that run it in sql it will it is going to run that command in the sql and by default it will add this percentage sql command so this is called your magic commands now if i click on this plus sign you will see again it comes as a python because only for this particular cell you know i have mentioned to you know i clicked over here and i said that i want to run it in you know sql now if i again change it to scala let's see how does it work let me click on the sql and just click on scala so you will see the moment i clicked on scala you will see that it said it removed that percentage sequel and it's just said percentage color so these are called magic commands so i have written my code in python so my notebook is in python this command will run in python now if i do select star from diamonds so let's say there is some there's some table so you know since it is a sequel command it is not going to run it is going to fail because it is you know the native language of this particular notebook is python not sql so now if i what i do if i want to run any kind of sql command i have to write percentage sql right similarly if i want to write any kind of scala command right in that case i have to write percentage scala right so in scala it is println right so if i say print ln let's say good now this will run in the scala language and similarly you see i am inside this command 3 and automatically it's changed to scala so these are called magic commands so what are magic command magic commands are percentage sql magic commands are percentage scalar magic commands as percentage python right magic command is percentage r so within a notebook right you have selected a language to run now you want to change the language for a particular command only in that case you just write a magic command percentage sequel percentage color if you want to write in scala percentage r if you want to run it in r so these three commands are actually called your magic commands what are they called they are called as magic commands right now this is a piece of code let me just run it for running it there are two options first you click on this particular option yeah this uh you know drop down here you will have an option to run the cell run cell will only run this command one run all above will run all the command above it so right now there are no commands above it so it will do nothing run all below so it will run all the comma all the cells below it right so this is the one uh way and then the second way is just click on this command one cell and then in your keyboard press shift and enter so what you need to do you just need to type shift plus enter so the moment you do shift plus enter it is going to run this particular command so now you see it has run this correct now if i remove this percentage sequel from here and i just try to execute it it is going to throw me an error it is going to say me it is an invalid syntax because python does not have anything like select star from diamonds right it does not have anything like that but now if i do it percentage sql and then i run it i have i have you know uh run shift press enter to run this command now you will see it's there is a table already here so i'll talk about all those tables later but yeah there's a table already here so it is taking the data from that table and you can see the output why because it has run a sql command over it now if you want to run you know any scala command right so let me just remove this color from here and let me just run it now this println is not present in python right so that is why it is saying println is not defined but the moment i write it in percentage scala and then run it you will see that it starts running right it will print the output so these are called magic commands so within a notebook you can change the language but do remember that is not a very good practice try to keep your whole notebook in one language itself for easier understanding as well and it is not a good practice right to you know keep each cell in a different language right so try to keep it in one or maximum two languages that is the best option now this is how you write your code so if you want to write one piece of code here you write here if you want to run next lines of code and another command you can write it over here if you want next set of commands you can write it here and whatever output of the command is that you can see it in form of the output here itself right if you click on this output you will see that there is a you know you can go down and by default it will always show first 10 000 1000 rows for any output by default it will show you know first thousand rows only because it's a ui right so this is how it looks like this is how your notebooks looks like if you want to give uh you know if you if you see this option of file at the top let me click here new notebook if you want to create a new notebook by default it is just a shortcut nothing else clone if you want to clone this notebook it is same option what you saw earlier but it's just for the ease of use they have kept it over here as well rename move move to trash you know upload data export you know all these things change default language all these things are the same things which you can do it from multiple places you know change default language you can do it even from the top right so this is just for a's of use and then edit option is here right edit option in fact you know you don't need to uh kind of use much it is not something that you will be using off often so i just just skip it and then view right and this is also something uh you know that mostly you will not use there is no use of you know changing the views so there is something called as dark theme people usually like if you like using dark theme this is how it will look like otherwise this is the normal light theme the one which you are using and then run all there is an option to run all so right now if you see i click on shift enter i'm running only this node this cell right if you want to run all the cell all together then in that case you know you just click on run all option this option will run all the cells right if i click on this run all it will run all the steps one by one if it is any error it will just stop there is an error because this command shift plus enter doesn't mean anything it starts giving you name error over here right now clear now this clear uh if i click on clear cell outputs it is going to whatever outputs are here it will clear so now your notebook looks little clean like no outputs are here that is called clear clear state clear states and cell output clear state and run all so these are different different options which might come handy if you want to clear the whole state of the notebook and you want to you know just start it all over again so these are the options over here now if you want to schedule it you have an option to schedule it as well but i'm not going to talk about it we are going to make a separate video out of it it's a big topic it has a lot of different concepts also comments if you want to add some comments right so in that case you can start commenting but in fact even this commenting right you can do it on your own like for example you know just comment it out for anything like just simply you know you know how to put comments right this is okay just give me a minute yeah so basically if you see uh i just forgot this is python so in python you put commenting uh you do commenting by using hash