Transcript for:
Power Query Workshop Recap

so uh quick question everybody where are you joining from that's the question number one which city which country are you joining from and uh do you work with power query using the user interface or do you actively write M code answer that in the comments Bangalore nice weather I really like Bangalore hey Deva what's up Emma Dubai what's up my man odia from Portugal Philippines Darren Fort Collins excellent okay the chat is a bit too fast for me to catch up Spain with what's up guys thank you so much for taking out the time um I don't know what time are you joining in your location but it's such a privilege to have all of you uh how many do we have like 80 odd people excellent Zambia Colombia joining in from Dubai again South Africa Gujarat in India okay from Mumbai uh Google is joining from Chennai from London from Hyderabad from Pune from Puerto Rico from south of France wow and of course the second question you forgot do you actively work with power query user interface or do you write the M code okay Detroit Philadelphia Kuwait a lot of people excellent about 95 odd people wow UI mostly okay interface um do you write cement code okay hang on slow down the chat slow down the chat okay that and you're not a noob uh M language I've used UI uh both okay M code but also the user interface okay both depends upon the task okay uh all right yeah that's right it allows for ambiguities that's right okay you are you are UI a lot of people UI UI UI UI um the more people say UI the more it's my responsibility to go slower today and funny thing enough when I was when I was preparing the case today in the middle of the day when I was kind of redoing the exercise myself just to be clear in my head that the concept goes well in your mind I thought maybe I have picked up a case a bit too complex for a live session um complex to explain for sure and I might also get stuck while solving the case myself so let's see which of the two happens foreign a lot of people on UI a lot of people on UI okay not bad not bad I started with UI still do a lot of things on UI so that's not bad at all okay all right folks we would not waste any more time and we will jump on to the case study uh we have like a hundred people now on the stream so just uh allow me to share my screen also in case you would like to try things along in during the session and you would like to have the access to the files there is a pinned comment here please go take a look at the pinned comment in the chat and just download the files all right I will transition and let me know two things again you can see my screen you can see me top right corner this time I'm not going to be coming in between the steps of UI and can you hear me as well just let me know in the chat okay do people can see me okay this window okay can you see me can you hear me the screen is also visible okay all right fellas so this question that I'm gonna solve today comes from one of um like I don't know maybe he followed some YouTube channel or uh wrote me an email that he wants to clean this particular data and I will show you what the data looks like I'm gonna go slow so if you take a look at this data there are two simple columns in this data the first one is transaction number and the second one is the payment column right here and the way we need to process this data is create column so there is going to be like a totals column there is going to be a pass due column there is going to be a discount column and an account column a revenue column and revenue Ava columns so on and so forth if you want to take a look at how this data is going to look because all of these transactions are the same transactions all of the columns which are written here are going to be forming a single row of the data something like this so this is transaction number one zero one and these are all the columns of the data and these These are the values of the data right here that's how we need to kind of clean up the data now just to add a little bit of complexity what I did was mostly when I got the data then the The Columns were the same so if you take a look at the first three rows and all of the columns which are mentioned here are the same for the next three rows are the same for the next three rows so on and so forth but in the last transaction I added these two additional columns amount and cost right just two uh add more complexity to the case so if you think about it this particular problem the problem that we have is that we want to create multiple columns but this is not a straight delimiter kind of a problem so we can't really go ahead and say Hey I want to delimit here then I want to delimit here because we don't really have a pattern to delimit uh nor do we know that where the column names end how long is the column name and things like that so how do we kind of solve this particular problem um what I'm going to do is I'm going to outline the solution um in your mind probably and then we'll go ahead and start to build the solution practically in power query because a lot of people have come from the UI I will try to go slow uh try to give you as much visual explanation as possible so that you can visualize what how the code is working the code is not that long but maybe a bit too complicated but anyways we'll see about that don't get scared about what I'm saying here all right so um the way that we are going to solve the problem is that there is a bit of manual work that we'll have to do to just figure out that what is the list of the columns that we have so all of the columns that we have which is total past due discount paid and all of these columns that we have I have just mentioned all of the columns on the far right here this is just like a table with hard-coded values of the columns this is something that we are going to use and what I'm going to do is I will essentially take a look that hey I'm looking for the words total which is technically a column header and where does it come in this particular piece of text so does it come on the first position does it come on the 10th position so on and so forth so let's say for example total comes on the first position right here past due which is the second column comes maybe on the 10th position right here I mean this position maybe it's let's say 10th position once I get these position numbers that where is the column starting that's when I will get to know that this is where I have to apply some sort of delimiter so I would start to add the delimiters here once I get to know the positions of the column so maybe delimiter could be anything which is not being used at the moment so this is where I will add a delimiter this is where I will add a delimiter this is where I will add a delimiter so on and so forth once I have added the delimiters then I'm going to break the columns now the delimiter is in my control because I have added it and it's a unique character which is not there in the data once I have added the delimiter then I'm going to split it out into multiple columns I'm sure that's a very easy operation and everybody can do that once you've split that out into multiple columns then all that I would want to do is perhaps just clean the column because even if you let's say even if you kind of break these two I mean you break in between here the problem is that I only want the value here just the value I don't want the text passed due because pass due is going to be the column I don't really want the text pass to so I just want the value so then a few steps are going to be extraction of the value and things like that and hopefully we'll be able to solve this by the end of this okay there are not too many steps but let's just see how do we go I could safely say this is going to be like a master class on list or accumulate so if you've never used that functionalist or accumulate you can just Google search it hopefully my video is going to show up and that is going to give you a good really good understanding of how does list.accumulate function work I will lay some grounds about list.accumulate but I will not go into depth of explaining how does this function work but hopefully we will do a good job of solving this problem let's move to power query and load this data up so um here is a table I've already converted that into a table I'm going to go to data and say from table range and I am right here in my power query window all right let me just uh go ahead and start all right okay um let me also zoom in the screen a bit okay cool first things first the problem is that at the moment all of this text the first line of text the second line of text and the third line of text is split between three lines I don't want that I want all of the text to be grouped into a single line so that I can get to know its positions so that's a simple grouping operation it's not that complicated I'm going to go to the transform Tab and then hit the group by option in the group by option I'm going to say hey I want to group my transaction number because transaction number is something that is getting repeated so I can group it by that grouping simply means summarization of the data so I'm going to say something like hey I would like to have advanced grouping transaction number is the column that I'd like to group it by the column that I'm wanting to create is all I just like to call it all and then the operation is all rows essentially what I'm trying to do is trying to get a single Row for one zero one and all the rows which are matching one zero one they are going to form a table so take a look I'm going to say okay and we have a table for one zero one which is where you know all the rows that matched 1 0 1 are together all the rows that matched 1 0 2 are together all zero all the rows that match one zero three are together so and so forth now I'm not interested in the table however I'm interested in this this very column and this very column at the moment is currently splitting into three different rows I do not want that I want somehow this column to kind of come into a single row right so the first row plus concatenate the second row concatenate the set third row so on and so forth so that I get to have a single row so what do I do about that so I'm gonna start to make changes in the uh in the M code that got generated with stable.group function so right now this was the name of the column if you remember that we created and the underscore here each simply means this particular table that we have received from this table that we have if I preview the the third table right here I can see in the third table in fact in every single table there is a column called payment column that I would like to combine together so I'm going to maybe delete all of this part of the M code right here and I'm going to say hey underscore gives me a table why don't you just from that table extract the payment column which contains all the details and I'll close the curly braces and the brackets and press enter now the underscore is a table the squared brackets defines the name of the column table name square bracket column name is going to give you a list so if I now commit to this I have gotten the same thing but in the form of a list the table is gone now it still is this is three rows of data and I don't yet want three rows of data I would want like one single row of the data so what I could use is I could use a function called table dot combine so it's not table dot combined text Dot combine so I'm going to say text Dot combine it says hey do you have multiple texts that you would want to combine sure enough I have multiple texts which is this one this one this one that I would like to combine but if you read carefully it says that please give me those texts in the form of a list so unless I Supply a list with multiple texts that I want to combine it's not going to accept that so I have to make sure that the input here is a list and that's sure enough this is a list and the the delimiter in which I would like to kind of combine these texts is just a simple space let's see what happens close the bracket uh did I make a mistake I think I made a mistake so just a bracket mistake that's going to be okay all right so now if you take a look at this one all the texts which are like three rows of the data have been combined into a single row and that's the entire like the entire row of the data not a problem now what we discussed was if I have to go ahead and kind of split this particular piece of text by delimiters then I need to have a delimiter here I need to have a delimiter here I need to have a delimiter here so on and so forth the way I'm going to be able to identify a delimiter is only if I have the identifying character which is nothing but the name of the column for which I already made a provision in my Excel and I created this column headers list which I will now kind of bring it to power query let's do that so zoom in again um I'm going to say right click new query and other sources in a blank query this is going to be nothing but my columns now how do we get the data from Excel I can use a simple function called Excel dot current workbook this is going to bring in all the tables that you have in the current Excel file press enter and that's the table that I want to kind of pull apart so I'm just going to maybe click on the table the table gets loaded nothing that complicated I can just remove the change type step the thing is that although we have gotten all the names of the columns which are exactly by the way exactly written the same way as you have it in the data so don't make the mistake of uh writing it in a different way or non-capitalizing it or lowercase and uppercase I'm sure you understand all of that now this is the table by the way it's not it's not a list so I in order for me to kind of do any sophisticated operations with this I need to convert this table into a list the easiest way to do that is that if you click on the column headers right here and say drill down this is going to be formed into a list and if you notice carefully this is exactly the same thing that we just did just a few seconds ago so name of the table which is nothing but the previous step which is right here and the name of the column which is column headers is going to give you a list so table name column name in the square brackets gives you a list and that's what also the UI does so nothing phenomenal here all right I'll go back now I know that let me resize so I know that if I want to kind of split this particular text by a delimiter then I know that the position of the delimiter has to be here and the identifying character is present in this particular list which are nothing but the names of the columns right so what do I do I'm going to go to the add columns Tab and start to write a custom column let's just do some manual work first so that you understand what am I trying to achieve here okay so um call this as maybe positions maybe because we are trying to find the position and I'm going to say something like um text Dot position off spelling is Right hopefully it's right okay so text Dot position off I'm gonna say um hey what's your text so my text is nothing but this particular text which is right here in the column called all so I'm going to feed that column all and then it says hey what um is your substring like what are you trying to find the position of at the moment I will enter a manual value so let's just say that I'm trying to find the position of the past due right uh because once I get the position of pass due like if I start counting from the left here if I start counting from the left when do I hit past U as a text so what's the position here right that's what I'm trying to find out I will just write that as a hard-coded value so past asked what do and by the way I'm just trying to find the first occurrence of that so there is an optional input in the text Dot position off which is the occurrence I'm just trying to find the first occurrence of that close the bracket and press enter and what this is going to give you is 12. now what does the 12 mean the 12 is nothing but the position number of this particular like a title or a column header in this particular piece of text now I know that on the 12th position here is where I would want to add the delimiter right now if I just maybe change that to a disc if I just maybe change that to a disc I'm gonna get to know that the discount is going to be on the 32nd position that means here is where I would want to add a delimiter but this is kind of quite uh you know kind of tedious to kind of keep writing these values manually and this is not going to serve any purpose the formula that I'm going to use is going to be this very formula but now I will use uh use a function called list.accumulate so essentially if I have to give you a broad logic that you understand what I'm trying to do is I'm trying to Loop through all of these columns names that I have created all of these column names right these column names are already present in the text so I'm just trying to Loop through all of these column names and trying to find the position of every single column so this is going to be the first I mean we start with total so zeroth position this is the 12th position this is the 32nd position so on and so forth that's what we saw so I am trying to get a list for the positions where the column name starts and that's where I will insert my own custom delimiter so that I am able to break the text right so what do I do I am going to go back to my function and uh just get rid of that just remember that's this particular function and then we will make use of this in fact let me just uh open up a notepad and write this function so that once everything kind of comes together uh we can go back to this particular function and see what parts of the function we replaced hopefully you have understood this all right I'm gonna go back and I'm gonna start writing something called X list dot a q mutate so the list don't accumulate function for the first time I'm explaining it on the live feed here accepts the first part as a list that means do you have a list where you want to Loop through every single item sure enough I have the list and that list is called the column list so for every column mentioned right here I want to find its position number all right so what do I do I am going to go ahead and damn it just move away so I'm gonna say my list is called columns and then it asks you for a seed now seed is like a starting point so I want you to imagine what is the kind of output that you're wanting to have so if I have let's say all of the columns so I'm not writing the column names here let's say column one column two column three and I want to find positions of every single column this is the zeroth position the 12th position the 32nd position all of these numbers are going to be given out to you in the form of a list that means the output of this particular function is going to be a list so I'm going to say that hey let's start accumulate I know which lists to accumulate through that's part one then the starting point is a blank list that means in the list that I would want the for it's going to be just like a blank list there's going to be no value in that list and then I want you to start feeding the values one by one of all the columns which are here basis on the text which is right here okay so I will start with the blank list then it says what's your accumulator so there are two parts here which is the state which is s and C now you can write state or any other name of the variable here just to make a short I write s for state and C for current uh and that's that's what creates the function and I'm just going to maybe write the function and I'll say something like text the same function that we wrote so text Dot position off Dot position off and it says hey what's the text so my the text that I want to start working with is this particular text and if you remember the name of this column is the all column so I will say hey um in the all column you go to the all column and uh the substring is nothing but C close the bracket and occurrence first now if you maybe just copy this particular line here and I compare that with my notepad now this is good because that's the same here is where I entered the manual value which is the name of the column called discount but here is where I have written C what exactly is the C do you notice I have declared two variables here the state and the current let's talk about the current first imagine a list the list has got the First Column the second column the third column so on and so forth so as the list is being built up which is the empty list once the loop starts so the first value goes in the loop that is labeled as C because C is your current item so it automatically picks up the first value which is total and pushes it right here and this is the same part then when it moves to the second part then again the second item in the list which is the list of the columns it's going to be called as C and then it just changes that c to the second item which is maybe the total or the discount or whatever that is so the C keeps on changing because you have to Loop through every single item in that list and C is nothing but the current item all right um hopefully I have made sense of that close the bracket and say okay let's see what we get now the results are bizarre I understand that it's minus 1 minus 1 minus 1 and 175 so I have to kind of draw a picture in the mind as to how this this came about and hopefully we'll be able to kind of make sense um so if you take a look at this particular list uh list or accumulate first went through the first item of the list then it went through the second item of the list then went through the third item of the list so on and so forth and finally when it went through the last item of the list cost it was not able to find the cost right here because it was not able to find the cost right here so the answer that it gave you was negative one I was not able to find it that's the output that you get however if you take a look at the last preview here it was able to find cost right here and the position at which the cost stands is nothing but 175 that means list.accumulate function has accumulated the result and is kind of giving you the result for all of these when the list or accumulate was going through the accumulation process it kind of discarded all the items that were previous and it just like went through all of these calculated the result discarded the result and it just gave you the result of the last item this is not what we want we want we want to capture what the position here zeroth we want to capture the position 12. we want to capture the position 32 so we want to capture the values we want to don't we don't really want to discard the values so how do we capture the values and that is the reason why we have made this particular list right here the empty list and we want to feed all of these values in the empty list all right so I'm gonna say hey list.accumulate once you find the position of every single item please don't discard it please save it and save it in a variable called s so what s stands for is the output of every single step so once list.accumulate uh was starting to work through the list it just went ahead and stored the first uh output in a variable called s when it calculated the second output it's stored again in the variable called s the third output again in the variable called s so so on and so forth so I'm trying to say that hey list.accumulate why don't you just maybe concatenate both of these together so this one and the S together at the moment this is giving you an error because s is nothing but the list and I want to kind of have this also in the form of a list a list can only be concatenated with another list you can't concatenate a text with a list all right so I'm gonna maybe form this as a list so curly brackets is nothing but a list and this gives you a list if I peek into the list this is exactly what we wanted so the position of the total is a zeroth position 12th position 32nd position 41st position so on and so forth and then you get minus one minus 1 because the revenue and the cost were not found in this piece of text good we want to clean this list up because at the moment is not usable because of these junk values here so on and so forth another thing that we want to do is that I don't really want to have a zero because I don't really want to insert a delimiter right at the start on the zeroth position because then the First Column is going to be the blank column so I'm going to go back to my function and I'm going to say list.accumulate the result is good but I want to clean it up a bit so I'm going to say something like list Dot select so list.select is going to work with a list you already have a list this is nothing but a list and I'm going to say hey please take a look at every single item of the list here and pick up all the items which are bigger than zero simple enough so I'm going to say each item of the list should be greater than equal to a zero and maybe also convert that to a number by number Dot from all right and gonna close that and that is critical say okay and we again get a list nothing that complicated but the zeros are gone and the negative ones are also gone that means that excuse me the first delimiter is going to be here right here and the second delimiter is going to be right here so on and so forth all right I hope people are still cruising with me it's not yet become too complicated we still have a couple of more steps to go I don't really want the change type I can get rid of that all right now inserting the delimiter so I want to kind of go ahead in this piece of text in this piece of text I have these numbers in this particular list and I want to pick up the number one by one so I want to pick up let's say 12 and go right here put the delimiter then pick up the second number 32 go right here and pick delimiter and then delimiter so on and so forth again this this is going to be like a list or accumulate so I want to accumulate through this particular list pick up the number one by one and start pushing in the delimiter let's do some bit bit of manual work first to understand what's the ground what's the formulation of the logic and then we'll proceed to add columns custom column again and let's just call this as custom is fine doesn't matter uh I am going to create a manual work a bit of a manual work here so I'm going to say that what was the function text Dot insert I believe so text Dot insert is the function and the text Dot insert function allows you to insert a piece of text between a string what is my string my string is this string and so I'm just gonna pick up all which is nothing but this particular column and I will say that I would like to insert a piece of text on the let's say for now I'm just doing it manually so on the 12th position which is manual entry at the moment I would like to insert a double pipe I believe double pipe is going to be unique and I can just play around with that particular symbol to delimit my columns so double Pi on the 12th position in this particular piece of text say okay and that is where I have got the delimiter and this is the right position on which I want to have it the moment this is manual because you have just put it on the first position but you also want it here you want it here so on and so forth so the base formula that we're going to use is nothing but this formula text Dot insert and then but this 12 is going to keep on changing so first insert it on the 12th position then 30 second position so on and so forth all right so let's just do some work here so what I'm going to do is I'm going to maybe copy this uh text like text Dot insert here and then we'll see what parts of the formula change all right so what do I do list dot accumulate now which list do I want to Loop through I want to Loop through this particular list and this list is in the positions column so I will just maybe add the positions right here and then I am going to have a seed so seed is nothing but your starting point how do you want to start so how do I want to start what's my first piece of text where I want the pipe symbol to be inserted the first piece of text where I want the pipe to be inserted inserted is this particular raw text which is there in the all column so I will just feed the all column right here and then it says uh what is your accumulator simple again s and C the two variable State and current I hope you've understood that and then it asks you to write a function and what's my function my function is this particular function so I'll say Ctrl C and I'll paste that right here text Dot insert now at the moment this is hard coded 12 but I will change that to a c closer bracket press enter to see what happens so and I did not insert the text let me see okay it's because all right it's the same problem that we had it earlier if you take a look at the output that we have received here we do have the double pipe inserted but the double pipe has been inserted at the last position I don't want that I want the results to be accumulated I don't really want the final output of list.accumulate so my formula is going to change a bit I'm gonna say that although I would want you to start from this piece of text but once you have inserted the first delimiter here then I want to then I want you to pick up this text and on top of this text I want you to insert another delimiter here and then I want you to insert another delimiter here so keep picking up the text that you have just processed and then keep on adding delimiters so instead of all here which is the row column right here I'm going to say that hey why don't you just pick up the variable s which is the state which is the output of every single step so I'm going to say s is going to be this and press enter let's just see what happens now uh seems right but it's not right let me help you understand why so at the 12th position the text got inserted pretty good then it inserted the second text at the 30 second position and this is correctly the 32nd position but the problem is it doesn't feel right I mean this was ten dollars and between ten dollars the two pipe symbols came not good why did it happen because when the when the text was being accumulated one on top of the other the first position was the 12th position and the second position was the 32nd position that means that the second position should have been the 34th position not the 32nd position because you've already inserted two characters here so we need to account for that as well so I need to kind of move these two characters to the right and then whatever you have again you again need to move that two characters further up because the two pipe symbols have to keep on shifting to the right as in when you move through the list so how do we kind of solve the problem um visualize this so I'm just going to get rid of that write the numbers 12 write the number 32 and write one more number 41. so the first thing that I'm going to do is I'm going to find the position of 12 in this list so what's what's the position of 12 in this list the position of 12 in this list is 1. then I'm going to multiply that with 2 and I'm going to add this uh which is not 12 to that now the output of this is the output of this is um what is that 2 sorry the position of 12 is not 1 it's 0 because power query starts the counting with zero my bad sorry about that so um 0 into 2 is 0 and then you have 12. so the first delimiter is going to come right here let's have the second equation so I'm going to say again what's the position of 32 in this particular list the position of 32 in this list is nothing but 2. so 2 into 2 why do I keep writing 2 because this is like two characters long so 22 is a 4 and then plus C which is 32 it's going to give you some output which is going to be let's say here right here so that's the logic that I'm trying to build this is not two this is actually one my bad so this is going to be uh 2 into 2 is 2 and 2 plus 32 is 34 and that is going to be maybe the position I believe here so on and so forth so that's the little you know kind of equation that we have built and we need to kind of fit this equation into our formula now so I'm going to say that I don't really want you to pick up the very number I don't really want that but however I want to find the position of that number so I'm going to use the function called list dot position off position off so I want to find there's a list which is nothing but positions let me just make sure the spelling is Right positions and I want to find the position of the C so I want to find there was a list right here and I want to find the position of 12 then I want to find the position of 32 so on and so forth to this position I wanna Maybe uh what did I do here multiply by 2 multiply by 2 and add a c to that now let's take a look at the output it's an error why is there an error uh no Plus let's see right let's start positions of what's the error spelling mistake position not positions okay all right now if you take a look at this output this is the correct output so here I get at the 12th position 34th position so on and so forth and now this is going to be delimited well it solves a lot of the problems that we were trying to deal with now I can just simply apply a split of the columns and all the columns are going to be split in multiple columns it's an easy so I'm gonna go ahead and what do I do I say right click that's your UI people working with UI you're going to like this so I will right click on this column I'll say split column by the delimiter and what is my delimiter my delimiter is going to be a double pipe that's the one and at each occurrence that's okay and I'll say okay and that is being split into multiple columns the change type step is annoying it gets inserted all the time don't want that I can get rid of that the only problem fellas is that at the moment these are bad column names so column one column two column three column four it's fine at the moment you can rename the columns but then what if another column gets added into your data you want the query to be as robust as possible so don't want that if you notice column dot one sorry custom.1 custom.2 custom.3 are all values which are packed in the form of a list so you have your own list for The Columns that you have sure enough we do have that and maybe just go right here this is the list again I can just delete all of this part and feed my list right here this is going to be gone so I can just delete all of this and I can say hey why don't you pick up the columns from this columns list that I have created change type annoying remove that and we're kind of good to go so yeah this is the totals column past you column discount column beautiful working all good now another problem at this point in time we have been able to split the data in multiple columns um the thing that seemed a little impossible at the start but at the moment that we we wanna we wanna remove the very word total from here and we want to remove the very word pass due from here that's one problem then I also need to remove these dashes from here you could maybe go here and clean the columns one by one by one so total you could just maybe say replace the word total in this column with nothing replace the dashes and all of all of that you're going to create a bunch of steps but I do not really want that two things one is that I'm not really sure how many columns I'm going to get in the future and I have taken care of that that's going to be split automatically but I also don't know people that what kind of junk character are you going to get in the column to remove because at the moment the junk character seems consistent which is a dash but there could be a possibility that the junk character could also be an equals to sign hmm so I need to kind of have a more robust approach to solve the problem so what I'm going to do is I'm gonna give you a visual picture as to how am I going to solve this so we will move crisscross across the across the data so what we're going to do is we are first of all going to get hold of every single column one at a time so pick up this particular column that's the first iteration in this column then pick up every single value in every single value what you need to remove is the very header so this is the header I want to get rid of the header that's one then I want to get rid of the dash but I don't know if this is going to be a dash always or this is going to be something else so I don't know what kind of character do you want to get rid of so I have to sort of make a list of all the possible junk characters and then I want to see that which junk character fits here so if one of the junk characters is Dash it takes a look at Dash and just replaces that I'm sure you understand there's a function called text Dot replace to mostly replace all the text values from things like that but what we need to do is we need to Loop through this so first cleaning up is going to be of the total column so you pick up this column and clean up every single value then you move right then you pick up the second column and then you clean up every single value so on and so forth let's do some manual work here and then try to proceed with some automation I'm going to make a step and in this step let's just uh call this as a like a split shorter name okay so I'm gonna say something like this so table dot transform columns if I want to perform Transformations on a column that's the standard function that I tend to use a lot now in the table dot transform columns the first part is the name of the table on which table would you like to perform your Transformations so here is the table on which I would like to perform Transformations and then it says hey what Transformations do you want to apply the Transformations that you input are column by column so let me write some piece of code and hopefully you'll understand so one second okay so what do I do I'm going to write a curly bracket because the Transformations are there as a list you can also take a look at the documentation here it says that please give me the transformations in the form of a list so unless I make a list this is not going to accept that all right so I'm going to start the curly brackets and within the curly brackets I would write My First Transformation every transformation is going to be a single columns transformation and if I want more Columns of transformation all of the transformations of every single column is going to be an individual list so let's say that the transformation that I want to apply is on the total column at the moment this is all manual work just trying to help you understand the logic and then it says hey what's the transformation so I'm going to say something like each value of this particular column and I want to replace the the word total so I'll say text Dot replace and what's my text this is my text my old text is the word total itself and the new text is nothing right and let's just close the brackets and commit to this function press enter now if you take a look at the total column that we were trying to work with the word total is replaced with like a empty string nothing the next thing that I want to do in this piece of function is that replace like wrap this around again in the text Dot replace function so text Dot replace function and here I'm going to say work with this particular text the old text is going to be a dash and replace it with nothing so I'm going to say the old text is a dash and replace it with nothing close the brackets hopefully that should work let's see all right that's your beautiful zero nice now if I want to repeat the process I want to do it like again so this is how it's going to look like so I will this is one list one list means Transformations on one column and we'll do it again I will have to repeat the list once again put a comma and repeat the list again now I'm going to say hey um this time I'm not trying to clean the total which is already cleaned I'm trying to clean the past due column so past t-u-e and text Dot replace text Dot replace replace the word pass to itself so fast do itself and the replacement is nothing and this and this and everything is okay just remove the last comma and we are kind of good to go and this is your ten dollars this is something exactly what I want to do but I can't possibly write all the Transformations like this because I don't know how many columns this is eventually going to get so this again is another looping that I have to do there are going to be two loopings here looping number one at the moment all that we have done so far is single looping like Loop through a list of numbers Loop through any other thing but this time it's going to be like two loopings the first looping is here if you take a look these are the names of the columns so I want to pick up every single column and I want to do a transformation on that that's looping number one then so the first looping goes like this the first looping goes like this to the right and the second looping goes like this so once it picks up the First Column then it goes like this so in every single in every single value right here I want you to do two Replacements first is the very column name and the second is this particular delimiter but I'm also not sure how many delimiters do I have so do I have one delimiter do I have two little meters and things like that so what I'm going to do is I'm going to make a list of delimiters and all the possible delimiters are going to be clean so even if the number of delimiters increase that's going to be okay you're just going to come at one single place add an additional delimiter that has been added to your data and it's still going to be okay there is not going to be a lot of manual work to change the query so right click new query other sources a blank query and I'm going to call this as junk these are your junk characters so to make a list I will initiate the square the curly braces one delimiter is nothing but a dash the second delimiter which we know at the moment is nothing but an equals to sign press enter you get a list the first one is a dash the second one is n equals sign and both these limit delimiters are going to be replaced with like an empty string so that's not a problem let's start to work so I'm going to go right here and I'm going to start to make changes to this so just hoping that I'm able to explain you but let's just see so I'm gonna maybe make a copy of this piece of text here so that once we start to replace this text you understand what parts have changed so again my favorite notepad paste that text right here and let's start to work so the very first thing to notice is that in this piece of text right here sorry in this piece of code right here you did not work with just one list you worked with a nested list that means there is a list outside within which there is a list inside and every single list is one single transformation so whatever code we write that code also needs to generate a list of a list structure if it doesn't your code is going to break with that in mind I'm going to get rid of this and continue using the table.transforms columns function but I will write my own list of list structures so I'm going to say something like list dot transform so now the good thing about list.transform is that the list.transform gives you a list that means the output of the list.transform function is a list right so what you're going to get out of this is a list so assume that we have already gotten a list within this list we need to have a smaller list and this smaller list needs to have two parts if you remember it needs to have the name of the column so like column one and then what transformation you want then it needs to have column two and then what transformation do you want so this is an auto list which we are already getting with this function list dot transform now the only thing that we have to worry about is this inner list that means whatever formula we write that also has to generate a list within the list or transform function so I'll say list.transform which list am I trying to work with so the first list that I'm trying to work with is the list of the columns that means I want to go horizontal so this is nothing but the list of the columns so let's just do that so I'm going to say let's do a transform I'm trying to work with these names of the columns and what's the transformation function so I'm going to say something like um let's just declare a variable so I'm going to say column name is the variable now this is very equivalent of writing the each function or the each keyword okay let me explain in a simple way so this column name which is a variable that I have created is going to hold the value of the list so once the list is going through the iteration so once the list is passing through the first value then passing through the second value then passing through the third value it captures every single value in the variable that we have declared which is nothing but column name all right next we say what do you want to do with every single column name and that's where we create a second list because we need to have a list of a list structure so I want to say something like a column name and I want to say something like um what was the function that we used um text Dot replace and this is where I will use the function.list.accumulate and I'm going to say that hey we don't really have one single error we have multiple errors that we want to deal with and the multiple errors are stored in the junk right here so at the moment I have picked up the First Column which is nothing but the total and in that total now my looping is going to go from top to bottom and I want to check every single value of the total and start to replace two things the first is the word total itself and the second is any junk character which could be found in this list right so let's not accumulate so I'm going to say list dot accumulate needs a list and that is my junk list so I want to take a look at all the possible junk values that I have then it asks you for a seat uh just one second all right then oh God myself one second um okay so then it asks you for a seed and the seed is going to be the cleaned text so I'll say something like text Dot replace and what do I want to replace uh I want to replace the very value with the column name and replace it with nothing that should work and then it gives you an accumulator function so snc again and then just put a comma and then I'll say hey I again want to do a text Dot replace and then uh text as an audible text which is nothing but my underscore the very value itself I think I missed out the each so this is going to be error out so text Dot replace underscore and what is your old text my old text is the current item so the now this time the current item is nothing but one of the items of the junk values and I would like to replace it with nothing hopefully that should work let's take a look all right oops did I provide an extra bracket I think I did all right nothing happened one second must have x dot replace column name each list will accumulate junk accumulate through the junk this takes dot replace replace the with column name and C and okay keep iterating through the S and then replace that okay that was a mistake all right and now we have been able to get the very numbers so this is all one this is all nine this is all values this is good all the text has been cleaned and finally um time to shine UI people so you can just kind of delete these two columns that we have close and load the data set the data types kind of good to go now only like three or four steps here but I believe uh list like what I was trying to explain through this particular problem is of course one way is to how do you think about the problems and then how do you carry your logic and use these functions in conjunction to create more sophisticated logic to solve the problem I think I've spoken a lot so I need some water for sure I'm gonna stop sharing my screen and that's the example that I wanted to kind of bring it to you we've also kind of nearly touched an hour and I want to invite questions foreign people folks do you have any questions for me um a lot of people I mean if you don't have any questions for me regarding the case study that we have done then what I could do is answer questions that people asked me when we reached 50 000 subscribers that was nice So Daniel asked would some of this also work using list.generated instead of list or accumulate possibly yes but I haven't tried that once again thanks um they're on I don't think this should be a problem but try to kind of load a lot of data and then try to kind of see how this works I would tend to add if I have to solve this using another approach maybe I would use like a record approach to solve this or list or generate approach to solve this maybe some other way but I don't think it should be a problem just load up some data and take a look at it okay Claudia says uh that was super thank you Claudia I can't understand the logic of the column name in the last could you please explain right sure enough I can let's transition back to Power bi the Excel sorry all right all right let's open the notepad on the side that's what we kind of had so if you take a look at the manual work that we did in this manual work we had like a wrapped text Dot replace text Dot replace straight here so here what we have done is list.transform gives us a list that's not a problem and which list do I want to kind of work with I want to work with this particular columns list so this columns list is what I want to work this is where I have the names of all the columns now what do you want to do with every single column so think about it with every single column once I go ahead and pick up the first column in the first column I want to do two things first there was total riddled all across this data so I want to get rid of that that's one and the second thing that I want to do is also get rid of any possible junk characters right so what I will do is I'm going to say if you now take a look at this this part this is how we wrote Our function right so we had the name of the column and then we had what do you want to do with that column so this particular text value which is hard-coded is represented by column name because column name is a variable which is going to Loop through this list of columns and capture the first value then the second value then the third value then the fourth value so as it moves across different columns it then replaces the previous with the next column so hopefully that clears up the second thing that you want to take a look where are you the second thing that you want to take a look is that this type of structure so we have the name of the column we have the name of the column and then we have the each we have the each and then we have used list.accumulate one more time to accumulate through all the possible errors and then first of all remove the very name of the column which is there in the data so every like this total needs to be removed and then on top of that we apply again text Dot replace to remove the c c is nothing but every single item of the junk list so it could be a dash it could be a comma it could be anything and that is like looping horizontally and looping vertically um I don't know maybe I confused you more than I explained it to you I'm sorry all right um thanks Google for the tip [Music] um the junk list is not Dynamic could be a dynamic list I mean if you have certain characters that you can draw from you can have that but at the moment I have created a list of young characters okay um so ayush asks uh what's the best way to learn the M language the roadmap so I'll talk about the way that I kind of approached the first thing that I did was I became really good with the ad Customs box I'll show it to you so if you take a look at the ad columns and the custom column right here sorry let's just go right here so custom columns right here come on this is what I tried to Master first like even if even if let's say for example you have uh created an error here it does not break the query your query is still functioning what you have received your error is contained in a column of a single row so it's a very safe space to kind of make mistakes uh the time you start writing formulas here and if you probably make a mistake your entire query breaks and that's very scary for a lot of people so the first thing that I did was I tried to write a lot of formulas as added custom step right here that's what I did um once I became comfortable creating custom columns then I started to approach the formula bar and understand the each keyword and you know things like that once I became good with that um and there is no definition of like there is no I mean if you're able to solve more and more complex problems that's a sign that you're becoming better so that's what is the check then I kind of moved on to the advanced editor and then started to write vanilla code as a block of code in the advanced editor right here so that's that's been my journey there are two very helpful points that I will mention Point number one is when you're learning M code it's very very important to understand any function that you use what goes inside of the function and what comes outside of the function so for example list.accumulate function has the capability to generate a single value it has a capability to generate a list it has a capability to generate a table or a record and things like that so when you're learning different functions notice what goes inside of the function does it is it does it accept a list or table and what comes out of it that is going to give you an understanding that if I have to provide a list am I kind of working with the list or not it's very very important this is a thing that has struck me very late and this is what I don't want people to be struggling with um and I forgot the second Point frankly if if if I could recall the second Point once again I'll just maybe come back to you all right all right people um there were a lot of people who posted uh uh comments on um like they they gave me um some questions to answer and I just want to kind of answer those questions as well we recently hit 50 000 subscribers on YouTube and I really want to be thankful of those people who provided me questions so let's just kind of go through a round of these questions first and then I'll come back to your questions so Gemma asks what is it living like in Dubai I would be interested in knowing that it's very hot in Dubai for sure um and uh like it's like the opposite of a cold country so a cold country would have everything air conditions to condition to keep it warm here everything is air conditioned to keep it cool uh other than that um the crime rate is low and everything is kind of set so running a business is a bit difficult here that's what I believe so opening up a bank account and all of all of those things is a bit difficult but other than that it's it's just a good place to be here however um now that I'm on the live stream I'm kind of going back to India very soon so I would leave Dubai and go back to India very soon um then Vanessa asks another question was it hard uh taking the decision of quitting your daily job and dedicating 100 to your Channel I did not do that actually before I started on YouTube uh I was doing a lot of physical training and Consulting work for Indian companies and the idea of creating a YouTube channel ever struck me uh because I was like pretty happy with those stuff that I was doing and you know that kind of kept me busy but then once covet hit and everybody just took away the work that I had I had nothing else to do so I thought okay maybe I've learned a bunch of stuff why don't I start speaking about that uh and then one thing led to the other and here we are so it was definitely not I did not quit my job I was doing the business uh for a very long time um and then YouTube came a lot later but if you want to kind of walk on that path I I would suggest that have the wheel rolling first before you make a jump so at least have some income first okay um till when can we accept an in-depth course on visualization sujit asks uh on power bi I don't think I'm gonna create a course on visualizations uh not because I don't like it but because the things are changing so fast in the UI of power bi that the course is going to be absolutely even before I make it so I think your best course of action is just to watch YouTube videos and try as many things out on your on your own okay uh what is the potential and future of office scripts um Sunday I really don't know I have not worked with office scripts so I don't know um I I can't really say much about that okay um Supriya asks I'm curious about your career uh where did you begin your profession in technology um when did you change your career to this um and some tips on developing an expertise could you please share so Supriya I mostly worked in finance and mostly worked with numbers uh it's just uh like I liked solving problems and you know one thing led to the other and I still don't consider myself to be a professional coder I uh just like learn things in a Brute Force way and try to solve problems so yeah I evolve as I go so it's I mean those days are gone when you have to kind of firmly study anything to be good at it so that's what I would recommend okay um what are my favorite YouTube channels uh Lionel asks there is this Channel that I really like called veritasium uh it's on the Science Channel and I do watch a lot of videos from there um other than that if you're talking about power bi channels then I I typically watch Channels like Microsoft is obviously a good channel to watch uh then I'm gonna I'm gonna watch Channels which are related somewhat related to power bi but not exactly so for example um the other day I was trying to play with the den app and I I got stuck and there was this channel I don't know the name of it I they were talking about Vega light as a programming language and I just took a look at that entire video um so and then anything that kind of feeds into my YouTube feed I kind of watched that uh but obviously everybody knows about big channels which are their kind of cube Google how to power bi um so all of those channels obviously are there where do you see yourself two to five years from now definitely in India I don't know what I'll be doing um what are your thoughts on the if statement um try to use a switch switch function in RBI um okay all right I'm going to go back to the questions here in the chat the pen the pen that I use is uh what's that called Zoom it that's the zoom it tool a lot of people ask me so that's a Zoomer tool uh okay all right what would you say among the Dax in the M code the learning curve and where is most difficult in your opinion which one is most difficult in your opinion okay so if you would have asked me this question two years three years ago I would have given you the answer something like this learn bubble query then learn Dax and then then learn the visuals now my answers have changed uh so the first the very very first thing if you want to learn power bi I will highly recommend all of you anybody a beginner out there is to First learn modeling what do I mean by modeling modeling is like a pen and paperwork so for example if you're trying to build up a power bi model what you need to do is just take a look at all the data sources identify which are the fact tables which are the dimension tables and try to build your model into a star schema shape now if you don't understand these terms you need to research about them what's the start schema what's a fact table what's the dimension table and things like that take a look at a few models understand how do these relationships work and how does the filter propagate from one table to the other what causes you to write vlookups in Excel and all of those things so the first thing that I would want anybody to learn in power bi is how on the piece of paper are you going to frame the model which are going to be your dimension tables which columns are you going to get in and which is going to be your fact table at what granularity are you going to get your fact tables at once you have built a model on a piece of paper then that model is going to be compared to your Source data and you will see that in order for me to reach this shape of the model I need to kind of clean the data in so and so fashion or I need to have so and so concatenated columns or I need to apply a group by so that I can reduce the granularity of the data or I need that column or I need this column and things like that the model is going to tell you automatically what is the steps that you need to take to clean your data that's one and that comes power query or SQL or whatever ETL tool that you're using and then once you have built the model then you take a look at the final output and that actually tells you that how much of work could you push to power query and how much of Dax are you going to write so the very first thing that I would recommend is the pen and paper logical understanding of what you're trying to build which is the data model from there you kind of go back to Power query and figure out what your video steps are going to be in order to reach this model and then kind of take a look at the Dax on the visualization side of it um the vet asks can you explain looping again pivot I will recommend you to watch my video on list.accumulate and that is where I have explained looping the thing with recording the videos is that I can make ton of mistakes while recording a video till the time I'm not convinced that I've explained it right I can re-record the video over and over again so just watch that video again please and thanks for the tip how do you move steps or do zoom in in the formula bar uh so there's a shortcut here Ctrl shift plus sign uh is the one that I use control shift Plus okay um so Pradesh asks me a question I started data analyst course most concentrate um more on which module um just uh perhaps the basics in fact uh so don't get hell-bent on the tool you can provide a lot of value with Excel as well um I read a I read a post on Twitter the other not Twitter but LinkedIn the other day and it was made by I think uh what Owen I guess um and it said that the CEO doesn't really care about the prettiness of your chart if the chart is not valuable to the business then it's definitely going in the garbage I that's not overbatim but I'm just kind of summarizing that so you can even provide a lot of sophisticated value to the business even with Excel so don't focus on the tool focus on what is that value that you need to provide and then from there you can pick up the tool based on the volume of the data the sophistication of the problem things like that but just get hold of the basics really well first will there be a finished file to download sure enough uh as soon as we kind of end the stream I can just take the very file and push it off as a downloadable link and then you can just download that if you worked with API and power query very briefly yes I have uh I got success once then many times I face to failure so I don't I don't have experience of comfortably saying that yes I know how to work with apis in power query I have once but then maybe I just got lucky that it worked the one time that I've worked with it and most times it didn't okay okay John uh I'll just take maybe um a few more questions before I end this how do you handle the issue of large data causing long refresh times in power query so I know a half cooked concept of something called as streaming I'm still trying to figure out a way to visualize streaming in power query and then probably make a video to explain you all but I know there's a concept of streaming streaming simply means that power query doesn't hold all the data it just shows you a preview of the first thousand rows of the data and in case you do any particular step that stops the streaming to happen that is when the query tends to become slow let me give you an example you have a table that shows the preview of the first thousand rows and you ask the table to be sorted in ascending order now even to generate a correct preview power query needs to read all the data assume that you have 10 million rows of data but you were seeing only a thousand rows it needs to read all the data and then even to generate the first 1000 rows which is the preview then it needs to sort the data and present the first thousand rows again that is where the streaming is going to break because you have done an operation in power query that breaks the streaming there are several such operations in Baba query that might just break the streaming and that's what you need to check how do you check that um and where could you get some guidance around that and is there a tool that helps you check that have you broken the streaming or not I don't know frankly I just kind of tweak around with the query play around with the m code and then see where it possibly could have broken the streaming in case my query is getting slow but that's what that is where I've seen mostly the problem being for slow queries I I hope I answered that question hey thank you Aaron appreciate that okay last question Gladwin says what's the difference between power query and data model power query is a data prep tool uh it allows you to prepare the data so that you can load it up in some Source be it Excel be it power bi and things like that data model is the data which is loaded in the vertipack engine of power bi where the tables are connected to one another and the Dax is written that's the data model where you can see all the relationships if you've taken a look at Power bi there are three tabs there is a visual tab there's a data Tab and there is a modeling tab at the modeling tab you will see all the tables connected with one another you will see relationships you will see inactive relationships you will see measures that thing is called the data model all right I think that's been it frankly I've been tired talking I can do another uh session uh based on how this one goes and uh you guys have been very kind to sit through all the uh looping that we have done today um I hope to see you uh sometime soon maybe we can do another session on Dax on visualizations or on power query again uh happy to have your feedback in the comments and of course if anybody has a better solution please do post the comments comments are an excellent way for everybody to learn um don't consider my videos to be absolutely sacrosaned this is the best way to solve the problem please do take a look at the comments as well there are some ingenious people writing comments and those Solutions are far better than what I have this is just my understanding my preliminary understanding of solving the problem in the moment then I was solving it and I presented that to you but please do take a look at the comments those are gold that's been it thank you everybody and uh I hope to see you sometime again soon and of course thank you for 50 000 subscribers uh I'm gonna end this stream by now