Transcript for:
Introduction to Alteryx Designer

welcome to an introduction to Alteryxdesigner  now Alteryx is a data tool set designed   from the ground up to empower business and  technology users alike in using their data   more effectively Alteryx means different things  to different people depending on who you ask   for some it's all about data acquisition and  enrichment however others place much more   emphasis on its predictive modeling application  developing or even geospatial capabilities Alteryx   allows users to rapidly prepare blend conform and  analyze data from any source including flat files   databases apis and more within a single  instance all of this is accomplished using   a visual workflow designer that you're seeing  up on screen right now and is it remarkably   intuitive and easy to use requiring no codes  but also cloud friendly should you prefer so let's get started the Alteryx designer  window is split up into four main components   the first being this toolbar which you're gonna  see atop the screen which contains all the tools   that you're gonna use and piece together to build  what's called a workflow now each tool as you can   see here is split out by their shape and their  color depending on what sort of function they   provide for us within the workflow so we have  our favorites palettes here which is selected   automatically but you've also got tools that take  data in and out of the workflow tools that start   to prepare that data so you're going to see such  tools as filtering data cleansing formulas here   you've also got a join category parsing  transformation and it just goes on and   on there's a whole host of different tools  that we've got here in all tricks but i think   just for the purpose of this introduction  we're just going to make use of these first   five tool categories here or even just the tools  that are present within the favorites palette   now once you've found the tool that you're  interested in using uh quite typically this   is in first instance is the input data tool let's  go ahead and click drag and place it onto where   it says drop tools here right in the middle this  is called the canvas that's where you're going   to build that workflow now once i've dropped  a tool onto the canvas i can give it a click   and on the left hand side we're going to  see what's called the configuration pane   now the configuration pane is where you  tell precisely what you want this tool to do   this being an input data tool i'm going to  go ahead and point to the file in which i'm   interested into reading to this workflow so  i'm going to click on this drop down menu here   pop open this data connections window and head  on over to files so you can see here a list of   all the supported file types now this isn't an  exhaustive list but it just shows you some of   the files that we can read into Alteryx nice and  easily these contain such things as csv files json   files uh xml files excel files you name it it's  all here and if you're not seeing the data source   which you're interested in reading into old tricks  from this screen you can also go to data sources   now this is where we can hook ourselves up to the  likes of microsoft sql server oracle databases   amazon databases ibm databases hadoop whatever  you've got we can make sure we can hook up to it   but i'm going to head back to files i'm going  to go ahead and select my file it's going to   pop open a windows explorer box here and i'm just  going to select the customers.csv file once i have   that selected i can click onto open and it's go  ahead and read that file into the input data tool   now next i'm going to do is head on back to my  favorites toolbar here find this tool called   the browse tool give it a click hold down my left  mouse button and drag it onto the canvas until i   see these dotted lines appear these dotted lines  once i drop this browse tool onto the canvas are   going to transform into what's called a connection  between these two tools now if this connection   isn't present by maybe you dropping down this  browser a little bit further away from the other   input data tool all you need to do is to hover  over this green anchor or nose uh some people   like to call them until you see this little hand  icon appear now if you click and hold your mouse   you can drag that connector and snap it into  place into the input node of the browse tool   now if i zoom back out this is in essence   the smallest workflow you can build read in  a file and go ahead and take a look at it so let's click on to run to execute the flow now that the workflow has finished running  i can take a look at the fourth part of the   designer window that being the results pane  so i can see here the contents of the csv file   now i can click on any one of these  fields let's take the customer segment   field for example now once i've got the  browse tool selected i can see here rather   than having the configuration pane for this tool  present it shows me what's called a data profile   and this allows you to take a little bit of a  deeper dive into what this data sets all about   so i can see here how many values we've got as  okay how many are unique how many are null not   okay or empty now a lot of people say to me  hey what's not okay me and that simply means   if you've got any trailing or leading white  space here got any tabs carriage returns or   new lines in your cells then that's going  to be highlighted right away as being a not   okay value and you can go ahead and prepare that  data before you go ahead and do anything else with   it we've also got some other information here we  just get a value by count so we can see how many   values we've got in the different customer  segments at a flash rather than having to scour   through our data set so definitely recommended to  drop down a browse tool when you're utilizing a   brand new data set just so you can have a look  at what that's all about and note down any   preparatory steps which you're going to need to  do to get this data set into a good shape so first   thing i might want to do in this workflow would be  to just remove some of these address parts for the   particular report that i'm creating here where i'd  like to find out what the total spend per customer   is and figure out what our top 10 customers  are for each one of the customer segments   these address parts here are of no use to me so  we're going to do it's going to drag this browse   tool over to the right hand side slightly and i  find the next tool that i'm interested in using   that being the select tool i'm going to click  and drag this tool onto the connector between   the input data tool and the browse tool once i do  so it's going to merge itself between those two   tools and i can see the select tools configuration  appear here in the left hand side of the window so   what i can do is i can see all of the fields  i've got here within my data set and simply   just use these check boxes to remove these fields  from my data set now just bear in mind that when   we're deselecting these fields here in alteryx  we're not updating or modifying that source file   in any way shape or form so you can remove and  play around this data set in confidence without   corrupting the source data whatsoever now this  field here called unknown is quite important   because if i select the input to my tool now just  so i can see the the data set coming on through   from the input data tool i can see here that i  don't actually have a field called unknown but   what that means is if anyone else goes into this  csv file and adds in let's say you know a another   zip field we have zip 3 for some reason then that  is constituted as an unknown field now Alteryx   saying here we're having this check box enabled  do i want to read through any fields that are not   directly referenced here into the workflow the  next time it's run should there be any or should   i say hey any unknown field i don't want to come  through and i'll leave this check box deselected   so once i've got that configuration  set up again i'm going to run the flow and now you notice here is uh  with the exception of course   being the input data tools and maybe some of  the endpoint tools such as the browse tool here   most Alteryx tools have some sort of input and  some sort of output depicted by these green icons   either side the select tool that we've got on  screen here so you can see that the input node   or anchor is selected at the moment and i can see  the state of the data set before the configuration   i've set for the select tool is applied now if i  select the output to this tool by clicking on here   i can see what the effects that configuration has  on the data set and you can quite easily flick   between the input and the output just to verify  that the tool has done precisely what you're after   all looks pretty good to me although what i  have noticed here in the data type field is   that all of my fields are coming through as  a v string now there's only really three main   data types i would say that you need to concern  yourself when you're first getting into Alteryx   those being a string type data which is the  text-based data you've also got some numerical   type data if i click here on this type drop  down we can see the full list of these data set   data types excuse me so we have the string  based data here so these all different text   types data we have int 16 32 and 64. that simply  stands for an integer so that's a whole number   rounded up to the nearest value we also  have a fixed decimal which you can imagine   is this a number uh rounded up to a certain number  of decimal places and then we have what's called   a double which is a the exact figure so no  rounding taking place there we then also have   some date time and then date time data as well so  those string numerical and date time data points   are probably the only types you need to concern  yourself in the first instance but as i say we   can see that all of the data types for this data  set here are coming through as a v string it's   always text but if i take a look in the results  pane i can see that i've got two columns here   namely the customer id and the store number fields  that should be numerical so i could go ahead and   manually update the data type by selecting them by  this drop down in the select tool or alternatively   in keeping with Alteryx fashion we like to keep  things as simple as possible what i'm going to do   is head on over to the preparation category and  find this tool here called the auto field tool   once i've found that tool i'm going to click  drag and then place it onto the connector   and all i have to do here is select the fields  in which i want the tool to go ahead and update   the data type for me now this auto field tool if i  give this workflow run it's going to scour through   the fields i had selected in the configuration  pane and go ahead and update the data types   to what Alteryx thinks is the best fit so  if i have a look here i can see the input   that's coming on in and i've affected the output  there's no apparent change but what's happening   here is we're actually updating what's called  the metadata as opposed to the data itself   you're going to see here in results grid  i've got two tabs one called data which   is selected now highlighted in blue i've also  got one here for metadata if i click onto that   i should be able to see the data types for all  the fields coming into the tool notice how i've   got the input node selected to this tool now if  i flick to the output after this auto field tools   configuration has been applied i can see that the  data types have been updated uh to a a better fit   all right let's flick back to the data tab  and there's still a couple things i want   to do this data set before it's ready to go  those being cleaning up this first and last   name fields here they're all capitalized and i  probably want to push them down to be title case   so again gonna find the tool that i'm interested  in using and this circumstance is going to be the   data cleansing tool i'm gonna drag this onto  the canvas place it onto the line once again   and configure the tool how i want it to work so  there's a variety of different cleansing options   we can do utilizing this data cleansing tool such  as remove all null rows or null columns should we   have any very useful indeed alternatively we  can just simply select the columns we'd like   to cleanse and in this example it's just going  to be the first name and the last name now we   can replace null values here in string field to  a blank value or in a numerical field with the   value of zero but that's not relevant for  us in this case or deselect those options   you've also have the opportunity to remove any  unwanted characters such as the leading trailing   white space or tabs line breaks and duplicate  white space now if you recall earlier on in the   video i mentioned in the data profile how he could  have some not okay records that can be coming up   and highlighted in red in the data profile from  the browse tool if you have any of those records   then this data cleansing tool and having  these two options selected will clean that   right up for you but again not useful in this  circumstance you're going to deselect those   what i'm most interested in is to modify the  case so i simply select this checkbox here   click onto the drop down menu and select  the case i'm interested in that being   the title case now again i'm going to run the  flow by clicking on this blue run button at   the top right of the window i'm going to have a  look at the input to the data cleansing tool see   what that data looks like coming into the tool now  if i flick to the output again i'm going to see   what the modified case action does and i can see  that first and last name has been updated nicely so i think this data set is good  to go next i'm going to need to do   is to drag in and read in the next input  to my flow so again i'm going to zoom out   a little bit and go back to my favorites  palette and find the input data tool again   dragging another one of these down i'm  going to place it somewhere around here   so i'm going to give it a click in the same  fashion i had done earlier go to my data   connections window files select a file and this  time i'm going to select the transaction.xml   okay once i've got that selected and click  onto open and drag down another browse tool   and just place it somewhere around here cool once  that's done you can get the workflow one more row okay so clicking onto this browse tool i  can see now the contents of this xml file   coming on through so this has my customer id in  here and it also has some of these sales values   for the products that these customers have bought  now to tally up the total sales for each one of my   customers customers excuse me what i'm going to  do is to join this transactions file here to the   customers data set that i've got at the top before  i can do so however i'm going to need to prepare   this data again ever so slightly now if i flick  to that metadata tab that i had a look at earlier   we can notice that all of the columns within our  data set are coming through as text once again   now if i flick back to the data tab i've certainly  got some fields here that need to be a numerical   value such as the sales if we're going to sum up  the sales value for each one of our customer ids   that needs to be numerical value as we  cannot sum up text as you can imagine so   what i could do is drag down another auto field  tool in the same fashion i had done in the above   work stream or alternatively as we had a look  at the select tool is great for updating any   of these data types manually should you wish so  i could even select the sales field here click   on that drop down and say hey i want that to be  a double data type which is numerical data type   once that's done again the workflow one more  run and we can take a look at the results so if   i flick to the metadata tab now we should be able  to see that all of my data types have been updated   nice and neatly and that's ready to go  although there is one more step to be had   before i can join that customer data set and  that is of course to sum up all of these sales   values so what i'm going to do is drag this browse  tool a little bit further to the right hand sides   head on back to my favorites palette and i'm going  to find this tool here called the summarize tool   now you'll be able to notice of course by its  shape and its color that is part of the transform   category i want to stick in the favorites palette  just for now again i'll find that tool drag it   onto the line drop it there and then on the  left hand side we're going to configure the   tool and tell it what we want it to do so we want  to first have one row per customer id so therefore   i'm going to select my customer id up here in the  fields add an action to it and then do a group buy   next what i'm going to do is select my sales  column add an action to that and then sum it   up now the output field name is going to be the  name of the field that's going to be coming out   of this tool by default it's called some sales  but i also have the opportunity here to give   it a nice new name and i'm going to call this  total sales probably sounds a little bit better   some other things i could do here are select the  order id field add an action to that and then   count it this could give me visibility on the  number of orders that created that total sales   figure and i will even go as far as selecting the  product name which we can see here add an action   and there's a whole host of string specific  or text specific actions that we've got here   and i can click onto concatenate and this  is give me a list of all of the products   wrapped up together in a single cell and of course  you can choose how those values are separated down   here and maybe i want to have a slash separating  those product names okay once that's been   configured let's run the workflow once again we  can take a look at the data coming into the tool   so this is our view that we're familiar with thus  far click onto the output and we can see now that   we've got one row per customer id a visibility  of the total sales for that customer how many   orders they've made and then if i double click  here we can see all of the products concatenated   together onto a single line with that slash as the  separator between the product names brilliant okay   so now i think we're in a good shape to join  our customers data set here at the top to the   transactional data set here which we can see at  the bottom so before i do so i'm going to get rid   of these browse tool now by highlighting  them both and then just selecting delete   they've kindly served their purpose for us now  and what we're gonna do is join these two work   streams together as you can imagine we're gonna  be using the join tool so i can find that up here   in the favorites palette again click and drag  and i'm going to place it somewhere along here   now this join tool looks a little different than  some of the other tools that we've been working   with thus far just because it's got two inputs  and three outputs uh they are labeled l and r   these inputs which simply just stand for left  and for right so we can take our customers data   set up here coming out of the data cleansing tool  click drag and place that into the left hand side   and similarly we're going to do the same for the  transactional data set by clicking dragging and   placing it into the right hand side now once i've  got these two data sets connected to one another   in this join tool as we can see here in the  configuration now i'm going to have to tell these   uh Alteryx that how i'm going to join these two  tables together and that being of course with our   customer id field so i can join by specific fields  click on the left and i'll select my customer id   and similarly i can click on the right  drop down and select my customer id here   now i can see here in the configuration of the  join tool i've got a view very similar to that   of the select tool that we used earlier on in the  flow but with one addition this column here called   input so if the records from these two data  sets are successfully joined together i can   see which fields are going to come from the  left hand side that being our customers file   and which field is going to come from our  right hand side that's the transactional   file i can even have the opportunity to use  these directional buttons here to kind of push   these transactional file uh records a little bit  further up into my table of course the further up   they are in this view the further left they will  be in our data set okay next thing i'm going to do   is just to right click onto this join  tool and select the option here to add   all browsers and what that's going to do as  you can see is add a browse tool so every one   of these three outputs that we've got here next  thing i'm going to do just before i run the flow   is just click onto a blank space on this workflow  just about here and that's going to bring up the   general workflow configuration and i'm interested  in this option here called connection progress   now by default that's selected as only show when  running but i want to say show all the time and   the reason for that will become apparent in just a  second because once that option has been selected   then these little boxes between the connectors  i've seen the tools are going to appear on the   flow and be present all the time now this shows  me the record count between each one of the tools   as we're going along and this is especially useful  when we're using a joint tool so if i run the flow   it shall illustrate my point nicely so i can  see here having these connection progress boxes   showing all the time now i can see coming in from  my customers file i have 2678 records coming into   the left-hand side 2703 records coming in from the  right-hand side that's our transaction file and i   can see here i've got my three outputs labeled l  j and r l stands for left r down to right and j   stands for join so out of the j for join output  in the middle here i can see that i've got 2   678 records joined that's great that's all of  my customer records and i can confirm this by   looking at the left output here which will show  me all of the left-hand records that do not   join to the right-hand records this being zero  records here so i can say that confidently   my customer records have all joined to the  transactional side but as we can see here   and the reason being why i've added these browse  tools to this outside of the join tool as i can   see here on the right output i actually have 25  records coming out of this output so that means   i've got 25 records for my transactional data set  that are not joining up to my customers data set   now this might be okay in some examples or in  other examples this could not be acceptable   but just for the purposes of this demonstration  what i'm going to do is just use this j output and   pretend we've got everything joined up just like  we needed to so now what we can do is we can see   that we've got a customer id field here alongside  all of the transactional based data and then the   customers data all in the same data set that's  great news now what i can do here is start to   formulate my output so i'm just going to get rid  of these browse tools on the left and the right   output for the time being and to drag this tool a  little bit further to the right hand side now what   i want to do is create a report that shows the top  10 customers by that total sales figure for each   one of the customer segments that i've got here so  first thing you need to do is to sort my data in   descending order by that total sales because of  course i'll have my best customers at the top of   my table so i'm going to drag this sort tool from  the favorites palette place it onto that line and   on the left hand side here i can select the field  i'd like to sort by that being my total sales and   then the order it needs to be sorted by and that  is descending okay give that a run now if i take   a look at the browser i've got here the end of my  flow i should see here my customer id with the id   3075 has spent 123 000 pounds with us so that's  a tremendous amount and we of course want to make   sure we nurture that client so now i've got my top  customers at the top here all i need to do here is   now group by this customer segment and i can do  that by selecting this sample tool so i drag it on   place it off the sort tool that i got here on the  canvas i can say that i want to select the first   n rows up here where the variable n is set to 10  so therefore first 10 rows and i can group that   by the customer segment field now if i give us a  run i know that i've got four customer segments   so therefore i should have 40 second the first 10  per customer segment coming out of the tools so i   can see here that i've got the top 10 for consumer  then for corporates then for home office and then   finally small business that's great so i guess  the final step that i need to do here is output   this prepared data set into an excel workbook so  that other people around the business can begin   to use this output so i'm going to do is just  delete this browse tool now and put in its place   this tool here the output data tool so go ahead  and drag that onto the flow and connect it up to   the sample tool and the same fashion where i first  selected to read in a file using the input data   so what i'm going to do here now is select where  i want to push or to write the file to so i can   select the file here and then of course navigate  to the area in which you would like to write the   file i'm just going to write it this folder here  do make sure you select the create the correct   save as type in this drop down menu here i'm  going to leave it as the default microsoft excel   i'm just going to call this Alteryx demo out like  so give that a save and it's going to ask you what   you'd like to call the sheets i'm going to leave  it's called sheet 1 for the time being i'm going   to change this output option here to overwrite  sheet or range now if i run this workflow i can see now here in the results grid i've got  a log of everything that's happened during the   flow and i can see here the output data tool has  written 40 records to this file and i can click   onto this link and it's going to pop open the  excel file for me to take a look at now if i just   reorganize these columns this is now the  output of the Alteryx flow and then people   around the business can then start to use  and consume this data set as they please   so if i close this file down and we just  take a zoom out at everything we've done   here in this Alteryx flow all we've done  is read in two files that may be from   source system a from source system b prepared that  data joined it together at an output the results   now if we're doing this in the traditional manual  sense let's think someone may have to go into   source system a to export a customers.csv file  maybe go into another system that being source b   and downloading this transactions xml  documents we have to open them both up   into excel in parallel with one another copy  some data across from one sheet to another   do a pivot table to get the summary of the total  spend for each one of the customer ids and maybe   do a vlookup here to join the values together  sort them and then figure out what the top 10   customers were for each one of the customer  segments that could take maybe even a skilled   analyst about half an hour 45 minutes maybe even  up to an hour to complete manually and of course   if they're doing that manually then that does  have the opportunity to introduce any human error   if they don't apply the correct formula or do the  vlookup correctly then we could introduce errors   unwittingly into that process now since we've got  this built out in Alteryx designer as you can see   here all these tools are laid out doing exactly  the same steps as we would do manually but when   we run this flow we're not going to deviate from  these steps whatsoever so we can be confident that   every time we run this flow we're going to get  exactly the same output and of course uh running   it within Alteryx is tremendously faster than  doing it manually let's say rather than taking   between half an hour to an hour to complete  such a process if i take a look at the results   we've done this in just over one second that's 1.3  seconds so that's the sort of time efficiency and   savings that we're looking to introduce to your  team by utilizing Alteryx designer if you are   interested in learning any more about Alteryx  designer then visit our website continuum.je   or reach out to us at inquiries at continuum  dot je and it'd be our pleasure to show you a   little bit more about old tricks and answer any  questions that you do have thanks for listening