Transcript for:
Exploring Predictive Analytics Fundamentals

i got another question in this week that was sent to me i'm going to start off this way what exactly do you do and what is predictive analytics signed your mom gonna tell you fill you in and fill everybody in and i got a little help to do it let's get to it [Music] welcome welcome welcome i'm your not so humble but very passionate host eric wilson and this is ibf on demand your podcast coming to you on youtube or wherever you get your podcast we do have a sponsor for this arkiva your one source snop software solution so please check them out they've been involved with us from the beginning great opportunity to see exactly what they have to offer you as well and i want you to keep watching i know we're going to be hitting about the five minute mark now and analytics wise people start dropping off so you know who you are so help me boost these analytics just just sit and watch the entire episode once or twice or five times wait get those uh up and continue to like share and subscribe we're almost at the 2000 mark with subscribers i want to get over that level so please if you haven't subscribed please subscribe and do me one more favor when you do that sharing get your other friends to subscribe as well your other colleagues because guess what we're trying to grow a community here this is a family ibf is about a family this podcast is about that family giving you some insights into forecasting business forecasting snop predictive analytics things you're doing every day giving you some insights on those i said today we're going to talk a little bit about predictive analytics what it is and answer the question of what exactly i do well you do because you know it's a tough question when you're at the pool you're going to a party someone asked you what does exactly you do and you start talking you know i do business forecasting start and their eyes just glaze over so what exactly is we do can we answer that question i'm gonna try to help you explain that and i figured i need some help and two eric's are much better than one so also gave me excuse to bring him on so we have a very special guest today eric siegel he's the instructor of our coursera machine learning for everyone course he's the former columbia university professor he's a singer and songwriter kind of popularized with a geek wrap video on youtube about predictive analytics he received numerous teaching awards he's the author of the bestseller predictive analytics the power to predict who will click buy lie or die he's the founder of predictive analytics world conference he has appeared on talk shows and newscasts around the globe and almost equals my passion and makes predictive analytics and machine learning understandable and captivating help me welcome eric siegel welcome so thank you very much eric for joining me i'm excited i said i'm using this as an excuse to get you on our podcast to be able to talk i've read your book a few times i actually got the audio book as well because i read your book last year and very conversational way you write it's an excellent way of writing it i actually got your your audio book to be able to freshen up for this and it's equally as engaging and entertaining so so kudos on on your book and cute is what you're doing right now with coursera as well putting together those courses i'm i'm excited actually i'll probably participate in those as well and be able to to gain my knowledge in machine learning for everyday use type situations so thank you very much thanks eric thanks for having me and thanks for saying all that yeah i mean if you if you don't want if you prefer to see the movie instead of read the book although the the the the movie the coursera core series which is called machine learning for everyone uh is broader it's got more content it's it's composed of 141 short videos so it's got a lot of content well i would check that out now i've done the book i've done the audio now i'll do the do the movie now as well so i'll do all three excellent yeah you can watch it on the go just like an audio book because corsair's got that app and it's free to access the entire contents i would check that out so i started off you get this question a lot you actually covered it in your book uh and and i'm gonna have to ask it again for this audience because we got a lot of people out there still trying to figure out exactly what it is that i do we do so i'm going to ask you the question i'm sure you've had is can you please explain to people exactly what it is that we do what is predictive analytics right well predictive analytics is basically applications of machine learning for business problems so the way we kind of most uh concretely define it is it learns from data to render predictions for each individual so in most cases it's sort of differentiated i mean there's a there's a there's a continuum fuzzy uh a fuzzy boundary here between predictive analytics and forecasting but i differentiate in that forecasting usually talking about singular predictions what's going to be the sales in the next quarter and uh whether you know whether with politics which direction is an election going to go whereas with predictive analytics we're at this lower more detailed level of granularity so it's more for targeting marketing for targeting fraud for deciding who should be granted approval and application for a credit for credit card or what have you it's rendering a predictive score for each individual which could be a consumer or a corporate client or even a product which satellite is most likely to run out of battery over the next two years things like that but since you're rendering for each individual you're directly informing decisions on that level of granularity so it drives large scale operations in fraud detection marketing uh credit risk management cetera it drives all those large scale operations really all the main large scale operations that we execute as organizations more effectively so there's a lot of overlap with forecasting there should be a lot more interaction between what are really two very siloed industries but there's a lot of the same concepts a lot of the same core analytical methods a lot of the same thinking you know they certainly would both belong under this sort of subjective umbrella known as data science right um and uh so i'm glad to have you on and also to hear uh about a book that you're writing where you're bringing the two fields together what's the title of that book again predictive analytics for business forecasting yeah to be out about a month from now so so thank you for that plug yes so so with that i mean so why exactly then predictive analytics machine learning why is it important or why is it the latest evolution in in information technology why why is it that latest revolution in your opinion well i i like to consider that it the the the the latest step in that in that evolution um information evolutionary step kind of in the information age um because it drives all the large-scale operations uh more effectively right so in a sense we've moved from engineering the sort of the big data movement of how to warehouse manage maintain collect bigger and bigger data to the application of science to learn from that data it's content what it means what do we want to learn from data we want to learn how to predict predicting per individual is basically by definition the most actionable form of analytics because it directly informs those per individual decisions right which make up all the main large-scale operations that we conduct as organizations and you kind of mention that i mean it makes up all the large-scale operations i mean i've said quite a few times every business decision starts with a forward-looking projection or a lag between what you know or assume right now and what actually occurs that is a prediction and indeed yes and and being able to drive that prediction with data and science is pretty much what you're talking about i mean yeah i mean the only way to scale prediction right in general and to optimize to do it in a scientific way is to to do with data and use you know best-in-class cutting edge methods which are known as machine learning right so the methods on the sort of business application side include decision trees logistic regression neural networks ensemble models etc right and forecasting time series and stuff but there's ways in which these two classes of methods really do interact and build upon one another now knowing this audience we may have just scared a few people away we're talking about ensemble models regression no no no no they have fancy names but they're totally intuitive so let's let's plug our writing and stuff right so okay name of the coursera course machine learning for everyone because we mean that it is intuitive there is a way to present it and we worked hard to do that so that it can be um accessible understandable relevant interesting even entertaining to everyone and the core technology is exciting because it it provides this business value yeah and sure that's what i said i'm finding someone on this podcast that almost equals my passion for predictive analytics you know so and if we can make it exciting and entertaining that's that's that's the thing out there my passion is bigger than yours oh no i admit i'm not doctor dated and i don't have a youtube video yet but yes i'm getting there music video yes yes we'll do anything to educate on predictive analytics machine learning we have uh the predict this rap music video very very educational and a brand new machine learning rap music video called learn this embedded in the middle of the course series on coursera free to access two and a half minute best educational machine learning rap music video stuck in the middle of 141 videos on that okay okay so i said you're a singer-songwriter you got me on that one so you may have a little more passion but i'm getting there i'm getting there no i got the book coming out yeah i know you got to write a rap because you're not you're not educating on machine learning unless you write some pros that rhyme you know hey i've already said no tick tock dances and i don't think my audience wants to see me rap either so i will i'll leave it to doctor data today i mean i take the difference okay so so getting back to the predictive analytics machine learning so i mean we threw out a lot of terms there early on to kind of break it down then i mean let's take away the myth of some of these terms and you mentioned data quite a few times i mean let's break down that myth of the word big data because i love it where you talk about just it's a lot of data exactly a two-minute description of how machine learning actually works under the cover right in two minutes that's right so data is not boring data's interesting because it's a list of things that have happened it's a recording of history it encodes the collective experience of your organization so this is a process of learning from experience and what gets learned from analyzing those things that have happened in the past where you know you know what happened how what the outcome was you know how many ice cream code cones were sold in the third quarter which of your customers cancelled uh which direction a political uh election went who turned out to which transaction turned out to be fraudulent which credit card holders turned out to be bad credit risks you know all this that's the experience so you learn from that data the number crunching methods derive patterns and the patterns depending on the method are pretty intuitive and understandable so they could be business rules like if a customer lives in a rural neighborhood and has these demographic characteristics and has exhibited these particular behaviors if if it's like a long english sentence then they have let's say a four times more likely chance than average to buy your product right which still may be a relatively small chance but when you're improving something like mass marketing a small you know finding a pocket which has a four or five times more likely chance of behaving in a certain way or having these outcomes making this purchase is a dramatic uh improvement and has a has a dramatic improvement to the bottom line metrics of that large-scale process whether it's in marketing fraud detection credit assignment and all different kinds of business operations so i guess i might have gone two minutes and 25 seconds right i lost track i was just so entertaining i was listening i lost track of time so when you're talking about that example right there uh you pretty much talked about looking at historic events historic you know drivers that we can identify and looking putting probabilities to that i mean like saying okay this percentage occurs this percentage occurs a logistic regression is pretty much what you're talking about there right logistic regression is one of the methods logistical rest is just a weighted sum with a little bit of fancy math added onto a weighted sum literally that each factor each input to the model each thing you're considering about let's say a consumer demographic aspects how many purchases they made last quarter et cetera et cetera that each of these aspects which are called an independent variable gets a weight so their relative importance simply is a bunch of weights and then you it's so it's a weighted sum and then you do this little non-linear transformation that helps the thing work better but for the most part it's all about the weighted sum and in fact a neural network everyone hears about deep learning yeah neural uh deep learning is a is a kind of neural network but neural networks in general are actually just composed of those types of weighted sums you just sort of build it up to a big complicated uh mathematical soup which can be very difficult to decipher even for the inventors of the method right even if even for the scientists who created neural network uh algorithms that that do this but the actual building of it is modular right so if you look at the way people build a bridge you'll see all these uh uh you know trestles under the bridge and the way they work and they're these little you know often triangular shapes and we know how they support structure right so each little part each component is very simple you build it up into this massive system it's very complex to understand at once but we you know we know it works and you can measure how well it works people drive across the bridge and it holds up with with a car and likewise you can see how well these complex models work in a very simple way you just try them out over many examples and see how well it predicts so we just demystified neural networks and that's why i was reading you beat me to it type thing so real quick back to that data thing one of the things we talked ab in data i want to make sure when we're talking about you know a lot of data it can come in different ways it can be a breadth or or depth of data so i mean you could actually have a lot more features or what they're calling drivers uh different you know different things predictors that you're looking at or you can have a lot of the same feature and everything so the very difference when you're looking at a lot of data can mean different things and has different impacts to a model then as well correct yeah absolutely so you i mean you could just think of the learning data the list of historical examples where you already know the outcome right that's the value that's why there's so much the thing about big data isn't just that it's large physically the volume of it right because there's always twice as much today as there was yesterday so the word big is all just relative but what is absolutely big is the excitement about it because of its value and the value is that we have these historical outcomes we know how things went and what we also have is a lot of pieces of information those independent variables and there are columns so the data the learning data which technically the technical term for learning data is training data go figure right is just a two-dimensional table so it's a a bunch of rows so it could be an excel spreadsheet although typically you want more advanced uh software than excel but but the concept's the same it's just a two-dimensional table and each row is an example and each column is one of those different independent variables those different factors demographic and behavior that is known about each individual so the data potentially does get quite wide because there's so many different aspects factors known about ancient individual that could help predict that could provide value as input ultimately to the to the model the thing that's learned from from crunching that data so we want the data to be very very wide even if ultimately the system's going to decide decide that only a small you know a handful a couple dozen of those factors will actually get used but you never really know which ones those will be until you try it out so you just throw in a whole bunch of stuff and see which aspects turn out to be most predictive okay so let's get to that at that point then just throwing in yeah because i don't want to overfit the models have too many features how do i know exactly which model is going to work and yeah what are we going to do what's a predictive model look like then right so the overfitting question is sort of the big question right i mean it might it might find patterns in the data that don't hold in general and therefore it's failed it hasn't found an actual truth that's going to help you predict in the future it's just found arcane aspects of this particular list of examples even if it's a really really long so i said i'm making the point that the day is really wide and you're making a valid point was well if it gets too wide then you're giving it so many opportunities to find little weird arcane things otherwise known as overfitting so that even if it's also really really long even if there's tens of millions of examples for example it it's um still has the potential to over fit to find patterns that don't hold in general now there's a lot of ways to try to fight against the potential overfit but the bottom line is extremely simple whatever strategies are in the algorithm to help avoid it to overfit the only way you know and the way that you can be totally sure that it has an over fit is simply to test the resulting model over a bunch of examples that were held aside that were quarantined that were not used during that number crunching during that learning process and that's the standard that's not just a best practice it's the only practice all the software does this typically automatically even without you asking although you can specify how to do it but you know it might hold to say about 20 of the data aside and then you're forgoing the value of that data for the learning process right you don't you get that much less data to actually learn from but you have the absolute necessary uh requirement fulfilled that whatever you do learn from let's say like 80 percent is now validated over the separate hello side data which is called test data all righty so uh so the different models i want you to hit on ensemble modeling real quick because when we're looking at something that i mean you're talking about simplicity and being able to find the best features find the best model ensemble modeling actually can help you and it's it's held up a lot in those kegel competitions and things of that sort as well so you want to explain a little bit about ensemble modeling then for the audience yeah sure so an ensemble is when you take a bunch of relatively simple models like the one like a simple logistic regression which is a weighted sum and or a decision tree which is a way to encode those if then business rules we mentioned earlier but you take a whole bunch of those models and you basically just have them all collaborate you have them all vote or you take the average output of all of them basically it's totally analogous to this wisdom of the crowds concept that happens with people so if i show you a big can of marble or a glass jar of marbles i say can you guess how many marbles in here the winner whoever guesses close it is the winner it's very difficult you're looking at this jar of marbles with your eyes you're trying to figure out how many marbles are in there but if you get a lot of people to guess some will be guessed way too high some will guess way too low but typically the average gas is a good guess often better than the best individual guest so the group of guessing people as a collective exhibit this emergent so-called collective intelligence right and it's the same thing when you have a population not of humans but of models even if the models are super simple so that's that's what an ensemble model is so it kind of it kind of smooths out the edges in the sense that if you did something like a decision tree which is a way to summarize a bunch of if-then rules well if i have this list of let's say 50 000 examples and it grows the tree it creates a bunch of rules uh learning from that data it typically does a pretty good job but it's a little brittle in the sense that if i change the data even just a little bit there's a possibility that the resulting model would change a lot just because of where of all the boundaries right if a few boundary cases changed the whole resulting model might get a lot better a lot worse just from those little changes and we don't really want that we don't want that brittleness but if you perturbate the data a little bit randomly create a new model a new let's say decision tree and then do that repeatedly create a whole population of trees and have their have them kind of collaborate have them all vote rather than just having just relying on one model then that overall process is much more robust much less brittle okay so then the last thing when we were taking apart predictive analytics machine learning the other thing i always say is it's more forward-looking uh you know opposed to just a looking at historic you know level and trend this is a more forward-looking because you are looking at those other features as does that you know width of data set your data that you're looking at so i mean then it gets into or like say things like uplift modeling or what persuasion modeling i think is the other word for it things of that sort things that can worry not only you know predict what the future is but help create a future going forward is that really that next stage then for predictive analytics yeah i mean uplift modeling also known as persuasion modeling yes that's the other word for it is a a better option for many application areas especially marketing let's say political elections because you're predicting whether the individual would be persuaded whether you would change the outcome with this marketing contact so instead of just predicting of all the customers um who i actually apply the marketing treatment let's say i spend two dollars sending them a marketing brochure you know which ones are most likely to buy that doesn't necessarily help me decide who i should send a marketing brochure to because what i want to do is not just send the one based on like who's going to buy if i send it i want to know who's going to be more likely to buy if i send it in comparison to if i didn't send it right does the marketing treatment actually help does it have an impact and the same thing with driving which campaign volunteers knock on the door for a political election you know obama's re-election campaign 2012 very much used uplift modeling slash persuasion modeling so this proves that um you really did read my book because it's the very last chapter um but i i saved the best for last in a sense and same thing it ends it it's also covered at the end of our coursera course series um machine learning for everyone so the thing about uplift modeling is its potential to do much better than standard uh predictive modeling or machine learning is huge but it also is uh more complex and has additional data requirements because you need to have a control set right so most applications of predictive analytics that is to say most of the time when you use machine learning for business you're operating on found data so found data meaning data that wasn't actually collected for the purposes of machine learning but it's it's big data it's all the data that's being collected just as a matter as a side effect of doing conducting business as normal right so you know that you did all this marketing last quarter and you know which of those customers did and didn't buy that's data that's sitting around right that just was organically collected organically in the sense that just as a normal matter of conducting business and now we're making use of it if you want to move to uplift modeling you have to conduct experiments that is you have to have a control set so in the case of marketing you need to market to a bunch of people and track who did and didn't respond that's already normally done but then you also need to purposefully not market to some control set some evenly distributed sample of customers who you're going to withhold the marketing treatment but then also track who did or didn't make the purchase anyway or in political elections who did vote for your candidate or didn't whether or not they are applied to a phone call from a campaign volunteer same same concept so it's more complex but it's a fascinating area and yes i did read your book i got it a couple times already and i highly recommend people in the audience if you want a good introduction before my book comes out to predictive analytics this is a excellent it's a it's a good entertaining conversational type of lead highly recommend picking this up is one of your must reads uh especially during the season coming in season when you're singing last question then for you i wish i could keep going but the audience only has so long of attention span and i can only go so long as well so last question for us then it's those those noobs the people like me that want to get started in this that have this passion or we've now sparked that hopefully sparked that passion in them about what predictive analytics is and they want to learn more type situation how do you recommend i mean they're they're a demand plan or forecaster in a company how do you recommend they they they start gaining that knowledge they're gonna they're gonna need in a few years no that's a great question and um i mean my my first answer will be to once again plug the coursera core series because i'm i'm addressing an unmet need with this thing most machine learning trainings are technical either they're hands-on or they're abstract academic with regard to the algorithms the machine learning methods themselves but the fact is there's another side to machine learning if you're going to actually make business value out of it which is the whole organizational process right the way you're positioning the technology so it's not just a cool elegant model that's nifty but it's actually actionable and will actually be deployed that is integrated into your existing operations that they'll be changed and improved so what we did with machine learning for everyone and the information about free access is at machine learning.courses is holistically addressed both sides so an accessible overview of the underlying methods that we've been talking about machine learning methods but then also all the organizational requirements the planning the green lighting uh the staffing of the analytics project the data requirements and how you prep it which is not rocket science but is very involved on a technical level has certain particular requirements that that table we were talking about um and uh sort of all the ins and outs and pitfalls about how do you understand how well a model really did and how can you be secure that it's discovered something truly a value that will be actionable and that'll hold up in in in new situations so well thank you very much i really enjoyed this conversation i said i'll probably have you back sometime uh talk again because i i just enjoy talking with you and so someone that almost shares my passion with predictive analytics is hard to find so i appreciate you coming on board and and you know kind of raising up your passion a little bit for for what whatever we do thanks for the great questions good talking it was great talking with you and and we'll speak again soon awesome thanks eric thank you not surprisingly that was a wonderful discussion uh i knew i'd enjoy talking with him he's he does make predictive analytics understandable and entertaining and that's why i really wanted to have him on board to help explain exactly what we're talking about with predictive analytics and hopefully entertain you along the way predictive analytics if we really wanted to sum up that as a definition for business purposes it's a process and strategy that uses a variety of advanced statistical algorithms to detect patterns and conditions that may occur in the future for insights into what will happen and to help support various business needs we talked about every business decision is based on that forward-looking projection it's a forecast and if we can do it better that's what we're all about and predictive analytics will help us do that better so we're distinguishing what the difference is he had a great definition talking about the finite and probability versus you know the deterministic i mean he has some great insights that he talked about with predictive endings the way i'm going to simplify it is really based on three basic things this makes it different it's one it's more and different data we're not just looking at our internal data sets anymore that's not going to get you where you need to go and especially during times like covid or any type of of things that we're going through that shocks to the system where you're going to understand your consumers and market better to be able to do that you need to start looking at external drivers external data you need to start looking at things that really drive your business and drive consumer behavior those are the different data sources and he talked about the wide talking about all the columns that you're looking to be able to forecast we need to start looking at more data more variables to be able to bring into our modeling that's the one first and foremost data drives prediction and looking at more and different types of data is really one of the major characteristics of difference in predictive analytics second it's more and different models and algorithms it's not just your you know your your mama's moving average you have to start looking at things that really look at correlation start looking at things that that will bring different models together start looking at different drivers start looking at different types of machine learning that will help your system identify and learn from behaviors and patterns to be able to do better predictions going forward so it's the more data different data more and different models as well and a lot of mario knows supervised or even unsupervised type of learning models and finally the last piece really differentiates predictive analytics from standard business forecasting or what we've been doing for years with a lot of success but together you get to that next level it needs to be more forward-looking and believe it or not what we do is not exactly groundbreaking and forward-looking if we're looking at past order history and looking at the level trends seasonality noise and extrapolating that forward you're making some assumptions there yes and it's also reactive to what is just occurred if we can start becoming understand the whys if we can start seeing the drivers now those can sometimes be turned into reading indicators that are truly more forward-looking you're now driving your business with something that is more sure has happened in the future or more correlation to what may occur in the future so predictive modeling by its nature a lot of times is more forward-looking so if i had to break down the difference it's more different data more different algorithms and it's more forward-looking that's the basis of predictive analytics it isn't rocket science though it's some if i can do it you can and and i want everybody to get on board and start learning more he mentioned my book coming out and i appreciate that plug he gave him and i'll give another input in about a month i'm gonna have a book coming out that's gonna be the topic of that podcast and and you've done a great job audience every person i've had on with the book i keep getting feedback from them and you saying i bought that book i bought that book please to that by eric's book today and then you're going to check out my book as well because it's going to make business forecasting and predictive energy easy understandable something you can grasp something you can hold on to and make it more enjoyable and better going forward again so this is comes to an end of another ibf on demand podcast my name is eric wilson you can find me at eric at ibf.org it's eric ibf.org check out arkiva as well we'll talked about machine learning they just came out with a new swiftcast actually utilizes machine learning in developing an roi for exactly what you may be able to get with different types of models different types of data how you can look at your data set and actually look at an roi to that as well so check out swiftcast uses some machine learning as well as you can find that at arkiva's website so please continue to listen like share subscribe continue to be part of this family help us grow this community over 2 000 is the next goal let's see if we can hit it before the next podcast with that i want to thank everybody thank you one more time thank the audience thank everything and don't forget wash your hands [Music] you